Go Back
Lab-1
To perform structure validation of compound using lipinski's rule in pubchem
Introduction
Lipinski's Rule of Five is a rule of thumb to evaluate druglikeness, or determine if a chemical compound
with a certain pharmacological or biological activity has properties that would make it a likely orally
active drug in humans. The rule was formulated by Christopher A. Lipinski in 1997, based on the
observation that most medication drugs are relatively small and lipophilic molecules.
The rule
describes molecular properties important for a drug's pharmacokinetics in the human body, including
their absorption, distribution, metabolism, and excretion ("ADME"). However, the rule does not predict
if a compound is pharmacologically active
Lipinski's Rule of Five states that, in general, an
orally active drug has no more than one violation of the following criteria:
- Not more than 5 hydrogen bond donors (nitrogen or oxygen atoms with one or more hydrogen atoms)
- Not more than 10 hydrogen bond acceptors (nitrogen or oxygen atoms)
- A molecular weight under 500 g/mol
- A partition coefficient log P less than 5
STEPS:
- Go to the NCBI home page
http://www.ncbi.nlm.nih.gov - select search for pubchem compound in menu type aspirin (for)
- click on the link CID: 2244
- You will get output like this:
Compound ID: 2244
Molecular Weight: 180.15742 [g/mol]
Molecular Formula: C9H8O4
XlogP: 1.4
H-Bond Donor: 1
H-Bond Acceptor: 4
IUPAC Name: 2-acetyloxybenzoic acid
Canonical SMILES: CC(=O)OC1=CC=CC=C1C(=O)O
Lab-2
To understand and draw molecule sketching , SMILE file format and calculating properties
Introduction
Biologically active compound act as a drug.these software is helpful for sketching molecule. Smiles file
format is The Simplified Molecular Input Line Entry Specification (SMILES) a line notation for
molecules. SMILES strings include connectivity but do not include 2D or 3D coordinates.
Hydrogen atoms are not represented. Other atoms are represented by their element symbols
B, C, N, O, F, P, S, Cl, Br, and I. The symbol "=" represents double bonds and "#" represents triple
bonds. Branching is indicated by (). Rings are indicated by pairs of digits.
Name Formula SMILES String
Methane CH4 C
Ethanol C2H6O CCO
Benzene C6H6 C1=CC=CC=C1 or c1ccccc1
Ethylene C2H4 C=C
STEPS:
- Open the site
http://www.molinspiration.com - click on free on-line cheminformatics services
- Use the icon | for single carbon || for double bond
- Red cross is eraser
- Sketch molecule aspirin.
- For it Open site
http://www.ncbi.nlm.nih.gov/sites/entrez?db=pccompound - search for aspirin
- For smiles format take it from pubchem canonical smiles
- paste smiles format CC(=O)OC1=CC=CC=C1C(=O)O in paste smiles here.
- click on calculate properties and predict bioactivity
- You will get output predict bioactivity
- You will get output for calculate properties.
| GPCR ligand | -0.66 |
| Ion channel modulator | -0.91 |
| Kinase inhibitor | -0.49 |
| Nuclear receptor ligand | -1.23 |
| miLogP | 1.434 |
| TPSA | 63.604 |
| natoms | 13 |
| MW | 180.159 |
| nON | 4 |
| nOHNH | 1 |
| nviolations | 0 |
| nrotb | 3 |
| volume | 155.574 |
Lab-3
Prediction of the secondary structure of proteins using
GOR IV method
Introduction
Secondary structure is formally defined by the hydrogen bonds of the biopolymer, as observed in an atomic-resolution structure. In proteins, the secondary structure is defined by patterns of hydrogen bonds between backbone amide groups. Many methods describing secondary structure of protein like chou-fas man, phd, GOR etc. GOR IV is the fourth version of GOR secondary structure prediction methods based on the information theory (Garnier et al., 1996). GOR IV uses all possible pair frequencies within the window of 17 amino acid residue it predicts alpha helix,310 helix, Pi helix,Beta bridge, Extended strand,beta turn and random coil.
Steps:
- Go to the NCBI home page by clicking on this link
(http://www.ncbi.nlm.nih.gov) - Select Protein database and type pyruvate kinase in the blank space, click GO .
- Now select the sequence entry YP_576506 (Accession ID) and save in fasta format.
- Copy the entire sequence (except the first comment line start with >sign).
- Go to GOR IV website by clicking on this link
http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html - Paste selected sequences at the blank space provided. Click on GO button.
- You will get output like this:
10 20 30 40 50 60 70 | | | | | | | MRRLRRIKILATLGPASSDSAMVRRLFEAGADVFRINMSHTTHDKMRELVATIRNVEGSYGRPIGILVDL ccccceeeeeeeccccccchhhhhhhhhhcceeeeecccchhhhhhhhhhhhhhccccccccceeeeecc QGPKLRIGSFADGPIQLSNGDTFVLDSDNSPGDKTRVHLPHPEILAALRPGHTLLLDDGKVRLIAEETSP ccccceeeccccccceecccceeeeecccccccceeecccchhhhhhccccceeecccceeeeeeccccc GCAVTRVVVGGRMSDRKGVSLPDTDLPMSAMTPKDRSDLDAALEAGVDWIALSFVQRADDVAEAKKMIRG ccceeeeeeccccccccceeccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh RASVMAKIEKPQAIDRLPEIIEMADALMVARGDLGVELPLEQVPGLQKQMTRLARRAGKPVVIATQMLES hhhhhhhhcchhhhccchhhhhhhhhhhhhhccccccccccccchhhhhhhhhhhhcccchhhhhhhhhh MILSPVPTRAEVSDVATAVYEGADAIMLSAESAAGKYPVEAIATMNRIGEEVERDPTYRGVLNAQRPQPE hhcccccccccccchhhhhhhhhhhhhhhhhhhccccchhhhhhhhhhccccccccchhhhhhccccccc PTVGDAIADAARQIAETLDLSAIICWTSSGSTALRVARERPKPPVVAITPNLATGRKLAVVWGVHCVVAE ccchhhhhhhhhhhhhhhcceeeeeecccccchhhhhccccccceeeeeccccccceeeeeeeeeeeeee DAHDQDDMVDRAGSIAFRDGFAKAGQRIIIVAGVPLGTPGATNMTRIAFVGPNGETGV cccchhhhhhhhhhhhhhhhhhhhcceeeeeeccccccccccceeeeeeeccceeeec Sequence length : 478 GOR4 : Alpha helix (Hh) : 190 is 39.75% 310 helix (Gg) : 0 is 0.00% Pi helix (Ii) : 0 is 0.00% Beta bridge (Bb) : 0 is 0.00% Extended strand (Ee) : 89 is 18.62% Beta turn (Tt) : 0 is 0.00% Bend region (Ss) : 0 is 0.00% Random coil (Cc) : 199 is 41.63% Ambigous states (?) : 0 is 0.00% Other states : 0 is 0.00%
Lab-4
Prediction of structure of protein (unknown structure) using SWISS-MODEL server.
Introduction
The tertiary structure of proteins is an important problem in biochemistry, and since structure determination is relatively difficult, protein structure prediction has been a long-standing problem. There are many way to predict the protein using ab intio METHOD, Threading algorithm, energy minimization and and homology modeling. Swiss pdb viewer(swiss model). SWISS-MODEL, an automated homology modeling server developed by Swiss Institute of Bioinformatics (SIB). Based on the template and target it searches its homologue and structure is optimized by energy minimization and further check by ramachandran plot for its stability.
Steps:
- Go to the NCBI home page by clicking on this link
(http://www.ncbi.nlm.nih.gov) - Select Protein in search menu and type calmodulin and then click on GO button
- Select the sequence which has Accession ID llike: CAA36839.1
- Click on the id than select display option select FASTA
- Copy all sequences (ONLY SEQUENCE)
- Open swiss model site by clicking on this link http://swissmodel.expasy.org/
- Now follow the Instructor's guidance
- You will get output like this:
![]() |
Model info: modelled residue range: 4 to 473 based on template: 2e28A (2.40 Å) Sequence Identity [%]: 39.286 Evalue: 0.00e-1 |
Alignment:
TARGET 4 LRRIKIL ATLGPASSDS AMVRRLFEAG ADVFRINMSH TTHDKMRELV 2e28A 1 m--krktkiv stigpasesv dklvqlmeag mnvarlnfsh gdheehgrri TARGET sss sss h hhhhhhhhh sssssss hhhhhhhh 2e28A sss sss h hhhhhhhhh sssssss hhhhhhhh TARGET 51 ATIRNVEGSY GRPIGILVDL QGPKLRIGSF ADGPIQLSNG DTFVLDSDNS 2e28A 49 anireaakrt grtvailldt kgpeirthnm engaielkeg sklvismsev TARGET hhhhhhhhh ssssss ssss ssss 2e28A hhhhhhhhh ssssss ssss ssss TARGET 101 PGDKTRVHLP HPEILAALRP GHTLLLDDGK VRLIAEETSP --GCAVTRVV 2e28A 99 lgtpekisvt ypsliddvsv gakillddgl islevnavdk qageivttvl TARGET sss hh ssssss s sssssssss sssssss 2e28A ssssss hh ssssss s sssssssss sssssss TARGET 149 VGGRMSDRKG VSLPDTDLPM SAMTPKDRSD LDAALEAGVD WIALSFVQRA 2e28A 149 nggvlknkkg vnvpgvkvnl pgitekdrad ilfgirqgid fiaasfvrra TARGET sss ss hhhhhh hhhhhhh s ssss 2e28A sss ss hhhhhh hhhhhhh s ssss TARGET 199 DDVAEAKKMI ----RGRASV MAKIEKPQAI DRLPEIIEMA DALMVARGDL 2e28A 199 sdvleirell eahdalhiqi iakieneegv anideileaa dglmvargdl TARGET hhhhhhhhh ss ssss hhhh hhhhhh ssss hhhh 2e28A hhhhhhhhhh hh ss ssss hhhh hhhhhh sssssshh TARGET 245 GVELPLEQVP GLQKQMTRLA RRAGKPVVIA TQMLESMILS PVPTRAEVSD 2e28A 249 gveipaeevp liqkllikks nmlgkpvita tqmldsmqrn prptraeasd TARGET hhh hhhh hhhhhhhhhh hhh sssss hh hhhhhh 2e28A h hhhh hhhhhhhhhh hhh sssss h hhhhhh TARGET 295 VATAVYEGAD AIMLSAESAA GKYPVEAIAT MNRIGEEVER DPTYRGVLNA 2e28A 299 vanaifdgtd avmlsgetaa gqypveavkt mhqialrteq alehrdilsq TARGET hhhhhhh s ssss hhhhhhh hhhhhhhhhh hhhhhhh 2e28A hhhhhhh s ssss hhhhhhh hhhhhhhhhh hhhhhhh TARGET 345 QRPQPEPTVG DAIADAARQI AETLDLSAII CWTSSGSTAL RVARERPKPP 2e28A 349 rtkesqttit daigqsvaht alnldvaaiv tptvsgktpq mvakyrpkap TARGET h hh hhhhhhhhhh hhhh sss sss hhhh hhhh s 2e28A hh hh hhhhhhhhhh hhhh sss sss hhhh hhhh s TARGET 395 VVAITPNLAT GRKLAVVWGV HCVVAEDAHD QDDMVDRAGS IAFRDGFAKA 2e28A 399 iiavtsneav srrlalvwgv ytkeaphvnt tdemldvavd aavrsglvkh TARGET sssss hhh hh ss sssss hhhhhhhhhh hhhh 2e28A sssss hhh hh ss sssss hhhhhhhhhh hhhh TARGET 445 GQRIIIVAGV PLGTPGATNM TRIAFVGPN ---------- ---------- 2e28A 449 gdlvvitagv pvgetgstnl mkvhvisdll akgqgigrks afgkavvakt TARGET sssssss s ssssss 2e28A sssssss s ssssss sss ssssss ss ssssssss TARGET ---------- ---------- ---------- ---------- ---------- 2e28A 499 aeearqkmvd ggilvtvstd admmpaieka aaiiteeggl tshaavvgls TARGET 2e28A hhhhhhh ssss hhh sssss hhhhhhhh TARGET ---------- ---------- ---------- --------- 2e28A 549 lgipvivgve nattlfkdgq eitvdggfga vyrghasvl TARGET 2e28A h sssss s ssssss ss ssss
- Click on display model pdb or download if u want to download

