BIRLA INSTITUTE OF SCIENTIFIC RESEARCH, JAIPUR

Summer Training May-July, 2010



Day5

Go Back

Lab-1

To perform structure validation of compound using lipinski's rule in pubchem

Introduction

Lipinski's Rule of Five is a rule of thumb to evaluate druglikeness, or determine if a chemical compound with a certain pharmacological or biological activity has properties that would make it a likely orally active drug in humans. The rule was formulated by Christopher A. Lipinski in 1997, based on the observation that most medication drugs are relatively small and lipophilic molecules.

The rule describes molecular properties important for a drug's pharmacokinetics in the human body, including their absorption, distribution, metabolism, and excretion ("ADME"). However, the rule does not predict if a compound is pharmacologically active

Lipinski's Rule of Five states that, in general, an orally active drug has no more than one violation of the following criteria:

  • Not more than 5 hydrogen bond donors (nitrogen or oxygen atoms with one or more hydrogen atoms)

  • Not more than 10 hydrogen bond acceptors (nitrogen or oxygen atoms)

  • A molecular weight under 500 g/mol

  • A partition coefficient log P less than 5

STEPS:

  • Go to the NCBI home page
    http://www.ncbi.nlm.nih.gov

  • select search for pubchem compound in menu type aspirin (for)

  • click on the link CID: 2244

  • You will get output like this:

Compound ID: 2244
Molecular Weight: 180.15742 [g/mol]
Molecular Formula: C9H8O4
XlogP: 1.4
H-Bond Donor: 1
H-Bond Acceptor: 4
IUPAC Name: 2-acetyloxybenzoic acid
Canonical SMILES: CC(=O)OC1=CC=CC=C1C(=O)O

Lab-2

To understand and draw molecule sketching , SMILE file format and calculating properties

Introduction

Biologically active compound act as a drug.these software is helpful for sketching molecule. Smiles file format is The Simplified Molecular Input Line Entry Specification (SMILES) a line notation for molecules. SMILES strings include connectivity but do not include 2D or 3D coordinates.

Hydrogen atoms are not represented. Other atoms are represented by their element symbols B, C, N, O, F, P, S, Cl, Br, and I. The symbol "=" represents double bonds and "#" represents triple bonds. Branching is indicated by (). Rings are indicated by pairs of digits.

Name      Formula      SMILES String
Methane      CH4        C
Ethanol      C2H6O        CCO
Benzene      C6H6        C1=CC=CC=C1 or c1ccccc1
Ethylene      C2H4        C=C

STEPS:

  • Open the site
    http://www.molinspiration.com

  • click on free on-line cheminformatics services

  • Use the icon | for single carbon || for double bond

  • Red cross is eraser

  • Sketch molecule aspirin.

  • For it Open site
    http://www.ncbi.nlm.nih.gov/sites/entrez?db=pccompound

  • search for aspirin

  • For smiles format take it from pubchem canonical smiles

  • paste smiles format CC(=O)OC1=CC=CC=C1C(=O)O in paste smiles here.

  • click on calculate properties and predict bioactivity

  • You will get output predict bioactivity

  • GPCR ligand -0.66
    Ion channel modulator -0.91
    Kinase inhibitor -0.49
    Nuclear receptor ligand -1.23

  • You will get output for calculate properties.


  • miLogP 1.434
    TPSA 63.604
    natoms 13
    MW 180.159
    nON 4
    nOHNH 1
    nviolations 0
    nrotb 3
    volume 155.574

Lab-3

Prediction of the secondary structure of proteins using
GOR IV method

Introduction

Secondary structure is formally defined by the hydrogen bonds of the biopolymer, as observed in an atomic-resolution structure. In proteins, the secondary structure is defined by patterns of hydrogen bonds between backbone amide groups. Many methods describing secondary structure of protein like chou-fas man, phd, GOR etc. GOR IV is the fourth version of GOR secondary structure prediction methods based on the information theory (Garnier et al., 1996). GOR IV uses all possible pair frequencies within the window of 17 amino acid residue it predicts alpha helix,310 helix, Pi helix,Beta bridge, Extended strand,beta turn and random coil.

Steps:

  • Go to the NCBI home page by clicking on this link
    (http://www.ncbi.nlm.nih.gov)

  • Select Protein database and type pyruvate kinase in the blank space, click GO .

  • Now select the sequence entry YP_576506 (Accession ID) and save in fasta format.

  • Copy the entire sequence (except the first comment line start with >sign).

  • Go to GOR IV website by clicking on this link
    http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html

  • Paste selected sequences at the blank space provided. Click on GO button.

  • You will get output like this:

		
		
		         10        20        30        40        50        60        70
		         |         |         |         |         |         |         |
		MRRLRRIKILATLGPASSDSAMVRRLFEAGADVFRINMSHTTHDKMRELVATIRNVEGSYGRPIGILVDL
		ccccceeeeeeeccccccchhhhhhhhhhcceeeeecccchhhhhhhhhhhhhhccccccccceeeeecc
		QGPKLRIGSFADGPIQLSNGDTFVLDSDNSPGDKTRVHLPHPEILAALRPGHTLLLDDGKVRLIAEETSP
		ccccceeeccccccceecccceeeeecccccccceeecccchhhhhhccccceeecccceeeeeeccccc
		GCAVTRVVVGGRMSDRKGVSLPDTDLPMSAMTPKDRSDLDAALEAGVDWIALSFVQRADDVAEAKKMIRG
		ccceeeeeeccccccccceeccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
		RASVMAKIEKPQAIDRLPEIIEMADALMVARGDLGVELPLEQVPGLQKQMTRLARRAGKPVVIATQMLES
		hhhhhhhhcchhhhccchhhhhhhhhhhhhhccccccccccccchhhhhhhhhhhhcccchhhhhhhhhh
		MILSPVPTRAEVSDVATAVYEGADAIMLSAESAAGKYPVEAIATMNRIGEEVERDPTYRGVLNAQRPQPE
		hhcccccccccccchhhhhhhhhhhhhhhhhhhccccchhhhhhhhhhccccccccchhhhhhccccccc
		PTVGDAIADAARQIAETLDLSAIICWTSSGSTALRVARERPKPPVVAITPNLATGRKLAVVWGVHCVVAE
		ccchhhhhhhhhhhhhhhcceeeeeecccccchhhhhccccccceeeeeccccccceeeeeeeeeeeeee
		DAHDQDDMVDRAGSIAFRDGFAKAGQRIIIVAGVPLGTPGATNMTRIAFVGPNGETGV
		cccchhhhhhhhhhhhhhhhhhhhcceeeeeeccccccccccceeeeeeeccceeeec   
		Sequence length :   478

		GOR4 :
		   Alpha helix     (Hh) :   190 is  39.75%
		   310  helix      (Gg) :     0 is   0.00%
		   Pi helix        (Ii) :     0 is   0.00%
		   Beta bridge     (Bb) :     0 is   0.00%
		   Extended strand (Ee) :    89 is  18.62%
		   Beta turn       (Tt) :     0 is   0.00%
		   Bend region     (Ss) :     0 is   0.00%
		   Random coil     (Cc) :   199 is  41.63%
		   Ambigous states (?)  :     0 is   0.00%
		   Other states         :     0 is   0.00%
		
		

Lab-4

Prediction of structure of protein (unknown structure) using SWISS-MODEL server.

Introduction

The tertiary structure of proteins is an important problem in biochemistry, and since structure determination is relatively difficult, protein structure prediction has been a long-standing problem. There are many way to predict the protein using ab intio METHOD, Threading algorithm, energy minimization and and homology modeling. Swiss pdb viewer(swiss model). SWISS-MODEL, an automated homology modeling server developed by Swiss Institute of Bioinformatics (SIB). Based on the template and target it searches its homologue and structure is optimized by energy minimization and further check by ramachandran plot for its stability.

Steps:

  • Go to the NCBI home page by clicking on this link
    (http://www.ncbi.nlm.nih.gov)

  • Select Protein in search menu and type calmodulin and then click on GO button

  • Select the sequence which has Accession ID llike: CAA36839.1

  • Click on the id than select display option select FASTA

  • Copy all sequences (ONLY SEQUENCE)

  • Open swiss model site by clicking on this link http://swissmodel.expasy.org/

  • Now follow the Instructor's guidance

  • You will get output like this:



		
		Model info: 
		
		modelled residue range:	4 to 473  
		based on template:   2e28A (2.40 Å)  
		Sequence Identity [%]:   39.286  
		Evalue:     0.00e-1  

		

Alignment:

		
		TARGET    4        LRRIKIL ATLGPASSDS AMVRRLFEAG ADVFRINMSH TTHDKMRELV
		2e28A     1     m--krktkiv stigpasesv dklvqlmeag mnvarlnfsh gdheehgrri
		                                                                      
		TARGET                 sss sss      h hhhhhhhhh   sssssss     hhhhhhhh
		2e28A                  sss sss      h hhhhhhhhh   sssssss     hhhhhhhh
		TARGET    51    ATIRNVEGSY GRPIGILVDL QGPKLRIGSF ADGPIQLSNG DTFVLDSDNS
		2e28A     49    anireaakrt grtvailldt kgpeirthnm engaielkeg sklvismsev
		                                                                      
		TARGET          hhhhhhhhh      ssssss     ssss                ssss    
		2e28A           hhhhhhhhh      ssssss     ssss                ssss    
		TARGET    101   PGDKTRVHLP HPEILAALRP GHTLLLDDGK VRLIAEETSP --GCAVTRVV
		2e28A     99    lgtpekisvt ypsliddvsv gakillddgl islevnavdk qageivttvl
		                                                                      
		TARGET              sss      hh        ssssss  s sssssssss     sssssss
		2e28A               ssssss   hh        ssssss  s sssssssss     sssssss
		TARGET    149   VGGRMSDRKG VSLPDTDLPM SAMTPKDRSD LDAALEAGVD WIALSFVQRA
		2e28A     149   nggvlknkkg vnvpgvkvnl pgitekdrad ilfgirqgid fiaasfvrra
		                                                                      
		TARGET                 sss ss             hhhhhh hhhhhhh  s ssss      
		2e28A                  sss ss             hhhhhh hhhhhhh  s ssss      
		TARGET    199   DDVAEAKKMI ----RGRASV MAKIEKPQAI DRLPEIIEMA DALMVARGDL
		2e28A     199   sdvleirell eahdalhiqi iakieneegv anideileaa dglmvargdl
		                                                                      
		TARGET          hhhhhhhhh          ss ssss  hhhh    hhhhhh   ssss hhhh
		2e28A           hhhhhhhhhh hh      ss ssss  hhhh    hhhhhh   sssssshh 
		TARGET    245   GVELPLEQVP GLQKQMTRLA RRAGKPVVIA TQMLESMILS PVPTRAEVSD
		2e28A     249   gveipaeevp liqkllikks nmlgkpvita tqmldsmqrn prptraeasd
		                                                                      
		TARGET          hhh   hhhh hhhhhhhhhh hhh  sssss      hh        hhhhhh
		2e28A             h   hhhh hhhhhhhhhh hhh  sssss         h      hhhhhh
		TARGET    295   VATAVYEGAD AIMLSAESAA GKYPVEAIAT MNRIGEEVER DPTYRGVLNA
		2e28A     299   vanaifdgtd avmlsgetaa gqypveavkt mhqialrteq alehrdilsq
		                                                                      
		TARGET          hhhhhhh  s ssss          hhhhhhh hhhhhhhhhh    hhhhhhh
		2e28A           hhhhhhh  s ssss          hhhhhhh hhhhhhhhhh    hhhhhhh
		TARGET    345   QRPQPEPTVG DAIADAARQI AETLDLSAII CWTSSGSTAL RVARERPKPP
		2e28A     349   rtkesqttit daigqsvaht alnldvaaiv tptvsgktpq mvakyrpkap
		                                                                      
		TARGET          h       hh hhhhhhhhhh hhhh   sss sss   hhhh hhhh     s
		2e28A           hh      hh hhhhhhhhhh hhhh   sss sss   hhhh hhhh     s
		TARGET    395   VVAITPNLAT GRKLAVVWGV HCVVAEDAHD QDDMVDRAGS IAFRDGFAKA
		2e28A     399   iiavtsneav srrlalvwgv ytkeaphvnt tdemldvavd aavrsglvkh
		                                                                      
		TARGET          sssss  hhh hh      ss sssss      hhhhhhhhhh hhhh      
		2e28A           sssss  hhh hh      ss sssss      hhhhhhhhhh hhhh      
		TARGET    445   GQRIIIVAGV PLGTPGATNM TRIAFVGPN  ---------- ----------
		2e28A     449   gdlvvitagv pvgetgstnl mkvhvisdll akgqgigrks afgkavvakt
		                                                                      
		TARGET           sssssss            s ssssss                          
		2e28A            sssssss            s ssssss sss ssssss  ss ssssssss  
		TARGET          ---------- ---------- ---------- ---------- ----------
		2e28A     499   aeearqkmvd ggilvtvstd admmpaieka aaiiteeggl tshaavvgls
		                                                                      
		TARGET                                                                
		2e28A           hhhhhhh      ssss         hhh     sssss       hhhhhhhh
		TARGET          ---------- ---------- ---------- ---------            
		2e28A     549   lgipvivgve nattlfkdgq eitvdggfga vyrghasvl            
		                                                                      
		TARGET                                                                
		2e28A           h  sssss            s ssssss  ss ssss     
		
		
  • Click on display model pdb or download if u want to download



Go Back