BIRLA INSTITUTE OF SCIENTIFIC RESEARCH, JAIPUR

One Day Workshop 16 February, 2010
on
"BIOINFORMATICS"




Go Back

Section - 1

Accessing Bioinformatics Resources and Databases

Objective

The objective of this exercise is to make student aware of the biological information and databases available over internet and to download the desired information from these databases. There are numerous websites available which deals with Bioinformatics related information, but there are limited dedicated servers which offer different services to the scientific community, such servers are referred to as "metaserver". The major information repositories include NCBI, EBI, EXPASY, DDBJ etc. In this exercise we will access NCBI website for different information and resources.

Lab 1: Searching the literature

For the present exercise we are taking "Pyruvate kinase" as research material. The purpose of this exercise is to search literature for research from PubMed, a service of National Library of Medicine, contains over 1700 millions records in an abstract form and also links to the journals, if available online.

Steps:

  • Go to the NCBI home page by clicking on this link
    (http://www.ncbi.nlm.nih.gov)

  • Select the "PubMed" database from the scroll down list of databases. In the blank space provided after "for" type "Pyruvate kinase". Now click at "Go".

  • The result page contains 8244 records out of which only 20 sequences are visible (this no. may vary as database is being update regularly). You can modify this number by selecting the desired option from "show" field mentioned above in the web page. In order to look at abstract of article, select the abstract from "display settings" hyperlink on page.

  • In order to download the abstract at your PC, select checkbox of the abstract of your interest and then "send to" on topmost of the page and select "file" options from given values. A dialogue box will appear asking for saving the file. Click ok.

Lab 2: Searching for Nucleotide and Protein Sequence

In this exercise we will fetch the nucleotide(s) and protein sequence(s) of "Pyruvate kinase" from NCBI.

Steps:

  • Go to the NCBI home page by clicking on this link
    (http://www.ncbi.nlm.nih.gov)

  • From the database options select the "Nucleotide" or "Protein" database and in the query space type "Pyruvate kinase". Now click at "Go".

  • Similar options are available as discussed above. But here we will download the sequence in fasta format. So select the checkbox of the specific record and from the DISPLAY field select "FASTA" option. It will convert all records from summary to fasta files. Now you can download the files at your PC in a folder by clicking on the "Download" then select "FASTA" option.

  • For proteins it will show 4304 items and for nucleotide, 3169 items or records.

  • You can restrict your search according to organism. For eg. If you are interested in only human sequences, then in query field type Pyruvate kinase AND human [organism]. Now it will filter only human pyruvate kinase sequences. So it will show only 88 sequences.

  • The entrez query engine accepts the "boolean search", means the use of AND, OR and NOT operators within a query word.

  • You can limit your search according to Fields, organism, gene location dates etc., go to limit option for further detail.

Assignment:

  • Search the following nucleotide and protein sequences

    • Prohibitin from Drosophila melanogaster

    • insulin from human

  • Search other resources available at NCBI such as genomes, taxonomy, books etc.

Section - 2

Homology Searching

Lab 1: Similarity search using BLAST

In This Exercise We will Get Sequences Similar To The Desired protein sequence of "PYRUVATE KINASE in Methanocaldococcus jannischii DSM 2661". So first download this protein sequence from NCBI homepage as we have done in previous section.

The sequence comparison we want to make is restricted to certain specific genomes, to do this we select 3 completely sequenced genomes viz:

    • Sulfolobus solfataricus
    • Halobacterium salinarum
    • Escherichia coli

STEPS:

  • First download the Desired protein sequence of "PYRUVATE KINASE in Methanocaldococcus jannischii DSM 2661"

  • Go to the Entrez on NCBI by clicking this link http://www.ncbi.nlm.nih.gov/Entrez

  • Click on the Genome whole genome sequence

  • from the page displayed click on the BLAST hyperlink at the right side of page

  • paste the desired protein sequence in the table displayed box.

  • In the query we select protein like this

Query: Protein Database: Protein

  • Also select check boxes of certain specific genomes named as:


    • Sulfolobus solfataricus
    • Halobacterium salinarum
    • Escherichia coli

  • Later press the BLAST button,

  • The web paged viewed will have format option, so select options as you required then press the View Report button

  • We get results of the BLAST after this operation

  • There are many sequences which are similar to the query; we selected the first three sequences for the three selected organisms.


    • ref|NP_342465.1| Pyruvate kinase (pyK) [Sulfolobus solfatar... 244 1e-65
    • ref|NP_753966.1| Pyruvate kinase I [Escherichia coli CFT073] 238 1e-63
    • ref|NP_279422.1| pyruvate kinase; PykA [Halobacterium sp. N... 197 2e-51

  • Click on the links and get the sequences in fasta format from the Display , for the organisms.

  • Save all the four sequences in same text file for multiple sequence alignment

Lab 2: Similarity search using FASTA

In this exercise search the similar sequence using the FASTA (Fast Alignment) tool, using the same organism, which have used in above exercise.

Follow these STEPS:

  • Go to EMBL Toolbox page by clicking on this link:
    http://www.ebi.ac.uk/Tools/

  • Now click on Similarity and Homology Hyperlink.

  • This will open Similarity searching and Homology page, which contains list of tools. Click on Fasta Protein.

  • Now select the required options and paste the Protein sequence of the organism at specified place.

  • Click on RUN button.

  • It will show you the Summary Table, Different options and list of similar sequences.

  • First click on CLEAR ALL button, then select check boxes of required sequences and click on SHOW ALIGNMENTS button.

  • study the results and try other options also.

Section - 3

SEQUENCE ANALYSIS

Lab 1: Pairwise Alignment

  • Open SIM Alignment tool for protein sequences by following this link http://www.expasy.org/tools/sim-prot.html

  • Copy-Paste Sequences at appropriate place in tool. Download any Protien Sequences from ncbi

  • Then perform Alignment, by clicking on Submit button.

Lab 2: MULTIPLE SEQUENCE ALIGNMENT using EBI tool

Follow these steps:

  • Search the different species of Rhizobium on the google.com

  • From all Rhizobium species choose few species for multiple sequence alignment like:

    • Rhizobium cellulosilyticum (AM286429)

    • Rhizobium daejeonense (AM910856)

    • Rhizobium etli (HB764848)

    • Rhizobium galegae (EU074168)

  • Download Nucleotide Sequences of these species in FASTA format from http://www.ncbi.nlm.nih.gov

  • Open the EBI tools by Clicking on this link:
    http://www.ebi.ac.uk/Tools/clustalw2/index.html

  • Paste the all fasta format sequences of Rhizobium in given text area which you have downloaded

  • Now click on RUN button

  • You will get results, now study these.

  • Now repeat this exercise with same sequences, this time change the differnet parameter values, then study the results, how its differs.

Section - 4

Retrieve a PDB structure

In this exercise we will extract a PDB structure which has ID like: "1DD6" from Protein Data Bank(PDB).

Towards this exercise follow these steps :

  • Click on this PDB site http://www.rcsb.org/pdb

  • Enter "1DD6" in the SEARCH textbox (just before "site search" button on top blue line)

  • Then click on SITE SEARCH button.

  • To display PDB file, click on the image which is next to 1DD6 (Display PDB file)

  • To download PDB file, Click on the image which is just next to 1DD6 (Download PDB file) . OR at right side of the page, click on Download files then click on PDB text.

  • Save your PDB file in your exercise folder

  • Do same for 7LYZ to get PDB structure.

Section - 5

Visualization of Protein/DNA using RasMol.

Introduction

Rasmol is a computer program written for molecular graphics visualization intended and used primarily for the depiction and exploration of biological macromolecule structures.

Getting Started

  • Start Rasmol from your computer's Dektop.

  • This will open two Windows (one Black window, another white command-line window)

  • Commands below preceded by M are best done from the pull-down Menus. Command NOT preceded by M must be typed in the white command-line window. (RasMol has two windows, one black and one white. On windows, the white command-line window starts minimized. look for it on the taskbar. Command with blue colour listed below seprated by semicolons should be typed on separate lines into the white window, pressing Enter after each command.

  • Run RasMol, and do M(enu) File-- Open. Select 1d66.pdb(gal4 transcriptional regulator complexed to DNA).

  1. How many Chains are there?

    • reset; rotate z 90; zoom 150; rotate y 40

    • M(enu) Display-- Backbone, M colours-- Chain
      (Now each chain in different colour. Click on each chain to report its ID letter code)

  2. Is there anything else in this PDB file besides the protein/DNA chains?

    • select hetero; M Display—Spacefill
      (Now you can see oxygen from water in the X-rayed crystal.)

    • M Colours-- CPK

    • Restrict not water
      (This hides water; Click on what remains to find out what it is.)

  3. What are the hydrophobic aminoacids?

    • select hydrophobic; color magenta; wireframe 0.4; select not water

    • M Display-- Spacefill; M Option-- Slab mode

  4. Like this many more commands.......



  5. Go Back