BioInformatics Tools

The Bioinformatics tools are the software programs for the saving, retrieving and analysis of Biological data and extracting the information from them.

Factors that must be taken into consideration when designing these tools are:

The end user (the biologist) may not be a frequent user of computer technology and thus it should be very user friendly.
These software tools must be made available over the internet given the global distribution of the scientific research community.

The Bioinformatics Tools may be categorized into following categories:

Homology and Similarity Tools
Protein Function Analysis
Structural Analysis
Sequence Analysis

Homology and Similarity Tools

The term homology implies a common evolutionary relationship between two traits -whether they are DNA sequences or bristle patterns on a fly's nose. Homologous sequences are sequences that are related by divergence from a common ancestor. Thus the degree of similarity between two sequences can be measured while their homology is a case of being either true of false. This set of tools can be used to identify similarities between novel query sequences of unknown structure and function and database sequences whose structure and function have been elucidated.

Protein Function Analysis

Function Analysis is Identification and mapping of all functional elements (both coding and non-coding) in a genome. This group of programs allow you to compare your protein sequence to the secondary (or derived) protein databases that contain information on motifs, signatures and protein domains. Highly significant hits against these different pattern databases allow you to approximate the biochemical function of your query protein.

Structural Analysis

This set of tools allow you to compare structures with the known structure databases. The function of a protein is more directly a consequence of its structure rather than its sequence with structural homologs tending to share functions. The determination of a protein's 2D/3D structure is crucial in the study of its function.

Sequence Analysis

This set of tools allows you to carry out further, more detailed analysis on your query sequence including evolutionary analysis, identification of mutations, hydropathy regions, CpG islands and compositional biases. The identification of these and other biological properties are all clues that aid the search to elucidate the specific function of your sequence.

Bioinformatics Tools

BLAST:
The Basic Local Alignment Search Tool (BLAST) for comparing gene and protein sequences against others in public databases, now comes in several types including PSI-BLAST, PHI-BLAST, and BLAST 2 sequences. Specialized BLASTs are also available for human, microbial, malaria, and other genomes, as well as for vector contamination, immunoglobulins, and tentative human consensus sequences.

FASTA
A database search tool used to compare a nucleotide or peptide sequence to a sequence database. The program is based on the rapid sequence algorithm described by Lipman and Pearson. It was the first widely used algorithm for database similarity searching. The program looks for optimal local alignments by scanning the sequence for small matches called "words". Initially, the scores of segments in which there are multiple word hits are calculated ("init1"). Later the scores of several segments may be summed to generate an "initn" score. An optimized alignment that includes gaps is shown in the output as "opt". The sensitivity and speed of the search are inversely related and controlled by the "k-tup" variable which specifies the size of a "word".

EMBOSS
EMBOSS (The European Molecular Biology Open Software Suite) is a new, free open source software analysis package specially developed for the needs of the molecular biology user community. Within EMBOSS you will find around 100 programs (applications) for sequence alignment, database searching with sequence patterns, protein motif identification and domain analysis, nucleotide sequence pattern analysis, codon usage analysis for small genomes, and much more.

A list of applications that are included with the EMBOSS package can be found in http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Apps/

Clustalw
ClustalW is a general purpose multiple sequence alignment program for DNA or proteins. It produces biologically meaningful multiple sequence alignments of divergent sequences, calculates the best match for the selected sequences, and lines them up so that the identities, similarities and differences can be seen.

RasMol
It is a powerful research tool to display the structure of DNA, proteins, and smaller molecules. Protein Explorer, a derivative of RasMol, is an easier to use program.

Application Programs

JAVA in Bioinformatics:
Due to Platform independence nature of Java, it is emerging as a key player in bioinformatics. Physiome Sciences' computer-based biological simulation technologies and Bioinformatics Solutions' PatternHunter are two examples of the growing adoption of Java in bioinformatics.

Perl in Bioinformatics:
Perl is also being used in the processing of biological data. One example of perl project is BioPerl project.

Bioinformatics Projects:

BioJava:
The BioJava Project is providing the Java tool for the processing of data in Java

BioPerl:
The BioPerl project many module for biological data processing.

BioXML:
A part of the BioPerl project, this is a resource to gather XML documentation, DTDs and XML aware tools for biology in one location.

BioInformatics Tools

BioInformatics Tools

BioInformatics Tools

Tutorials