Bioinformatics is a branch of biological science dealing with the study of storing, retrieving and analyzing biological data like nucleic acid (DNA/RNA) and protein sequence, structure, function, pathways and genetic interactions . It generates new knowledge that is useful in such fields as drug design and development of new software tools. Bioinformatics also deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, structural biology, software engineering, data mining, image processing, modeling and simulation, discrete mathematics, control and system theory, circuit theory, and statistics.
Map of the human X chromosome
Assembly of the human genome is one of the greatest achievements of bioinformatics
At the beginning of the "genomic revolution," the term bioinformatics refered to the creation and maintenance of a database to store biological information like nucleotide and amino acid sequences. Development of this type of database involved not only design issues but the development of complex interfaces whereby researchers could access existing data as well as submit new or revised data.
In order to study how normal cellular activities are altered in different disease states, the biological data must be combined to form a comprehensive picture of these activities. Therefore, the field of bioinformatics has evolved such that the most pressing task now involves the analysis and interpretation of various types of data. This includes nucleotide and amino acid sequences, protein domains and protein structures. The actual process of analyzing and interpreting data is referred to as computational biology. Important sub-disciplines within bioinformatics and computational biology include:
- the development of tools that enable efficient use of various types of information
- the development of new algorithms (mathematical formulas) and statistics with which to assess relationships among members of large data sets. For example, methods to locate a gene within a sequence (gene distributions), predict protein structure and/or function, and cluster protein sequences into families of related sequences.
The primary goal of bioinformatics is to increase the understanding of biological processes. What sets it apart from other approaches, however, is its focus on developing and applying computationally intensive techniques to achieve this goal. Examples include pattern recognition, data mining, machine learning algorithms, and visualization. Major research efforts in the field include sequence alignment, gene finding, genome assembly, drug design, drug discovery, protein structure alignment, and the modeling of evolution.
Gene Ontology, or GO, is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to:
- maintain and develop its controlled vocabulary of gene and gene product attributes
- annotate genes and gene products and assimilate and disseminate annotation data
- offer tools for easy access to all aspects of the data provided by the project