Bioinformatics is the recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of the sequences and structural information as well methods to access, search, visualize and retrieve the information. Bioinformatics concern the creation and maintenance of databases of biological information whereby researchers can both access existing information and submit new entries.
Following text taken from Suresh Kumar.Bioinformatics Web - Comprehensive educational resource on Bioinformatics. 6th May .2005
Bioinformatics has evolved into a full-fledged multidisciplinary subject that integrates developments in information and computer technology as applied to Biotechnology and Biological Sciences. Bioinformatics uses computer software tools for database creation, data management, data warehousing, data mining and global communication networking.
Function genomics, biomolecular structure, proteome analysis, cell metabolism, biodiversity, downstream processing in chemical engineering, drug and vaccine design are some of the areas in which Bioinformatics is an integral component.
Sub-disciplines within bioinformatics
There are three important sub-disciplines within bioinformatics involving computational biology:
The development of new algorithms and statistics with which to assess relationships among members of large data sets
The analysis and interpretation of various types of data including nucleotide and amino acid sequences, protein domains, and protein structures and
The development and implementation of tools that enable efficient access and management of different types of information
Activities in bioinformatics
We can split the activities in bioinformatics in two areas (1) the organization and (2) the analysis of biological data
Organization activity in Bioinformatics
The creation of databases of biological information
The maintenance of these databases
Analysis activity in Bioinformatics
Development of methods to predict the structure and/or function of newly discovered proteins and structural RNA sequences.
Clustering protein sequences into families of related sequences and the development of protein models.
Aligning similar proteins and generating phylogenetic trees to examine evolutionary relationships
Aims of Bioinformatics
The aims of bioinformatics are basically three-fold. They are
Organization of data in such a way that it allows researchers to access existing information & to submit new entries as they are produced. While data-creation is an essential task, the information stored in these databases is useless unless analysed. Thus the purpose of bioinformatics extends well beyond mere volume control.
To develop tools and resources that help in the analysis of data. For example, having sequenced a particular protein, it is with previously characterized sequences. This requires more than just a straightforward database search. As such, programs such as FASTA and PSI-BLAST much consider what constitutes a biologically significant resemblance. Development of such resources extensive knowledge of computational theory, as well as a thorough understanding of biology.
Use of these tools to analyse the individual systems in detail, and frequently compared them with few that are related.
Three levels of bioinformatics
Analysis of a single gene (protein) sequence.
Similarity with other known genes
Phylogenetic trees; evolutionary relationships
Identification of well-defined domains in the sequence
Sequence features (physical properties, binding sites, modification sites)
Prediction of subcellular localization
Prediction of secondary and tertiary structure
Analysis of complete genomes.
Which gene families are present, which missing?
Location of genes on the chromosomes, correlation with function or evolution
Expansion/duplication of gene families
Presence or absence of biochemical pathways
Identification of "missing" enzymes
Large-scale events in the evolution of organisms
Analysis of genes and genomes with respect to functional data.
Expression analysis; microarray data; mRNA conc. measurements
Proteomics; protein conc. measurements, covalent modifications
Comparison and analysis of biochemical pathways
Deletion or mutant genotypes vs. phenotypes
Identification of essential genes, or genes involved in specific processes
Bioinformatics and its scope
Bioinformatics uses advances in the area of computer science, information science, computer and information technology, communication technology to solve complex problems in life sciences and particularly in biotechnology. Data capture, data warehousing and data mining have become major issues for biotechnologists and biological scientists due to sudden growth in quantitative data in biology such as complete genomes of biological species including human genome, protein sequences, protein 3-D structures, metabolic pathways databases, cell line & hybridoma information, biodiversity related information. Advancements in information technology, particularly the Internet, are being used to gather and access ever-increasing information in biology and biotechnology. Functional genomics, proteomics, discovery of new drugs and vaccines, molecular diagnostic kits and pharmacogenomics are some of the areas in which bioinformatics has become an integral part of Research & Development. The knowledge of multimedia databases, tools to carry out data analysis and modeling of molecules and biological systems on computer workstations as well as in a network environment has become essential for any student of Bioinformatics. Bioinformatics, the multidisciplinary area, has grown so much that one divides it into molecular bioinformatics, organal bioinformatics and species bioinformatics. Issues related to biodiversity and environment, cloning of higher animals such as Dolly and Polly, tissue culture and cloning of plants have brought out that Bioinformatics is not only a support branch of science but is also a subject that directs future course of research in biotechnology and life sciences. The importance and usefulness of Bioinformatics is realized in last few years by many industries. Therefore, large Bioinformatics R & D divisions are being established in many pharmaceutical companies, biotechnology companies and even in other conventional industry dealing with biological. Bioinformatics is thus rated as number one career in the field of biosciences.
In short, Bioinformatics deals with database creation, data analysis and modeling. Data capturing is done not only from printed material but also from network resources. Databases in biology are generally in the multimedia form organized in relational database model. Modeling is done not only on single biological molecule but also on multiple systems thus requiring a use of high performance computing systems.
The Potential of Bioinformatics
The potential of Bioinformatics in the identification of useful genes leading to the development of new gene products, drug discovery and drug development has led to a paradigm shift in biology and biotechnology-these fields are becoming more & more computationally intensive. The new paradigm, now emerging, is that all the genes will be known "in the sense of being resident in database available electronically", and the starting point of biological investigation will be theoretical and a scientist will begin with a theoretical conjecture and only then turning to experiment to follow or test the hypothesis. With a much deep understanding of the biological processes at the molecular level, the Bioinformatics scientist have developed new techniques to analyse genes on an industrial scale resulting in a new area of science known as 'Genomics'.
The shift from gene biology has resulted in the development of strategies-from lab techniques to computer programmes to analyse whole batch of genes at once. Genomics is revolutionizing drug development, gene therapy, and our entire approach to health care and human medicine.
The genomic discoveries are getting translated in to practical biomedical results through Bioinformatics applications. Work on proteomics and genomics will continue using highly sophisticated software tools and data networks that can carry multimedia databases. Thus, the research will be in the development of multimedia databases in various areas of life sciences and biotechnology. There will be an urgent need for development of software tools for datamining, analysis and modelling, and downstream processing. Security of data, data transfer and data compression, auto checks on data accuracy and correctness will also be major research area of bioinformatics. The use of virtual Reality in drug design, metabolic pathway design, and unicellular organism design, paving the way to design and modification of muticellular organisms, will be the challenges challenges which Bioinformatics scientist and specialist have to tackle. It has now been universally recognized that Bioinformatics is the key to the new grand data-intensive molecular biology that will take us into 21 century.
Bioinformatics - Industry Overview
The Bioinformatics industry has grown to keep up with the information explosion, growing at 25-50% a year. In 2000, the US market Research company Oscar Gruss estimated that the value of the Bioinformatics industry would touch $2 billion. Now it s demand for individuals capable of doing bioinformatics is soaring. Industry's demand for scientists with skills in Bioinformatics far exceeds the supply of qualified specialists in the field, Seems likely that this figure will be reached within the coming year. Therefore, companies are developing methods of spotting potential Bioinformatics experts and then training them on the job.
Bioinformatics and computational biology
Bioinformatics and computational biology each maintain close interactions with life sciences to realize their full potential. Bioinformatics applies principles of information sciences and technologies to make the vast, diverse, and complex life sciences data more understandable and useful. Computational biology uses mathematical and computational approaches to address theoretical and experimental questions in biology. Although bioinformatics and computational biology are distinct, there is also significant overlap and activity at their interface.
Biocomputing is often used as a catch-all term covering all this area at the intersection of Biology and Computation , although many other terms are used to name the same area. We can distinguish in to (non-disjoint) sub-fields:
Bioinformatics - this includes management of biological databases, data mining and data modeling, as well as IT-tools for data visualization
Computational Biology - this includes efforts to solve biological problems with computational tools (such as modeling, algorithms, heuristics)
DNA computing and nano-engineering - this includes models and experiments to use DNA (and other) molecules to perform computations
Computations in living organisms - this is concerned with constructing computational components in living cells, as well as with studying computational processes taking place daily in living organisms
Computational Biology is application of core technology of computer science (eg. algorithms, artificial intelligence, databases etc) to problems arising from biology. Computational biology is particularly exciting today because the problems are large enough to motivate the efficient algorithms and moreover the demand of biology on computational science is increasing.
The most pressing tasks in bioinformatics involve the analysis of sequence information. Computational Biology is the name given to this process, and it involves the following:
Finding the genes in the DNA sequences of various organisms
Developing methods to predict the structure and/or function of newly discovered proteins and structural RNA sequences.
Clustering protein sequences into families of related sequences and the development of protein models.
Aligning similar proteins and generating phylogenetic trees to examine evolutionary relationships.