Argonne's Christina Hueneke overseas a robot that produces 20 times as many protein clones as a traditional wet lab.

Biology and the Computer

Hundreds of computers have been pressed into service 24/7 at Argonne National Laboratory to analyze genomic information—€”an aspect of a relatively new field known as bioinformatics. That's the creation and advancement of computational techniques to solve problems posed by the analysis of biological data. It's a key component of Argonne's multi-million dollar, multi-disciplinary structural biology program, which provides guidance that can reduce the cost of identifying unique structures of medical and biotechnological significance.

The biology revolution came about with massive genome sequencing. More than 350 fully sequenced genomes are now publicly available and more than 780 are in the pipeline. Using bioinformatics tools and advanced computers, Argonne researchers can take a small amount of known information from one genome—€”for example, genes involved in energy production—€”and search for them in all other genomes. If the same sequences are found in other genomes, that suggests that those organisms have a similar physiology. In structural studies, comparison of protein sequences with those from proteins of known structure allows researchers to make conjectures about the presence of novel folds in proteins with no similarity to what is known. Comparative analysis of pathogenic and nonpathogenic Mycobacterium species implicated several genes in causing disease. Knowing these genes, researchers can seek them in other genomes; when found, their presence tells the researchers that organism is potentially capable of causing disease. Medical researchers can use this data to develop treatments.

Structural genomics attempts to provide the link between the vast amount of genomic sequence data that is being generated and the structure of the proteins that these genomes encode. The eventual goal is to obtain a complete database of unique folds in proteins such that they can be used to predict the three dimensional structures of all proteins from their gene sequences. There are many steps in including the use of high throughput methodology including:

—€ The selection of gene sequence targets
—€ The cloning and expression of target polypeptides
—€ The crystallization of purified proteins by robots
—€ X-ray data collection
—€ Structure determination of phased data sets
—€ A structural genomics understanding the resulting structures

A thorough understanding of the activity of a molecular system requires knowledge of the three dimensional structure for the macromolecules involved in the process. Argonne is pioneering cutting-edge technologies to determine the three-dimensional structures of macromolecules and carry out comprehensive functional characterization of bio molecules.

Recent developments in genome scale DNA sequencing, high throughput analytical tools and computing technology have made feasible the genome-wide analysis of biomolecular function. Construction of complete functional maps of cellular behavior now appears to be achievable. Functional analysis of the thousands of proteins and other macromolecules needed for a comprehensive analysis of even the simplest prokaryote is a significant technological challenge that will require substantial enhancement of currently available experimental and computational capabilities. The amount of data needed to functionally characterize an organism greatly exceeds that required to sequence its genome. Furthermore, unlike genome sequencing, functional analysis requires multiple high-throughput experimental technologies and novel computational approaches.

The comprehensive characterization of biomolecular function has huge potential payoffs, providing the basis for developing entirely new strategies for modulating cellular activities and engineering novel cellular capabilities, enabling major benefits for environmental management, human health and general economic productivity.

Bioinformatics, computation and comparative analysis will drive rapid advances and collaboration across experts in biology, biophysics, microbiology and computer science. To address these needs Argonne has created the first formal organization of Computing and Life Sciences and has created a new directorate headed by Rick Stevens. The newly created directorate brings together two Argonne research divisions—€”the Biosciences Division and the Mathematics and Computer Science Division—€”along with the Computation Institute and the Institute for Genomics and Systems Biology. This new organizational structure will improve Argonne's ability to respond to current and emerging initiatives, national needs and opportunities in computational science and engineering, computer science, applied mathematics and structural and systems biology.

Argonne's Advanced Biosciences Initiative will couple the laboratory's capabilities in advanced computation and computer science with leadership at Argonne and The University of Chicago in structural biology and high-throughput life science to make Argonne a leader in four important areas of systems biology:

—€ Biological solutions for sustainable energy supplies and environmental management
—€ High-throughput facilities for protein production, genome-scale proteomics, protein engineering, and single-cell assays
—€ Integrated biological databases that enable the analysis and the construction of whole-cell models
—€ An integrated computational environment that supports high-throughput analysis of environmental genomics
This initiative could help lead to a number of practical applications in energy production, medicine and environmental stewardship, such as:
—€ Genome engineering that improves the conversion of biomass to transportation fuels
—€ Designer enzymes for new medical treatments and industrial processes
—€ Metabolic pathway engineering that optimizes the production of useful chemical feedstocks and the biodegradation of undesirable waste products
—€ New materials that combine nanotechnology with biomolecules for sensors, diagnostics and therapeutics
—€ The systematic use of genomic information to monitor and protect the health of the environment
—€ Improved understanding of biological pathogens and how they work

Argonne's leading research in the structure and function of proteins and other large-scale biological molecules, in the development of novel techniques for the expression and purification of membrane proteins and in the study of the photoreaction center and protein structures through which plants collect and use energy will produce breakthrough synergies that address major national scientific and technological issues in the biosciences.

The Advanced Bioscience Initiative addresses important national needs for improving energy production and supply, health and national security. These needs include:

—€ Large-scale computing and data analysis to address core data problems in biology and medicine
—€ Systems and synthetic biology technologies to address energy production and environmental stewardship
—€ Genome-scale computational models of microbial cells

The Computation Institute plays a key role in life sciences and medicine. Computational approaches to problem solving have proven their worth in many fields of science, allowing the collection and analysis of unprecedented quantities of data, and the exploration via simulation of previously obscure phenomena. In principle, service-oriented approaches can have a transformative effect on scientific communities, allowing tools formerly accessible only to the specialist to be made available to all, and permitting previously manual data-processing and analysis tasks to be automated.

Argonne's Grid computing power enables researchers to perform in one week comparisons that would take 18 months with a single computer. Grid resources have already proven themselves indispensable to computational biology through work done at Argonne in running automated, large-scale genome analysis of public data sources and the assimilation of results of these analyses into an integrated database that provides important public resources to the community.

Scientists at the University of Chicago and Argonne have constructed a computer simulation that allows them to study the relationship between biochemical fluctuations within a single cell and the cell's behavior as it interacts with other cells and its environment. The simulation, called AgentCell, has possible applications in cancer research, drug development and combating bioterrorism. Other simulations of biological systems are limited to the molecular level, the single-cell level or the level of bacterial populations. AgentCell can simulate the behavior of entire populations of cells as they sense their environment, respond to stimuli and move in a three-dimensional world.

Argonne's contributions to bioinformatics include databases and analytical tools that guide biological research. The main database, PUMA2, combines information from 22 databases. The team has also developed "Pathos" and "Chisel," software tools that work with PUMA2 to find specific genomic sequences. Pathos is a database for bio defense research. It contains all publicly available genomes of pathogens, including anthrax and plague. Chisel enables identification of eukaryotic and bacterial versions of the same enzyme functions.

Industrial technology development is an important activity in moving benefits of Argonne's publicly funded research to industry to help strengthen the nation's technology base. Within the last 15 years Argonne has produced dozens of spinoff companies and more than 750 patents.

Eleanor Taylor is a communications specialist at Argonne National Laboratory.