October / November 06 Volume 4 Number 5

Understanding Proteins

Proteins are the workhorses of the cell and understanding how proteins fold into their correct shapes or misfold into insoluble aggregates is key to understanding many diseases since virtually all human diseases are the result of protein dysfunctions.

For example, Tuberculosis is one of the oldest diseases known to mankind, found in the skeletal remains of prehistoric humans. While many people might think that tuberculosis is nearly extinct, it is in fact one of the top four most infectious diseases in the world today, killing more than 50 percent of its victims (over two million people annually).

Since the early 1900s people have known about Alzheimer's disease and yet it is still an unsolved mystery. Today, more than 4.5 million Americans have Alzheimer's disease, and that number will more than double by 2050.

Since these diseases were discovered, there has been a constant effort to find their cures. Researchers are still actively trying to develop cures for these diseases by studying the structures of the proteins associated with them, but these proteins are very difficult to study due to problems arising from their physical properties. A typical method for studying proteins is to tag them with a reporter (a tool to examine the expression of a protein of interest). This allows scientists to better understand a protein and its characteristics by telling them when the protein expresses and where its physical location is in the cell.

Green fluorescent protein, or GFP, is one of the most common tools used to report on a target protein's temporal and spatial expression, allowing researchers to study when and where a specific protein is being expressed. Since proteins are the workhorses of our cells, performing virtually all the functions in our body, GFP is one of the most powerful reporters available to scientists.

Yet, despite its usefulness, GFP is subject to several significant barriers:
—€ GFP can only provide information on a protein that is expressed and folded correctly. If there is no fluorescence, it can mean either that the protein has not been expressed, or that it is not folded correctly.

—€GFP can sometimes report on only a part of the target protein providing incorrect data, or false positives, which can be a costly and time-consuming mistake.
—€ GFP itself is a large protein consisting of over two hundred amino acids; this alone can cause problems such as interference in the folding of the target protein.

Hence, existing GFP technology can only provide correct data for proteins that are expressed and folded correctly, so it is impossible to use GFP to report on dysfunctional proteins, and protein dysfunction is at the root of many diseases such as TB and Alzheimer's.

Now, Geoffrey S. Waldo, a scientist at Los Alamos National Laboratory, and his colleagues have developed a toolbox of interrelated, GFP-based technologies to study target proteins to solve these three primary barriers limiting GFP's current capabilities as a reporter. The pharmaceutical industry could benefit immensely from LANL's GFP toolbox. Typically, pharmaceutical companies have developed small-molecule drugs that have many side effects because they do not interact solely with their intended target but instead bind to many non-specific proteins causing unwanted side effects.

Although this may seem like an insignificant problem, because the drugs that make it to market typically have minor side effects, the vast majority of drugs never get there. That's because side effects can cause severe damage to the patient, costing big pharma millions of dollars invested in drugs that never produce any revenue.

Recently, several biotechs such as Genentech and Amgen have started using protein treatment drugs called mononuclear antibodies, or mAb. These drugs want to interact only with their target protein and thus eliminate most side effects. Despite success with mAb treatments, one major problem remains: the proteins used must be soluble. An insoluble protein can cause many internal problems such as kidney failure. The LANL GFP toolbox can identify mutations that will help mAb drugs be more soluble.

"LANL's GFP toolbox can help scientists determine folding and solubility characteristics of their protein or drug target, protein localization, and the interactions of proteins in protein complexes both in vitro and in vivo," said Mina Stemm, a business development executive for Los Alamos' Technology Transfer Division. "The GFP toolbox addresses problems common to many protein scientists. Nobody wants to work with a highly insoluble or unfolded protein—€”it either crashes out of solution or gets chewed up by the cell's machinery. This technology allows for creation of proteins that are less problematic."

Green fluorescent proteins were first discovered in the 1960s at the University of Washington in a jellyfish named Aequorea victoria and first cloned in the 1980s. However, what exactly the GFPs were going to be used for was somewhat of a mystery until the 1980s and 90s. Since then, GFPs have been a popular focus for scientists and researchers interested in learning more about the movement of proteins in cells.

LANL's Waldo first began researching GFPs when he was studying a protein called ferritin (which stores iron in the body).

"I had been studying x-ray scattering by metals in proteins and ferritin was a protein that was particularly poorly folded," he said. "There weren't any high-throughput tests for ferritin activity and I wanted to make better folded versions of it, so it occurred to me to try to use the fluorescence of correctly folded GFP to tell about the folding of ferritin."

While using GFP, Waldo and his team realized that, because of the complex nature of proteins, there were many things it could not do. So the team created additional innovations, now called the GFP toolbox, that include Superfolder GFP, Insertion GFP, and Split GFP.

Superfolder GFP improves upon existing GFP by reporting or measuring a target protein's expression, independent of the target protein's folding. Superfolder GFP will fluoresce as long as the target protein is being expressed, regardless of how well folded it is, providing an entirely new realm of reporting available to scientists. Superfolder GFP is very well folded and can serve as the starting point for making new GFP technologies, such as pH sensors, that require this robust folding.

The Insertion GFP is a variation of GFP that attaches at both the N-terminus (beginning of the amino acid chain) and the C-terminus (end of the amino acid chain); thus, placing the target protein inside the GFP. GFP is meant to attach only to the C-terminus, however, some proteins have tagging problems when these proteins have internal places where GFP can attach. This results in false positives because GFP is only reporting on a small portion of the target protein. Use of the Insertion GFP helps to avoid these false positives.

The Split GFP solves the problem of proteins having large tags by using only a portion of GFP to attach to the target protein. Because GFP is a fairly large tag, 238 amino acids long, it can cause problems with the behavior of target proteins, which can cause problems within the cell.

"Stephanie Cabantous [a scientist on Waldo's team at Los Alamos] worked