Disclaimer: I’m not a biologist. I’ll explain things to the best of my knowledge, but errors are very possible. If you find some, please let me know in the comments and I’ll fix them. Thank you.
Proteins are the building blocks of life.
According to a recent Nature article (which I’ve also cited here), “the human genome contains some 25,000 protein-encoding genes, [but because genes can code for more than one protein, and the products described in genes can be modified after being translated into protein,] a given person’s various cells might use up to a million different proteins to do different things at different times in the course of a life.” So there are lots of proteins that we need to study to better understand life.
There are two general ways used to study the structure of proteins (the structure gives us important information about how the protein performs its function):
The Experimental Way
The experimental way is itself sub-divided in at least two techniques (there might be more).
Image: A diffractometer. GFDL license.
First is X-ray crystallography. It is a long and expensive process, and unfortunately not all proteins will crystallize (which is why nuclear magnetic resonance (NMR) spectroscopy is important):
The technique of X-ray crystallography has three basic steps. The first and generally most difficult step is to produce an adequate crystal of the molecule(s) under study. The crystal must be sufficiently large, pure in composition and regular in structure, with no large internal imperfections such as cracks. In the second step, the crystal is placed in an intense beam of X-rays of a single wavelength, producing a series of spots called reflections. As the crystal is gradually rotated, previous reflections disappear and new ones appear; the intensity of every spot is recorded meticulously at every orientation of the crystal. Multiple data sets may have to be collected, with each covering a full rotation of the crystal and containing tens of thousands of reflection intensities. In the third step, these data are combined computationally with prior chemical information about the molecular structure to produce the atomic resolution model.
The second is Protein nuclear magnetic resonance spectroscopy, which is what has recently been improved by MIT researchers. First, lets look at the “old model”:
Image: Pacific Northwest National Laboratory 800 MHz NMR Spectrometer. Public Domain license.
Traditional NMR uses coils to detect the radio-frequency signals produced by some atoms, including hydrogen and carbon, when they are exposed to a magnetic field. […] Because the radio-frequency signals that NMR spectroscopy relies on are very weak, large samples are needed to perform experiments. The instruments also require large, powerful magnets, which contribute to their size and expense. Hence, biochemists have had limited access to the machines.
So NMR spectrometers are big, expensive and their access is limited. This could change:
This NMR probe, which is smaller than a credit card, greatly increases the analytical technique’s sensitivity. The probe uses a cheap, simple piece of copper similar to a cell-phone antenna. Credit: Yael Maguire, Technology Review
MIT researchers have significantly increased the sensitivity of nuclear magnetic resonance (NMR) spectroscopy […] The MIT method, which relies on a new kind of magnetic probe, could cut down the time it takes to perform these tests by a factor of 100 […]
the MIT researchers fabricated a highly sensitive NMR probe out of a flat strip of copper similar to the antennas in laptops and cell phones. “It’s simple to fabricate,” says Maguire. “The same companies that make antennas can make these.” A quick cut with a laser creates a small hole out of which a magnetic field can flow. […]
So far, the MIT researchers have used the probe to confirm known structures. In tests on a protein called ribonuclease, they were able to use 3,000 times less of the compound than is normally required to perform NMR spectroscopy; in tests on sucrose, they used 10,000 times less.
Sounds great, eh?
I hope that this new tool will be available widely soon and that it will allow more (and better) experimental protein research will be done. It could seriously accelerate research in many fields in combination with the progress that is made in the second way used to study the shape and structure of proteins.
The Computational Way
There are two important things we know about proteins that can help us simulate their structures in computer models.
It’s long been recognized that most for most proteins the native state is at a thermodynamic minimum. In English, that means the unique shape of a protein is the most stable state it can adopt. Picture a ball in a funnel – the ball will always roll down to the bottom of the funnel, because that is the most stable state.
The sequence of amino acids is sufficient to determine the native state of a protein. By virtue of their different chemical properties, some amino acids are attracted to each other (for example, oppositely charged amino acids) and so will associate; other amino acids will try to avoid water (because they are greasy) and so will drive the protein into a compact shape that excludes water from contacting most of the amino acids that “hide” in the core of this compacted protein.
But that doesn’t mean it’s easy to do. Small proteins can have 100 amino acids, and some large human proteins can have up to a 1,000. This makes for astronomical possibilities of combinations, and trying all of them to find the one with the lowest energy would require massive amounts of computing power.
The rosetta philosophy is to use both an understanding of the physical chemical properties different types of amino acid interactions, and a knowledge of what local conformations are probable for short stretches of amino acids within a protein to adopt, to limit the search space, and to evaluate the energy of different possible conformations. By sampling enough conformations, Rosetta can find the lowest energy, most stable native structure of a protein.
So in this case, both the clever design of the software to limit search space and the brute force of distributed computing are required.
The potential benefits of computational biology are immense: Save time and resources on the experimental side, and keep a focus on already promising projects. Go through a lot more data than would be possible otherwise (Moore’s Law helps), but best of all, it makes it possible to design new proteins.
Why is that important? Because right now, a lot of the drugs that are produced by the pharmaceutical industry are more discovered than designed. They test thousands of compounds selected among potential candidates until they find one that gives the desired result without too many side-effects. It’s a bit like going out in the jungle, coming back with lots of different plants in the hope that one of them will do what you want it to. It’s better than nothing, but it’s expensive, inelegant and limited. We know we can do better.
Once we are far enough in our understanding of how proteins work, we’ll be able to design our own that will have exactly the effects that we want them to have, possibly with no negative side-effects. This could allow us to design cures for diseases that are currently deadly and stop a lot of suffering, but also to do things like design tools to break down toxic waste and separate it into benign elements, absorb CO2 from the atmosphere and make it easy to sequester, turn the sun’s energy into fuel more efficiently than photosynthesis, etc.
- Better Pictures of Proteins at MIT’s Technology Review
- Rosetta@home Science FAQ – by Vanita Sood
- Protein at Wikipedia
- X-ray crystallography at Wikipedia
- Protein nuclear magnetic resonance spectroscopy at Wikipedia
- Amino Acids at Wikipedia