Explainer: What are genes?
These stretches of DNA are the instruction manuals for making proteins, which do a cell’s work
Genes are the blueprints for building the chemical machinery that keeps cells alive. That’s true for humans and all other forms of life. But did you know that with 20,000 genes, people have almost 11,000 fewer genes than water fleas? If the number of genes doesn’t predict complexity, what does?
The answer is that our genetic material contains much more than the units we call genes. Just as important are the switches that turn a gene on and off. And how cells read and interpret genetic instructions is far more complex in people than in those water fleas.
Genes and the switches that control them are made of DNA. That’s a long molecule resembling a spiral ladder. Its shape is known as a double helix. A total of three billion rungs connect the two outer strands — the upright supports — of this ladder. We call the rungs base pairs for the two chemicals (pair) from which they are made. Scientists refer to each chemical by its initial: A (adenine), C (cytosine), G (guanine) and T (thymine). A always pairs with T; C always pairs with G.
In human cells, the double-stranded DNA doesn’t exist as one gigantic molecule. It’s split into smaller chunks called chromosomes (KROH-moh-soams). These are packaged into 23 pairs per cell. That makes 46 chromosomes in total. Together, the 20,000 genes on our 46 chromosomes are referred to as the human genome.
The role of DNA is similar to the role of the alphabet. It has the potential to carry information, but only if the letters are combined in ways that make meaningful words. Stringing words together makes instructions, as in a recipe. So genes are instructions for the cell. Like instructions, genes have a “start.” Their string of base pairs must follow in a specific order until they reach some defined “end.”
If genes are like a basic recipe, alleles (Ah-LEE-uhls) are versions of that recipe. For instance, the alleles of the “eye color” gene give directions for making eyes blue, green, brown and so on. We inherit one allele, or gene version, from each of our parents. That means most of our cells contain two alleles, one per chromosome.
But we aren’t exact copies of our parents (or siblings). The reason: Before we inherit them, alleles are shuffled like a deck of cards. This happens when the body makes egg and sperm cells. They are the only cells with just one version of each gene (instead of two), packaged into 23 chromosomes. Egg and sperm cells will fuse in a process known as fertilization. This starts the development of a new person.
By combining two sets of 23 chromosomes — one set from the egg, one set from the sperm cell — that new person ends up with the usual two alleles and 46 chromosomes. And her unique combination of alleles will never arise in the exact same way again. It’s what makes each of us unique.
A fertilized cell needs to multiply to make all of a baby’s organs and body parts. To multiply, a cell splits into two identical copies. The cell uses the instructions on its DNA and the chemicals in the cell to produce an identical DNA copy for the new cell. Then the process repeats itself many times as one cell copies to become two. And two copy to become four. And so on.
To make organs and tissues, the cells use the instructions on their DNA to build tiny machines. They control reactions between chemicals in the cell that eventually produce organs and tissues. The tiny machines are proteins. When a cell reads a gene’s instructions, we call it gene expression.
How does gene expression work?
Gene expression relies on helper molecules. These interpret a gene’s instructions to make the right types of proteins. One important group of those helpers is known as RNA. It’s chemically similar to DNA. One type of RNA is messenger RNA (mRNA). It’s a single-stranded copy of the double-stranded DNA.
Making mRNA from DNA is the first step in gene expression. That process is known as transcription and happens inside a cell’s core, or nucleus. The second step, called translation, takes place outside of the nucleus. It turns the mRNA message into a protein by assembling the appropriate chemical building blocks, known as amino (Ah-MEE-no) acids.
All human proteins are chains with different combinations of 20 amino acids. Some proteins control chemical reactions. Some carry messages. Still others function as building materials. All organisms need proteins so that their cells can live and grow.
To build a protein, molecules of another type of RNA — transfer RNA (tRNA) — line up along the mRNA strand. Each tRNA carries a three-letter sequence on one end and an amino acid on the other. For example, the sequence GCG always carries the amino acid alanine (AL-uh-neen). The tRNAs match up their sequence with the mRNA sequence, three letters at a time. Then, another helper molecule, known as a ribosome (RY-boh-soam), joins the amino acids on the other end to make the protein.
One gene, several proteins
Scientists first thought that each gene held the code to make one protein only. They were wrong. Using the RNA machinery and its helpers, our cells can make way more than 20,000 proteins from their 20,000 genes. Scientists don’t know exactly how many more. It could be a few hundred thousand — perhaps a million!
How can one gene make more than one type of protein? Only some stretches of a gene, known as exons, code for amino acids. The regions in between them are introns. Before the mRNA leaves a cell’s nucleus, helper molecules remove its introns and stitch together its exons. Scientists refer to this as mRNA splicing.
The same mRNA may be spliced in different ways. This often happens in different tissues (perhaps skin, the brain or the liver). It’s like the readers “speak” different languages and interpret the same DNA message in multiple ways. That’s one way the body can have more proteins than genes.
Here’s another way. Most genes have multiple switches. The switches determine where an mRNA begins to read a DNA sequence, and where it stops. Different start or end sites create different proteins, some longer and some shorter. Sometimes, transcription doesn’t start until several chemicals attach themselves to the DNA sequence. These DNA binding sites may be far away from the gene, but still influence when and how the cell reads its message.
Splicing variations and gene switches result in different mRNAs. And these are translated into different proteins. Proteins also may change after their building blocks have been assembled into a chain. For example, the cell may add chemicals to give a protein some new function.
DNA holds more than building instructions
Making proteins is far from DNA’s only role. In fact, only one percent of human DNA contains the exons that the cell translates into protein sequences. Estimates for the share of DNA that controls gene expression range from 25 to 80 percent. Scientists do not yet know the exact number because it’s harder to find these regulatory DNA regions. Some are gene switches. Others make RNA molecules that aren’t involved in building proteins.
Controlling gene expression is almost as complex as conducting a large symphony orchestra. Just consider what it takes for a single fertilized egg cell to develop into a baby within nine months.
So does it matter that water fleas have more protein-coding genes than people? Not really. Much of our complexity hides in the regulatory regions of our DNA. And decoding that part of our genome will keep scientists busy for many, many years.