What is DNA? What is a gene? What is protein?

In this contributor article, Drs. CJ Schwartz and Marie Turner of Marigene Consulting explains the basics of genetics when it comes to cannabis cultivation.

The following is an article produced by a contributing author. Growers Network does not endorse nor evaluate the claims of our contributors, nor do they influence our editorial process. We thank our contributors for their time and effort so we can continue our exclusive Growers Spotlight service.

What is DNA? What is a gene? What is a protein?

Society is currently experiencing a revolution. Like the industrial revolution and the computer revolution, we are now experiencing the DNA revolution. The secrets to the vast and amazing abilities of living things, and human health, lie in the DNA code. So what is DNA?

DNA is a molecule, made up of 4 building blocks. These building blocks, called monomers, are referred to as nucleotides. DNA is typically found as long chains called chromosomes within the nucleus of eukaryotic cells. Despite the fact that cells cannot be observed without a microscope, each individual cell contains several feet of DNA, depending on the species. If you laid out all of the DNA in your body end-to-end, it would extend to the moon and back, multiple times.

DNA serves as the basic form of data storage required to encode life. DNA has the capacity to store an enormous amount of data, similar to a computer hardrive, but orders of magnitude more efficient. To illustrate: Computer code is binary and has two choices (0 or 1) per position, while DNA has 4 choices per position (A, C, G, or T). For example, if there are 5 positions, each with 4 choices, you end up with 4 x4 x4 x4 x4 = 45 = 1024 potential combinations. A binary code, on the other hand, would only have 2 x2 x2 x2 x2 = 25 = 32 potential combinations. As you add positions, the differences become exponentially greater. In the Cannabis genome there are 820 million positions or base-pairs of DNA (humans have 3.2 billion). The number of possible combinations is truly incredible!

DNA sequences encode for the 20 amino acids that are used as the building blocks for proteins found within cells. Proteins can be thought of as the basic working units of an organism, similar to the workers in a construction company. The specific biochemical properties and combinations of the individual amino acids determine the function of a protein, while the proteins themselves can be shaped differently to make different structures. THCAS (THCA Synthase) and CBDAS (CBDA Synthase) have very similar amino acid sequences (roughly 90% identical), but the difference in amino acids between the two proteins result in different end products.

Editor’s Note: Synthases are proteins that create certain molecules. THCA synthase makes THCA, and CBDA synthase makes CBDA.

Pictured Above: DNA codes for different amino acids which can then be combined into proteins.

A gene is most commonly defined as a DNA sequence that codes for a specific protein. E. coli has about 5000 genes, while more complex organisms have 25,000-30,000 genes. For any given gene, there exists some natural variation, which can result in a protein with a different function (or even no function). Thus, genetic differences (genotype) can result in physical differences (phenotype). The DNA sequence also determines the timing of gene expression to coordinate developmental processes, such as flowering time. In layman’s terms, the DNA sequence is like a building blueprint.

Editor’s Note: Geneticists have a variety of terms they use in genetic parlance. A “P” generation stands for the initial parents of a cross. The “F1” generation means “Filial 1”, or the children of the parents. The “F2” generation means “Filial 2”, or the children of the F1 generation. This can be continued indefinitely.

In any given cannabis strain, certain genes have been brought to the forefront through inbreeding. Strains become increasingly stable by inbreeding, decreasing heterozygosity (two different copies of a specific gene). Crossing two strain often results in hybrid vigor, where the child displays the strengths from each strain and an overall increase in vigor for many traits. When F1 hybrids are allowed to progress to the next generation (F2), there is often a loss in the hybrid vigor, thus clonally-propagated F1 plants are desirable for production. However, the loss of vigor in the F2 also comes with increased variation – which can be the starting material for a breeding program and in fact may produce even better production strains. Proper selection in the F2 and subsequent generations, results in gene combinations that can phenotypically exceed the F1 plants.

Legend: Crossing a pest resistant (Normal) parent (DD) x a sensitive (Lesion) parent (dd) produces identical F1 plants (Dd). In the F2 we can identify stable lines (DD), using DNA sequencing, for further breeding. Thus, pest resistance would be FIXED in this strain.

In the F2 plants, the genes from the parents have been shuffled, due to naturally occurring chromosomal rearrangements, resulting in novel genotypes and phenotypes. Stabilizing a genotype/phenotype is accomplished by sibling crosses (inbreeding) in subsequent generations (F3, F4, etc.), with heterozygosity decreasing with each generation.

With next-generation sequencing technologies, billions of DNA sequences are generated every day, and technologies to understand and make use of these sequences are improving rapidly. This information offers endless possibilities for improving human well-being as well as for breeding better plants with desired traits; it’s simply a matter of finding the right genes.

Editor’s Note: Finding the right combination of genes is not creating GMOs, merely classical breeding! Knowing which plants have which genes can lead to a more efficient and effective breeding program.

Do you want to receive the next Grower’s Spotlight as soon as it’s available? Sign up below!


  1. Want to get in touch with Marigene Consulting? You can reach them via the following methods:
    1. Website: http://www.marigene.com/
    2. Phone: 970.372.5363
    3. Email: [email protected]
  2. Want to learn more about genetics? Please let us know in the survey below.

Do you have any questions or comments?

Feel free to post below!

About the Author

CJ Schwartz has a BS degree in Genetics and Cell Biology from the University of Minnesota and a PhD in Biochemistry from the University of Wisconsin. For the last decade Dr. Schwartz’s research has focused on the genetic differences that control flowering time.