Practical Bioinformatics the halting thought process of a working bioinformatician


The Central Dogma

So a lot of what I do depends on some basic understanding of molecular biology.  While an in-depth knowledge is desirable, this is one of those explanations you can probably just refer back to when confused about terms.  It does require some terminology, but it's not too bad.

The Central Dogma of Molecular Biology is a big idea.  So big, in fact, that Nobel prizes have been awarded on its topic, and all of modern biology depends on it.  It's a theory, meaning it's an internally-consistent, cohesive set of explanations we use to explain a wide variety of phenomena.  It's also a law, meaning we base a field of study on it, and the broad truth of the topic is no longer actively debated. We still fight each other tooth and nail about little details, but the consensus of the scientific community is this:

DNA -> RNA -> Protein

DNA is transcribed into RNA is translated into Proteins.

Since DNA and RNA are nucleic acids, they store data very well, but don't perform work very well. Proteins are strong and functional, but very hard to replicate.  Modern organisms and most viruses use DNA as the source of genetic information. They transcribe that DNA into an RNA template, which either goes off into the cell to be used as-is, or it can be translated into a protein, which can perform sophisticated functions.

There are lots of exceptions to the rules, like reverse transcriptases, which turn an RNA template into DNA, or Prions, which are self-replicating proteins.  (Kind of.)  The exceptions, though, are a tiny tiny fraction of the overall diversity.  The overwhelming majority of everything that has ever lived follows the procession of DNA -> RNA -> Protein.

This process is highly conserved, meaning it operates much the same whether you're talking about a virus in a sulphurous hot spring, or a cat.  This is one of the major arguments we use to explain why evolution is the paradigm we view all biology with.  So far, everything we've ever found uses the same genetic code, with a bit of tweaking by domain.  The DNA is arranged into 3-letter groups called codons, which are transcribed into a complementary RNA sequence.  That sequence determines which amino acid is added to a growing chain of amino acids, which become a protein when complete.  There are start codons and stop codons that initiate and halt translation.  One interesting artifact is that while the start codons can vary slightly between domains of life, the stop codons do not.  A stop is a stop is a stop, and they're mundane enough that in my line of work, we don't really even differentiate between them that much.

Most of my day to day work is examination of genetic code as it exists in an organism, and seeing if there is biological meaning I can pull out of the string of letters.  There are only four nucleotides, and an alphabet with only four letters is at first blush not very interesting.  Parsing out the meaning hidden in the letters is the science of Genomics, one of the most current and exciting fields of study within Bioinformatics.  I hope to help explain some of the tools we use to do this work, and illuminate the theory that underpins our conclusions.  Look for the BNFO101 Category of posts for more background information.

Filed under: BNFO101 Leave a comment
Comments (0) Trackbacks (0)

No comments yet.

Leave a comment


No trackbacks yet.