Site icon DNAeXplained – Genetic Genealogy

Mitochondrial DNA: Part 2 – What Do Those Numbers Mean?

This is the second part in a series about mitochondrial DNA. The first article can be found here:

When people receive their results, generally the first thing they look at is matches, and the second thing is the actual results, found under the Mutations tab.

We’re going to leave working with matches until after we discuss what the numbers on the Mutations page actually mean.

Fair warning – if you’re not interested in the “science stuff,” then this article probably isn’t for you. We’re going to talk about the different kinds of mutations and how they affect your results and matching. I promise to make the science fun and understandable.

However, it’s only fair to tell you that you don’t need to understand the nitty-gritty to make use of your results in some capacity. We will be covering how to use every tab on your mitochondrial DNA page, above, in future articles – but you may want to arm yourself with this information so you understand why tools, and matching, work the way they do. All matches and mismatches are not created equal!

The next article in the series will be “Mitochondrial DNA: Part 3 – Haplogroups Unraveled” in which we’ll discuss how haplogroups are assigned, the differences between vendors, and how haplogroup results can be utilized for genealogy.

If you have your full sequence mitochondrial results from Family Tree DNA, it would be a good idea to sign on now, or to print out your results page so you can refer to your results while reading this article.

Results

I’m using my own results in these examples.

When you click on the “Results” icon on your personal page, above, this is what you’ll see.

You can click to enlarge this image.

After you read the information about your haplogroup origin, your eyes will drift down to the numbers below, where they will stop, panic spreading throughout your body.

Never fear – your decoder ring is right here.

Where Did Those Numbers Come From?

The numbers you are seeing are the locations in your mitochondrial DNA where a mutation has occurred. Mutations, in this sense, are not bad things, so don’t let that word frighten you. In fact, mutations are what enables genetic genealogy to work.

Most of the 16,569 locations never change. Only the locations that have experienced a mutation are shown. Locations not listed have not experienced a mutation.

The number shown is the location, or address, in the mitochondrial DNA where a mutation has occurred.

However, there is more than one way to view your results.

Two Tabs – rCRS and RSRS

Click to enlarge this image.

You’ll notice that there are two tabs at the top of the page. RSRS values are showing initially.

rCRS and RSRS are abbreviations for “revised Cambridge Reference Sequence” and “Reconstructed Sapiens Reference Sequence.”

The CRS, Cambridge Reference Sequence was the reference model invented in 1981, at Cambridge University, when the first full sequencing of mitochondrial DNA was completed. Everyone has been compared to that anonymous individual ever since.

The problem is that the reference individual was a member of haplogroup H, not a haplogroup further back in time, closer to Mitochondrial Eve. Mitochondrial Eve was not the first woman to live, but the first woman to have a line of continuous descendants to present. You can read more about the concept of Mitochondrial Eve, here and about rCRS/RSRS here.

Using a haplogroup H person for a reference is kind of like comparing everyone to the middle of a book – the part that came later is no problem, but how do you correctly classify the changes that preceded the mutations that produced haplogroup H?

Think of mitochondrial DNA as a kind of biological timeline.

In this concept example, you can see that Mitochondrial Eve lived long ago and mutations, Xs, that formed haplogroups accrued until haplogroup H was born, and additional mutations continued to accrue over thousands of years.

Haplogroup J, a different haplogroup, was born from one of mitochondrial Eve’s descendants with a string of their own mutations.

The exact same process occurred with every other haplogroup.

You can see a bare-bones tree in the image below, with H and J under different branches of R, at the bottom.

Using the rCRS model, the descendants of haplogroup J born today are being compared to the rCRS reference person who is a descendant of haplogroup H.

In reality, everyone should be being compared directly to Mitochondrial Eve, or at least someone much closer to the root of the mitochondrial phylotree than haplogroup H. However, when the CRS and then the revised CRS (rCRS) was created, scientists didn’t know as much as they do today.

In 2012, Dr. Doron Behar et al rewrote the mitochondrial DNA phylotree in the paper A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root by discerning what mitochondrial Eve’s DNA looked like by tracking the mutations backwards in time.

Then, the scientists redrew the tree and compared everyone to Mitochondrial Eve at the base of the tree. The RSRS view shows those mutations, which is why I have more mutations in the RSRS model than in the rCRS model where I’m compared with the haplogroup H person who is closer in time than Mitochondrial Eve. In other words, mutations that were considered “normal” for haplogroup J because haplogroup H carried them, are not considered mutations by both haplogroup J and H because they are both being compared to Mitochondrial Eve.

Today, some papers and individuals utilize the CRS version, and others utilize the RSRS version. People don’t adapt very well or quickly to change. Complicating this further, the older papers, published before 2012, would continue to reference rCRS values, so maintaining the rCRS in addition to the RSRS seemed prudent.

You can see the actual mtDNA haplotree here and I wrote about how to use it here.

Let’s look at the differences in the displays and why each is useful.

The Cambridge Reference Sequence

My rCRS results look a little different than the RSRS results.

Click to enlarge this image.

I have more mutations showing on the RSRS page, above, than in the rCRS page below, including only the information above the second row of black headers.

Click to enlarge.

That’s because my RSRS results are being compared to Mitochondrial Eve, much further back in time. Compared to Mitochondrial Eve, I have a lot more mutations than I have being compared to a haplogroup H individual.

Let’s look at the most common example. Do you see my mutation at location 16519C?

In essence, the rCRS person carried this mutation, which meant that it became “normal” and anyone who didn’t have the mutation shows with a mutation at this location.

Therefore, today, you’re very likely to have a mutation at location 16519C in the rCRS model.

In the RSRS results below, you can see that 16519C is missing from the HVR1 differences.

You can see that the other two mutations at locations 16069 and 16126 are still present, but so are several others not present in the rCRS model. This means that the mutations at locations 16129, 16187, 16189, 16223, 16230, 16278 and 16311 are all present in the rCRS model as “normal” so they weren’t reported in my results as mutations.

However, when compared to Mitochondrial Eve, the CRS individual AND me would both be reported with these mutations, because we are both being compared to Mitochondrial Eve.

Another difference is that at the bottom of the rCRS page you can see a list of mutations and their normal CRS value, along with your result.

For location 16069, the normal CRS value is C and your value is T.

Why don’t we have this handy chart for the RSRS?

We don’t need it, because the value of 16069C in the RSRS model is written with the normal letter preceding the location, and the mutated value after.

You might have noticed that you see 4 different letters scattered through your results. Why is that?

Letters

The letters stand for the nucleotide bases that comprise DNA, as follows:

Looking at location 16069, above, we see that C is the normal value and T is the mutated value.

Let’s look at different kinds of mutations.

Transitions, Transversions and Reversions

DNA is normally paired in a particular way, Ts with As and Cs with Gs. You can read more about how that works here.

Sometimes the T-As and C-Gs flip positions, so T-C, for example. These are known as transitions. A mutation with a capital letter at the end of the location is a transition.

For example, C14352T indicates that the normal value in this location is C, but it has mutated to T. This is a transition and T will be capitalized. The first letter is always capitalized.

If you notice that one of your trailing letters in your RSRS results is a small letter instead of a capital, that means the mutation is a transversion instead of a transition. For example, C14352a.

You can read more about transitions and transversions here and here.

When looking at your RSRS results, your letter before the allele number is the normal state and the trailing noncapital letter is the transversion. With C14352a, C is the normal state, but the mutation caused the change to a, which is a small letter to indicate that it is a transversion.

Original Value

Typical Transition Pairing (large trailing letter)

Unusual Transversion Pairing (small trailing letter)

T

C a or g

A

G

c or t

C

T

a or g

G A

c or t

An exclamation mark (!) at the end of a labeled position denotes a reversion to the ancestral or original state. This means that the location used to have a mutation, but it has reverted back to the “normal” state. Why does this matter? Because DNA is a timeline and you need to know the mutation history to fully understand the timeline.

The number of exclamation marks stands for the number of sequential reversions in the given position from the RSRS (e.g., C152T, T152C!, and C152T!!).

This means that the original nucleotide at that location was C, it changed to T, then back to C, then back to T again, indicated by the double reversion-!!. Yes, a double reversion is very, very rare.

Insertions

Many people have mutations that appear with a decimal point. I have an insertion at location 315. The decimal point indicates that an insertion has occurred, and in this case, an extra nucleotide, a C, was inserted. Think of this as DNA cutting in line between two people with assigned parking spaces – locations 315 and 316. There’s no room for the cutter, so it’s labeled 315.1 plus the letter for the nucleotide that was inserted.

Sometimes you will see another insertion at the same location which would be noted at 315.2C or 315.2A if a different nucleotide was inserted.

Complex insertions are shown as 315.XC which means that there was an insertion of multiple nucleotides, C, in this case, of unknown length. So the number of Cs would be more than 1, but the number was not measurable so the unknown “X” was used.

Some locations, such as 309 and 315 are so unstable, mutating so often, that they are not included in matching.

Deletions

Deletions occur when a piece of DNA is forever removed. Once deleted, DNA cannot regenerate at that position.

A deletion is indicated by either a “d” or a “-“ such as 522d or 522-.

Deletions at locations 522 and 523 are so common that they aren’t utilized in matching either.

Extra and Missing Mutations

On the RSRS tab, you’ll notice extra and missing mutations. These are mutations that vary from those normally found in people who carry your haplogroup. Missing and extra mutations are your own personal DNA filter that allow you to have genealogically meaningful matches.

Extra mutations are mutations that you have, but most people in your haplogroup don’t.

Missing mutations are mutations that most people have, and you don’t.

Heteroplasmies

A heteroplasmy is quite interesting because it’s really a mutation in progress.

What this means is that you have two versions of the DNA sequence showing in your mitochondrial DNA at that location. At a specific location, you show both of two separate nucleotides. Amounts detected of a second nucleotide over 20% are considered a heteroplasmy. Amounts below 20% are ignored. Generally, within a few generations, the mutation will resolve in one direction or the other – although I have seen some heteroplasmies that seem to be persistent for several generations.

Heteroplasmies are indicated in your results by a different letter at the end of the location, so for example, C16069Y where the Y would indicate that a heteroplasmy had been detected.

The letter after the location has a specific meaning; in this case, Y means that both a C and a T were found, per the chart below.

Heteroplasmy Matching

Technically, using the example of C16069Y, where Y tells us that both C and T was found, this location should match against anyone carrying the following values:

However, currently at Family Tree DNA, the heteroplasmy only counts as a match to the Y (specific heteroplasmy indicator) and the CRS value or C, but not the mutated value of T.

Genetic Distance

The difference in matching locations is called the genetic distance. I wrote about genetic distance in the article, Concepts – Genetic Distance which has lots of examples.

When you have unusual results, they can produce unexpected consequences. For example, if a heteroplasmy is found in the HVR 1 or 2 region, and a woman’s child doesn’t have a heteroplasmy, but does have the mutated value – the two individuals, mother and child, won’t be shown as a match at the HVR1/2 level because only exact matches are shown as matches at that level.

That can be pretty disconcerting.

If you notice something unusual in your results, and you match someone exactly, you know that they have the same anomaly. If you don’t match the person exactly, you might want to ask them if they have the same unusual result.

If you expect to match someone, and don’t, it doesn’t hurt to begin discussions by asking about their haplogroup. While they might be hesitant to share their exact results values with you, sharing their haplogroup shouldn’t be problematic. If you don’t share at least the same base haplogroup, you don’t need to talk further. You’re not related in a genealogically relevant timeframe on your matrilineal line.

If you do share the same haplogroup, then additional discussion is probably warranted about your differences in results. I generally ask about the unusual “extra and missing” mutations, beginning with “how many do you have?” and discussing from there.

Summary

I know there’s a lot to grasp here. Many people don’t really want to learn the details any more than I want to change my car’s oil.

For more information, you can call, e-mail or e-chat with the support department at Family Tree DNA which is free.

Next Article – Haplogroups

Your haplogroup, which we’ll discuss in the next article, can eliminate people as being related to you in the past hundreds to thousands of years, but you need the information held in all of your 16,569 locations to perform granular genealogical matching and to obtain all of the available information. In order to obtain all 16,569 locations, you need to order the mtFull Sequence test at Family Tree DNA.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some (but not all) of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Exit mobile version