Welcome to the concepts articles. This series presents the concepts of genetic genealogy, not the details. I have written a lot of detailed articles, and I’ve linked to them for those of you who want more. My suggestion would be to read this article once, entirely, all the way through to understand the concepts with continuity of thought, then go back and reread and click through to other articles if you are interested.
All of autosomal genetic genealogy is based on these concepts of inheritance and matching, so if you don’t understand these, you won’t understand your matches, how they work, why, or how to interpret what they do or don’t tell you.
Someone sent me this question about autosomal DNA matching.
“I do not quite understand how the profiles can be identified to an ancestor since that person is not among us to provide DNA material for “testing” and “comparison.”
That’s a really good question, so let’s take a shot at answering this question conceptually.
Do you have a cat or dog?
I bet I could tell if I could see your clothes, your house, your car or your quilt. Why or how? Because pets shed, and try as you might, it’s almost impossible to get rid of the evidence. I went to the dentist once and he looked at my sweatshirt and said, “German Shepherd?” I laughed.
When your ancestor had children, he or she shed their DNA, half of it, and it’s still being passed down to their descendants today, at least for the next several generations. Let’s look, conceptually, at how and why this works.
In the following diagram, on the left you can see the generations and the relationships of the people both to the ancestor and to each other.
Our ancestor, John Doe, married a wife, J, and had 2 children. Gender of the children, in this example, does not matter.
Everyone receives one strand of DNA from their mother and one from their father. If you’re interested in more detail about how this works, click here.
In our example below, I’ve divided this portion of John’s DNA into 10 buckets. Think of each of these buckets as having maybe 100 units of John’s DNA. You can think of pebbles in the bucket if you’d like. Our DNA is passed, often, in buckets where the group of pebbles sticks together, at least for a while. Since this is conceptual, our buckets are being passed intact from generation to generation.
John’s mother’s strand of DNA has her buckets labeled MATERNALAB and I’ve colored them pink to make them easy to identify. John’s father’s strand of DNA has his buckets labeled FATHERSIDE and is blue. Important note – buckets don’t come colored coded pink or blue in nature – you have no idea which side your DNA comes from. Yes, I know, that’s a cruel joke of Nature.
John married J, call her Jean. Jean also has 2 strands of DNA, one from her mother and one from her father, but in order to simplify things, rather than have two colors for the wives, I’d rather you think of this generationally, so the wives in each generation only have one color. That way you can see the wives’ DNA mixing with the husbands by just looking at the colors. Jean’s color is lavender.
DNA “Shedding” to Descendants
So, now let’s look at how John “sheds” his DNA to his two children and their descendants – and why that matters to us several generations later.
Please note that you can click on any of the graphics to make them larger.
In the examples above, the DNA that is descended in each generational line from John is bolded within the colored square. I also intentionally put it at the beginning and ends of the segments for each child so it’s easy to see.
In the first generation, John’s children each receive one strand of DNA from their mother, J, and one from John. John’s DNA that his children receive is mixed between John’s father’s DNA and John’s mother’s DNA – roughly 50-50 – but not exactly.
At every position, or bucket, during recombination, John’s child will receive either the value in John’s Mom’s bucket or the value at that location in John’s Dad’s bucket. In other words, the two strands of John’s parent’s DNA, in John, combine to make one strand to give to one of John’s children. Each time this happens, for each child conceived, the recombination happens differently.
In this case, John’s children will receive either the M or the F in bucket one. In buckets 2 and 3, the values are the same. This happens in DNA. The child’s bucket 4 will receive either an E or H. Bucket 5 an R or E. Bucket 6 an N or R. And so forth. This is how recombination works, and it’s called “random recombination” meaning that we have not been able to discern why or how the values for each location are chosen.
Is recombination really random, like a coin flip? No, it’s not. How do we know? Because clumps of neighboring DNA stick often together, in buckets – in fact we call them “sticky segments.” Groups of buckets stick together too, sometimes for many generations. So it’s not entirely random, but we don’t know why.
What we do know for absolutely positively sure is that every person get’s exactly half of their parents’ DNA on chromosomes 1-22. We are not talking about the X chromosome (meaning chromosome 23) or mitochondrial DNA or Y DNA. Different topics entirely relative to inheritance.
You can see which buckets received which of John’s parents’ DNA based on the pink and blue color coding and the letters in the buckets. Jean’s contribution to Child 1 and Child 2 would be mixed between her parents’ DNA too.
In the first generation, Child 1 received 6 pink buckets (segments) from John’s mother and 4 blue buckets from John’s father – MATHERSLAB. Child 2 received 6 blue buckets from John’s father and 4 pink buckets from John’s mother – FATHERALAB. On the average, each child received half of their grandparents’ DNA, but in reality, neither child received exactly half.
Note that Child 1 and 2 did not necessarily receive the SAME buckets, or segments, from John’s parents, although Child 1 and 2 did receive some buckets with the same letters in them – ATHERLAB.
If you’re thinking, “lies, damned lies and statistics” right about now, and chuckling, or maybe crying, join the club!
Looking at the next generation, John’s Child 1 married K and John’s Child 2 married O.
Let’s follow John’s pink and blue DNA in Child 1’s descendants. Child 1 marries K and had one child.
John’s grandchild by Child 1 has one strand of DNA from Child 1’s spouse K and one strand from Child 1 which reads MATJJJJLAB. You can see this by K’s entire strand and the grandchild’s other strand, contributed by Child 1, being a mixture of John’s DNA along with his wife J’s DNA. In this case, for these buckets, John’s mother’s pink DNA is only being passed on. John’s father’s buckets 4-7 were “washed out” in this generation and the grandchild received grandmother J’s DNA instead.
In the next generation, 3, John’s grandchild married P and had generation 4, the great-grandchild. Generation 4 of course carries a strand from wife P, but the Doe strand now carries less of John’s original DNA – just MA and LAB at the beginning and end of the grouping.
In the next generation, 5, the great-great-grandchild, you can see that now John Doe’s inherited DNA is reduced to only the AB at the right end.
In the next generation, 6, the great-great-great-grandchild carries only the A, and in the final generation, below, the great-great-great-great-grandchild, none of John Doe’s DNA is carried by that descendant in those particular buckets.
Can there be exceptions? Yes. Buckets are sometimes split and the X chromosome functions differently in male and female inheritance. But this example is conceptual, remember.
You always receive exactly half of your parents’ DNA, but after that, how much you receive of an ancestor’s DNA isn’t 50% in each generation. You saw that in our examples where both Child 1 and Child 2 inherited a little more or a little less than 50% of each of John’s parents’ DNA.
Sometimes groups of DNA buckets are passed together and sometimes, the entire bucket or group of buckets are replaced by DNA from “the next generation.”
To summarize for Child 1, from John Doe to generation 7, each generation inherited the following buckets from John, with the final generation, 7, having none of John’s DNA at all – at least not in these buckets.
Now, let’s see how the DNA of Child 2 stacks up.
You can follow the same sequence with Child 2. In the first generation, Child 2 has one strand of John’s DNA and one of their mother’s, J.
Child 2 marries O, Olive, and their child has one strand from O, and one from Child 2.
Child 2’s contributed strand is comprised of DNA from John Doe and mother J. You can see that the grandchild has FA and ALAB from John, but the rest is from mother J.
The grandchild (above) married Q and their child generation 4, inherits most of John’s DNA, but did drop the A .
Sometimes the DNA between generations is passed on without recombining or dividing. That’s what happened in generation 5, above, and 6 below, with John’s DNA.
Generations, 5 (great-great-grandchild) and 6 (great-great-great-grandchild) both receive John’s F and AB, above.
However, in the 7th generation, the great-great-great-great-grandchild only inherits John’s bucket with B. The F and A were both lost in this generation.
This summary of the inheritance of John’s DNA in Child 2’s descendants shows that in the 7th generation, that individual carries only one of John’s DNA buckets, the rest having been replaced by the DNA of other ancestors during the inheritance recombination process in each generation.
Half the Equation
To answer the question of how we can identify the profile of a person long dead is not answered by this inheritance diagram, at least not directly – because we don’t KNOW how much of John’s DNA we inherited, or which parts. In fact, that’s what we’re trying to figure out – but first, we had to understand how we inherited DNA from John (or not).
Matching with known family members is what actually identifies John’s DNA and tells us which parts of our DNA, if any, come from John.
Let’s say I’m in the first cousin generation and I’m comparing my autosomal DNA against my first cousin from this line. First cousins share common grandparents.
Assuming that they are genetically my first cousin (meaning no adoptions or misattributed parentage,) they are close enough that we can both be expected to carry some of our common ancestor’s DNA. I wrote an in-depth article about first cousin matching here, but for our purposes, we know genetically that first cousins are going to match each other virtually 100% of the time.
The reason our autosomal DNA matches with our reasonably close relatives is because we share a common ancestor and have inherited at least a bucket, if not more than one bucket, of the same DNA from that ancestor.
That’s the ONLY WAY our DNA could match at the bucket level, given what we know about inheritance. The only way to get our DNA is through our parents who got their DNA through their parents and ancestors. Now, could we share more than one common ancestral line? Yes – but that’s beyond conceptual, for now. And yes, there is identical by chance (IBC), which doesn’t apply to close relatives and in general, nor to larger buckets. If you want to read more about this complex subject, which is far beyond conceptual, click here.
Now, let’s see how we identify our ancestor’s DNA!
Let’s look at people of the same generation of descendants and see how they match each other. In other words, now we’re going to read left to right across rows, to compare the descendants of child 1 and 2. Previously, we were reading up and down columns where we tracked how DNA was inherited.
Bolded letters in buckets indicate buckets inherited from John, just like before, but buckets with black borders indicate buckets shared with a cousin from John’s other child. In other words, a black border means the DNA of those two people match at that location. Let’s look at the grandchildren of John compared to each other. John’s grandchildren are first cousins to each other.
Our first cousins match on 4 different buckets of John’s DNA: A, L, A and B. In this case, you can see that both individuals inherited some DNA from John that they don’t share with each other, such as their first letters, M for Child 1 and F for child 2. Because they inherited different pieces from John, because he inherited those pieces from different ancestors, the first cousins don’t match each other on that particular bucket because the letters in their individual buckets are different.
Yes, the first cousins also match on wife J’s DNA, but we’re just talking about John’s DNA here. Now, let’s look at the next generation.
Our second cousins, above, match on four buckets of John’s DNA. Yes, the A bucket was inherited from John’s Mom in one case, and John’s Dad in the other case, but because the letter in the bucket is the same, when matching, we can’t tell them apart. We only “know” which side they came from, in this case, because I told you and colored the buckets pink and blue to illustrate inheritance. All the actual software matching comparison has to go by is the letter in the bucket. Software doesn’t have the luxury of “knowing” because in nature there is no pink and blue color coding.
Our third cousins, above, match, but share only A and B, half as much of John’s DNA as the second cousins shared with each other.
Our 4th cousins, above, are lucky and do match, although they share only one bucket, A, of John’s DNA, which happens to have come from John’s mother.
By the time you get down to the 5th cousins, meaning the 7th generation, the cousins’ luck has run out, because these two 5th cousins don’t match on any of John’s DNA.
Most 5th cousins don’t match and few 6th cousins match, at least not at the default thresholds used by the testing companies – but some do. Remember, we’re dealing with matching predictions based on averages, and actual individual DNA inheritance varies quite a bit. Lies, damned lies and statistics again!
You can adjust your own thresholds at GedMatch, in essence making the buckets smaller, so increasing the odds that the contents of the buckets will match each other, but also increasing the chances that the matches will be by chance. Again, beyond conceptual.
While this is how matching worked for these comparisons of descendants, it will work differently for every pair of people who are compared against each other, because they will have, or not have, inherited different (or the same) buckets of DNA from their common ancestor. That’s a long way of saying, “your mileage will vary.” These are concepts and guidelines, not gospel.
Now, let’s put these guidelines to work.
Matching People at Testing Companies
Ok, so now let’s say that I match Sarah Doe. I don’t know Sarah, but we are predicted to be in the 2nd or 3rd cousin range, based on the amount of our DNA that we share.
As we know, based on our inheritance example, amounts of shared DNA can vary, but we may well be able to discern a common ancestor by looking at our pedigree charts.
Sure enough, given her surname as a hint, we determined that John Doe is our common ancestor.
That’s great evidence that this DNA was passed from John to both of us, but to prove it takes a third person matching us on the same segment, also with proven descent from John Doe. Why? Because Sarah and I might also have a second common genealogical line, maybe even one we don’t know about, that’s isn’t on our pedigree chart. And yes, that happens far more than you’d think. To prove that Sarah Doe and my shared DNA is actually from John Doe or his wife, we need a third confirmed pedigree and DNA match on that same bucket.
A Circle is Not a Bucket
If you just said to yourself, “but Ancestry doesn’t show me buckets,” you’re right – and a Circle is not a bucket. A Circle means you match someone’s DNA and have a common tree ancestor. It doesn’t mean that you or any Circle members match each other on the same buckets. A bucket, or segment information, tells you if you match on common buckets, which buckets, and exactly where. You could match all those people in a Circle on different buckets, from completely different ancestors, and there is no way to know without bucket information. If you want to read more about the effects of lack of tools at Ancestry, click here and here.
Matching multiple people on the same buckets who descend from the same ancestor through different children is proof – and it’s the only proof except for very close relatives, like siblings, grandparents, first cousins, etc. Circles are hints, good hints, but far, far from proof. For buckets, you’ll need to transfer your Ancestry results to Family Tree DNA or to GedMatch, or preferably, both.
I’m most comfortable if at least two of the individuals of a minimum of three who match on the same buckets and share an ancestor, which is called a triangulation group, descend from at least two different children of John. In other words, the first common ancestor of the matches is John and his wife, not their children.
The reason I like the different children aspect is because it removes the possibility that people are really matching on the downstream wives DNA, and not John’s. In other words, if you have two people who match on the same buckets, A and B above, who both descend from John’s Child 1 who married K, they also will share K’s DNA in addition to John’s. So their match to each other on a given bucket might be though K’s side and not through John’s line at all.
Let’s say A and B have a match to unknown person D who is adopted and doesn’t know their pedigree chart. We can’t make the presumption that D’s match to A and B is through John Doe and Jean, because it might be through K.
However, a match on the same buckets to a third person, C, who descends through John’s other child, Child 2, assuming that Child 2 did not also marry into K’s (or any other common) line, assures that the shared DNA of A and B (and C) in that bucket is through John or his wife – and therefore D’s match to A, B and C on that bucket is also through the same common ancestor.
If you want to read more about triangulation, click here.
The beauty of autosomal DNA is that we carry some readily measurable portion of each of our ancestors, at least the ones in the past several generations, in us. The way we identify that DNA and assign it to that ancestor is through matching to other people on the same segments (buckets) that also descend from the same ancestor or ancestral line, preferably through different children. In many cases, after time, you’ll have a lot more than 3 people descended from that ancestral line matching on that same bucket. Your triangulation group will grow to many – all connected by the umbilical lifethread of your common ancestors’ DNA.
As you can see, the concepts, taken one step at a time are pretty simple, but the layers of things that you need to think about can get complex quickly.
I’ll tell you though, this is the most interesting puzzle you’ll ever work on! It’s just that there’s no picture on the box lid. Instead, it’s incredible real-life journey to the frontiers inside of you to discover your ancestors and their history:) Your ancestors are waiting for you, although my ancestors have a perverse sense of humor and we play hide and seek from time to time!