Recently, I had the opportunity to compare 2 children’s autosomal DNA against both of their parents. Since children obtain 50% of their DNA from each parent (except for the X chromosome in males), it stands to reason that all valid autosomal matches to these children not only will, but must match one parent or the other. If not, then the match is not valid – in other words – it’s an identical match by chance.
If you remember, the definition of a match by chance, or IBC (identical by chance) is when someone matches a child but doesn’t match either parent.
This means that the DNA segments, or alleles, just happen to line up so that it reads as a match for the child, by zigzagging back and forth between the DNA of both parents, but it really isn’t a valid genealogical match.
You can read about how this works in my article, How Phasing Works and Determining IBD Versus IBS Matches and also in the article, One Chromosome, Two Sides, No Zipper.
The absolute best way to determine if a match is a valid match or not, valid meaning that the DNA was handed down by ancestors, not a match by chance, is to compare a child’s matches against both parents. By doing that, we can quickly identify and isolate matches that aren’t real.
In the example above, you can see that Mom contributed all As to me and Dad contributed all Cs to me. Joe has alternating As and Cs, so he is a match to me on every location. However, he only matches my parents on half of their locations, so he is not a match to them, because it’s only chance that caused him to match me on those allele values in that order.
DNA matching programs have to take into consideration both allele values in their match routines, since you carry a value from your mother (A above) and a value from your father (C above), and they are not labeled as to which parent they come from.
Valid matches will also match one parent or the other. After all, the child received all of their DNA from one parent or the other, so for someone to be a valid genealogical match a child, they must match a parent.
Some time back, when I was matching to my own mother’s DNA, I noticed that I matched her on about 40% of my matches, which left 60% to either be matches to my father or identical by chance.
Notice, I’m not talking about IBS, or identical by state, because that phrase is used to mean both identical by chance and identical by population. Identical by population means that you did in fact inherit the DNA from an ancestor, but it’s either too far back in time to determine which ancestor, or that segment was present in a specific, probably endogamous population, and you could have inherited it from any number of ancestors.
So, identical by population is identical by descent, but we just can’t tell who we got received that DNA from.
- IBC – identical by chance – not a valid match – you happen to match someone else on a particular segment, but it’s because the match software is jumping back and forth from your mother’s side to your father’s side.
- IBD – Identical by descent – you share a common segment of DNA because you and another person(s) inherited that DNA segment from a common ancestor who you can identify
- IBS – Identical by state – currently used to be both IBC and IBS, where IBS means that you did inherit this DNA from a common ancestor, but it’s so far back you can’t determine who, or that segment is so common within a particular population you could have inherited it from a number of people.
Now a 60-40 parental split is certainly possible, especially if one parent was from an endogamous population, which would mean more matches, or one parent was more recently immigrated from the old country, which would mean fewer matches.
However, without my father’s DNA, which is not available, we’ll never know.
Since that time, I have obtained access to 2 sets of child plus both parents DNA results, so I wanted to take a look at how IBD versus IBC stacked up. These comparisons were done at Family Tree DNA.
|Total Matches||Non-Matching Either Parent||Percent Non-Matching|
Based on other evidence I’ve seen, this percentage seems about right, but the amount of shared DNA and the largest segment size surprised me. Keep in mind that the smallest possible segment size is 7cM which is Family Tree DNA’s lowest single segment threshold to be counted as a match (assuming you meet the 20cM total threshold first.) If you match, they show you your matching DNA down to 1cM, but these tables are measurements by the 7cM matching criteria only.
In plain English, this means that in this case, 12% and 13% of these matches were identical by chance, or false matches. These matches included people who shared up to 57cM of data and the largest block was 15cM.
|Largest Shared cM||Largest Longest Block|
Could something else be causing this? Certainly. Some of these non-matches could be read errors in the files. I’d certainly want to take a look at that if any of these became critical. Another possibility could be that valid match segments are “stitched together” by IBC segments creating longer segments in the child.
An alternative to check validity would be to download the files to GedMatch and see if the pattern continues using the same match criteria. Of course, testing at multiple labs and downloading the results to compare at GedMatch likely removes the issue of read errors in the first set of files. And if you really, REALLY, want to know, you can look at the raw data files themselves.
Just so you know, this wasn’t an anomaly with just one high read. Here are the highest 25 entries from Child 2, or about one fifth of her total mismatches. Only a few were in the 3-5th cousin range. None were closer. Most were 4th or 5th to remote.
If you want to do these comparisons yourself, they are easy to do if you have a child and both parents who have tested at Family Tree DNA.
On your Family Finder matches page, at the bottom, in the right corner, there is a button to download matches.
I download the matches into separate spreadsheets for the child, mother and father. I then color all of the rows pink in the mother’s results, and blue in the father’s results, then copy all three to a common spreadsheet. You can then sort on the match name and this is what you’ll see.
What you’re looking for is white (child) rows that don’t match either a blue row (father) or a pink row (mother.) Don’t worry about pink or blue rows that don’t have matches. It’s normal for the DNA not to be passed to the child part of the time, so these are expected.
In this example, all white rows matched one parent or the other, except for Winnie Whines. I colored this row red and added the Comment column where I entered the number of this non-matching entry. When I’m finished comparing and coloring, then all I have to do is sort that column, bringing all of the nonmatching rows together. I copied those nonmatching entries into a separate sheet so I could sort those alone and obtained the largest shared and longest segments. To determine the percent, just divide the total number of nonmatches, in this case, 133, by the child’s total number of matches, in this case, 959, giving a non-parent-match percentage of 13.9%.
So, the take-home message is that not all small segment matches are genealogically irrelevant and not all larger segment matches are genealogically relevant. Thank goodness we have tools and processes to begin to tell the difference.
So, if you don’t have both parents to compare to, and you’re wondering why you just can’t find a common ancestor with someone you match, the answer might be that they fall into your 12 or 13% that are IBC matches.
If you perform this little exercise, comparing a child to both parents, please feel free to post your results in the comments section along with any commentary about endogamous populations or special circumstances. It really doesn’t take long, probably about an hour total, and the results are really interesting. Plus, you’ll have eliminated all those irrelevant matches.
I’ll be writing more about this interesting experiment in coming days.