While working on a client’s mitochondrial DNA report, I came across the worst case I’ve seen in a long time of mismatches being shown as matches at Ancestry.com. This has been a pervasive problem for a long time.
10 Point Question – If you match another person exactly on every location, HVR1&HVR2, must you have the exact same haplogroup?
Answer: Most of the time.
You didn’t think this was going to be easy did you?
Because Family Tree DNA is the only company to test to the full sequence level, their clients are going to have far more advanced, detailed and accurate haplogroup assignments than people who test at companies who only offer the HVR1+HVR2 regions.
Therefore, like in this case, we see a client whose haplogroup is H1. The “1″ part of H1 is determined by location 3010A, a position found in the coding region that can only be read by full sequence testing. So, at Ancestry, and in other data bases outside of Family Tree DNA, we would expect to see matches to both haplogroup H and H1 (assuming the data base allows outside results to be input), and possibly some other H haplogroups as well, if the HVR1+HVR2 region mutations match those of our H1 person.
OK – next 10 point question. Will someone who is haplogroup H match someone who is haplogroup M or N or some other haplogroup?
Answer: No, not an exact match, but they may share some common mutations.
Then why does Ancestry show them as matches when a simple comparison would eliminate them?
The answer is two-fold. Part of the issue could be how Ancestry assigns haplogroups. We really don’t know how they do it, and they aren’t as forthcoming about these things as Family Tree DNA is. Secondly, and probably the biggest issue is that Ancestry allows people to enter their own data from other labs into their data base, including their haplogroup, apparently without any verification process. So, in essence, Ancestry has muddied their own waters.
My client’s 251 matches at ancestry were all shown with “0″ differences which means they are exact matches. That’s exciting to see, except it isn’t real.
I clicked on the “download matches” button, which dumps everything into a spreadsheet, a wonderfully handy feature. As we talk about this, keep in mind that my client had a total of 5 mutations in the HVR1+HVR2 regions, so based on “0″ differences, everyone on that list should share all of those mutations with no additional mutations.
Here’s what I found after sorting the spreadsheet.
Exact matches = 32, hardly the 251 displayed on the match page.
Of the 251 “exact” matches shown, the haplogroup breakdown is shown below:
A – 10 (Native American)
B- 7 (Native American)
C – 3 (Native American)
D – 2 (Native American)
H – 154, over half with no matching markers at all to client
HV – 10
I – 5
J – 5
K – 4
L – 12 (African)
M – 4
N – 5
R – 6
T – 7
U – 11
V – 3
W – 1
X – 1
Z – 1
But even this isn’t the worst part. Of the 251 matches shown with “0″ differences, 32 are actually exact matches. Of those exact matches, we find 4 different haplogroups, including 3 in haplogroup M, a generally Asian haplogroup which is rare as hen’s teeth here in the US. Hmmm….anyone spot a problem?
Of the remaining 219, 162 have no mutations whatsoever that match the clients, so they not only shouldn’t be shown with “0″ differences, they shouldn’t be shown at all. So this means that the balance of the matches that do share at least one marker but aren’t exact matches, 57 in number, are shown incorrectly, with “0″ differences.
So let’s give Ancestry a report card on this. 32 out of 251 correct equals 13% correct.
Last 10 point question – What letter grade do you get for 13% right, which is 87% wrong?
In my book, and in any school I ever attended, that was a big fat F!
And no, this is not just a recently introduced software bug. It’s been like this forever.
So now that we know how well Ancestry does on basic things like mitochondrial DNA matches, which are exceedingly easy, anyone feel good about how they’ll do with autosomal DNA? Comparatively speaking, that’s the tough stuff.