Geno 2.0, WTY, mtDNA Full Sequence Participants, and More

As we know, some of the WTY (Walk the Y) discoveries were used in the creation of the Geno 2.0 chip.  The entire point, of course, for the WTY test is to sequence the Y chromosome to search for new mutations.  As we can see by the plethora of new L SNPs on the SNP Tree at ISOGG, this has been quite successful.

What you may not know is that the WTY product has two prices.  A price, subsidized by Family Tree DNA for the test if you agree to allow the use of the data for scientific research, and the private price.  The application for the WTY at Family Tree DNA clarifies the expectations and the pricing.

Therefore, anyone who did not pay the higher, private price of $1500, has agreed for their results to be used for research. In essence, those who did agree to participate in research received a significant discount, 38%, amounting to 950.

Thank you Bennett and Max for underwriting this important scientific effort!

Speaking with Bennett about the process of vetting the new Geno 2.0 chip, he indicated that many of the WTY samples used were internal, meaning not customers.  Only 23 public WTY samples were used.

Spencer Wells, today, clarified the situation for those few whose results were used:

“The WTY and whole-mtDNA genome customers used in the chip validation process will receive their results when the results section of the website goes live for all Geno 2.0 participants this fall.  Your data belongs to you.  There will be no charge to them for this, and we hope that they enjoy the new Geno 2.0 experience and will become cheerleaders for the project.”

I notice, in addition to the WTY samples used, this also extends to any mtDNA full sequence results used as well.  Thank you Spencer!

Now, of course the next question will be what happens for those who have already placed orders.  Spencer says, “They will be able to cancel their orders, or give the kit to a friend or family member (which of course we would prefer…;-).  I really want to encourage them to help us expand our database.  It will benefit everyone, themselves included, and will allow us to make the 2.0 experience richer for everyone – especially the community features.  They will receive the whole Geno 2.0 experience, just like people who purchase kits.  We’ll provide them with GPIDs to use for logging in via email.”

In addition, an article appeared in BioArray News today by Justin Petrone that provides some additional information on the Illumina BeadChips used.  It’s free, but you do have to register to read it.  I’m providing the highlights below that add to the information we’re already received.

Justin interviewed Spencer, who provides background information on the Genographic project.  He mentions that about 520,000 people have participated to date.

In addition to discussing the SNPs on chips information that Spencer has previously provided to our community, he also says that ‘National Geographic and its partners are preparing two publications that discuss the new chip and have submitted an abstract for the American Society of Human Genetics annual meeting, which will be held in San Francisco in November.”

Spencer also spoke a little about the new National Geographic online community capability.  This will be in addition to the option for participants to transfer their results to Family Tree DNA, for free.  He says that “participants will have the opportunity to choose to register for the Genographic online community to connect with other participants and find shared ancestry, helping to fill in the gaps between what they know about their recent genealogy and their genetic results.”

Geno 2.0 Answers from Spencer Wells

Lots of folks have had questions about the Geno 2.0 kits and different aspects of the testing.  Dr. Spencer Wells, National Geographic’s Scientist in Residence for the Genographic Project has been kind enough to answer some of the questions he’s been receiving.  I know the genetic genealogy community appreciates the continued communication and involvement from Dr. Wells.  Thanks Spencer!! 

1.    How many SNPs do we have in the test?

A total of around 146,000 ancestry-informative markers (AIMs):  ~130,000 autosomal and X-chromosomal, ~13,000 Y-chromosomal, and ~3200 mtDNA

2.    What is the different between the Genographic Project and the 23andme test?  And ancestry.com?

Genographic is a non-profit National Geographic research project focused on mapping the human journey, and encompasses three core components:  scientific research, public participation and the Legacy Fund.  Our public participation component is available through the purchase of a Geno 2.0 DNA testing kit.  Our custom-designed genotyping chip looks at the markers outlined above, and is simply the best available platform for the study of genetic ancestry.  For-profit companies, including Ancestry and 23andMe, use slightly modified off-the-shelf chips which were optimized for medical research, not population history.

3.    Do we offer ancestry painting?

I assume you are referring to the chromosomal “painting” on the 23andMe website, and no – at this time we don’t offer this feature.  It is relatively straightforward to implement, however, and if there is sufficient interest among our participants, we may offer it in the future.

4.    Do we give African Americans their Asian percentage?

Everyone receives a breakdown of their regional affiliations, expressed as percentages.  This might include northeast Asian or southeast Asian in African Americans, if such components are present.

5.    Do we plan on adding a West African or East African to the affiliation?

We are continuing to refine our analysis of the chip data, and may be expanding our list of regional affiliations.

6.    How are we different from population finder?

It’s all about the markers:  again, because we have created our chip specifically for the study of ancestry, we feel that it is the most accurate tool for determining population affiliation.  Our AIMs were drawn from more than 450 world populations, and were chosen on the basis of their ancestry informativeness.  We are continuing to refine our analytical methods to provide the best ancestry testing experience available anywhere.

Dr. Wells has been busy answering questions today.  Cece Moore has some additional comments on her blog as well.  http://www.yourgeneticgenealogist.com/2012/07/a-short-update-from-spencer-wells-on.html

Adoptee Resources and Genetic Genealogy

Genetic genealogy has been a God-send for adoptees, especially those who have had no luck unsealing records or otherwise determining their parentage.  I write DNA reports for lots of adoptees.  There is nothing more rewarding than an adoptee “happy ending,” someone who has found their family.  Nothing makes you appreciate your family more than working with people who can’t find theirs.

Men, especially, are fortunate, because the Y chromosome typically follows the surname, which means that they may have a very strong match with a specific surname.  Even though this doesn’t identify the specific person, it’s certainly a very large step in the right direction.  In more than one case, it has led us ultimately to the right person, confirmed by additional autosomal tests on family members.

Nearly all adoptees take the autosomal tests as well, Family Finder at Family Tree DNA and the 23andMe test.  This allows them to fish in two pools and both provide a list of matches.  The new Ancestry.com test, even though it’s new and we have no experience working with it yet promises a third pool for adoptee fishing.

Genetic genealogy for adoptees is slightly different than for the rest of us.  For adoptees, you’re not so much looking for older genealogy, you’re looking to use common autosomal DNA matches to identify any common ancestor between two matches, then use that information to track the family forward in time.  You’re ultimately looking for very recent genealogy, their parents.

A group recommended for adoptees doing DNA testing is AdoptionDNA in the Yahoo groups. This site includes search angels and folks who are developing specially designed software to work with adoptees matches Gedcoms.

Furthermore, I strongly recommend the DNA Adoption group at this link, and their classes for how to work with autosomal DNA, whether you are an adoptee or not.

While not specific to Genealogy, the ISOGG list at Yahoo focuses on Genetic Genealogy.  They also sponsor a Newbie forum if that is more your speed.

Dick Hill, a genetic genealogist, himself an adoptee, succeeded in finding his birth family.  His story is particularly inspiring, and his book, Finding Family, will be released shortly.  Dick created this website to assist other adoptees with information and free resources.   http://www.dna-testing-adviser.com/

Here are some additional resources for adoptees:

http://www.americanadoptioncongress.org/

http://www.adoptiondatabase.org/

http://www.isrr.org/

http://www.adoptioninstitute.org

http://www.childwelfare.gov/adoption/search/

http://www.childwelfare.gov/systemwide/laws_policies/statutes/infoaccessap.cfm

Watch for new programs from the Mixed Roots Foundation beginning in the fall of 2012 including the Global Adoptee Genealogy Project. http://www.mixedrootsfoundation.org/

Geno 2.0 – Q&A with Bennett Greenspan

Bennett Greenspan, President of Family Tree DNA, was gracious enough to call me with the answers to several questions and responses to comments and speculation on blogs and lists today. He wants to thank everyone for their interest and personal support for the ongoing research and the new product.  I am putting these in a question and answer format.

Q:  Can I purchase the Geno 2.0 kit elsewhere?

A:  The Geno 2.0 product can only be purchased through the National Geographic Society.  This product cannot be ordered from Family Tree DNA.

Q:  Will there be a way to move my Geno 2.0 results to the Family Tree DNA database?

A: As with the original National Geographic product, we plan to have a link on the Geno 2.0 personal page to allow people to upload their results.  With the Geno 2.0 deep SNP results, they will be able to enter their Family Tree DNA account number, if they have an existing account at Family Tree DNA, and their deep SNP results will be included with their other tests results on their personal page.

Q:  Does Family Tree DNA plan to offer a test that will be more extensive then the new Genographic test for the Y chromosome?

A:  No. The most extensive test for obtaining YDNA SNP data is available on the Geno 2.0 chip and Family Tree DNA has no plans to compete with its partner.  STR results will not be supplied by Geno 2.0 and all regular genealogical marker tests should be ordered through Family Tree DNA.  These two tests go hand in hand.

By way of example, in haplogroup R-M222 – the new Geno chip includes discoveries of at least three unique SNP’s downstream of R-M222.

These 10,000 new SNPs will provide, for almost everyone, one or two additional clades (subhaplogroups) down the tree from where they are located today.  For some people, these will reach into a genealogical timeframe, connecting their SNPs and their STR data.  The STR tests will then be used to further augment the Geno 2.0 SNP tests for genealogical comparisons within families.

Q:  When will the new Y tree be available?

A:  FTDNA is vetting the Y tree in conjunction with the Genographic Project and prior to the release of these data.  This won’t occur until they will have had enough samples to fully vet the 12,000 tree SNPs, confirming the positions on the tree and that all SNP’s are working correctly.

Q:  What is the difference between the full mitochondrial sequence (FMS) test and the Geno 2.0 test for mitochondria?

A:  Chips can only tell you what is programmed on them.  The Geno 2.0 test is not as complete as the FMS.  Geno 2.0 includes all mtdna SNPs approved for research purposes at Family Tree DNA plus all known mutations found in Genbank.  The Geno 2.0 chip includes a total of about 3,100 locations, more than any other product using this same technology.

This test is very complete for European-centric haplogroups, such as H.  However the test is anthropological in nature, not genealogical.  This means that while you will receive your haplogroup assignment to the same level as a full sequence test, you will not receive other genealogical information that could be critically important to your research.  (Private SNP’s that are unknown will not be ‘discovered’ via chip testing).

If you want your anthropological information, meaning haplogroup information only, then the Geno 2.0 kit is the way to go.

Geno 2.0 has 50% more mtDNA SNP’s than the next best chip technology for mtDNA.  The only thing better is the full sequence test.  The full sequence test is the only test that can be universally used for scientific research as well.

Q:  There seems to be some confusion surrounding what products to order for what purposes.

Geno 2.0

Product Purchase?
Y DNA – 12,000 SNPS – Deep Ancestry – Haplogroup identification Yes
Mitochondrial DNA – Anthropology – Deep Ancestry – Haplogroup Identification Yes
Ethnicity – Worldwide Populations – Ancestral Informative Markers – Deep Ancestry – 137,000 total SNP locations – covers many SNPS not in Family Finder Yes

FTDNA Products

Product Purchase?
Y- DNA Regular STR tests, 12, 25, 37, 67 and 111 markers Yes
Mitochondrial DNA tests for genealogical comparisons Yes
Family Finder –for genealogical matching – cousin matching provided from Family Tree DNA data base Yes
Y DNA deep clade test Order Geno 2.0 unless time is of the essence
Y DNA WTY – after running Geno 2.0 on kit, discuss with Family Tree DNA Case by case

National Geographic – Geno 2.0 Announcement – The Human Story

Have you ever dealt with something so massive and overwhelming it took a few days just to get your head wrapped around it?  Well, that’s how I’ve been feeling about the new National Geographic Geno 2.0 announcement.  It’s not just what has been announced, but the utterly massive amount of scientific research behind the scenes, and what it means to the rest of us.

If you think of all of the discoveries and progress that has been made in the 12 years since the advent of genetic genealogy, what you’re about to hear today dwarfs it all.  Hold on tight – this is a white knuckle ride of a lifetime.  The day I heard about this, I wandered around somewhat starry-eyed in amazement and kept muttering something terribly intelligent like “Wow, oh Wow.”

I’d like to share with you some of today’s big news and hope that you too share my sense of awe to be alive in such an exciting time, and to have not only a front row seat, but participating in making history.  This isn’t a movie, it’s the real McCoy!

Let’s start with a bit of history about Nat Geo 1.0, the Genographic Project.  Fasten your seatbelt, your E ticket ride starts here and now!

Nat Geo 1.0

Eight years ago, in April 2005, the National Geographic Genographic project was announced. The goal was to sell a total of 100,000 kits over 5 years to help fund the indigenous part of the project, which was to collect samples from indigenous peoples around the world to better understand population migration.

According to Nat Geo, this has been the most successful program they have ever undertaken.  That in and of itself it an amazing statement, especially considering that there was a lively debate within Nat Geo prior to the project launch.

Someone opined to Spencer Wells that they wouldn’t even sell 10,000 kits, let alone 100,000.  Well, they were wrong, 10,000 kits were sold the first day alone.  I’m guessing that Bennett and Max at Family Tree DNA, whose test kits Nat Geo uses, has a sense of controlled panic about that time.  The 100,000 kits were sold in the first 8 months and they still sell between 40,000 and 50,000 kits per year today.

How is that project doing?  Well, it was scheduled to run for 5 years, and it’s now into its 7th year.  They have collected over 75,000 samples from indigenous people and on the public side, over 750,000 people in over 130 countries have bought kits to help fund the research.  32 publications either have been released or will be shortly. Of the 45 million dollars the project has grossed, National Geographic has contributed more than 1.7 million dollars to the Legacy Fund for investment back into the indigenous communities that participated in the Genographic project.

You might recall that the original Nat Geo project only tested 12 markers for men and the HVR1 region on the maternal side.  At that time, 7 years ago, $99 for each of those was a great deal and the projects received a lot of new participants.  About 20% of the Nat Geo participants transferred their result to Family Tree DNA, for free, so they could join projects and participate in genetic genealogy.

Today, 12 markers is quite light and so is HVR1 testing alone.  Project administrators cringe when we see those, because we know it’s really not enough to do much with today.  We’ve learned so much in the past 7 years.  You don’t realize how much things have changed until you take a minute to look back.

At the same time we were learning, technology was also advancing.  Seven years ago, running autosomal tests was simply cost prohibitive. If you consider that computer technology has decreased in price and doubled in speed every year or two (Moore’s Law), the advances in DNA sequencing technology and understanding are moving in the same directions (increased capability and decreased costs) by a factor of 5 as compared to computer technology. Literally, we are moving at the speed of light.  See, I told you to hold on.  I meant it!

Geno 2.0 – The Big Announcement

It’s amazing that something this big has been kept this quiet.  Those of us involved have been bursting at the seams with excitement, and today is the big day.  Last night about 9 o’clock we received word that the countdown had begun.

For a look at the new National Geographic webpage, go to www.genographic.com.  This is the heart of the new Geno 2.0.

Geno 2.0 is still comprised of the 3 core components as before, the indigenous portion, the Legacy fund and the public participation portion.  However the technology is changing, dramatically, and the public participation arena is expanding.   Public participation will now include some “citizen science” projects, grants, an educational segment meaning kits in classrooms, and community based projects.  All of this is made possible by advances in the core sciences and technology.  This, plus the focus of the “Dream Team” of genetic genealogy and population genetics.

Thankfully, Spencer Wells at National Geographic and Bennett Greenspan and Max Blankfeld at Family Tree DNA prepared us in advance for what was coming, as much as you can prepare for a technological tsunami!

Let’s take a look at the technology and scientific advances that have occurred and what it means to us today.

New Chips and New Partnerships

The days of sequencing 12 markers in the lab are gone forever, replaced by high-speed sequencing that looks at half a million markers, or more, at a time, and for the same price as a 12 marker test and the mitochondrial DNA test, together, would have cost in Nat Geo 1.0.

However, when you’re looking at just the Y DNA and the mitochondrial, you’re missing 98% of the human genome, the part that isn’t Y or mitochondrial DNA.  And that 98% holds many secrets, the secrets of our ancestors.

The National Geographic Society recruited one of the top geneticists in the world at Johns Hopkins, focused on autosomal genetic markers.  He has spent the past two years identifying every known marker relevant to ancestry or population genetics that is NOT medically relevant.  This includes the X and Y chromosomes, mitochondrial DNA and the balance of the autosomal markers.

Are you sitting down?  Here’s the first of several bombs!

Relative to Y-line DNA, in 2010, just 2 years ago, the YCC SNP 2010 tree had a total of just over 800 SNPs that has been discovered.  Today it still hasn’t reached 900.  You can see the current tree at  http://www.isogg.org/tree/index.html.  Notice that all of the L SNPs were discovered by Thomas Krahn in the Family Tree DNA lab with the assistance of Family Tree DNA’s customers and project administrators.  This is truly “crowd-science” in the flash mob sense.

Today, after a concerted effort of discovery involving many people, there are a total of 12,000 Y SNPS and of that, 10,000 of them are unique and new and have never been seen or published before.  This means that your haplogroup will automatically be determined to the furthest branch of the tree with no additional SNPs to be tested.  As this test becomes available to Family Tree DNA clients as an upgrade, it will signal the demise of the deep clade test.

If there is a project administrator sitting next to you, they have just fainted.  The magnitude of this is simply mind-boggling.

Relative to mitochondrial DNA, 3352 unique (non-haplogroup defining) mutations have been discovered.  To measure all of the relevant mitochondrial DNA mutations, including insertions and deletions, over 31,000 probes (locations) are needed on the new high density chips.  Before this new approach, chip technology was unable to account for insertions and deletions, but that has been remedied by a new approach to an old problem.  This means that haplogroups will be determined to their deepest level and they will be accurate, including insertions and deletions critical to haplogroup assignment.

Relative to autosomal DNA, over 75,000 Ancestrally Informative Markers (AIMs) have been discovered and included on the new chip, and that’s after removing any that might be considered medically informative.  This astronomical number of SNPs will allow us to detect ethnicity and improve accuracy on a scale that we’ve never even dreamed about before.  I specifically asked Spencer Wells if this will help resolve those “messy” situations where we have European, Native American and African admixture, and he indicated that it would.  I can hardly wait.  For those of us what have been waiting patiently, and some not so patiently, to be able to identify small amounts of admixture, this is the best news you could ever hope to hear!  I told you that something wonderful was on the way!

Relative to admixture with Neanderthal, Denisovan and Melanesian man, meaning interbreeding, more than 30,000 SNPs have been identified that will signal interbreeding where it occurred between modern humans and ancient hominids.  And yes, this means that it did occur!  So indeed once again, you can begin wondering about your brother-in-law.  He’s probably wondering about you too.

Relative to the X chromosome, it’s included.  The X chromosome, because of its special inheritance pattern, gives us an additional, special tool when working with genetic genealogy.  We’ll cover this in a future blog.

The New Chip

In total, the new SNP count to be included on the new Nat Geo 2.0 chip (photo above) includes both new and known existing SNPs in the following amounts:

  • Autosomal including X – 147,000
  • Neanderthal – 26,000
  • Denisovan – 1,500
  • Aboriginal – 13,000
  • Eskimo – 12,000
  • Chimpanzee – 1,100
  • Y Chromosome – 12,000
  • mtDNA – 31,000

This chip has been designed to distinguish between populations.

OMG – What Happened to the Haplotree?

We’re not done yet with bombshells.

After this new chip was created by Illumina specifically for National Geographic, about 1200 samples were run as proof of concept, including 400 WTY (Walk the Y), 350 mitochondrial full sequence and 500 Y samples.  All of the samples run are checked and tested for all of the SNPs on the chip.  Of course, females’ samples will fail on all of the Y haplogroup locations, etc.

Just based on this test run alone of 900 Y chromosome kits, the haplotree expanded from 862 SNPs to a total of 6153.  If you’ve just said something akin to “Holy Cow,” you’re on the right track.  Imagine what it will do with another 1000 or 10,000 or 100,000 tests.  Right now, we’re making discoveries so fast we can hardly deal with them.

What Does This Mean?

In reality, what this means is that we will very soon use SNPs to determine heritage down to a genealogical meaningful timeframe, meaning 500 to maybe 1000 years.  The standard STR (Short Tandem Repeat) markers we know and love will become the leaves on the branches of the tree and these will likely be used when there are no more SNPs to determine family groupings and line marker mutations within families.

New National Geographic Geno 2.0 Website

Needless to say, all of this discovery has prompted National Geographic to redo their website entirely.  New maps are forthcoming.  Yeah!!  New maps include the migration maps as well as new haplogroup “heat maps” where the colors are graduated based on frequency.

There are entirely new capabilities too.  The new website will show you as the center of a circle and you’ll be able to contact people who have tested at Nat Geo who are located near to you in the circle.  Those closest to you, you’re most closely related to.  Further away, more distantly related.  Before, there was no matching between Nat Geo participants.

And yes, Geno 2.0 participants will still be able to transfer into Family Tree DNA for free.  I hope they make that option much more visible or interactive.

A New Test Kit

Anyone wanting to participate in Geno 2.0 will have to order a new kit from National Geographic.  The previous Nat Geo kits, if you recall, were anonymous unless you chose to transfer to Family Tree DNA, plus the permission you gave was specifically for mtdna or Y-line, not autosomal testing.

Furthermore, the DNA in many kits will be too old and will have degraded too much to use.  Everyone ordering the new Geno 2.0 kit will receive a new swab kit, in an heirloom box.  The comprehensive Y-line (haplogroup only), mtdna (haplogroup only) and autosomal testing will cost $199.

For Family Tree DNA clients who will be offered the upgrade in the late summer or fall, you will be able to upgrade if your DNA is less than 4 or 5 years old.  Otherwise, you’ll receive a new swab kit too.

All processing will be done at the Family Tree DNA Houston facility.

New Results Pages

The new test of course requires all new results pages for participants.

Take a look at a few of the pages you can expect.

The results will be presented as a personal story.

Your story will also include information such as maps of where your ancestors lived and where they migrated.

I asked Spencer if participants will be able to download their results so that we can continue to compare them as we do today, using various phasing tools.   Spencer replied, “Yes, raw results WILL be available for download.  In the Genographic Project, you will always own your DNA results, and the genotype data will be yours to do with as you please.  I feel very strongly that this is a cornerstone of ethical DTC genetic testing.”  Way to go Spencer!!

As Geno 2.0 moves forward, additional analytical tools will be added.

Ordering

National Geographic is accepting pre-orders now.  They will ship before the end of October, and they expect to be shipping significantly before that.

In Summary

Our world is changing, rapidly, and for the better.  The door we’ve been peeking through for a decade now is swinging wide open.  More brick walls will fall.  We’ll find and meet new cousins.  Ethnicities will be identified at a level never before possible.  We’ll learn about our ancestors and the story of our past through their DNA that we carry today.  It is the frontier within.  DNA is truly the gift that keeps on giving!

“One small step for man, one giant leap for mankind.”

Neil Armstrong, July 24, 1969

The Dreaded “Middle East” Autosomal Result

One of our blog followers, Ron, asked this question:

“My late father and his brother were born and raised on Hatteras Island which was a very isolated community until relatively recent times. Curious about their genetic ancestry, I had my uncle do the Family Tree DNA Family Finder test. His results for the Family (Population) Finder were:

Europe (Western European) – Orcadian 91.37% ±2.82%

Middle East – Palestinian, Bedouin, Bedouin South, Druze, Jewish, Mozabite 8.63% ±2.82%

The 8.63% Middle East was surprising since most if not all of his ancestors, going back 4 or more generations, were born on the OBX (Outer Banks). Most of the original families on Hatteras Island trace their roots back to the British Isles and western Europe.

Since my mother’s parents were immigrants from eastern Europe, I thought it would be interesting to know what contributions my maternal grandparents added to my genetic ancestry, so I submitted my DNA samples for the same test.  The Population Finder test showed that I was Europe Orcadian 100.00% ±0.00%. I was shocked that some other population did not show in the results.

Can you help me understand how the representative populations are determined and why Middle East didn’t show in my sample?”

Yes, indeed, the dreaded “Middle Eastern” result.  I’ve seen this over and over again.  Let’s talk about what this is and why it might happen.  As it happens, the fact that Ray is from Hatteras Island provides us with a wonderful research opportunity, because it’s a population I’m quite familiar with.

Given that Dawn Taylor and I administer the Hatteras Families DNA Projects (Y-line, mtDNA and autosomal), I have a good handle on the genealogy of the Hatteras Island Families.  They are of particular interest because Hatteras Island is where Sir Walter Raleigh’s Lost Colonists are rumored to have gone and amalgamated with the Hatteras Indians.  The Hatteras Indians in turn appear to have partly died off, and partly married into the European Island population.  Both the Lost Colony Project and the Hatteras DNA Projects at  http://www.familytreedna.com/public/HatterasFathers and http://www.rootsweb.ancestry.com/~molcgdrg/hatteras/hifr-index.htm are ongoing and all Hatteras families are included.

As part of the Hatteras families endeavor, Dawn and I have assembled a data base of the Hatteras families with over 5000 early settlers and their descendants to about the year 1900 included.  What Ron says is accurate.  Most of the Hatteras Island families settled on the island quite early, beginning about 1710.  Nearly all of them came from Virginia, some directly and others after having settled on the NC mainland first for a generation or so in surrounding counties.  By 1750, almost all of the families found there in 1900 were present.  So indeed, this isolated island was settled by a group of people from the British Isles and a few of them intermarried with the local population of Hatteras Indians.

Once on the island, it was unusual to marry outside of the island population, so we have the situation known as endogamy, which is where an isolated population marries repeatedly within itself.  Other examples of this are the Amish and Jewish populations.  When this happens, the founding group of people’s DNA gets passed around in circles, so to speak, and no new DNA is introduced.

Typically what happens is that in each generation, 50% “new” DNA is introduced by the other parent.  When the new DNA is from someone nonrelated, it’s relatively easy to sort out using today’s DNA phasing tools.  But when the “new” DNA isn’t new at all, but comes from the same ancestral stock as the other parent, it has the effect of making relationships look “closer” in time.

Let’s look at an example.

You carry the following average percentages of DNA from these relatives:

  • Parents 50% from each parent
  • Grandparents 25%
  • Great-grandparents 12.5%
  • Great-great-grandparents 6.5%

As you can see, the percentage is divided in each generation.  However, if two of your great-grandparents are the same person, then you actually carry 25% of the DNA from that person, not 12.5.  When you’re looking at matches to other people in an endogamous community, nearly everyone looks more closely related than they are on paper due to the cumulative effect of shared ancestors.  In essence, genetically, they are much closer than they look to be on a genealogy pedigree chart.

Ok, back to the question at hand.  Where did the Middle Eastern come from?

Looking at the percentages above, you can see that if Ray’s Uncle was in fact 8% (plus or minus about 2%, so we’ll just call it 8%) Middle Eastern, his Middle Eastern relative would be either a great-grandparent or a great-great-grandparent.  Given that generational length is typically 25 to 30 years, assuming Ray’s birth in 1960 and his uncles in 1940, this means that this Middle Eastern person would have been living on Hatteras Island between 1835 and 1860 using 25 year generations and between 1810 and 1840 using 30 year generations.  Having worked with the original records extensively, I can assure you that there were no Middle Eastern people on Hatteras Island at that time.  Furthermore, there were no Middle Eastern people on Hatteras earlier in the 1800s or in the 1700s that are reflected in the records.  This includes all existent records, deed, marriages, court, tax, census, etc.

What we do find, however, are both Native Americans, slaves and free people of color who may be an admixture of either or both with Europeans.  In fact, we find an entire community adjacent to the Indian village that is admixed.

We published an article in the Lost Colony Research Group Newsletter that discusses this mixed community when we identified the families involved.  It’s titled, “Will the Real Scarborough, Basnett and Whidbee Please Stand Up” and details our findings.

These families were present on the island and were recorded as being “of color” before 1790, so the intermarriage occurred early in the history of the island.

Furthermore, these families continued to intermarry and they continued to live in the same community as before.  In fact, in May and June of 2012, we visited with a woman who still owns the Indian land sold by the Indians to her family members in 1788!  And yes, Ray’s surname is one of the surnames who intermarried with these families.  In fact, it was someone with his family surname who bought the land that included the Indian village in 1788 from a Hatteras Indian woman.

So what does this tell us?

Having worked with the autosomal results of people who are looking for small amounts of Native American ancestry, I often see this “Middle Eastern” admixture.  I’ve actually come to expect it.  I don’t believe it’s accurate.  I believe, for some reason, tri-racial admixture is being measured as “Middle Eastern.”  If you look at the non-Jewish Middle East, this actually makes some sense.  There is no other place in the world as highly admixed with a combination of African, European (Caucasian) and Asian.  I’m not surprised that early admixture in the US that includes white, African and Native American looks somewhat the same as Middle Eastern in terms of the population as a whole.  Regardless of why, this is what we are seeing on a regular basis.

New technology is on the horizon which will, hopefully, resolve some of this ambiguous minority admixture identification.  As new discoveries are made, as we discussed when we talked about “Ethnicity Finders” in the blog a few days ago, we learn more and will be able to more acutely refine these minority amounts of trace admixture.

If Ray’s ancestor in 1750 was a Hatteras Indian, and if there was no Lost Colonist European admixture already in the genetic mix, then using a 25 year generation, we would see the following percentages of ethnicity in subsequent generations, assuming marriage to a 100% Caucasian in each generation, as follows:

  • 1750 – 100% Indian
  • 1775 – next generation, married white settler – 50% Indian
  • 1800 – 25% Indian
  • 1825 – 13.5% Indian
  • 1850 — 6.25% Indian
  • 1875 — 3.12% Indian
  • 1900 – 1.56% Indian
  • 1925 – 0.78% Indian
  • 1950 – 0.39% Indian

Remember, however, about endogamy.  This group of people were neighbors and lived in a relatively isolated community.  They married each other.  Every time they married someone else who descended from someone who was a Hatteras Indian in 1750, their percentage of Native Heritage in the subsequent generation doubled as compared to what it would have been without double inheritance.  So if Ray’s Uncle is descended several times from Hatteras Indians due to intermarriage within that community, it’s certainly possible that he would carry 6-10% Native admixture.  There are also records that suggest possible African admixture early in the Native community.

So now to answer Ray’s last question about inheritance.

Ray wanted to know why he didn’t show any “Middle Eastern” admixture when his uncle did.

Remember that Ray’s Uncle has two “genetic transmission events” that differ from Ray’s line.  Ray’s Uncle, even though he had the same parents as Ray’s father, inherited differently from his parents.  Children inherit half of their DNA from each parents, but not necessarily the same half.  Maybe Ray’s father inherited little or none of the Native admixture.  In the next generation, Ray inherited half of his father’s DNA and half of his mother’s.  We have no way of knowing in which of these two transmission events Ray lost the Native admixture, or whether it’s there, but in such small pieces that the technology today can’t detect it.

Hopefully the new technology on the horizon will improve all aspects of autosomal admixture analysis and ethnicity detection.  But for today, if you see the dreaded “Middle East” result appear as one of your autosomal geographic locations and your family isn’t Jewish and has been in the states since colonial times, think to yourself ‘racial admixture’ and revisit this topic as the technology improves.  In other words, as far as I’m concerned, the jury is still out!

Racial Admixture in Elizabethan London

We typically don’t think of Africans in London in the 1500s, but they were there, as proven in parish and other records.  Thankfully, they were rare enough that when there was a record pertaining to them, their ethnicity is recorded.  But by 1600, after the Queen’s legendary decades-long conflict with Spain where galley slaves from Spanish ships were “rescued” when the ships were captured, the number of Africans and other “Moorish” people were becoming problematic, at least to the Queen, and she sought to repatriate at least some of them to “Barbary.”

Recently, the BBC ran a wonderful story about this which you can find at this link:  http://www.bbc.co.uk/news/magazine-18903391

In the haplogroup E1b1a project, it’s not uncommon for a person who knows their family to be “white” to discover their haplogroup is of African origin.  Many times, one can account for this by more fully researching the early colonial records of America, but not always.  Perhaps we need to extend the research net a bid wider to include both London and Bristol records.

Ethnicity Finders

It’s no secret in the genetic genealogy community that one of my special areas of interest is Native and mixed race heritage.  Both are obscured in the history of this country and this continent, and hampered by the lack of records.

Descendants are left to attempt to piece the history of their family together, many times with nothing more concrete than oral family history, faintly remembered.  For these people, and there are many, genetic genealogy is the best and final hope they have of discovering IF the family rumor is true.  If it is true, then perhaps by the judicious use of these new DNA tools, we can begin to get some idea of where to look on the family tree, as well as in historical records.

Someone asked a question on the blog the other day about how to interpret these results, and I do want to answer that question specifically in a future blog, but first, we need to talk about the tools themselves.

There are three kinds of tests or tools out there in the marketplace today.

Y-line and Mitochondrial DNA Tests

Why, you ask, are we talking about these tests when we’re supposed to be talking about ethnicity finders?  Well, simply put, because these are the old, proven gold standards, and people tend to forget about using them.  These tests DO prove ethnicity, but only for that one specific line.  But that’s also the beauty of this test, we know exactly which line the ethnicity pertains to.  Y-line of course is the paternal line and mitochondrial DNA is the direct maternal line only.  What does that tell us about their spouses?  Not a darned thing.

To discover ethnicity information about the spouse, you need to find someone directly descended from the spouse in the proper manner and have them test.  What you need to do is to build yourself a DNA pedigree chart so that you can determine, to the best of your ability, the ethnicity of your family, member by member.

There is a free paper on my website at www.dnaexplain.com under the Publications tab titled “Creating Your Personal DNA Pedigree Chart.”  Make good use of it and the color coded tree, included, shown below.

If you can obtain the Y-line and mtDNA of your great-grandparents (through descendants of course), you’ll know about 8 of your ancestors.  If you can obtain the DNA of your great-great-grandparents, you know the ethnicity of 16 of them.  That’s a lot of good information.

However, sometimes obtaining this information just isn’t possible.  Some people are adopted, some don’t know the identity of a parent for other reasons, sometimes couples don’t have children of the right genders for their descendants to take these kinds of DNA tests, and sometimes, you simply have relatives who aren’t interested or refuse to test.  Enter, autosomal testing.

CODIS Type Tests

The first entries into this field of autosomal testing were tests that used few markers.  I am grouping them here together, even though there were some differences and at the time, there was significant debate about which ones were better, more accurate and such.  But today, with the advent of what I’m calling the Wide Spectrum Chip Tests, they are all obsolete.

CODIS stood for the Combined DNA Index System and was developed by police to differentiate between people, not to find their ethnic similarities.  Most of them used either 15 or 21 markers that were standardized for police work.  One test specifically for genealogy used about 150 markers.

These tests were also used for early paternity testing and were fairly reliable for one generation, but beyond that, it was difficult to draw any conclusions.  My alleged half-brother and I took three of these tests to determine if we were in fact half-siblings.  One test came back inconclusive.  One test said “probably not” and one said “probably cousins, not half-siblings.”  Later, we both took two of the Wide Spectrum Chip Tests, and we are neither half-siblings nor cousins.  The results of both of the wide spectrum tests, taken at different companies, matched each other, so all doubt was removed.

I took several of these tests as they were released, and you can read about the differences in results in my paper on by website titled Revealing American Indian and Minority Heritage Using Y-line, Mitochondrial, Autosomal and X-Chromosomal Testing Data Combined with Pedigree Analysis.  This paper was published in JoGG, the Journal of Genetic Genealogy, in the Fall of 2010.

Wide Spectrum Chip Tests

In one large step, we went from 21 markers to half a million, give or take 100,000 or so.  It was kind of like moving from trying to find scant evidence under a microscope to a panoramic view of the galaxy.

All together, there have been 4 players in this field.  One of the first was DeCodeMe.  They have pretty well eliminated themselves.  With an impending bankruptcy a few years ago, they raised their prices into the $2000 range.  That combined with no comparative data base, like 23andMe had at the time, in essence killed them as a player.  Unfortunately, their ethnicity test was the only one that was able to classify my African heritage with a group of tribes.  I hated to see them leave the scene.

23andMe was the next player.  They introduced the concept of matching your cousins.  Genetic genealogy went crazy and we couldn’t order those tests quickly enough.  Unfortunately, their ethnicity comparison is disappointingly vague and is limited to 3 categories, European, Asian and African.  No updates or improvements have been offered in several years.  Genealogy is not their priority or focus.  People looking for Native American heritage must extrapolate that Asian is Native.

The other unfortunate part for genetic genealogists is that most of their customer base takes this test for health information.  While that means we’re fishing in a different pool than the normal genealogy group of people who test, it also means that many or most of them don’t reply to inquiries about their family history, and those that do often have no information.

Family Tree DNA was the next player to enter this space.  In addition to the cousin matches provided, their ethnic breakdown is far more detailed than any of the others, actually breaking down continents into several population categories.  While this detail is most welcome, it can also be confusing in some cases, especially if you receive an unexpected grouping  They are the first company to bring us this level of detail, and we’ll talk in a minute about how this is done.  As with any new technology, there are pitfalls and this entire field is and has been a learning experience.

Ancestry.com recently entered this market as well.  They initially gave away thousands of kits, about 10,000 I believe, so that they would have something in their data base to compare results to when they began to sell the kits.  They did begin to sell the kits in the spring of 2012 by invitation only to customers, and now the early results are coming in.  They seem to have had some early issues with unwarranted Scandinavian results being reported, but as they fully develop the product, I would expect they would get this corrected.

So, as of today, we have three players using this Wide Spectrum Chip Technology.

There are two things you need to understand about this technology and how it is used to generate the results you’re seeing relative to ethnicity.

Chip Technology Itself

Technology has been a good friend to genetic genealogy, but most of us don’t know it.  New diagnostic technology has been developed in the medical field that we’ve been able to leverage.  Instead of manually looking for the results of 21 markers in the lab, new chips have been developed that are scanned for between 500,000 and 700,000 locations, and for about the same price.  This allows detailed analysis on the level that was previously not only impossible, but undreamed of.

Do you remember the videotape format war in the 1980s – VHS vs Beta?  If so, you’re probably groaning now.  Well, there was a similar DNA chip war too and you didn’t even know it happened.  As a result, today we use the Illumina chip.

Anyone who was a Family Tree DNA customer and bought the early Family Finder test, you received a free upgrade when Family Tree DNA replaced their previous sequencer with the new Illumina model.  I’m sure that set them back a pretty penny, both the replacement sequencer and all of those free upgrades.  In any event, now that both 23andMe and Family Tree DNA use the same technology, their results can be compared.  You can upload 23andMe results to Family Tree DNA and you can upload both results to GedMatch for private comparisons.

We don’t know for sure what technology Ancestry is using, but it’s believed to be the Illumina platform.  However, it’s a moot point at this juncture, because they do not provide customers with their data files to download.  Genetic genealogists are hoping to change their minds in the future.  Without this capability, all of the advanced analysis is impossible.

(Update – Sept. 2013 – Ancestry does use the Illumina platform, does now provide raw data files, but still does not provide any comparison tools like a chromosome browser so that you can see if and where you actually do match the person you’re paired with through their system.)

Ok, all of this said, how is this technology used to determine ethnicity?

Determining Ethnicity

Whew, I bet you thought we’d never get to this part.  Ethnicity is really not determined by smoke and mirrors with the assistance of a fortuneteller and a crystal ball.  And no, you do not just pick up the Magic 8 Ball and look for the answer on the bottom.  If you remember the VHS wars, you’re probably laughing now.  If you aren’t, well, then, never mind.

Different marker values in our DNA are found in different proportions in differing populations.  We are all familiar with this relative to haplogroups – where they are found, originated and spread.  We know that African haplogroups are much more likely to be found in Africa than in Siberia, for instance.

Ancestry Informative Markers, called AIMS, aren’t any different.  What is different is that there is no centralized data base to compile them for research purposes.

Back to the CODIS markers, information about these markers was mined, for the most part, from forensic law enforcement publications.  The problem there was that there was no standardization or quality control.  For example, if you were being booked into the jail and someone asked you your ethnicity, how reliable was the answer?  Or did the jailer just look at you and write down what they thought?  Furthermore, results were very spotty and tended to be from high crime areas, not really representative of a world-wide population.    But it was all we had at the time and it was a baby step along the way.  This problem as a whole is known as data base normalization.

Relative to the CODIS type tests, they were pretty good at determining your primary ethnicity, something very important to law enforcement looking for an unknown suspect, but not useful to genealogists.  They were much less reliable looking for minority admixture and very unreliable looking for trace amounts of admixture.  These data bases were also easy to skew based on what data the researcher in question entered for comparison.  In other words, if you were interested in Native American ancestry, your data base would likely contain disproportionately more Native data than would proportionately be warranted.

As newer technology has become available and research has advanced, new information has become available.  For example, there are two DNA marker values that are known only to exist in the African and the Native American populations, respectively.  So, if you have one of these two values, then you unquestionably DO carry that heritage.  Of course, figuring out which ancestor or even which line it came from is another matter entirely.

No longer in the law enforcement and forensics arena, most AIMS now are discovered in academic settings.  In my paper, I do discuss the reference populations used for each of the testing companies.  The biggest challenge to all of them is finding and compiling the data.  It is buried in many academic papers and is not compiled centrally anyplace.  After the papers are read, the values are amassed, then the computer crunching needs to be done to determine which of these markers are really “ancestrally informative” and if so, how.  In general, unlike the one African and one Native marker, markers are generally found in a range of populations in varying frequencies.  This means that you’re now dealing with statistical probabilities.  Did your eyes just glaze over?

In a nutshell, what has to be done is to look at all of the AIM values that you carry, look at where they are most likely to be found, and put all of that together to come up with a composite picture of you.  Let’s say for example, you have that African marker, but very few others found in high frequencies in African, that Native marker plus several more found in Asia and a whole bunch found in Europe but seldom in Asia or Africa.  This person would obviously have European, Native and African heritage, but it’s up to the statistics to determine what percentage of which type and from where.

This is obviously a new field, actually, a new field within a new field.  Genetic genealogy itself is only 12 years old.  As more papers are published and more information is found, this affects the statistics and will affect the ethnicity percentages shown.  Keep in mind also that the African value, for example, could have been passed from many generations ago, from a long forgotten and otherwise genetically “absent” ancestor.

Blaine Bettinger had a great blog about this very topic.  You can see it at http://www.thegeneticgenealogist.com/2012/06/19/problems-with-ancestrydnas-genetic-ethnicity-prediction/.  While he is actually talking about the problem with Ancestry.com’s ethnicity predictions, he discusses a very important concept, and that is that you actually have two family trees.  The genealogy one we all know and love, and a genetic family tree that we are just now getting to know.

Of course, the gift box with the big beautiful bow holds for us, one by one, the branches of our genetic tree….and that gift may look nothing at all like the package wrapping suggests.

What Project do I Join?

You wouldn’t believe how often I receive this question.  It seems evident to those of us who work with this information, but it’s obviously not to others.  So this blog is for those who ask, and also for project administrators who want to make sure their projects are useful and friendly and reaching the people they want to reach.

This is referring to the projects at Family Tree DNA.  Ancestry also has surname projects, but they tend to be more like study groups because you don’t have to DNA test to join them.

At Family Tree DNA, there are three kinds of projects; surname projects, haplogroup projects and geographic projects.  Let’s look at all 3.

Surname Projects

Most males will want to join a surname project.  Since the Y chromosome follows the surname, unless we’re looking at cases of adoption (documented or otherwise), you’ll want to join the surname project most similar to your surname.

To find the surname project best suited for you, simply go to www.familytreedna.com and enter your surname into the surname search box.

Project administrators – be sure that all variants of the surname are listed in your project profile.

Ladies, surname projects are much less useful to you directly, since surnames changed with every generation.  In my own personal case, I “keep” people who have tested for particular surnames in that project, but that’s so I can find them easily.  For example, I have two women who tested to prove who their ancestor was, that she was the wife of one William Crumley, and so they are in the Crumley project.  However, that is as much for my convenience as anything.  There are 5 surnames between their generation and the Crumley connection, so any of those surnames would be as appropriate as any other.  Generally, women should focus more on the other project types.  Some Y-DNA project administrators don’t accept mitochondrial results into the project.

Be sure to look through the mtDNA Lineage projects on the project search page.  They are similar to surname projects for males.  Don’t know how to find the lineage projects, keep reading to discover how to find different kinds of projects.

Haplogroup Projects

I encourage everyone to join appropriate haplogroup projects.  There may be more than one for you.

Often there is a primary haplogroup project, for example, haplogroup H, then subprojects.  You can find these projects by going to the Projects tab at the top of your personal page and click on the “join” option.

You will see the following selections.

If you’re looking for mitochondrial haplogroup H, scroll down to the mitochondrial haplogroup section and click on H.  You will then see the following options.

In this case, I would suggest joining both the haplogroup H main project and the subproject appropriate for you. If you are haplogroup H1, then join that project as well.  So in this case, you would join two haplogroup projects.

What is the benefit of joining a haplogroup project?

First, you can help science along its way.  This is one way you can be a citizen scientist, contributing to the greater good.  Haplogroup projects group people so we can discover new haplogroup subgroups and learn about migration patterns, which brings me to the second reason, which isn’t so altruistic.

You can learn about where your ancestors lived and settled before the advent of surnames.  Do you want to know where they lived 1000, 2000 or 5000 years ago?  Well, by looking at the haplogroup maps, you can see where they and their descendants settled.

Many haplogroup administrators group participants within the haplogroup project by either haplogroup subgroups, common mutation patters (which lead to new haplogroup subgroups) or other criteria.  Here’s an example of a subgroup from the haplogroup H1 project.  If you don’t know where your ancestors were from in Europe, wouldn’t a map like this showing where others with similar DNA patterns lived be useful?

If you’re not sure about which projects apply to you then click on the project link and read what the administrator has to say about the project.  Still not sure?  Most of the time the administrator’s name and e-mail is shown.

Project administrators, be sure that your project description in the project profile and on your project public website background page is current and useful.  If you’re receiving the same question over and over, put the answer where people can see it.  Be sure your name and e-mail are listed so that people can contact you with questions.  Please, enable mapping.  It’s free and it a wonderful resource for your participants.  If any of these things are causing you problems, the helpdesk at Family Tree DNA is a god-send for project administrators and you can reach them at helpdesk@ftdna.com.

Geographic Projects

Named “geographic projects”, these really fall into the “all other” category, meaning those that aren’t surname projects and aren’t haplogroup projects.

I think these are the most interesting and most fun.  They group people by specific interests.  Sometimes that means geography, like the Cumberland Gap Project, sometimes ethnicity, like the Native American projects, and sometimes something else that someone wants to study.  They are also the most difficult to name appropriately so that people can find them, especially if they don’t know to look for them.

There are two ways to find these kinds of projects.  Go to the project tab on your personal page and click on join.

You will then see a listing of projects, a search box, and the index to projects.

Many people think that the projects shown are recommendations by Family Tree DNA and they join all of the projects.  This is NOT what this is.  This is a list of projects where the administrator has entered your surname, the one on your account, as a surname of interest to that project.  To see why, click on the project links. These projects may or may not be appropriate for your situation.

However, there may be other projects that are of interest to you.  You can begin by putting key words into the search box.  For example, putting the word “Indian” in the search box returned the following list of projects.

There are several projects shown, but I happen to know there are several more that aren’t.  Let’s say you’re interested in the Shawnee.  Try that word in the search box.  Still didn’t find anything, then resort to browsing?

Look through the various Y-line, mtdna and dual (Y+mtDNA) projects to see what is listed.  You may be surprised at what you find that is interesting to you.  While looking for your Shawnee, you may also discover the Cumberland Gap project, the North Carolina Native project and others that might be relevant.  So take some time and look at what is available to you.

Hey look, I found the Shawnee project under PiquaShawnee in the P section of the Dual (Y+mtDNA) geographic projects.  I surely am glad I was browsing, because I would never have thought to look for that project name or to look under P!!

Project administrators, you want your project to be able to be found by those who need to find it.  If you have a Native American project, for example, you might add the names of tribes, the word “Indian” and the words “native” and “American” and “Native American” in the surname list on the project profile page.  Why?  Because those are things people might enter in that search box to find a relevant project.  In the above example, list the word “Shawnee” as well.

Once a project is named, the name can’t be changed, so think about how the project can most easily be found by a novice and name it appropriately.

The Trouble with Ancestry.com Matches

While working on a client’s mitochondrial DNA report, I came across the worst case I’ve seen in a long time of mismatches being shown as matches at Ancestry.com.  This has been a pervasive problem for a long time.

10 Point Question – If you match another person exactly on every location, HVR1&HVR2, must you have the exact same haplogroup?

Answer:  Most of the time.

You didn’t think this was going to be easy did you?

Because Family Tree DNA is the only company to test to the full sequence level, their clients are going to have far more advanced, detailed and accurate haplogroup assignments than people who test at companies who only offer the HVR1+HVR2 regions.

Therefore, like in this case, we see a client whose haplogroup is H1.  The “1” part of H1 is determined by location 3010A, a position found in the coding region that can only be read by full sequence testing.  So, at Ancestry, and in other data bases outside of Family Tree DNA, we would expect to see matches to both haplogroup H and H1 (assuming the data base allows outside results to be input), and possibly some other H haplogroups as well, if the HVR1+HVR2 region mutations match those of our H1 person.

OK – next 10 point question.  Will someone who is haplogroup H match someone who is haplogroup M or N or some other haplogroup?

Answer: No, not an exact match, but they may share some common mutations.

Then why does Ancestry show them as matches when a simple comparison would eliminate them?

The answer is two-fold.  Part of the issue could be how Ancestry assigns haplogroups.  We really don’t know how they do it, and they aren’t as forthcoming about these things as Family Tree DNA is.  Secondly, and probably the biggest issue is that Ancestry allows people to enter their own data from other labs into their data base, including their haplogroup, apparently without any verification process.  So, in essence, Ancestry has muddied their own waters.

My client’s 251 matches at ancestry were all shown with “0” differences which means they are exact matches.  That’s exciting to see, except it isn’t real.

I clicked on the “download matches” button, which dumps everything into a spreadsheet, a wonderfully handy feature.  As we talk about this, keep in mind that my client had a total of 5 mutations in the HVR1+HVR2 regions, so based on “0” differences, everyone on that list should share all of those mutations with no additional mutations.

Here’s what I found after sorting the spreadsheet.

Exact matches = 32, hardly the 251 displayed on the match page.

Of the 251 “exact” matches shown, the haplogroup breakdown is shown below:

A – 10 (Native American)

B- 7 (Native American)

C – 3 (Native American)

D – 2 (Native American)

H – 154, over half with no matching markers at all to client

HV – 10

I – 5

J – 5

K – 4

L – 12 (African)

M – 4

N – 5

R – 6

T – 7

U – 11

V – 3

W – 1

X – 1

Z – 1

But even this isn’t the worst part.  Of the 251 matches shown with “0” differences, 32 are actually exact matches.  Of those exact matches, we find 4 different haplogroups, including 3 in haplogroup M, a generally Asian haplogroup which is rare as hen’s teeth here in the US.  Hmmm….anyone spot a problem?

Of the remaining 219, 162 have no mutations whatsoever that match the clients, so they not only shouldn’t be shown with “0” differences, they shouldn’t be shown at all.  So this means that the balance of the matches that do share at least one marker but aren’t exact matches, 57 in number, are shown incorrectly, with “0” differences.

So let’s give Ancestry a report card on this.  32 out of 251 correct equals 13% correct.

Last 10 point question – What letter grade do you get for 13% right, which is 87% wrong?

In my book, and in any school I ever attended, that was a big fat F!

And no, this is not just a recently introduced software bug.  It’s been like this forever.

So now that we know how well Ancestry does on basic things like mitochondrial DNA matches, which are exceedingly easy, anyone feel good about how they’ll do with autosomal DNA?  Comparatively speaking, that’s the tough stuff.