Botocudo Ancient Remains from Brazil

One thing you can always count on in the infant science of population genetics…  whatever you think you know, for sure, for a fact…well….you don’t.  So don’t say too much, too strongly or you’ll wind up having to decide if you’d like catsup with your crow!  Well, not literally, of course.  It’s an exciting adventure that we’re on together and it just keeps getting better and better.  And the times…they are a changin’.

We have some very interesting news to report.  Fortunately, or unfortunately – the news weaves a new, but extremely interesting, mystery.

Ancient Mitochondrial DNA

Back in 2013, a paper, Identification of Polynesian mtdNA haplogroups in remains of Botocudo Amerindians from Brazil, was published that identified both Native American and Polynesian haplogroups in a group of 14 skeletal remains of Botocudo Indians from Brazil whose remains arrived at a Museum in August of 1890 and who, the scientists felt, died in the second half of the 19th century.

Twelve of their mitochondrial haplogroups were the traditional Native haplogroup of C1.

However, two of the skulls carried Polynesian haplogroups, downstream of haplogroup B, specifically B4a1a1a and B4a1a1, that compare to contemporary individuals from Polynesian, Solomon Island and Fijian populations.  These haplotypes had not been found in Native people or previous remains.

Those haplogroups include what is known as the Polynesian motif and are found in Indonesian populations and also in Madagascar, according to the paper, but the time to the most common recent ancestor for that motif was calculated at 9,300 years plus or minus 2000 years.  This suggests that the motif arose after the Asian people who would become the Native Americans had already entered North and South America through Beringia, assuming there were no later migration waves.

The paper discusses several possible scenarios as to how a Polynesian haplotype found its way to central Brazil among a now extinct Native people. Of course, the two options are either pre-Columbian (pre-1500) contact or post-Columbian contact which would infer from the 1500s to current and suggests that the founders who carried the Polynesian motif were perhaps either slaves or sailors.

In the first half of the 1800s, the Botocudo Indians had been pacified and worked side by side with African slaves on plantations.

Beyond that, without full genome sequencing there was no more that could be determined from the remains at that time.  We know they carried a Polynesian motif, were found among Native American remains and at some point in history, intermingled with the Native people because of where they were found.  Initial contact could have been 9,000 years ago or 200.  There was no way to tell.  They did have some exact HVR1 and HVR2 matches, so they could have been “current,” but I’ve also seen HVR1 and HVR2 matches that reach back to a common ancestor thousands of years ago…so an HVR1/HVR2 match is nothing you can take to the bank, certainly not in this case.

Full Genome Sequencing and Y DNA

This week, one on my subscribers, Kalani, mentioned that Felix Immanuel had uploaded another two kits to GedMatch of ancient remains.  Those two kits are indeed two of the Botocudo remains – the two with the Polynesian mitochondrial motif which have now been fully sequenced.  A corresponding paper has been published as well, “Two ancient genomes reveal Polynesian ancestry among the indigenous Botocudos of Brazil” by Malaspinas et al with supplemental information here.

There are two revelations which are absolutely fascinating in this paper and citizen scientist’s subsequent work.

First, their Y haplogroups are C-P3092 and C-Z31878, both equivalent to C-B477 which identifies former haplogroup C1b2.  The Y haplogroups aren’t identified in the paper, but Felix identified them in the raw data files that are available (for those of you who are gluttons for punishment) at the google drive links in Felix’s article Two Ancient DNA from indigenous Botocudos of Brazil.

I’ve never seen haplogroup C1b2 as Native American, but I wanted to be sure I hadn’t missed a bus, so I contacted Ray Banks who is one of the administrators for the main haplogroup C project at Family Tree DNA and also is the coordinator for the haplogroup C portion of the ISOGG tree.

ISOGG y tree

You can see the position of C1b2, C-B477 in yellow on the ISOGG (2015) tree relative to the position of C-P39 in blue, the Native American SNP shown several branches below, both as branches of haplogroup C.

Ray maintains a much more descriptive tree of haplogroup C1 at this link and of C2 at this link.

Ray Banks C1 tree

The branch above is the Polynesian (B477) branch and below, the Native American (P39) branch of haplogroup C.

Ray Banks C2 treeIn addition to confirming the haplogroup that Felix identified, when Ray downloaded the BAM files and analyzed the contents, he found that both samples were also positive for M38 and M208, which moves them downstream two branches from C1b2 (B477).

Furthermore, one of the samples had a mutation at Z32295 which Ray has included as a new branch of the C tree, shown below.

Ray Banks Z32295

Ray indicated that the second sample had a “no read” at Z32295, so we don’t know if he carried this mutation.  Ray mentions that both men are negative for many of the B459 equivalents, which would move them down one more branch.  He also mentioned that about half of the Y DNA sites are missing, meaning they had no calls in the sequence read.  This is common in ancient DNA results.  It would be very interesting to have a Big Y or equivalent test on contemporary individuals with this haplogroup from the Pacific Island region.

Ray notes that all Pacific Islanders may be downstream of Z33295.

Not Admixed

The second interesting aspect of the genomic sequencing is that the remains did not show any evidence of admixture with European, Native American nor African individuals.  More than 97% of their genome fits exactly with the Polynesian motifs.  In other words, they appear to be first generation Polynesians.  They carry Polynesian mitochondrial, Y and autosomal (nuclear) DNA, exclusively.

Botocudo not admixed

In total, 25 Botocudo remains have been analyzed and of those, two have Polynesian ancestry and those two, BOT15 and BOT17, have exclusively Polynesian ancestry as indicated in the graphic above from the paper.

When did they live?  Accelerator mass spectrometry radiocarbon dating with marine correction gives us dates of 1479-1708 AD and 1730-1804 for specimen BOT15 and 1496-1842 for BOT17.

The paper goes on to discuss four possible scenarios for how this situation occurred and the pros and cons of each.

The Polynesian Peru Slave Trade

This occurred between 1862-1864 and can be ruled out because the dates for the skulls predate this trade period, significantly.

The Madagascar-Brazil Slave Trade

The researchers state that Madagascar is known to have been peopled by Southeast Asians and not by Polynesians.  Another factor excluding this option is that it’s known that the Malagasy ancestors admixed with African populations prior to the slave trade.  No such ancestry was detected in the samples, so these individuals were not brought as a result of the Madagascar-Brazil slave trade – contrary to what has been erroneously inferred and concluded.

Voyaging on European Ships as Crew, Passengers or StowAways

Trade on Euroamerican ships in the Pacific only began after 1760 AD and by 1760, Bot15 and Bot17 were already deceased with a probability of .92 and .81, respectively, making this scenario unlikely, but not entirely impossible.

Polynesian Voyaging

Polynesian ancestors originated from East Asia and migrated eastwards, interacting with New Guineans before colonizing the Pacific.  These people did colonize the Pacific, as unlikely as it seems, traveling thousands of miles, reaching New Zealand, Hawaii and Easter Island between 1200 and 1300 AD.  Clearly they did not reach Brazil in this timeframe, at least not as related to these skeletal remains, but that does not preclude a later voyage.

Of the four options, the first two appear to be firmly eliminated which leaves only the second two options.

One of the puzzling aspects of this analysis it the “pure” Polynesian genome, eliminating admixture which precludes earlier arrival.

The second puzzling aspect is how the individuals, and there were at least two, came to find themselves in Minas Gerais, Brazil, and why we have not found this type of DNA on the more likely western coastal areas of South America.

Minas Gerais Brazil

Regardless of how they arrived, they did, and now we know at least a little more of their story.

GedMatch

At GedMatch, it’s interesting to view the results of the one-to-one matching.

Both kits have several matches.  At 5cM and 500 SNPs, kit F999963 has 86 matches.  Of those, the mitochondrial haplogroup distribution is overwhelmingly haplogroup B, specifically B4a1a1 with a couple of interesting haplogroup Ms.

F999963 mito

Y haplogroups are primarily C2, C3 and O.   C3 and O are found exclusively in Asia – meaning they are not Native.

F999963 Y

Kit F999963 matches a couple of people at over 30cM with a generation match estimate just under 5 generations.  Clearly, this isn’t possible given that this person had died by about 1760, according to the paper, which is 255 years or about 8.5-10 generations ago, but it says something about the staying power of DNA segments and probably about endogamy and a very limited gene pool as well.  All matches over 15cM are shown below.

F999963 largest

Kit F999964 matches 97 people, many who are different people that kit F999963 matched.  So these ancient Polynesian people,  F999963 and F999964 don’t appear to be immediate relatives.

F999964 mito

Again, a lot of haplogroup B mitochondrial DNA, but less haplogroup C Y DNA and no haplogroup O individuals.

F999964 Y

Kit F999964 doesn’t match anyone quite as closely as kit F999963 did in terms of total cM, but the largest segment is 12cM, so the generational estimate is still at 4.6,  All matches over 15cM are shown below.

F999964 largest

Who are these individuals that these ancient kits are matching?  Many of these individuals know each other because they are of Hawaiian or Polynesian heritage and have already been working together.  Several of the Hawaiian folks are upwards of 80%, one at 94% and one believed to be 100% Hawaiian.  Some of these matches are to Maori, a Polynesian people from New Zealand, with one believed to be 100% Maori in addition to several admixed Maori.  So obviously, these ancient remains are matching contemporary people with Polynesian ancestry.

The Unasked Question

Sooner or later, we as a community are going to have to face the question of exactly what is Native or aboriginal.  In this case, because we do have the definitive autosomal full genome testing that eliminates admixture, these two individuals are clearly NOT Native.  Without full genomic testing, we would have never known.

But what if they had arrived 200 years earlier, around 1500 AD, one way or another, possibly on an early European ship, and had intermixed with the Native people for 10 generations?  What if they carried a Polynesian mitochondrial (or Y) DNA motif, but they were nearly entirely Native, or so much Native that the Polynesian could no longer be found autosomally?  Are they Native?  Is their mitochondrial or Y DNA now also considered to be Native?  Or is it still Polynesian?  Is it Polynesian if it’s found in the Cook Islands or on Hawaii and Native if found in South America?  How would we differentiate?

What if they arrived, not in 1500 AD, but about the year 500 AD, or 1000 BCE or 2000 BCE or 3000 BCE – after the Native people from Asia arrived but unquestionably before European contact?  Does that make a difference in how we classify their DNA?

We don’t have to answer this yet today, but something tells me that we will, sooner or later…and we might want to start pondering the question.

Acknowledgements: 

I want to thank all of the people involved whose individual work makes this type of comparative analysis possible.  After all, the power of genetic genealogy, contemporary or ancient, is in collaboration.  Without sharing, we have nothing. We learn nothing.  We make no progress.

In addition to the various scientists and papers already noted, special thanks to Felix Immanual for preparing and uploading the ancient files.  This is no small task and the files often take a month of prep each.  Thanks to Kalani for bringing this to my attention.  Thanks to Ray Banks for his untiring work with haplogroup C and for maintaining his haplogroup webpage with specifics about where the various subgroups are found.  Thanks to ISOGG’s volunteers for the haplotree.  Thanks to GedMatch for providing this wonderful platform and tools.  Thanks to everyone who uploads their DNA, and that of their relatives and works on specific types of projects – like Hawaiian and Maori.  Thanks to my haplogroup C-P39 co-administrators, Dr. David Pike and Marie Rundquist, for their contributions to this discussion and for working together on the Native American Haplogroup C-P39 Project.  It’s important to have other people who are passionate about the same subjects to bounce things off of and to work with.  This is the perfect example of the power of collaboration!

Kennewick Man is Native American

Finally, an answer, after almost 20 years and very nearly losing the opportunity of ever knowing.

Today, in Nature, a team of scientists released information about the full genomic sequencing of Kennewick Man who was discovered in 1996 in Washington state.  Previous DNA sequencing attempts had failed, and 8000 year old Kennewick Man was then embroiled in years of legal battles.  Ironically, the only reason DNA testing was allowed is because, based on cranial morphology it was determined that he was likely more closely associated with Asian people or the Auni than the Native American population, and therefore NAGPRA did not apply.  However, subsequent DNA testing has removed all question about Kennewick Man’s history.  He truly is the Ancient One.

Kennewick man is Native American.  His Y haplogroup is Q-M3 and his mitochondrial DNA is X2a.  This autosomal DNA was analyzed as well, and compared to some current tribes, where available.

From the paper:

We find that Kennewick Man is closer to modern Native Americans than to any other population worldwide. Among the Native American groups for whom genome-wide data are available for comparison, several seem to be descended from a population closely related to that of Kennewick Man, including the Confederated Tribes of the Colville Reservation (Colville), one of the five tribes claiming Kennewick Man. We revisit the cranial analyses and find that, as opposed to genomic-wide comparisons, it is not possible on that basis to affiliate Kennewick Man to specific contemporary groups. We therefore conclude based on genetic comparisons that Kennewick Man shows continuity with Native North Americans over at least the last eight millennia.

Interestingly enough, the Colville Tribe, located near where Kennewick Man was found, decided to participate in the testing by submitting DNA for comparison.

Kennewick Colville

The ancestry and affiliations of Kennewick Man by Rasmussen, et al, Nature (2015) doi:10.1038/nature14625

Also from the paper:

Our results are in agreement with a basal divergence of Northern and Central/Southern Native American lineages as suggested from the analysis of the Anzick-1 genome12. However, the genetic affinities of Kennewick Man reveal additional complexity in the population history of the Northern lineage. The finding that Kennewick is more closely related to Southern than many Northern Native Americans (Extended Data Fig. 4) suggests the presence of an additional Northern lineage that diverged from the common ancestral population of Anzick-1 and Southern Native Americans (Fig. 3). This branch would include both Colville and other tribes of the Pacific Northwest such as the Stswecem’c, who also appear symmetric to Kennewick with Southern Native Americans (Extended Data Fig. 4). We also find evidence for additional gene flow into the Pacific Northwest related to Asian populations (Extended Data Fig. 5), which is likely to post-date Kennewick Man. We note that this gene flow could originate from within the Americas, for example in association with the migration of paleo-Eskimos or Inuit ancestors within the past 5 thousand years25, or the gene flow could be post colonial19.

The authors go on to say that Kennewick Man is significiantly different than Anzick Child, which matches closely with many Meso and South American samples.  Kennewick on the other hand, is closely related to the Chippewa and Anzick was not.

This divergence may suggest a population substructure and migration path within the Americas, although I would think significantly more testing of Native people would be in order before a migration path would be able to be determined or even suggested. It is very interesting that Anzick from Montana, 12,500 years ago, would match Meso American samples so closely.  I would have expected Kennewick to perhaps match Meso Americans more closely because I would have expected the migration pathway to be down the coastline.  Perhaps that migration had already happened by the time Kennewick man came onto the scene some 8000 years ago.

You can read the entire paper at this link.

How Much Indian Do I Have in Me???

I can’t believe how often I receive this question.

Here’s today’s version from Patrick.

“My mother had 1/8 Indian and my grandmother on my father’s side was 3/4, and my grandfather on my father’s side had 2/3. How much would that make me?”

First, this question was about Native American ancestry, but it could just have easily have been about African, European, Asian, Jewish….fill in the blank.

Secondly, Patrick’s initial question is a math question, but the real question is how much of a particular ethnicity do you have on paper versus how much you have genetically.

How could they be different?

Lots of ways.

Oral history in families tends to get diluted and condensed over time.  For example, maybe grandmother wasn’t really 3/4th – because her ancestors were admixed and she (or her descendants) didn’t know it.  And how does one have 2/3, exactly, with 4 grandparents.  So, the story may not be the whole story.

For our example, we’re going to eliminate the 2/3 number, because it can’t be correct.  A grandparent would be 1/4th, a great grandparent, 1/8th.  In other words, ancestors fractions come in divisions of 4, or 2, but not 3 – because it takes 2 people in each generation.

So, you could have 3 of 4 ancestors who are native, which would make the person 3/4th, 2 of 4 which would make the person half, or 1 of 4 which would make the person one quarter, but you cannot have 1 of 3, 2 of 3 or 3 of 3, because you have 4 grandparents, not 3.

Math

First, let’s answer the math question.

Math is your friend.

There are three easy steps.

1. Divide Each Generation By Half to Current

Each ancestral generation is reduced by one half, because the DNA is diluted by half in each generation.

So, if Patrick’s mother is 1/8, Patrick is 1/16 on their mother’s side, because Patrick received half of her DNA.  With fractions, you can’t reduce the top number of 1 by one half so you double the bottom number.

If grandfather was 3/4, then father was 3/8 on that side and Patrick is 3/16th.

So, now, add the numbers for Patrick together.

2. Find the Common Denominator

The two numbers you need to add together from the above exmaple are 1/16 and 3/16.  This is easy because the denominator is already the same – 16.  But let’s say you also have a third number, just for purposes of example.  Let’s say that third number is 3/32.

How do you add 1/16, 3/16 and 3/32?

The denominator has to be the same.  If you look at the denominators, you’ll see that if you double the fractions with 16, they become fractions with 32 as their denominator.

So, for this example, 1/16 becomes 2/32, 3/16 becomes 6/32 and 3/32 remains the same.

3. Add the Top Numbers Together

Now just add the numerators, or the top numbers together.

2/32 + 6/32 + 3/32 = 11/32

That’s the answer.  In this example, our person, per their family history, is 11/32 Native or 34.38%.

Patrick, who originally asked the question is 1/16 + 3/16 which equals 4/16, which reduces to 1/4 (by dividing the same number, 4, into the top and bottom of the fraction), plus whatever amount that “2/3″ really is.  So, Patrick is more than one quarter, at least on paper.

Genetics

The next question is often, “how do I prove that?”  In terms of Native ancestry, the answer varies on the purpose – general interest, tribal identification or tribal membership, etc.  I’ve written about that in two articles, here and here.

You can take a DNA test from Family Tree DNA called Family Finder that provides you with percentages of ethnicity, including Native American, as well as a list of cousin matches. They also offer additional testing that may be relevant if you descend from the native person paternally (if you are a male) or matrilineally (for both sexes.)

On the diagram below, you can see the Y DNA in blue, inherited by males from their father and the mitochondrial or matrilineal DNA in red, always inherited from the mother.  While the Y and mitochondrial tests give you very specific information on two lines, the Family Finder test provides you with ethnicity information from all of your lines.  It just can’t tell you which line or lines the Native heritage came from.

adopted pedigree

Often, due to admixture in the Native population over the past several hundred years, since the Europeans “discovered” America, the amount of Native DNA is less than expected and sometimes is so far back and such a small amount that it doesn’t show at all.

An individual could well be considered a full tribal member, yet have less than half Native heritage.  Examples that come to mind are Mary Jemison, an adopted captive who was European, but considered a full tribal member, and Sequoyah, who invented the Cherokee alphabet.   Even the Cherokee Chief, Benge was at least half European, sporting red hair.  His mother was a member of the Cherokee tribe, so Benge was as well.  Cherokee Chief John Ross, born in 1790, was only one eighth Native.

So, the bottom line.  Enjoy your family history and heritage.  Document your family stories.  Understand that tribal membership was historically not a matter of percentages, at least not until the late 1800s and early 1900s.  Your ancestor either was or was not “Indian,” generally based on the tribal membership status of their mother.  There was no halfway and mixed didn’t matter.

DNA testing can confirm Native heritage.  It can also prove Native heritage in a variety of ways depending on how one descends from the Native ancestor(s), using Y and mitochondrial DNA.  Depending on whether Patrick is male or female, and how Patrick descends from his or her Native ancestors, the Y or mitochondrial DNA test can add a wealth of information to Patrick’s family history.

For some people, DNA testing is how one discovers that they have a Native ancestor.

So, how much Indian do you have in you, on paper and through DNA testing?

Are You Native? – Native American Haplogroup Origins and Ancestral Origins

At Family Tree DNA, having Haplogroup Origins and Ancestral Origins indicating Native American ancestry does not necessarily mean you are Native American or have Native American heritage.

This is a very pervasive myth that needs to be dispelled – although it’s easy to see how people draw that erroneous conclusion.  Let’s look at why – and how to draw a correct conclusion.

The good news is that more and more people are DNA testing.  The bad news is that errors in the system are tending to become more problematic, or said another way, GIGO – Garbage in, Garbage Out.

I want to address this problem in particular having to do with Native American ancestry – or the perception thereof.

At Family Tree DNA, everyone who tests their Y DNA or their mitochondrial DNA have both Haplogroup Origins and Ancestral Origins tabs as two of your 7 information tabs detailing your results.

haplogroup and ancestral orgins tab

The goals of these two pages are to provide the testers with locations around the world where their haplogroup is found, and locations where their matches’ ancestors are found – according to their matches.

Did a little neon danger sign start flashing?  It should have.

Haplogroup Origins

Haplogroup Origins provides testers with information about the origins of other individuals who match your haplogroup both exactly and nearly.  This data base uses the location information from both the Family Tree DNA participant data base and other academic or private databases.

haplogroup origins 2

Ancestral Origins

Ancestral Origins is comprised primarily of the results of the “most distant ancestor” country of your matches at Family Tree DNA.  This tab is designed to provide you a view into the locations where your closest matches are found at each of the testing levels.  After all, that’s where your ancestors are most likely to be from, as well.

ancestral origins 2

Most of the time this works really well, providing valuable information to testers, assuming two things:

1. Participants who are entering the information for their “most distant ancestor” understand that in the case of the Y line DNA – this is the most distant direct MALE ancestor who carries that paternal surname. Not his wife or someone else in that line.

Sometimes, people enter the name of the person in that line, in general, who lived to be the oldest – but that’s not what this field is requesting – the most distant – meaning further back in that direct line.

For mitochondrial DNA, this is the most distant FEMALE in your mother’s mother’s mother’s mother’s direct line – directly on up that maternal tree until you run out of mothers who have been identified. I can’t tell you how many male names I see listed as the “most distant ancestor” when I do DNA reports for people – and I know immediately that information is incorrect – along with their associated geographic locations.

mtdna matches

In this mitochondrial example, the third match shows a male Indian Chief.  The first problem is that this is a mitochondrial DNA test, so the mitochondrial DNA could not have descended from a male.  If you don’t understand how Y and mitochondrial DNA descends from ancestors, click here.

Secondly, there is no known genealogical descent from this chief – but that really doesn’t matter because the mtDNA cannot descend from a male and the batter is out with the first problem, before you ever get to the second issue.  However, if you are someone who is “looking for” Native American ancestry, this information is very welcome and even seems to be confirming – but it isn’t.  It’s a red herring.

Unfortunately, this may now have perpetuated itself in some fashion, because look at the first and last lines of this next entry – again – another male chief.  The second entry with a name is another male too, Domenico.  Hmmm….maybe information entered by other participants isn’t always reliable and shouldn’t be taken at face value….

mtdna matches 2

2. This approach works well if people enter only known, verified, proven information, not speculation. Herein lies the problem with Native American heritage. Let’s say that the family oral history says that my mother’s mother’s line is Native American. I decide to DNA test, so for the “Most Distant Ancestor” location I select “United States – Native American.”

united states selection

The DNA test comes back and shows heritage other than Native, but that previous information that I entered is never changed in the system.  Now, we have a non-Native haplogroup showing as a Native American result.

Unfortunately, I see this on an increasingly frequent basis – Native American “location” associated with non-Native haplogroups.

non native hap

This scenario has been occurring for some time now.  Family Tree DNA at one point attempted to help this situation by implementing a system in which you can select “United States” meaning you are brick walled here, and “United States Native American” which means your most distant ancestor in that line is Native American.

Native American Haplogroups

There are a very limited number of major haplogroups that include Native American results.  For mitochondrial DNA, they are A, B, C, D, X and possibly M.  I maintain a research list of the subgroups which are Native.  Each of these base haplogroups also have subgroups which are European and/or Asian.  The same holds true for Native American Y haplogroups Q and C.

In the Haplogroup Origins and Ancestral Origins, there are many examples where Non-Native haplogroups are assigned as Native American, such as haplogroup H1a below.  Haplogroup H is European..

non native hap 2

A big hint as to an incorrect “Native” designation is when most or many of the other exact haplogroups, especially full sequence haplogroups, are not Native.  As Bennett Greenspan says, haplogroups and ethnicity are “guilt by genetic association.”  You aren’t going to find the same subhaplogroup in Czechoslovakia, Serbia or England and as a Native American too.

non native hap 3

Haplogroup J is European.

non native hap 4

Haplogroup K is European, and so is U2e1, below.

non native hap 5

Unfortunately, what is happening is that someone tests and see that out of several matches, one is Native American.  People don’t even notice the rest of their matches, they only see the Native match, like the example above.  They then decide that they too must be Native, because they have a Native match, so they change their own “most distant ancestor” location to reflect Native heritage.  This happens most often when someone is brick walled in the US.

non native hap 6

Another issue is that people see haplogroup X and realize that haplogroup X is one of the 5 mitochondrial haplogroups, A, B, C, D and X. that define Native American DNA.  However, those haplogroups have many subgroups and only a few of those subgroups are Native American.  Many are Asian or European.  Regardless, participants see the main haplogroup designation of X and assume that means their ancestor was Native.  They then enter Native American.

In the example above, haplogroup X1c has never been found in a Native American individual or population, although we are still actively looking.  Haplogroup X2a is a Native American subgroup.

In some cases, we are finding new subgroups of known Native haplogroups that are Native.  I recently wrote about this for haplogroup A4 where different subgroups are Asian, Jewish, Native and European.  This is, however, within an already known base haplogroup that includes a Native American subgroup – haplogroup A4.

When testers see these “Native American” results under Haplogroup and Ancestral Origins, they become very encouraged and excited.  Unfortunately, there is no way to verify which of your matches entered “Native American,” nor why, unless you have only a few matches and you can contact all of them.

When someone has tested at the full sequence level, remember that their results will show on these pages in the HVR1 section, the HVR2 section and the full sequence section.  So while it may look like there are three Native American results, there is only one, listed once in all three locations where it “counts.”  In the example below, there are two V3a1 full sequence matches that claim Native American.  Those were the chiefs shown above.  There are those two, plus one more HVR1+HVR2 individuals who has entered Native American as well.  However, if the match total was one for the HVR1, HVR2 and coding regions, that would mean there is one person who tested and matched in all 3 categories, not that 3 people tested.  In other words, you don’t add the match totals together.

non native hap 7

What Does A Native Match Look Like?

Of course, not all matches that indicate Native heritage are incorrect.  It’s a matter of looking at all of the available evidence and finding that guilt by genetic association.

In this first confirmed Native example, we see that the haplogroup is a known Native haplogroup, and all of the matches from outside the US are from areas known to have a preponderance of Native Americans in their population.  For example, about 80% of the people from Mexico carry Native American mitochondrial DNA.

Native 1

In this second example, we see Native American indicated, plus Mexico and Canada, which it typical.  In addition we see Spain.  Just like some people assume Native American, some people from Mexico, Central and South America presume that their ancestors are from Spain, so I always take these with a grain of salt.  Japan is a legitimate location for haplogroup B as well, especially given that this result is listed at the HVR1 level. If this individual tested at the HVR2 or full sequence level, they might be assigned to a different subgroup, and therefore would no longer be considered a match.

native 2

It’s not just what is present that’s important, but what is absent as well.  There is no long list of full sequence matches to people whose ancestors come from European countries like the U2 example above.  Spain is understandable, given the history of the settlement of the Americas, and that can be overlooked or considered and set aside.  Japan makes sense too.  But a European haplogroup combined with a long list of primarily European high level matches with only one or two “Native” matches is impossible to justify away.

What Does Native American Mean?

This discussion begs the question of what Native American means.

It’s certainly possible for someone with a European or African haplogroup to descend from someone who was a proven member of the a tribe.  How is that possible?  Adoption, slavery and kidnapping.  All three were very prevalent practices in the Native culture.

For example, Mary Jemison is a very well-known frontierswoman adopted by the Seneca with many descendants today.  Was she Native?  Yes, she was adopted by the tribe.  Is her DNA Native?  No.  Were her ancestors Native?  No, they were European.  So, are her descendants Native, through her?  She married a Native man, so her descendants are clearly Native through him.  Whether you consider her descendants Native through her depends on how you define Native.  I think the answer would be both yes and no, and both should be a part of the history of Mary Jemison and her descendants.

If a European or African women was kidnapped, enslaved or adopted into the tribe, and bore children, her children were full tribal members.  Of course, today her descendants might have be unaware of her European or African roots, prior to her tribal membership.  Her mtDNA would, of course, come back as European or African, not Native.

This is a case where the culture of the tribe involved may overshadow the DNA in terms of definition of “Indian.”  However, genetically, that ancestor’s roots are still in either Europe or African, not in the Americas.

How Do We Know Which Haplogroups Are Native?

One of the problems we have today is that because there are so many people who carry the oral history of grandmother being “Cherokee,” it has become common to “self-assign” oneself as Native.  That’s all fine and good, until one begins to “self-assign” those haplogroups as Native as well – by virtue of that “Native” assignment in the Family Tree DNA data base.  That’s a horse of a different color.

Because having a Native American ancestor has become so popular, there are now entities who collect “self-assigned” Native descendants and ancestors and, if you match one of those “self-assigned” Native descendants and their haplogroups, voila, you too are magically Native.

I can tell you, being an administrator for the American Indian, Cherokee, Tuscarora, Lumbee and other Native American DNA projects – that list of “self-assigned” Native haplogroups would include every European and African haplogroup in existence – so we would one and all be Native – using that yardstick for comparison.  How about that!

Bottom line – no matter how unhappy it makes people – that’s just not true.

A great deal of research has been undertaken over the past two decades into Native American genetic heritage – and continues today.  The reason I started my Native American Mitochondrial DNA Haplogroup list is because it’s difficult to track and keep track of legitimate developments.  Any time someone tells me they have “heard” that haplogroup H, for example, is Native, I ask them for a credible source.  I’ve yet to see one.

How do we determine whether a haplogroup is Native, or not?

The litmus paper test is whether or not the haplogroup has been found in pre-contact burials.  If yes, then it can be considered that the ancestor was living on this continent prior to European contact.  Native people arrived from Asia, across Beringia into what is now Alaska, and then scattered over thousands of years across all of North and South America.  We see subgroups of these same haplogroups across this entire space.

In some locations, the Native people are much less admixed than, for example, the tribes that came into the earliest and closest contact Europeans.  These tribes were decimated and many are now extinct.  I wrote about this in my paper titled, “Where Have All the Indians Gone.”

The tribes that are less admixed are probably the best barometers of Native heritage today.

We are hoping for new discoveries every day, but for today, we must rely on the information we have that is known and proven.

Interpreting Results Today

Native American haplogroup results today are subsets of Y DNA haplogroups Q and C.  If you find a haplogroup O result that might potentially be Native, PLEASE let me know.  This is also a possibility, but as yet unproven.

Mitochondrial Native American haplogroups include subgroups of A, B, C, D, X and possibly M.

If anyone tells you otherwise, personally or indirectly via Haplogroup or Ancestral Origins – keep in mind that extraordinary claims require extraordinary proof and data is only as good as its source.  Look at all of the information – what is present, what is absent, the testing level and what kind of documentation your matches have to share.

Finding your haplogroup listed as Native American in the Haplogroup or Ancestral Origins doesn’t make you Native American any more than it would make you an elephant if someone else listed “purple elephant.”

purple elephant

The only things that make you Native American are either a confirmed Native haplogroup subgroup, preferably with proven Native matches, or a confirmed genealogical paper trail.  Best of all scenarios is a combination of a Native haplogroup, matches that suggest or confirm your tribe and a proven paper trail.  That combination removes all doubt.

Evidence

Of the various kinds of evidence, some can stand alone, and some cannot.

Evidence Type Evidence Results Comments
DNA Y or mitochondrial Confirmed Native American subgroup – can stand alone sometimes With deep level testing, this can be enough to prove Native ancestry.  For Y  this generally means advanced SNP testing or matching to other proven Native participants.  For mitochondrial DNA, it means full sequence testing.
Proven paper trail Proven Native tribal membership, but does not prove ancestral origins Needs DNA evidence to prove whether the tribal member was admixed.
Matches to Haplogroup or Ancestral Origins If Native is indicated, need to evaluate the rest of the information. Level of testing, haplogroup, locations of most distant ancestors of other matches need to be evaluated, plus any paper trail evidence.
Autosomal DNA matches To people with Native ancestry Unless you can prove a common ancestor through triangulation, those individuals with Native ancestry could be related to you through any ancestor.  Matches to several people with Native ancestry does not indicate or suggest that you have Native ancestry.
Native DNA ethnicity through autosomal testing Native American results You can generally rely on these results, especially if they are over 5%.  Unless you have reason to believe that other regions could be providing some interfering results, this is probably a legitimate indication of Native heritage.  Locations that sometimes give Native results are Asia and eastern European countries that absorbed Asian invaders, such as the Slavic countries and Germany.  I wrote about this here.

If you don’t test, you can’t play.  If you think you have Native American ancestry, you can take the Y DNA test (at least to 37 markers) if you are a male, the full sequence test if you are testing mitochondrial DNA, or Family Finder to match family members from all ancestral lines and discover if you show any Native American in your ethnicity estimate provided in myOrigins.  Men can take all 3 tests and women can take the mitochondrial DNA and Family Finder tests.  Family Tree DNA is the only testing company providing this comprehensive level of testing.

Finding Your American Indian Tribe Using DNA

If I had a dollar for every time I get asked a flavor of this question, I’d be on a cruise someplace warm instead of writing this in the still-blustery cold winter weather of the northlands!

So, I’m going to write the recipe of how to do this.  The process is basically the same whether you’re utilizing Y or mitochondrial DNA, but the details differ just a bit.

So, to answer the first question.  Can you find your Indian tribe utilizing DNA?  Yes, it can sometimes be done – but not for everyone, not all the time and not even for most people.  And it takes work on your part.  Furthermore, you may wind up disproving the Indian heritage in a particular line, not proving it.  If you’re still in, keep reading.

I want you to think of this as a scavenger hunt.  No one is going to give you the prize.  You have to hunt and search for it, but I’m going to give you the treasure map.

Treasure mapI’m going to tell you, up front, I’m cheating and using an example case that I know works.  Most people aren’t this lucky.  Just so you know.  I don’t want to misset your expectations.  But you’ll never know if you don’t do the footwork to find out, so you’ve got nothing to lose and knowledge to gain, one way or another.  If you aren’t interested in the truth, regardless of what it is, then just stop reading here.

DNA testing isn’t the be-all and end-all.  I know, you’re shocked to hear me say this.  But, it’s not.  In fact, it’s generally just a beginning.  Your DNA test is not a surefire answer to much of anything.  It’s more like a door opening or closing.  If you’re looking for tribal membership or benefits of any kind, it’s extremely unlikely that DNA testing is going to help you.  All tribes have different rules, including blood quantum and often other insurmountable rules to join, so you’ll need to contact the tribe in question. Furthermore, you’ll need to utilize other types of records in addition to any DNA test results.

You’re going to have some homework from time to time in this article, and to understand the next portion, it’s really critical that you read the link to an article that explains about the 4 kinds of DNA that can be utilized in DNA testing for genealogy and how they work for Native testing.  It’s essential that you understand the difference between Y line, mitochondrial and autosomal DNA testing, who can take each kind of test, and why.

Proving Native American Ancestry Using DNA

For this article, I’m utilizing a mitochondrial DNA example, mostly because everyone has mitochondrial DNA and secondly, because it’s often more difficult to use genealogically, because the surnames change.  Plus, I have a great case study to use.  For those who think mito DNA is useless, well all I can say is keep reading.

Y and mito

You’ll know from the article you just read that mitochondrial DNA is contributed to you, intact, from your direct line maternal ancestors, ONLY.  In other words, from your mother’s mother’s mother’s mother and on up that line.

In the above chart, you can see that this test only provides information about that one red line, and nothing at all about any of your other 15 great-great grandparents, or anyone else on that pedigree chart other than the red circles.  But oh what a story it can tell about the ancestors of those people in the red circles.

If this example was using Y DNA, then the process would be the same, but only for males – the blue squares.  If you’re a male, the Y DNA is passed unrecombined from your direct paternal, or surname, ancestor, only and does not tell you anything at all about any of your other ancestors except the line represented by the little blue squares.  Females don’t have a Y chromosome, which is what makes males male, so this doesn’t apply to females.

First, you’ll need to test your DNA at Family Tree DNA.  This is the only testing company that offers either the Y (blue line) marker panel tests (37, 67 or 111), or the (red line) mitochondrial DNA full sequence tests.

For Y testing, order minimally the 37 marker test, but more is always better, so 67 or 111 is best.  For mitochondrial DNA, order the full sequence.  You’ll need your full mitochondrial haplogroup designation and this is the only way to obtain it.

I’m also going to be talking about how to incorporate your autosomal results into your search.  If you remember from the article, autosomal results give you a list of cousins that you are related to, and they can be from any and all of your ancestral lines.  In addition, you will receive your ethnicity result estimate expressed as a percentage.  It’s important to know that you are 25% Native, for example.  So, you also need to order the Family Finder test while you’re ordering.

You can click here to order your tests.

After you order, you’ll receive a kit number and password and you’ll have your own user page to display your results.

Fast forward a month or so now…and you have your results back.

A GEDCOM File

I hope you’ve been using that time to document as much about your ancestors as you can in a software program of some sort.  If so, upload your GEDCOM file to your personal page.  The program at Family Tree DNA utilizes your ancestral surnames to assist you in identifying matches to people in Family Finder.

It’s easy to upload, just click on the Family Tree icon in the middle of your personal page.

Family Tree icon

Don’t have a Gedcom file?  You can build your tree online. Just click on the myFamilyTree to start.

Having a file online is an important tool for you and others for ancestor matching.

Your Personal Page

Take a little bit of time to familiarize yourself with how your personal page works.  For example, all of your options we’re going to be discussing are found under the “My DNA” link at the top left hand side of the page.

My dna tab

If you want to join projects, click on “My Projects,” to the right of “My DNA” on the top left bar, then click on “join.”  If you want to familiarize yourself with your security or other options, click on the orange “Manage Personal Information” on the left side of the page to the right of your image.

Personal info

Preparing Your Account

You need to be sure your account is prepared to give you the best return on your research efforts and investment.  You are going to be utilizing three tabs, Ancestral Origins, Haplogroup Origins and various projects, and you need to be sure your results are displayed accurately.  You need to do two things.

The first thing you need to do is to update your most distant ancestor information on your Matches Map page.  You’ll find this page under either the mtDNA or the Y DNA tabs and if you’ve tested for both, you need to update both.

matches map

Here’s my page, for example. At the bottom, click on “Update Ancestor’s Location” and follow the prompts to the end.  When you are finished, your page should like mine – except of course, your balloon will be where your last know matrilineal ancestor lived – and that means for mitochondrial DNA, your mother’s mother’s mother’s line, on up the tree until you run out of mothers.  I can’t tell you how many men’s names I see in this field…and I know immediately someone is confused.  Remember, men can’t contribute mtDNA.

For men, if this is for your paternal Y line, this is your paternal surname line – because the Y DNA is passed in the same way that surnames are typically passed in the US – father to son.

It’s important to have your balloon in the correct location, because you’re going to see where your matches ancestors are found in relationship to your ancestor.  Your most distant ancestor’s location is represented by the white balloon.  However, you will only see your matches balloons that have entered the geographic information for their most distant ancestor. Now do you see why entering this information is important?  The more balloons, the more informative for everyone.

The second thing is that you need to make sure that the information about the location of your most distant ancestor is accurate.  Most Distant Ancestor information is NOT taken from the matches map page, but from the Most Distant Ancestors tab in your orange “Manage Personal Information” link on your main page.  Then click on to the Genealogy tab and then Most Distant Ancestors, shown below.

genealogy tab

If your ancestral brick wall in in the US, you can select 2 options, “United States” and “United States (Native American).”  Please Note – Please do not, let me repeat, DO NOT, enter the Native American option unless you have documented proof that your ancestor in this specific line is positively Native American.  Why?  Because people who match you will ASSUME you have proof and will then deduce they are Native because you are.

This is particularly problematic when someone sees they are a member of a haplogroup that includes a Native subgroup.  Haplogroup X1, which is not Native, is a prime example.  Haplogroup X2 is Native, but people in X1 see that X is Native, don’t look further or don’t understand that ALL of X is not Native – so they list their ancestry as United States (Native American) based on an erroneous assumption.  Then when other people see they match people who are X1 who are Native, they assume they are Native as well.  It’s like those horrible copied and copied again incorrect Ancestry trees.

distant ancestor US optionsIt’s important to update both the location and your most distant ancestors name. This is the information that will show in the various projects that you might join in both the “Ancestor Name” and the “Country” field.  As an example, the Estes Y project page is shown below.  You can see for yourself how useless those blank fields are under “Paternal Ancestor Name” and “Unknown Origin” under Country when no one has entered their information.

estes project tab

While you are working on these housekeeping tasks, this would be a good time to enter your ancestral surnames as well.  You can find this, also under the Genealogy Tab, under Surnames.  Surnames are used to show you other people who have taken the Family Finder test and who share the same surname, so this is really quite important.  These are surnames from both sides of your tree, from all of your direct ancestors.

surnames tab

Working With Results

Working with mitochondrial DNA genetic results is much easier than Y DNA.  To begin with, the full sequence test reads all of your mitochondrial DNA, and your haplogroup is fully determined by this test.  So once you receive those results, that’s all you need to purchase.

When working with Y DNA, there are the normal STR panels of 12, 25, 37, 67 and 111 markers which is where everyone interested in genealogy begins.  Then there are individual SNP tests you can take to confirm a specific haplogroup, panels of SNPs you can purchase and the Big Y test that reads the entire relevant portion of the Y chromosome.  You receive a haplogroup estimate that tends to be quite accurate with STR panel tests, but to confirm your actual haplogroup, or delve deeper, which is often necessary, you’ll need to work with project administrators to figure out which of the additional tests to purchase.  Your haplogroup estimate will reflect your main haplogroup of Q or C, if you are Native on that line, but to refine Q or C enough to confirm whether it is Native, European or Asian will require additional SNP testing  unless you can tell based on close or exact STR panel matches to others who are proven Native or who have taken those SNP tests.. 

Y Native DNA

In the Y DNA lines, both haplogroups Q and C have specific SNP mutations that confirm Native heritage.  SNPs are the special mutations that define haplogroups and their branches.   With the new in-depth SNP testing available with the introduction of the Big Y test in 2013, new discoveries abound, but suffice it to say that by joining the appropriate haplogroup project, and the American Indian project, which I co-administer, you can work with the project administrators to determine whether your version of Q or C is Native or not.

Haplogroups Q and C are not evenly distributed.  For example, we often see haplogroup C in the Algonquian people of Eastern Canada and seldom in South America, where we see Q throughout the Americas.  This wiki page does a relatively good job of breaking this down by tribe.  Please note that haplogroup R1 has NEVER been proven to be Native – meaning that it has never been found in a pre-contact burial – and is not considered Native, although speculation abounds.

This page discusses haplogroup Q and this page, haplogroup C.

Haplogroup C in the Native population is defined by SNP C-P39 and now C-M217 as well.

Haplogroup Q is not as straightforward.  It was believed for some time that SNP Q-M3 defined the Native American population, but advanced testing has shown that is not entirely correct.  Not all Native Q men carry M3.  Some do not.  Therefore, Native people include those with SNPs M3, M346, L54, Z780 and one ancient burial with MEH2.  Recently, a newly defined SNP, Y4273 has been identified in haplogroup Q as possibly defining a group of Algonquian speakers.  Little by little, we are beginning to more clearly define the Native American genetic landscape although there is a very long way to go.

With or without the SNP tests, you can still tell a great deal based on who you match.

For Y and mitochondrial DNA (not autosomal), at the highest levels of testing, if you are matching only or primarily Jewish individuals, you’re not Native.  If you’re matching people in Scandinavia, or Asia, or Russia, nope, not Native.  If you’re matching individuals with known (proven) Native heritage in Oklahoma or New Mexico, then yep….you’re probably Native

We’ll look at tools to do this in just a few minutes.                              

Mitochondrial Native DNA

There are several Native founder mitochondrial DNA lineages meaning those that are believed to have developed during the time about 15,000 years ago (plus or minus) that the Native people spent living on Beringia, after leaving continental Asia and before dispersing in the Americas.

Those haplogroups (along with the Native Y haplogroups) are shown in this graphic from a paper by Tamm, et al, 2007, titled “Beringian Standstill and the Spread of Native American Founders.”

beringia map

The founder mitochondrial haplogroups and latecomers, based on this paper, are:

  • A2
  • B2
  • C1b
  • C1c
  • C1d
  • C4c
  • C1
  • D2
  • D2a
  • D4h3
  • X2a

Subsequent subgroups have been found, and another haplogroup, M, may also be Native.  I compiled a comprehensive list of all suspects.  This list is meant as a research tool, which is why it gives links to where you can find additional information and the source of each reference.  In some cases, you’ll discover that the haplogroup is found in both Asia and the Americas.  Oh boy, fun fun….just like the Y.

Be aware that because of the desire to “be Native” that some individuals have “identified” European haplogroups as Native.  I’ll be writing about this soon, but for now, suffice it to say that if you “self-identify” yourself as Native (like my family did) and then you turn up with a European haplogroup – that does NOT make that European haplogroup Native.  So, when the next person in that haplogroup tests, and you tell them they match “Native” people with European haplogroups – it’s misleading to say the least.

When working to identify your Native heritage, some of your best tools will be the offerings of Family Tree DNA on your personal page.  The same tools exist for both Y and mitochondrial DNA results, so let’s take a look.

Your Results

If your ancestor was Native on your direct matrilineal line, then her haplogroup will fall within one of 5 or 6 haplogroups.  The confirmed Native American mitochondrial haplogroups fall into major haplogroups A, B, C, D and X, with haplogroup M a possibility, but extremely rare and as yet, unconfirmed.  Known Y haplogroups are C and Q with O as an additional possibility.

Now, just because you find yourself with one of these haplogroups doesn’t mean automatically that it’s Native, or that your ancestors in this line were Native.  If your haplogroup isn’t one of these, then you aren’t Native on this line.  For example, we find male haplogroup C around the world, including in Europe.

Here is the list of known and possible Native mitochondrial DNA haplogroups and subgroups.

If your results don’t fall into these haplogroups, then your matrilineal ancestor was not Native on this particular line.  If your ancestor does fall into these base groups, then you need to look at the subgroup to confirm that they are indeed Native and not in one of the non-Native sister clades.  Does this happen often?  Yes, it does, and there are a whole lot of people who see Q or C for the Y DNA and immediately assume they are Native, as they do when they see A, B, C, D or X for mitochondrial.  Just remember about assume.

Scenario 1: 

Oh No! My Haplogroup is NOT Native???

Let’s say your mitochondrial ancestor is not in haplogroup A, B, C, D, X or M.

About now, many people choke, because they are just sure that their matrilineal ancestor was Native, for a variety of reasons, so let’s talk about that.

  1. Family history says so. Mine did too. It was wrong. Or more precisely, wrong about which line.  Test other contributing lineages to the ancestor who was identified as Native.
  2. The Native ancestor is on the maternal line, but not in the direct matrilineal line. There’s a difference. Remember, mitochondrial DNA only tests the direct matrilineal line. What this means is that, for example, if your grandmother’s father was Native, your grandmother is still Native, or half Native, but not through her mother’s side so IT WON’T SHOW ON A MITOCHONDRIAL DNA TEST. In times past, stories like “grandma was Indian” was what was passed down. Not, grandmother’s father’s father’s mother was Waccamaw. Any Indian heritage got conveyed in the message about that ancestor, without giving the source, which leads to a lot of incorrect assumptions – and a lot of DNA tests that don’t produce the expected results. This is exactly what happened in my family line.
  3. Your ancestor is “Native” but her genetic ancestor was not – meaning she may have been adopted into the tribe, or kidnapped or was for some other reason a tribal member, but not originally genetically Native on the direct matrilineal line.  Mary Jemison is the perfect example.
  4. My ancestor’s picture looks Native. Great! That could have come from any of her other ancestors on her pedigree chart. Let’s see what other eividence we can find.

At this point, you’re disappointed, but you are not dead in the water and there are ways to move forward to search for your Native heritage on other lines.  What I would suggest are the following three action items.

1. Look at your family pedigree chart and see who else can be tested to determine a haplogroup for other lineages. For example, let’s say, your grandmother’s father. He would not have passed on any of his mother’s mitochondrial DNA, but his sisters would have passed their mother’s mitochondrial DNA to their children, and their daughters would pass it on as well. So dig your pedigree chart out. and see who is alive today that can test to represent other contributing ancestral lines.

2. Take a look at your Family Finder ethnicity chart under myOrigins and see how much Native DNA you have.

FF no Native

If your ethnicity chart looks like this one, with no New World showing, it means that if you have Native heritage, it’s probably more than 5 or 6 generations back in time and the current technology can’t measurably read those small amounts.  However, this is only measuring admixed or recombined DNA, meaning the DNA you received from both your mother and father.  Recombination in essence halves the DNA of each of your ancestors in each generation, so it’s not long until it’s so small that it’s unmeasurable today.

You can also download your raw autosomal data file to http://www.gedmatch.com and utilize their admixture tools to look for small amounts of Native heritage.  However, beware that small amounts of Native admixture can also be found in people with Asian ancestors, like Slavic Europeans.

The person whose results are shown above does have proven Native Ancestry, both via paper documentation and mitochondrial DNA results – but her Native ancestor is back in French Canada in the 1600s.  Too much admixture has occurred between then and now for the Native to be found on the autosomal test, but mtDNA is forever.

If your Y or mtDNA haplogroup is Native, there is no division in each generation, so nothing washes out. If Y or mtDNA is Native, it stays fully Native forever, even if the rest of your autosomal Native DNA has washed out with succeeding generations.  That is the blessing of both Y and mtDNA testing!

FF native

If your myOrigins ethnicity chart looks like this one, which shows a significant amount of New World and other areas that typically, in conjunction with New World, are interpreted as additional Native contribution, such as the Asian groups, and your Y and/or mtDNA is not Native, then you’re looking at the wrong ancestor in your tree.  Your mtDNA or Y DNA test has just eliminated this specific line – but none of the lines that “married in.”

You can do a couple of things – find more people to test for Y and mtDNA in other lines.  In this case, 18% Native is significant.  In this person’s case, she could eliminate her father’s line, because he was known not to be Native.  Her mother was Hispanic – a prime candidate for Native ancestry.  The next thing for this person to do is to test her mother’s brother’s Y DNA to determine her mother’s father’s Y haplogroup.  He could be the source of the Native heritage in her family.

3. The third thing to do is to utilize Family Finder matching to see who you match that also carries Native heritage. In the chart below, you can see which of your Family Finder matches also carry a percentage of Native ancestry. This only shows their Native match percent if you have Native. In other words, it doesn’t’ show a category for your matches that you don’t also have.

ff native matches

Please note – just because you match someone who also carries Native American heritage does NOT mean that your Native line is how you match.

For example, in one person’s case, their Native heritage is on their mother’s side.  They also match their father’s cousin, who also carries Native heritage but he got his Native heritage from his mother’s line.  So they both carry Native heritage, but their matching DNA and ancestry are on their non-Native lines.

Lots of people send me e-mails that say things like this, “I match many people with Cherokee heritage.”  But what they don’t realize is that unless you share common proven ancestors, that doesn’t matter.  It’s circumstantial.  Think about it this way.

When measuring back 6 generations, which is generally (but not always) the last generation at which autosomal can reliably find matches between people, you have 64 ancestors.  So does the other person.  You match on at least one of those ancestors (or ancestral lines), and maybe more.  If one of your ancestors and one of your match’s ancestors are both Native, then the chances of you randomly matching that ancestor is 1 in 64.  So you’re actually much more likely to share a different ancestor.  Occasionally, you will actually match the same Native ancestor.  Just don’t assume, because you know what assume does – and you’ll be wrong 63 out of 64 times.

Sharing Native ancestry with one or several of your matches is a possible clue, but nothing more.

Scenario 2:

Yippee!!  My Haplogroup IS Native!!!

Ok, take a few minutes to do the happy dance – because when you’re done – we still have work to do!!!

happy dance frog

Many people actually find out about their Native American heritage by a surprise Native American haplogroup result.  But now, it’s time to figure out if your haplogroup really IS Native.

As I mentioned before, many of the major haplogroups have some members who are from Europe, Asia and the America.  Fortunately, the New World lines have been separated from the Old World lines long enough to develop specific and separate mutations, that enable us to tell the difference – most of the time.  If you’re interested, I recently wrote a paper about the various European, Jewish, Asian and Native American groups within subgroups of haplogroup A4.  If you’re curious about how haplogroups can have subgroups on different continents, then read this article about Haplogroups and The Three Brothers.  This is also an article that is helpful when trying to understand what your matches do, and don’t, mean.

So, before going any further, check your haplogroup subgroup and make sure your results really do fall into the Native subgroups.  If they don’t, then go back to the “Not Native” section.  If you aren’t sure, which typically means you’re a male with an estimated haplogroup of C or Q, then keep reading because we have some tools available that may help clarify the situation.

Utilizing Personal Page Y and Mito Tools to Find Your Tribe

Much of Y and mitochondrial DNA genetic genealogy matching is “guilt by genetic association,” to quote Bennett Greenspan.  In other words we can tell a great deal about your heritage by who you match – and who you don’t match.

Let’s say you are haplogroup B2a2 – that’s a really nice Native American haplogroup, a subgroup of B2a, a known Beringian founder.  B2a2 developed in the Americas and has never been found outside of the Native population in the Americas.  In other words, there is no controversy or drama surrounding this haplogroup.

It just so happens that our “finding your tribe” example is a haplogroup B2a2 individual, Cindy, so let’s take a look at how we work through this process.

Taking a look at Cindy’s Matches Map tab, which shows the location of Cindy’s matches most distant ancestor on their matrilineal line (hopefully that’s what they entered.)  Only one of Cindy’s full sequence matches has entered their ancestor’s geographic information.  However, it’s not far from Cindy’s ancestor which is shown by the white balloon.

Cindy full seq match

Please note that Cindy, who is haplogroup B2a2, has NO European matching individuals.  In fact, no matches outside of North and South America.  Being Native, we would not expect her to have matches elsewhere, but since the match location field is self-entered and depends on the understanding of the person entering the information, sometimes information provided seems confusing.  Occasionally information found here has to be taken with a grain of salt, or confirmed with the individual who entered the information.

For example, I have one instance of someone with all Native matches having one Spanish match.  When asked about this, the person entering the information said, “Oh, our family was Spanish.”  And of course, if you see a male name entered in the most distant ancestor field for mtDNA, or a female for Y DNA, you know there is a problem.

While the full sequence test is by far the best, don’t neglect to look at the HVR1 and HVR2 results, because not everyone tests at higher levels and there may be hints waiting there for you.  There certainly was for Cindy.

Cindy HVR1 match

Look at Cindy’s cluster of HVR1 matches.  Let’s look at the New Mexico group more closely.

Cindy HVR1 NM matches

Look how tightly these are clustered.  One is so close to Cindy’s ancestor that the red balloon almost obscures her white balloon.  By clicking on the red balloons, that person’s information pops up.

You will also want to utilize the Haplogroup and Ancestral Origins tabs.  The Haplogroup Origins provides you with academic and research data with some participant data included.  The Ancestral Origins tab provides you with the locations where your matches say their most distant ancestor is from.

Cindy’s Haplogroup Origins page looks like this.

Cindy haplogroup origins

Keep  in mind that your closest matches are generally the most precise – for mitochondrial DNA meaning the group at the bottom titled “HVR1, HVR2 and Coding Region Matches.”  In Cindy’s case, above, at both the HVR1 and HVR2 levels, she also matches individuals in haplogroup B4’5, but at the highest level, she will only match her own haplogroup.

Next Cindy’s Ancestral Origins tab shows us the locations where her matches indicate their most distant ancestor is found.

Cindy ancestral origins

These people, at least some of them, identified themselves as Native American and their DNA along with genealogy research confirmed their accuracy.

Now, it’s time to look at your matches.

Cindy fs matches

If you’re lucky, now that you know positively that your results are Native (because you carry an exclusive Native haplogroup), and so do your matches, one of them will not only list their most distant ancestor, they will also put a nice little heartwarming note like (Apache) or (Navajo) or (Pueblo).  Now that one word would just make your day.

Another word of caution.  Even though that would make your day, that’s not always YOUR answer.  Why not?  Because Native people intermarried with other tribes, sometimes willingly, and sometimes not by choice.  Willingly or not, their DNA went along with them and sometimes you will find someone among the Apache that is really a Plains Indian, for example.  So you can get excited, but don’t get too excited until you find a few matches who know positively what tribe their ancestor was from.

Proof

So let’s talk about what positive means.  When someone tells me they are a member of the Cherokee Tribe for example, I ask which Cherokee tribe, because there are many that are not the federally recognized tribes and accept a wide variety of people based on their family stories and little more except an enrollment fee.  I’m not saying that’s bad, I’m saying you don’t want to base the identity of your ancestor’s tribe, unwittingly, on a situation like that.

If the answer is the official Cherokee Nation in Oklahoma, for example, whose enrollment criteria I understand, then I ask them based on which ancestral line.  It could well be that they are a tribal member based on one relative and their mitochondrial DNA goes to an entirely different tribe.  In fact, I had this exact situation recently.  Their mitochondrial DNA was Seminole and they were a member of a different tribe based on a different lineage.

If the match is not a tribal member or descended from a tribal member, then I try, tactfully, to ask what proof they have that they are descended from that particular tribe.  It’s important to ask this in a nonconfrontional way, but you do need to know because if their claim to Native heritage is based on a family story, that’s entirely different than if it is based on the fact that their direct mitochondrial ancestor was listed on one of the government rolls on which tribal citizenship was predicated.

So, in essence, by your matches proving their mitochondrial lineage as Native and affiliated with a particular tribe, they are, in part, proving yours, or at least giving you a really big hint, because at some point you do share a common matrilineal ancestor.

You may find that two of your matches track their lineage to different tribes.  At that point, fall back to languages.  Are the tribes from the same language group?  If so, then your ancestor may be further back in time.  If not, then most likely someone married, was kidnapped, adopted or sold into slavery from one tribe to the other.  Take a look at the history and geography of the two tribes involved

Advanced Matching

It’s difficult to tell with any reasonable accuracy how long ago you share a common ancestor with someone that you match on either Y or mtDNA.  Family Tree DNA does provide guidelines, but those are based on statistical probabilities, and while they are certainly better than nothing, one size does not fit all and doesn’t tend to fit anyone very well.  I don’t mean this to be a criticism of Family Tree DNA – it’s just the nature of the beast.

For Y DNA, you can utilize the TIP tool, shown as the orange icon on your match bar, and the learning center provides information about mitochondrial time estimates to a common ancestor.  Let me say that I find the 5 generation estimate at the 50th percentile for a full sequence match extremely optimistic.  This version is a bit older but more detailed.

mtdna mrca chart

However, you can utilize another tool to see if you match anyone autosomally that you also match on your mitochondrial or Y DNA.  Before you do this, take a look at your closest matches and make note of whether they took the Family Finder test.  That will be listed by their name on the match table, by the FF, at right, below.

mtdna matches plus ff

If they didn’t take the Family Finder test, then you obviously won’t match them on that test.

On your mtDNA or Y DNA options panel, select Advanced Matching.

advanced matching

You’ll see the following screen.  Select both Family Finder and ONE Of the mtDNA selections  Why just one?  Because you’re going to select “show only people I match on selected tests” which means all the tests that you select.  Not everyone takes all the tests or matches on all three levels, so search one level of mtDNA plus Family Finder, at a time.  This means if you have matches on all 3 mitochondrial levels, you’ll run this query 3 times.  If you’re working with Y DNA, then you’ll do the same thing, selecting the 12-111 panels one at a time in combination with Family Finder.

The results show you who matches you on BOTH the Family Finder and the mtDNA test, one level at a time.  Here are the results for Cindy comparing her B2a2 HVR1 region mitochondrial DNA (where she had the most matches) and Family Finder.

advanced matches results

Remember those clusters of people that we saw near Cindy’s oldest ancestor on the map?  It’s Cindy’s lucky day.  She is extremely lucky to match three of her HVR1 matches on Family Finder.  And yes, that red balloon overlapping her own balloon is one of the matches here as well.  Cindy just won the Native American “find my tribe” lottery!!!!  Before testing, Cindy had no idea and now she has 3 new autosomal cousins AND she know that her ancestor was Native and has a very good idea of which tribe.  Several of the people Cindy matches knew their ancestor’s tribal affiliation.

So, now we know that not only does Cindy share a direct matrilineal ancestor with these people, but that ancestor is likely to be within 5 or 6 generations, which is the typical reach for the Family Finder matching, with one caveat…and that’s endogamous populations.  And yes, Native American people are an endogamous group.  They didn’t have anyone else to marry except for other Native people for thousands of years.  In recent times, and especially east of the Mississippi, significant admixture has occurred, but not so much in New Mexico at least not across the board.  The message here is that with endogamous populations, autosomal relationships can look closer than they really are because there is so much common DNA within the population as a whole.  That said, Cindy did find a common ancestor with some of her matches – and because they matched on their mitochondrial DNA, they knew exactly where in their trees to look.

Identifying your Tribe

Being able to utilize DNA to find your tribe is much like a puzzle.  It’s a little bit science, meaning the DNA testing itself, a dose of elbow grease, meaning the genealogy and research work, and a dash of luck mixed with some magic to match someone (or ones) who actually know their tribal affiliation.  And if you’re really REALLY lucky, you’ll find your common ancestor while you’re at it!  Cindy did!

In essence, all of these pieces of information are evidence in your story.  In the end, you have to evaluate all of the cumulative pieces of evidence as to quality, accuracy and relevance.  These pieces of evidence are also breadcrumbs and clues for you to follow – to find your own personal answer.  After all, your story and that of your ancestors isn’t exactly like anyone else’s.  Yes, it’s work, but it’s possible and it happens.

In case you think Cindy’s case is a one time occurrence, it’s not.  Lenny Trujillo did the same thing and wrote about his experience.  Here’s hoping you’re the next person to make the same kind of breakthrough.

New Haplogroup C Native American Subgroups

Haplogroup C is one of two haplogroups, the other being Q, which are found as part of the Native American paternal population in the Americas.  Both C and Q were founded in Asia and subgroups of both are found today in Asia, Europe and the New World.  The subgroups found in the Americas are generally unique to that location.  I wrote about some of the early results of haplogroup Q being divided into subgroups through Big Y testing here.

In the Americas, haplogroup Q is much more prevalent in the Native population.  Haplogroup C is rarely found and originally, mostly in Canada.

Hap C Americas

This chart, compliments of Family Tree DNA, shows the frequency distribution in the Americas between haplogroups Q and C.

However, in the Zegura et al article in 2004, haplogroup C was found in very small percentages elsewhere.

The authors found the following P39 men among the samples:

Northern Athabaskan:

  • Tanana of Alaska, 5 of 12

Southern Athabaskan:

  • Apache, 14 of 96
  • Navajo, 1 of 78

Algonquian (Plains):

  • Cheyenne, 7 of 44

Siouan–Catawban (Plains):

  • Sioux, 5 of 44

I was speaking with Spencer Wells (from the Genographic Project) about this at one point and he said to keep in mind that the Athabaskan migration to the Southwest was only about 600 years ago. That is why our one Southwestern C-P39 looks like he is related to all the other families about 600 years ago.

There are competing theories about whether the Athabaskan came down across the plains or along the western mountains/coast. I found a few recent studies that say both are likely true.  We don’t know if the C-P39 found on the Plains is residual from the migration event or from another source.

In the American Indian DNA Project and other relevant DNA projects, we find haplogroup C in New Mexico, Virginia, Illinois, Canada, New Brunswick, Ontario and Nova Scotia.

In 2012, Marie Rundquist, founder of the Amerindian Ancestry Out of Acadia DNA Project as well as co-founder the C-P39 DNA project wrote a paper titled “C3b Y Chromosome DNA Test Results Point to Native American Deep Ancestry, Relatedness, among United States and Canadian Study Participants.

At this that time, haplogroup C-P39 (formerly C3b) was the only identified Native American subgroup of haplogroup C.  Since that time, additional people have tested and the Big Y has been introduced.  Just recently, another subgroup of haplogroup C, C-M217, was proven to be Native and can be seen as the first line in the haplotree chart shown below.

The past 18 months or so with the advent of full genome sequencing of the Y chromosome with the Big Y test from Family Tree DNA and other similar tests have provided significant information about new haplotree branches in all haplogroups.

Ray Banks, one of the administrators of the Y DNA haplogroup C project and a haplogroup coordinator for the ISOGG tree has been focused on sorting the newly found SNPs and novel variants discovered during Big Y testing into their proper location on the Y haplogroup tree.

I asked Ray to write a summary of his findings relative to the Native American aspect of haplogroup C.  He kindly complied, as follows:

By way of a simplified explanation, a 2012 study by Dulik et al. reported that southern Altains (south central Russia) were the closest living relatives of Amerindian Haplogroup Q men they could identify.

Male haplogroup Q is the dominant finding within Amerindian populations of the Americas.

But male haplogroup C-P39 is also found in smaller percentages among Amerindians of North America.  A second type, of a different, poorly defined C, has been identified among rainforest Indians of northwestern South America.

The 2004 study by Zegura et al. reported that C-P39 was present in some quantities among some Plains and Southwest Indians of the United State, as well among Tananas of Alaska.  No one has done a comprehensive inventory of Amerindian Y-DNA haplogroups.  A high percentage of the Amerindian samples at Family Tree DNA that are P39, in contrast, report ancestry in central or eastern Canada.

It does not seem that anyone has yet definitively addressed whether C-P39 men have a different relationship pattern in relation to Asian groups than seen in haplogroup Q.  Another question is whether they might have been involved in a more recent migration from Asia than Q men who seem to have quickly migrated to all areas of South America as well.

Four men in the Haplogroup C Projects have made their Big Y results available for analysis.  All are from Canada, living in areas varying from central to maritime Canada.

These results show that the four men can be divided into two main groups.  The mutations Z30750 and Z30764 have been tentatively assigned to represent these subgroups.  The number of unique mutations for each man suggests these two subgroups each diverged from the overall P39 group about 3,500 years ago.  This is based on the 150 years per mutation figure that is being widely used.  There is no consensus for what number of years per mutation should be used.  Likewise, the total number of shared SNPs within P39, suggests 14,100 years as the divergence time from any other identified Y-DNA subgroup.  The Composite Y-DNA Tree by Ray Banks contains about 3,700 Y subgroups for comparison.

Ray Banks C Tree 3

The nearest subgroup to P39 has been identified as the F1756 subgroup, last line in the chart above.  These both share as a common earlier subgroup, F4015.   This parallel F1756 subgroup has been identified in Geno 2.0 testing as well as Big Y as containing mostly men from Kazakhstan, Kyrgyzstan and Afghanistan.  Some apparently have a tradition of a migration from Siberia.

There is available a Big Y test from among this group, and more recently complete Y sequencing in the sample file GS27578 at the Estonian Genome Centre.

Each of these men potentially could have shared one or more of the P39 equivalents creating a new subgroup older than P39.  But this is not the case.  The Big Y results are not complete genome sequencing, and they perhaps miss 30% of useful SNPs, mostly due to inconclusive reads.

The man in the Estonian collection is of particular interest because he is described as an Altaian of Kaysyn in Siberia, Russia.  He is not from the same town as samples in the earlier Dulik study, and thus no direct comparisons can be made.

The Big Y F1756 sample is geographically atypical because the man is Polish but still shares the unusual DYS448=null feature seen in all the available F1756 men in the C Project.  The project P39 men have either 20 or 21 repeats at this marker, instead of a null value.

In conclusion, the age of the P39 group and the failure of others so far to share its many equivalent mutations suggest together that the C-P39 men could have been part of the earliest migration to the Americas.  Like the Q men, the nearest relatives to C-P39 men have central Asian or Siberian origins.

Despite some identification of P39 branching.  Much work needs to be done to understand the branching due to the lack of availability of samples.

So, what’s the bottom line?

  1. C-P39 is being divided into subgroups as more Big Y and similar test results become available. If additional individuals who carry C-P39 were to take the Big Y test, especially from the more unusual locations, we might well find additional new, undiscovered, haplogroups or subgroups.  Eventually, we may be able to associate subgroups with tribes or at least languages or regions.
  2. If you are a Y DNA haplogroup C individual, and in particular C-P39, and have taken the Big Y test, PLEASE join the haplogroup C and C-P39 projects. Without a basis for comparison, much of the benefit of these tests in terms of understanding haplogroup structure is lost entirely.

As always, the power of DNA testing is in sharing and comparing.

Thank you Ray Banks, Marie Rundquist and DNA testers who have contributed by testing and sharing.

Haplogroup A4 Unpeeled – European, Jewish, Asian and Native American

Mitochondrial DNA provides us with a unique periscope back in time to view our most distant ancestors, and the path that they took through time and place to become us, here, today.  Because mitochondrial DNA is passed from generation to generation through an all-female line, un-admixed with the DNA from the father, the mitochondrial DNA we carry today is essentially the same as that carried by our ancestors hundreds or even thousands of years ago, with the exception of an occasional mutation.

Y and mito

You can see in the pedigree chart above that the red mitochondrial DNA is passed directly down the matrilineal line.  Women contribute their mitochondrial DNA to all of their children, of both genders, but only the females pass it on.

Because this DNA is preserved in descendants, relatively unchanged, for thousands of years, we can equate haplogroups, or clans, to specific regions of the world where that particular haplogroup was born by virtue of a specific mutation.  All descendants carry that mutation from that time forward, so they are members of that new haplogroup.

For example, here we see the migration path of haplogroup A, after being born in the Middle East, spreading across Eurasia into the Americas, courtesy of Family Tree DNA.

Hap A map crop

This pie chart indicates the frequency level at which haplogroup A is found in the Americas as compared to haplogroups B, C, D and X.

Hap A distribution

However, not all of haplogroup A arrived in the Americas.  Some subgroups are found along the path in Asia, and some made their way into Europe.  There are currently 48 sub-haplogroups of haplogroup A defined, with most of them being found in Asia.  Every new haplogroup and sub-haplogroup is defined by a new mutation that occurs in that line.  I wrote about how this works recently in the article, Haplogroups and The Three Brothers.

In the Americas, Native American mitochondrial haplogroups are identified by being subgroups of haplogroup A, B, C, D and X, as shown in the chart below.

beringia map

In the paper, Beringian Standstill and Spread of Native American Founders, by Tamm et al (2007), haplogroup A2 was the only haplogroup A subgroup identified as being Native American.

As of that time, no other sub-haplogroups of A had been found in either confirmed Native American people or burials.

In June, 2013, I realized that a subgroup of mitochondrial haplogroup A4 might, indeed, be Native American.

The haplogroup A4 project was formed as a research project with Marie Rundquist as a co-administrator and we proceeded to recruit people to join who either were haplogroup A4 or a derivative at Family Tree DNA, or had tested at Ancestry.com and appeared to be haplogroup A4 based on a specific mutation at location 16249 in the HVR1 region.  As it turns out, location 16249 is a haplogroup defining marker for haplogroup A4a1.

There weren’t many of these Ancestry people – maybe 20 in total at that time.  Ancestry has since discontinued their mitochondrial and Y DNA testing and has destroyed the data base, so it’s a good thing I checked when I did.  That resource is gone today.

Family Tree DNA has always been extremely supportive of scientific studies, whether through traditional academic channels or via citizen science, and they were kind enough to subsidize our testing efforts by offering reduced prices for mitochondrial testing to project members.  I want to thank them for their support.

Other haplogroup administrators have also been supportive.  I contacted the haplogroup A administrator and she was kind enough to send e-mails to her project members who were qualified to join the A4 project.  Supportive collaboration is critically important.

I wrote an article about the possibility that A4 might be Native, and through that article, raised money to enable people to test at Family Tree DNA or upgrade to the full sequence test.  Full sequence testing is critical to obtaining a full haplogroup designation.  Many of these people were only, at that time, defined by HVR1 or HVR1+HVR2 testing as haplogroup A.  Haplogroup A is, indeed, a Native American haplogroup, but it’s also an Asian haplogroup and we see it in Europe from time to time as well.  The only way to tell the difference between these groups is through full sequence testing.  Haplogroup A was born in Asia, about 30,000 years ago and has many subgroups.

What Do We Know About Haplogroup A4?

Haplogroup A4 has been identified as a subgroup of the parent haplogroup A and is the parent haplogroup of A2.  In essence, haplogroup A gave birth (through a mutation) to subgroup A4 who gave birth through a mutation to subgroup A2.

To date, before this research, all confirmed Native American haplogroups were subgroups of haplogroup A2.

In the Kumar et al 2011 paper, Schematic representation of mtDNA phylogenetic tree of Native American haplogroups A2 and B2 and immediate Siberian-Asian sister clades (A2a, A2b, A4a, A4b and A4c), no A4 was reported in the Americas, although A4 is clearly shown as the parent haplogroup of A2, which is found in the Americas.

On the graph below, from the paper, you can see the color coded “tabs” to the right of the haplogroup A designations that indicate where this haplogroup is found.  As you can see, A4 and subgroups is found only in Siberia and Asia, not in the Americas, which is indicated by yellow.

Hap A and B genesis

Schematic representation of mtDNA phylogenetic tree of Native American haplogroups A2 and B2 and immediate Siberian-Asian sister clades (A2a, A2b, A4a, A4b and A4c). Coalescent age calculated in thousand years (ky) as per the slow mutation rate of Mishmar et al. [58] and as per calibrated mutation rate of Soares et al. [59] are indicated in blue and red color respectively. The founder age wherever calculated are italicized. The geographical locations of the samples are identified with colors. For more details see complete phylogenetic reconstruction in additional file 2 (panels A-B) and additional file 3. Kumar et al. BMC Evolutionary Biology 2011 11:293 doi:10.1186/1471-2148-11-293

I then checked both GenBank and www.mtdnacommunity.org for haplogroup A4 submissions.  Ian Logan’s checker program makes it easy to check submissions by haplogroup.

MtDNACommunity reflected one A4 submission from Mexico and from the United States, which does not necessarily mean that the United States submission is indigenous – simply that is where the submission originated.  The balance of the submissions are from either academic papers or from Asia.

During this process, I utilized PhyloTree, Build 15, shown below, as my reference tree.  Build 16 was introduced as of February 2014.  It renames the A4 haplogroups.  In order to avoid confusion, I am utilizing the Build 15 nomenclature.  These are the haplogroup names currently in use by the vendors and utilized in academic papers.

Hap A tree

I am also utilizing the CRS version, not the RSRS version of mutations.  Again, these are the mutations referenced by academic papers and the version generally used among genealogists.

Family Tree DNA provides an easy reference chart of which mutations are haplogroup defining.  For haplogroup A4, we find the following progression.

A4 T16362C
A4a G1442A
A4a1 G9713A, T16249C
A4a1a T4928C

This means that everyone who falls in haplogroup A4 carries this specific mutation at location 16362.  The original value at that location was a T and in haplogroup A, that T has mutated to a C.  This defines haplogroup A4.  So, if you don’t have this mutation, you definitely aren’t in haplogroup A4.  Everyone in haplogroup A4 carries this mutation (unless you’ve had a back mutation, a very rare occurrence.)

This is actually a wonderful turn of events, because it means that the defining mutation for A4 is in the HVR1 region, which further means that regardless of how the haplogroup A individual is classified, I can tell with a quick glance if they are A4 or not.

In addition, subgroups are defined by other mutations as well, shown above.  For example, haplogroup A4a carries the A4 mutation of T16362C plus the additional mutation of G1442A that defines subclade A4a.

Full sequence testing showed that there was actually quite a variety of subhaplogroups in the project participants.

What Did We Find?

In the haplogroup A4 project, we now have 55 participants who fell into 11 different haplogroups when full sequence tested.

A4 project distribution crop

I have removed all haplogroup A2 individuals from further discussion, as we already know A2 is Native.  We have established a haplogroup A2 project for them, as well.

A4b

We found two haplogroup A4b individuals.  The most distant known ancestor of one is found in Tennessee, but the most distant ancestor of the other is found in England.  These two individuals have 19 HVR1 matches, of which many are to other A4b individuals.  There is no evidence of Native American ancestry in this group.

A4-A200G

This unusual haplogroup name indicates that this is a subgroup of haplogroup A4, defined by a mutation at location 200 that has changed from A to G.  The new subgroup is waiting to be named.  So eventually A4-A200G will be replaced with something like A4z, just as an example.

This individual is from Asia, so this haplogroup is not Native.

A10

One individual, upon full sequence testing, was found to carry haplogroup A10, which is not a subgroup of A4.  This is quite interesting, because the most distant ancestor is Catherine Pillard, originally believe to be one of the “Kings Daughters,” meaning French.  This article explains the situation and the question at hand.

All five of her full sequence matches are either to other descendants of Catherine Pillard, or designated as French Canadian.

One of this woman’s ten HVR2 matches shows her ancestor, Annenghton Annenghto, as born at the Ossosane Mission, Huronia, La Rochelle, Ontario, Canada and died in 1657 in Canada.  If this is correct and can be confirmed, haplogroup A10 could be Native, not French.  Her daughter, Marie Catherine Platt has a baptismal record dated March 30, 1651, was also born at the mission, and is believe to be Huron.

This article more fully explains the research and documents relevant to Catherine Pillard’s ancestry.

Based on these several articles, it seems that an assumption had originally been made that because the individual fell into haplogroup A, and haplogroup A was Asian and Native, that this individual would be Native as well.

This determination was made in 2007, based on only the HVR1 and HVR2 regions of the mitochondrial DNA, and on the fact that the DNA results fell within haplogroup A, as documented here.  The HVR1 and HVR2 regions do not include the haplogroup defining mutations for haplogroup A10, so until full sequence testing became available, this sequence could not be defined as A10.  The conclusion that haplogroup A equated to Native American was not a scientific certainty, only one of multiple possibilities, and may have been premature.

I contacted several French-Canadian scholars regarding the documents for Catherine Pillard and there is no consensus as to whether she was Native or European, based on the available documentation.  In fact, there are two very distinct and very different opinions.  There is also a possibility that there are two women whose records are confused or intermixed.

So it seems that both Catherine Pillard’s DNA and supporting documents are ambiguous at this point in time.

One of the ways we determine mitochondrial ethnicity in situations like this is “guilt by genetic association,” to quote Bennett Greenspan.  In other words, if you have exactly the same DNA and mutations as several other people, and they and their ancestors are proven to live in Scotland, or Paris, or Greece, you’re not Native American.  This works the other way too, as we’ll see in Kit 11 of the haplogroup A4 outliers group.

Looking at other resources, MtDNA Community shows two references to A10, one submitted from Family Tree DNA and one from the below referenced article.

Haplogroup A10 has one reference in Mitogenomic Diversity in Tatars from the Volga-Ural Region of Russia by Malyarchuk et al, (201 Molecular Biological Evolution) but has since been reassigned as haplogroup A8, as follows:

However, some of the singular haplotypes appear to be informative for further development of mtDNA classification. Sample 23_Tm could be assigned to A10 according to nomenclature suggested by van Oven and Kayser (2009). However, phylogenetic analysis of complete mtDNAs (fig. 1) reveals that this sample belongs to haplogroup A8, which is defined now by transition at np 64 and consists of two related groups of lineages—A8a, with control region motif 146-16242 (previously defined as A8 by Derenko et al. [2007]), and A8b, with motif 16227C-16230 (supplementary table S3, Supplementary Material online). Analysis of HVS I and II sequences in populations indicates that transition at np 64 appears to be a reliable marker of haplogroup A8 (supplementary table S3, Supplementary Material online). The only exception, the probable back mutations at nps 64 and 146, has been described in Koryak haplotype EU482363 by Volodko et al. (2008). Therefore, parallel transitions at np 64 define not only Native American clusters of haplogroup A2, that is, its node A2c’d’e’f’g’h’i’j’k’n’p (Achilli et al. 2008; van Oven and Kayser 2009), but also northern Eurasian haplogroup A8. Both A8 and subhaplogroups are spread at relatively low frequencies in populations of central and western Siberia and in the Volga-Ural region. A8a is present even in Transylvania at frequency of 1.1% among Romanians, thus indicating that the presence of such mtDNA lineages in Europe may be mostly a consequence of medieval migrations of nomadic tribes from Siberia and the Volga-Ural region to Central Europe (Malyarchuk et al. 2006; Malyarchuk, Derenko, et al. 2008).

On Phylotree build 15, A10 is defined as T5393C, C7468T, C9948A, C10094T A16227c, T16311C! and the submissions are noted as the Malyarchuk 2010b paper noting it as “A8b”and a Family Tree DNA submission.

At this point, haplogroup A10 is indeterminate and could be either Native or European.  We won’t know until we have confirmed test results combined with confirmed genealogy or location for another A10 individual.

A4

Haplogroup A4 itself is not the haplogroup I originally suspected was Native.  When this project first began, we had few A4s, and I suspected that they would become A4a1 when full sequence tested.  I expected A4a1 would be Native American.

Subsequent testing has shown that haplogroup A4 very clearly falls into major subgroups, as defined by different mutations.

A4 European

The European A4 group is comprised of three participants.  Of those three, two are matches to each other and the third is quite distant with no matches.  I suspect that we are dealing with two different European sub-haplogroups of A4.

Two project participants, one from Romania and one from Poland match each other and both match one additional individual from Hungary who is not a project member.  This group is eastern European.

The Romanian and Polish kits that match each other both carry mutations at locations 16182C, 16183C, 16189C, 150T, 204C, 3213G, 3801C and 14025C.  The third person that they match, who is not a project member, from Hungary, matches one of those kits exactly, so that gives us three kits carrying this same series of mutations.  These mutations do not match any other individuals carrying haplogroup A4.  This group appears to be Jewish, as all three of the participants are of the Jewish faith.

This leaves the third project participant from Poland who does not have any matches today, within or outside of the project.  This participant is clearly a different subclade of A4.  They match none of the defining markers of the group above. They do have unique mutations at locations not found in other A4 participants within the project.

This provides us with the following European haplogroup A4 results:

  • Eastern European –Jewish – 2 participants plus one exact full sequence match outside of project
  • Eastern European – does not match group above, has no matches today, five unique mutations including 4 in the coding region.

A4 Chinese

This A4 participant is from China.

This sequence is actually very interesting because of its relative age.  This individual has 109 matches at the HVR1 level.  This means, of course, that they are exact matches.  They match many people in varying locations such as people with Spanish surnames, participants from Michigan, Mexico and Asia which include people with extended haplogroups of A, A4 and A4-A200G haplogroup designations.

At first this appears confusing, until you realize two things.  First, the participant doesn’t continue those matches at the HVR2 level and second, this means that all of those people still carry the Haplogroup “A4 signature” HVR1 mitochondrial DNA, exactly.

This means that those matches stretch back in time thousands of years, until before the divergence of Native Americans and Asians, so at least 12,000 years, if not longer.  People who have incurred mutations in the HVR1 region don’t match, but those who have not, and today, there are only 109 in the Family Tree DNA data base, still match each other – reaching back to their common Asian ancestor many millennia ago.

This individual has developed two mutations in the HVR2 region at locations 156G and 159G.  The participant also does not carry the haplogroup A defining mutation at location 263G which means either that 263G actually defines a subgroup, or this participant has had a back mutation to the original state at this location.  This individual did not test at the full sequence level.

A4 Americas

This leaves a total of 14 haplogroup A4 individuals within the project.

In order to show a comparison, I have removed all private mutations where none of this group matches each other.  I have also removed the haplogroup defining mutations as well as 16519C and all insertions and deletions since those areas are considered to be unstable.  In other words, what I’m looking for are groups of mutations where this group matches each other and no one else.  These are very likely sub-haplogroup defining mutations.

In addition to all private mutations, deleted columns include: 16223, 16332, 16290, 16319, 16362, 16519, 73, 152, 235, 263, 309.1, 309.2, 315.1, 522, 523, 663, 750, 1438, 1736, 2706, 4248, 4769, 4824, 7028, 8794, 8860, 11719, 12705, 14766, 15326.

I then rearranged the remaining columns and color coded groups.  You can click on the chart to enlarge.

A4 mutations

Note: na means not available, indicating that the participant did not test at that level.  An x in the cell indicates that the mutation indicated in that column was present.

The purple and apricot groupings show different clusters of matches.  The light purple is the largest group, and within that group, we find both a dark purple group and an apricot group.  However, not everyone fits within the groups.

A4 – Virginia

The first thing that is immediately evident is that the first kit, Kit 1, is not a member of this purple grouping.  This person has three full sequence matches outside of the project, one whose ancestor was born in Texas.  This individual has three unique full sequence mutations.  This grouping may be Native, but lacks proof.

Additional genealogical research might establish a confirmed Native American connection. If Kit 1 is Native, this line diverged from this larger A4 group long ago, before any of these purple or apricot mutations developed.

This participant’s ancestor traces to Virginia.  Regardless of whether this haplotype is Native or not, it is most likely a sub-haplogroup of A4.

A4 – Colombia

The next least likely match is Kit 2.  This individual shares two of the common HVR2 markers, 146 and 153, but did not test at the full sequence level.  Given what I’m seeing here, I suspect that 146 might be a sub-haplogroup defining mutation for this light purple group.  In addition, 8027 and 12007 might be as well.  That includes everyone (who has tested at the relevant levels) except for Kit 1 and Kit 11.

Haplogroup A4 from Colombia is most likely Native.  Few people are in the public data bases are from Colombia.  One would expect several mutations to have occurred as groups migrated.  At the HVR1 level, this individual has 18 matches, most of which have Spanish surnames.  This participant has no HVR2 matches.

A4 – California Group

The next group is the apricot group which I’ve nicknamed the California group.  Both of these participants, Kit 3 and Kit 4, find their ancestors in either southern California or Baja California, into Mexico.  Finding these haplogroups among the Mexican, Central and South American populations is an indicator of Native heritage, as between 85% and 90% of Mexicans carry Native American matrilineal lineage.

These participants also match a third individual who is not a project member whose ancestor is also found in Baja California.  This group’s defining mutations are likely 16209C, 5054T, 7604A, 7861C and 12513G.  Fortunately, these will be relatively easy to discern due to the HVR1 mutation at 16209.

A4 – Puerto Rico Group

The dark purple group, Kits 5-9, is the Puerto Rican group even though it includes one kit from Mexico and one from Cuba.  The Mexican kit, Kit 5, in teal, is only a partial match.  Kits 6-9 match each other plus several additional people not in the project whose most distant ancestors are found in Puerto Rico as well.  This group has several defining markers including 16083T, 16256T, 214G, 2836T, 6632C and possibly 16126C, although Kit 5 carries 16126C while Kit 9 does not.

The Puerto Rico DNA project has another 18 individuals classified as haplogroup A or A4 and they all carry 16083T, 16256T and those who have taken the HVR2 test (10) carry 214G as well.  Only one carries 16126C, so that would not be a defining mutation for this major group, but could be for a subgroup of the Puerto Rico group.

Given the history of Puerto Rico, this is probably a signature of the Taino or Carib people.

In 2003, 27 Taino DNA sequences were obtained from pre-Columbian remains and reported in this paper by Laluezo-Fox et al.  This was very early in DNA processing, especially of remains, and they were found to carry only haplogroups C and D.  These remains were not from the islands, but were from the La Caleta site in the Dominican Republic.

The Taino today are considered to be culturally extinct due to disease, enslavement and harsh treatment by the Spanish, but they maintained their presence into the 20th century and were a significant factor in the population of the West Indies, including Puerto Rico.  Their descendants would be expected to be found within the population today.  The Taino were the primary tribe found on Puerto Rico and were an Arawak indigenous people who arrived from South America.  The Taino were in conflict with the Caribs from the southern Lesser Antilles.

Carib women were sometimes taken as captives by the Taino.  The Caribs originated in South American near the Orinoco River and settled on the islands around 1200AD, after the Taino were already settled in the region.

It’s therefore possible that haplogroup A4 is a Carib signature.  In 2001, Martinez-Cruzaco et al published a paper titled Mitochondrial DNA analysis reveals substantial Native American ancestry in Puerto Rico in which they found that haplogroup A was absent in the Taino by testing the Yanomama whose territory was close to the Taino.  If this is the case, then haplogroup A must have arisen and admixed from another native culture, or, conversely, the Yanomama tested were an incomplete sampling or simply not adequately representative as a proxy for the Taino.  However, if haplogroup A4 is not found in the Taino, the most likely candidate would be the Caribs, assuming that the Martinez-Cruzaco paper conclusions are accurate, or the even older Ortoiroid, Saladoid culture or Arawak tribe who are believed to have assimilated with or were actually another name for the Taino.

A4 – Mexican/Puerto Rican Mutation 16126 Group

This group, Kits 5-8, is defined by mutation 16126C.  It’s quite interesting, because it includes Kit 5 that does not match the rest of the Puerto Rican markers.  Only some Puerto Rican samples carry 16126C.  Kits 5-8 in this the A4 project do carry this mutation, but 18 of the haplogroup A kits in the Puerto Rican project which do carry the dark purple signature mutations do not carry this mutation.  This mutation may be a later mutation in some of the people who settled on Puerto Rico and some of which remained on the mainland.  The most distant ancestor of Kit 5 is from Tangancícuaro de Arista, Michoacan de Ocampo, shown below.

Tangancícuaro de Arista, Michoacan de Ocampo

Kit 5 has five full sequence matches, all of which carry Spanish surnames.

A4 Outliers

This leaves only kits 10-14.  These kits don’t match each other but do fall, at least on some markers, within the light purple group.

Kit 12 is from Costa Rica and has no matches at the HVR1 level because of a mutation at location 16086C, but has not tested at the HVR2 or full sequence levels.   They might fit into a group easily with additional testing.

Kit 13 is from Mexico and has only two HVR1 matches who have not tested at a higher level.  This kit, like Kit 5, does not carry mutation 16111T which could indicate an early split from the main group or a back mutation.

Kit 10 is from Mexico, has 17 HVR1 matches, some of which indicate that their ancestors are from Texas and Mexico.  Kit 10 has no HVR2 or full sequence matches.

Kit 11 is from Honduras and interestingly, has 158 HVR1 matches to a wide variety of people including those from Costa Rica, Mexico, South Carolina, Oklahoma, a descendant of a Crow Tribal member, North Dakota, Guatemaula, the Cree/Chippewa, a descendant of an Arikawa and one person who indicated their oldest ancestor is from Aragon, in Spain.  This means that all of these people carry the light purple group defining 16111T mutation.

Kit 14 is from Honduras and has only two matches at the HVR1 level, one which is from El Salvador.  Both of the matches have only tested to the HVR1 level.  Kit 14 does carry the 16111T mutation as well as most of the other light purple mutations, but is missing mutation 164C which is present in the entire rest of the light purple group.  This could signify a back mutation.  In addition, Kit 14 matches on marker 16189T with kit 6 from Puerto Rico and on 16311C with Kit 1 from Virginia, but with no other participants on these markers.

These people and their matches and mutations could well represent additional subgroups of haplogroup A4

A4a1

This leaves us with the A4a1 subgroup, which is where I started 18 months ago.

The haplogroup A4a1 group is very interesting, albeit not for the reasons I initially anticipated.  Again, the same columns were deleted as noted in A4, above, leaving only columns (mutations) unique to this group.  As with the other subgroups, these are likely sub-haplogroup defining mutations.

A4a1 mutations

Note:  na means not available, indicating that the participant did not test at that level

A4a1 Mexico

Kit 15, the pink individual did not take the HVR2 or full sequence test, but does not match any other participants at the HVR1 level.  This person’s maternal line is from Mexico.  Kit 15 could be Native and with additional testing could be a different subclade.

A4a1 European Group

The three yellow rows are positively confirmed from Europe.  Kits 1 and 2 do not match each other nor any other participants.

Kit 3 however, matches Kits 4-14.

Kits 3-14, all match each other at the HVR1 level.  One individual has not taken the HVR2 test and one has not taken the full sequence test, but otherwise, they also all match at the HVR2 and full sequence level.  Note that Kit 3 is also in the confirmed European group based on two sets of census documentation.

Within the group of participants comprising kits 3-14, several have oral history and some have circumstantial evidence suggesting Native ancestry, but not one has any documented proof, either in terms of their own ancestors being proven Native, their ancestor’s family members being proven Native, or the people they match being proven as Native.

Kit 3 states that their ancestor was born in England in 1838.  I verified that the 1880 census for New York City confirms that birth location of their ancestor.  The daughter’s mother’s birthplace is also noted to be England in the 1900 census.

Therefore, based on the fact that Kit 3 is proven to be English, according to the census, and this kit matches the rest of the group, Kits 4-14, at the HVR1, HVR2 and full sequence levels, it is very unlikely that this group is Native.

Kit 15, who does not match this group, but who has not tested above the HVR1 level, is the only likely exception and may be Native.  Full sequence testing would likely suggest a different or expanded subgroup of haplogroup A4a1.

Further documentation could add substantially to this information, but at this point, none has been forthcoming.

In Summary – The Layers of Haplogroup A4

Full sequence testing was absolutely essential in sorting through the various participant results.  As demonstrated, the full sequence results were not always what was expected.

When full sequence tested, one participant was determined to be Haplogroup A10, which is not a subgroup of A4.  Haplogroup A10 is indeterminate and could be Native but could also be European.  Additional A10 results will hopefully be forthcoming in the future which will resolve this question.

None of the haplogroup A4a1 participants provide any direct evidence of Native ancestry, with the possible exception of one A4a1 kit whose matrilineal ancestors are from Mexico and who has not tested at a higher level.  Three A4a1 participants have confirmed European ancestry and one of those participants matches most of the others.  A4a1, with possibly one exception, appears to be European.  The A4a1 participant whose ancestors are from Mexico does not match any of the other participants and could eventually be classified as a subhaplogroup.

Haplogroup A4 itself appears to be divided into multiple subgroups, several of which may eventually form new sub-haplogroups based on their clusters of mutations.

There is clearly a European and a Chinese A4 grouping.  The European group is broken into two subgroups, one of which is Jewish.

In the Americas, there are several A4 subgroups, including:

  • Virginia – indeterminate whether Native
  • Colombia – likely Native
  • California – likely Native
  • Puerto Rico (2 groups) – very likely Native

There are also 5 outliers who don’t match others within the group, hailing from:

  • Costa Rica – likely Native
  • Mexico (2) – likely Native
  • Honduras – matching several confirmed Native people in multiple tribes at the HVR1 level
  • Honduras – likely Native

A4 grid v2

Note: Undet, short for undetermined, means that the results could be Native or European but available evidence has not been able to differentiate between those alternatives today.

*A4 needs to be further divided into additional haplogroup subgroups.

Dedication

Obviously, a study of this complexity couldn’t be done without the many resources I’ve mentioned and probably some that I’ve forgotten.  I thank everyone who contributed and continues to contribute.  I also want to thank the people who contributed to the funding for participant testing.  We could not have done this without your contributions in combination with the discounts offered by Family Tree DNA.

However, the most important resource is the participants and their willingness to share – their DNA, their research and their family stories.  During this project, two of our participants have passed away.  I would like to take this opportunity to dedicate this research to them, and I hope they know that their DNA keeps on giving.  This is their legacy.

Acknowledgements

I would like to thank Ian Logan for his assistance with haplogroup designation, Family Tree DNA for testing support and discounts, my project co-administrator, Marie Rundquist, Bennett Greenspan, Dr. Michelle Fiedler and Dr. David Pike for paper review.