Concepts – Calculating Ethnicity Percentages

There has been a lot of discussion about ethnicity percentages within the genetic genealogy community recently, probably because of the number of people who have recently purchased DNA tests to discover “who they are.”

Testers want to know specifically if ethnicity percentages are right or wrong, and what those percentages should be. The next question, of course, is which vendor is the most accurate.

Up front, let me say that “your mileage may vary.” The vendor that is the most accurate for my German ancestry may not be the same vendor that is the most accurate for the British Isles or Native American. The vendor that is the most accurate overall for me may not be the most accurate for you. And the vendor that is the most accurate for me today, may no longer be the most accurate when another vendor upgrades their software tomorrow. There is no universal “most accurate.”

But then again, how does one judge “most accurate?” Is it just a feeling, or based on your preconceived idea of your ethnicity? Is it based on the results of one particular ethnicity, or something else?

As a genealogist, you have a very powerful tool to use to figure out the percentages that your ethnicity SHOULD BE. You don’t have to rely totally on any vendor. What is that tool? Your genealogy research!

I’d like to walk you through the process of determining what your own ethnicity percentages should be, or at least should be close to, barring any surprises.

By surprises, in this case, we’re assuming that all 64 of your GGGG-grandparents really ARE your GGGG-grandparents, or at least haven’t been proven otherwise. Even if one or two aren’t, that really only affects your results by 1.56% each. In the greater scheme of things, that’s trivial unless it’s that minority ancestor you’re desperately seeking.

A Little Math

First, let’s do a little very basic math. I promise, just a little. And it really is easy. In fact, I’ll just do it for you!

You have 64 great-great-great-great-grandparents.

Generation # You Have Who Approximate Percentage of Their DNA That You Have Today
1 You 100%
1 2 Parents 50%
2 4 Grandparents 25%
3 8 Great-grandparents 12.5%
4 16 Great-great-grandparents 6.25%
5 32 Great-great-great-grandparents 3.12%
6 64 Great-great-great-great-grandparents 1.56%

Each of those GGGG-grandparents contributed 1.56% of your DNA, roughly.

Why 1.56%?

Because 100% of your DNA divided by 64 GGGG-grandparents equals 1.56% of each of those GGGG-grandparents. That means you have roughly 1.56% of each of those GGGG-grandparents running in your veins.

OK, but why “roughly?”

We all know that we inherit 50% of each of our parents’ DNA.

So that means we receive half of the DNA of each ancestor that each parent received, right?

Well, um…no, not exactly.

Ancestral DNA isn’t divided exactly in half, by the “one for you and one for me” methodology. In fact, DNA is inherited in chunks, and often you receive all of a chunk of DNA from that parent, or none of it. Seldom do you receive exactly half of a chunk, or ancestral segment – but half is the AVERAGE.

Because we can’t tell exactly how much of any ancestor’s DNA we actually do receive, we have to use the average number, knowing full well we could have more than our 1.56% allocation of that particular ancestor’s DNA, or none that is discernable at current testing thresholds.

Furthermore, if that 1.56% is our elusive Native ancestor, but current technology can’t identify that ancestor’s DNA as Native, then our Native heritage melds into another category. That ancestor is still there, but we just can’t “see” them today.

So, the best we can do is to use the 1.56% number and know that it’s close. In other words, you’re not going to find that you carry 25% of a particular ancestor’s DNA that you’re supposed to carry 1.56% for. But you might have 3%, half of a percent, or none.

Your Pedigree Chart

To calculate your expected ethnicity percentages, you’ll want to work with a pedigree chart showing your 64 GGGG-grandparents. If you haven’t identified all 64 of your GGGG-grandparents – that’s alright – we can accommodate that. Work with what you do have – but accuracy about the ancestors you have identified is important.

I use RootsMagic, and in the RootsMagic software, I can display all 64 GGGG-grandparents by selecting all 4 of my grandparents one at a time.

In the first screen, below, my paternal grandfather is blue and my 16 GGGG-grandparents that are his ancestors are showing to the far right.  Please note that you can click on any of the images to enlarge.

ethnicity-pedigree

Next, my paternal grandmother

ethnicity-pedigree-1

Next, my maternal grandmother.

ethnicity-pedigree-2

And finally, my maternal grandfather.

ethnicity-pedigre-3

These displays are what you will work from to create your ethnicity table or chart.

Your Ethnicity Table

I simply displayed each of these 16 GGGG-grandparents and completed the following grid. I used a spreadsheet, but you can use a table or simply do this on a tablet of paper. Technology not required.

You’ll want 5 columns, as shown below.

  • Number 1-64, to make sure you don’t omit anyone
  • Name
  • Birth Location
  • 1.56% Source – meaning where in the world did the 1.56% of the DNA you received from them come from? This may not be the same as their birth location. For example an Irish man born in Virginia counts as an Irish man.
  • Ancestry – meaning if you don’t know positively where that ancestor is from, what do you know about them? For example, you might know that their father was German, but uncertain about the mother’s nationality.

My ethnicity table is shown below.

ethnicity-table

In some cases, I had to make decisions.

For example, I know that Daniel Miller’s father was a German immigrant, documented and proven. The family did not speak English. They were Brethren, a German religious sect that intermarried with other Brethren.  Marriage outside the church meant dismissal – so your children would not have been Brethren. Therefore, it would be extremely unlikely, based on both the language barrier and the Brethren religious customs for Daniel’s mother, Magdalena, to be anything other than German – plus, their children were Brethren..

We know that most people married people within their own group – partly because that is who they were exposed to, but also based on cultural norms and pressures. When it comes to immigrants and language, you married someone you could communicate with.

Filling in blanks another way, a local German man was likely the father of Eva Barbara Haering’s illegitmate child, born to Eva Barbara in her home village in Germany.

Obviously, there were exceptions, but they were just that, the exception. You’ll have to evaluate each of your 64 GGGG-grandparents individually.

Calculating Percentages

Next, we’re going to group locations together.

For example, I had a total of one plus that was British Isles. Three and a half, plus, that were Scottish. Nine and a half that were Dutch.

ethnicity-summary

You can’t do anything with the “plus” designation, but you can multiply by everything else.

So, for Scottish, 3 and a half (3.5) times 1.56% equals 5.46% total Scottish DNA. Follow this same procedure for every category you’re showing.

Do the same for “uncertain.”

Incorporating History

In my case, because all of my uncertain lines are on my father’s colonial side, and I do know locations and something about their spouses and/or the population found in the areas where each ancestor is located, I am making an “educated speculation” that these individuals are from the British Isles. These families didn’t speak German, or French, or have French or German, Dutch or Scandinavian surnames. People married others like themselves, in their communities and churches.

I want to be very clear about this. It’s not a SWAG (serious wild-a** guess), it’s educated speculation based on the history I do know.

I would suggest that there is a difference between “uncertain” and “unknown origin.” Unknown origin connotates that there is some evidence that the individual is NOT from the same background as their spouse, or they are from a highly mixed region, but we don’t know.

In my case, this leaves a total of 2 and a half that are of unknown origin, based on the other “half” that isn’t known of some lineages. For example, I know there are other Native lines and at least one African line, but I don’t know what percentage of which ancestor how far back. I can’t pinpoint the exact generation in which that lineage was “full” and not admixed.

I have multiple Native lines in my mother’s side in the Acadian population, but they are further back than 6 generations and the population is endogamous – so those ancestors sometimes appear more than once and in multiple Acadian lines – meaning I probably carry more of their DNA than I otherwise would. These situations are difficult to calculate mathematically, so just keep them in mind.

Given the circumstances based on what I do know, the 3.9% unknown origin is probably about right, and in this case, the unknown origin is likely at least part Native and/or African and probably some of each.

ethnicity-summary-2

The Testing Companies

It’s very difficult to compare apples to apples between testing companies, because they display and calculate ethnicity categories differently.

For example, Family Tree DNA’s regions are fairly succinct, with some overlap between regions, shown below.

ethnicity-ftdna-map

Some of Ancestry’s regions overlap by almost 100%, meaning that any area in a region could actually be a part of another region.

ethnicity-ancestry-map-2

For example look at the United Kingdom and Ireland. The United Kingdom region overlaps significantly into Europe.

ethnicity-ancestry-map

Here’s the Great Britain region close up, below, which is shown differently from the map above. The Great Britain region actually overlaps almost the entire western half of Europe.

ethnicity-ancestry-great-britain

That’s called hedging your bets, or maybe it’s simply the nature of ethnicity. Granted, the overlaps are a methodology for the vendor not to be “wrong,” but people and populations did and do migrate, and the British Isles was somewhat of a destination location.

This Germanic Tribes map, also from Ancestry’s Great Britain section, illustrates why ethnicity calculations are so difficult, especially in Europe and the British Isles.

ethnicity-invaders

Invaders and migrating groups brought their DNA.  Even if the invaders eventually left, their DNA often became resident in the host population.

The 23andMe map, below, is less detailed in terms of viewing how regions overlap.

ethnicity-23andme-map

The Genographic project breaks ethnicity down into 9 world regions which they indicate reflect both recent influences and ancient genetics dating from 500 to 10,000 years ago. I fall into 3 regions, shown by the shadowy Circles on the map, below.

ethnicity-geno-map-2

The following explanation is provided by the Genographic Project for how they calculate and explain the various regions, based on early European history.

ethnicity-geno-regions

Let’s look at how the vendors divide ethnicity and see what kind of comparisons we can make utilizing the ethnicity table we created that represents our known genealogy.

Family Tree DNA

MyOrigins results at Family Tree DNA show my ethnicity as:

ethnicity-ftdna-percents

I’ve reworked my ethnicity totals format to accommodate the vendor regions, creating the Ethnicity Totals Table, below. The “Genealogy %” column is the expected percentage based on my genealogy calculations. I have kept the “British Isles Inferred” percentage separate since it is the most speculative.

ethnicity-ftdna-table

I grouped the regions so that we can obtain a somewhat apples-to-apples comparison between vendor results, although that is clearly challenging based on the different vendor interpretations of the various regions.

Note the Scandinavian, which could potentially be a Viking remnant, but there would have had to be a whole boatload of Vikings, pardon the pun, or Viking is deeply inbedded in several population groups.

Ancestry

Ancestry reports my ethnicity as:

ethnicity-ancestry-amounts

Ancestry introduces Italy and Greece, which is news to me. However, if you remember, Ancestry’s Great Britain ethnicity circle reaches all the way down to include the top of Italy.

ethnicity-ancestry-table

Of all my expected genealogy regions, the most definitive are my Dutch, French and German. Many are recent immigrants from my mother’s side, removing any ambiguity about where they came from. There is very little speculation in this group, with the exception of one illegitimate German birth and two inferred German mothers.

23andMe

23andMe allows customers to change their ethnicity view along a range from speculative to conservative.

ethnicity-23andme-levels

Generally, genealogists utilize the speculative view, which provides the greatest regional variety and breakdown. The conservative view, in general, simply rolls the detail into larger regions and assigns a higher percentage to unknown.

I am showing the speculative view, below.

ethnicity-23andme-amounts

Adding the 23andMe column to my Ethnicity Totals Table, we show the following.

ethnicity-23andme-table-2

Genographic Project 2.0

I also tested through the Genographic project. Their results are much more general in nature.

ethnicity-geno-amounts

The Genographic Project results do not fit well with the others in terms of categorization. In order to include the Genographic ethnicity numbers, I’ve had to add the totals for several of the other groups together, in the gray bands below.

ethnicity-geno-table-2

Genographic Project results are the least like the others, and the most difficult to quantify relative to expected amounts of genealogy. Genealogically, they are certainly the least useful, although genealogy is not and never has been the Genographic focus.

I initially omitted this test from this article, but decided to include it for general interest. These four tests clearly illustrate the wide spectrum of results that a consumer can expect to receive relative to ethnicity.

What’s the Point?

Are you looking at the range of my expected ethnicity versus my ethnicity estimates from the these four entities and asking yourself, “what’s the point?”

That IS the point. These are all proprietary estimates for the same person – and look at the differences – especially compared to what we do know about my genealogy.

This exercise demonstrates how widely estimates can vary when compared against a relatively solid genealogy, especially on my mother’s side – and against other vendors. Not everyone has the benefit of having worked on their genealogy as long as I have. And no, in case you’re wondering, the genealogy is not wrong. Where there is doubt, I have reflected that in my expected ethnicity.

Here are the points I’d like to make about ethnicity estimates.

  • Ethnicity estimates are interesting and alluring.
  • Ethnicity estimates are highly entertaining.
  • Don’t marry them. They’re not dependable.
  • Create and utilize your ethnicity chart based on your known, proven genealogy which will provide a compass for unknown genealogy. For example, my German and Dutch lines are proven unquestionably, which means those percentages are firm and should match up relatively well to vendor ethnicity estimates for those regions.
  • Take all ethnicity estimates with a grain of salt.
  • Sometimes the shaker of salt.
  • Sometimes the entire lick of salt.
  • Ethnicity estimates make great cocktail party conversation.
  • If the results don’t make sense based on your known genealogical percentages, especially if your genealogy is well-researched and documented, understand the possibilities of why and when a healthy dose of skepticism is prudent. For example, if your DNA from a particular region exceeds the total of both of your parents for that region, something is amiss someplace – which is NOT to suggest that you are not your parents’ child.  If you’re not the child of one or both parents, assuming they have DNA tested, you won’t need ethnicity results to prove or even suggest that.
  • Ethnicity estimates are not facts beyond very high percentages, 25% and above. At that level, the ethnicity does exist, but the percentage may be in error.
  • Ethnicity estimates are generally accurate to the continent level, although not always at low levels. Note weasel word, “generally.”
  • We should all enjoy the results and utilize these estimates for their hints and clues.  For example, if you are an adoptee and you are 25% African, it’s likely that one of your grandparents was Africa, or two of your grandparents were roughly half African, or all four of your grandparents were one-fourth African.  Hints and clues, not gospel and not cast in concrete. Maybe cast in warm Jello.
  • Ethnicity estimates showing larger percentages probably hold a pearl of truth, but how big the pearl and the quality of the pearl is open for debate. The size and value of the pearl is directly related to the size of the percentage and the reference populations.
  • Unexpected results are perplexing. In the case of my unknown 8% to 12% Scandinavian – the Vikings may be to blame, or the reference populations, which are current populations, not historical populations – or some of each. My Scandinavian amounts translate into between 5 and 8 of my GGGG-grandparents being fully Scandinavian – and that’s extremely unlikely in the middle of Virginia in the 1700s.
  • There can be fairly large slices of completely unexplained ethnicity. For example, Scandinavia at 8-12% and even more perplexing, Italy and Greece. All I can say is that there must have been an awful lot of Vikings buried in the DNA of those other populations. But enough to aggregate, cumulatively, to between a great-grandparent at 12.5% and a great-great-grandparent at 6.25%? I’m not convinced. However, all three vendors found some Scandinavian – so something is afoot. Did they all use the same reference population data for Scandinavian? For the time being, the Scandinavian results remain a mystery.
  • There is no way to tell what is real and what is not. Meaning, do I really have some ancient Italian/Greek and more recent Scandinavian, or is this deep ancestry or a reference population issue? And can the lack of my proven Native and African ancestry be attributed to the same?
  • Proven ancestors beyond 6 generations, meaning Native lineages, disappear while undocumentable and tenuous ancestors beyond 6 generations appear – apparently, en masse. In my case, kind of like a naughty Scandinavian ancestral flash mob, taunting and tormenting me. Who are those people??? Are they real?
  • If the known/proven ethnicity percentages from Germany, Netherlands and France can be highly erroneous, what does that imply about the rest of the results? Especially within Europe? The accuracy issue is especially pronounced looking at the wide ranges of British Isles between vendors, versus my expected percentage, which is even higher, although the inferred British Isles could be partly erroneous – but not on this magnitude. Apparently part of by British Isles ancestry is being categorized as either or both Scandinavian or European.
  • Conversely, these estimates can and do miss positively genealogically proven minority ethnicity. By minority, I mean minority to the tester. In my case, African and Native that is proven in multiple lines – and not just by paper genealogy, but by Y and mtDNA haplogroups as well.
  • Vendors’ products and their estimates will change with time as this field matures and reference populations improve.
  • Some results may reflect the ancient history of the entire population, as indicated by the Genographic Project. In other words, if the entire German population is 30% Mediterranean, then your ancestors who descend from that population can be expected to be 30% Mediterranean too. Except I don’t show enough Mediterranean ancestry to be 30% of my German DNA, which would be about 8% – at least not as reported by any vendor other than the Genographic Project.
  • Not all vendors display below 1% where traces of minority admixture are sometimes found. If it’s hard to tell if 8-12% Scandinavian is real, it’s almost impossible to tell whether less than 1% of anything is real.  Having said that, I’d still like to see my trace amounts, especially at a continental level which tends to be more reliable, given that is where both my Native and African are found.
  • If the reason my Native and African ancestors aren’t showing is because their DNA was not passed on in subsequent generations, causing their DNA to effectively “wash out,” why didn’t that happen to Scandinavian?
  • Ethnicity estimates can never disprove that an ancestor a few generations back was or was not any particular ethnicity. (However, Y and mitochondrial DNA testing can.)
  • Absence of evidence is not evidence of absence, except in very recent generations – like 2 (grandparents at 25%), maybe 3 generations (great-grandparents at 12.5%).
  • Continental level estimates above 10-12 percent can probably be relied upon to suggest that the particular continental level ethnicity is present, but the percentage may not be accurate. Note the weasel wording here – “probably” – it’s here on purpose. Refer to Scandinavia, above – although that’s regional, not continental, but it’s a great example. My proven Native/African is nearly elusive and my mystery Scandinavian/Greek/Italian is present in far greater percentages than it should be, based upon proven genealogy.
  • Vendors, all vendors, struggle to separate ethnicity regions within continents, in particular, within Europe.
  • Don’t take your ethnicity results too seriously and don’t be trading in your lederhosen for kilts, or vice versa – especially not based on intra-continental results.
  • Don’t change your perception of who you are based on current ethnicity tests. Otherwise you’re going to feel like a chameleon if you test at multiple vendors.
  • Ethnicity estimates are not a short cut to or a replacement for discovering who you are based on sound genealogical research.
  • No vendor, NOT ANY VENDOR, can identify your Native American tribe. If they say or imply they can, RUN, with your money. Native DNA is more alike than different. Just because a vendor compares you to an individual from a particular tribe, and part of your DNA matches, does NOT mean your ancestors were members of or affiliated with that tribe. These three major vendors plus the Genographic Project don’t try to pull any of those shenanigans, but others do.
  • Genetic genealogy and specifically, ethnicity, is still a new field, a frontier.
  • Ethnicity estimates are not yet a mature technology as is aptly illustrated by the differences between vendors.
  • Ethnicity estimates are that. ESTIMATES.

If you like to learn more about ethnicity estimates and how they are calculated, you might want to read this article, Ethnicity Testing, A Conundrum.

Summary

This information is NOT a criticism of the vendors. Instead, this is a cautionary tale about correctly setting expectations for consumers who want to understand and interpret their results – and about how to use your own genealogy research to do so.

Not a day passes that I don’t receive very specific questions about the interpretation of ethnicity estimates. People want to know why their results are not what they expected, or why they have more of a particular geographic region listed than their two parents combined. Great questions!

This phenomenon is only going to increase with the popularity of DNA testing and the number of people who test to discover their identity as a result of highly visible ad campaigns.

So let me be very clear. No one can provide a specific interpretation. All we can do is explain how ethnicity estimates work – and that these results are estimates created utilizing different reference populations and proprietary software by each vendor.

Whether the results match each other or customer expectations, or not, these vendors are legitimate, as are the GedMatch ethnicity tools. Other vendors may be less so, and some are outright unethical, looking to exploit the unwary consumer, especially those looking for Native American heritage. If you’re interested in how to tell the difference between legitimate genetic information and a company utilizing pseudo-genetics to part you from your money, click here for a lecture by Dr. Jennifer Raff, especially about minutes 48-50.

Buyer beware, both in terms of purchasing DNA testing for ethnicity purposes to discover “who you are” and when internalizing and interpreting results.

The science just isn’t there yet for answers at the level most people seek.

My advice, in a nutshell: Stay with legitimate vendors. Enjoy your ethnicity results, but don’t take them too seriously without corroborating traditional genealogical evidence!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2016 Genetic Genealogy Retrospective

In past years, I’ve written a “best of” article about genetic genealogy happenings throughout the year. For several years, the genetic genealogy industry was relatively new, and there were lots of new tools being announced by the testing vendors and others as well.

This year is a bit different. I’ve noticed a leveling off – there have been very few announcements of new tools by vendors, with only a few exceptions.  I think genetic genealogy is maturing and has perhaps begun a new chapter.  Let’s take a look.

Vendors

Family Tree DNA

Family Tree DNA leads the pack this year with their new Phased Family Matches which utilizes close relatives, up to third cousins, to assign your matches to either maternal or paternal buckets, or both if the individual is related on both sides of your tree.

Both Buckets

They are the first and remain the only vendor to offer this kind of feature.

Phased FF2

Phased Family Matching is extremely useful in terms of identifying which side of your family tree your matches are from. This tool, in addition to Family Tree DNA’s nine other autosomal tools helps identify common ancestors by showing you who is related to whom.

Family Tree DNA has also added other features such as a revamped tree with the ability to connect DNA results to family members.  DNA results connected to the tree is the foundation for the new Phased Family Matching.

The new Ancient Origins feature, released in November, was developed collaboratively with Dr. Michael Hammer at the University of Arizona Hammer Lab.

Ancient European Origins is based on the full genome sequencing work now being performed in the academic realm on ancient remains. These European results fall into three primary groups of categories based on age and culture.  Customer’s DNA is compared to the ancient remains to determine how much of the customer’s European DNA came from which group.  This exciting new feature allows us to understand more about our ancestors, long before the advent of surnames and paper or parchment records. Ancient DNA is redefining what we know, or thought we knew, about population migration.

2016-ancient-origins

You can view Dr. Hammer’s presentation given at the Family Tree DNA Conference in conjunction with the announcement of the new Ancient Origins feature here.

Family Tree DNA maintains its leadership position among the three primary vendors relative to Y DNA testing, mtDNA testing and autosomal tools.

Ancestry

In May of 2016, Ancestry changed the chip utilized by their tests, removing about 300,000 of their previous 682,000 SNPs and replacing them with medically optimized SNPs. The rather immediate effect was that due to the chip incompatibility, Ancestry V2 test files created on the new chip cannot be uploaded to Family Tree DNA, but they can be uploaded to GedMatch.  Family Tree DNA is working on a resolution to this problem.

I tested on the new Ancestry V2 chip, and while there is a difference in how much matching DNA I share with my matches as compared to the V1 chip, it’s not as pronounced as I expected. There is no need for people who tested on the earlier chip to retest.

Unfortunately, Ancestry has remained steadfast in their refusal to implement a chromosome browser, instead focusing on sales by advertising the ethnicity “self-discovery” aspect of DNA testing.

Ancestry does have the largest autosomal data base but many people tested only for ethnicity, don’t have trees or have private trees.  In my case, about half of my matches fall into that category.

Ancestry maintains its leadership position relative to DNA tree matching, known as a Shared Ancestor Hint, identifying common ancestors in the trees of people whose DNA matches.

ancestry-common-ancestors

23andMe

23andMe struggled for most of the year to meet a November 2015 deadline, which is now more than a year past, to transition its customers to the 23andMe “New Experience” which includes a new customer interface. I was finally transitioned in September 2016, and the experience has been very frustrating and extremely disappointing, and that’s putting it mildly. Some customers, specifically international customers, are still not transitioned, nor is it clear if or when they will be.

I tested on the 23andMe older V3 chip as well as their newer V4 chip. After my transition to the New Experience, I compared the results of the two tests. The new security rules incorporated into the New Experience meant that I was only able to view about 25% of my matches (400 of 1651(V3) matches or 1700 (V4) matches). 23andMe has, in essence, relegated themselves into the non-player status for genetic genealogy, except perhaps for adoptees who need to swim in every pool – but only then as a last place candidate. And those adoptees had better pray that if they have a close match, that match falls into the 25% of their matches that are useful.

In December, 23andMe began providing segment information for ethnicity segments, except the parental phasing portion does not function accurately, calling into question the overall accuracy of the 23andme ethnicity information. Ironically, up until now, while 23andMe slipped in every other area, they had been viewed at the best, meaning most accurate, in terms of ethnicity estimates.

New Kids on the Block

MyHeritage

In May of 2016, MyHeritage began encouraging people who have tested at other vendors to upload their results. I was initially very hesitant, because aside from GedMatch that has a plethora of genetic genealogy tools, I have seen no benefit to the participant to upload their DNA anyplace, other than Family Tree DNA (available for V3 23andMe and V1 Ancestry only).

Any serious genealogist is going to test at least at Family Tree DNA and Ancestry, both, and upload to GedMatch. My Heritage was “just another upload site” with no tools, not even matching initially.

However, in September, MyHeritage implemented matching, although they have had a series of what I hope are “startup issues,” with numerous invalid matches, apparently resulting from their usage of imputation.

Imputation is when a vendor infers what they think your DNA will look like in regions where other vendors test, and your vendor doesn’t. The best example would be the 300,000 or so Ancestry locations that are unique to the Ancestry V2 chip. Imputation would result in a vendor “inferring” or imputing your results for these 300,000 locations based on…well, we don’t exactly know based on what. But we do know it cannot be accurate.  It’s not your DNA.

In the midst of this, in October, 23andMe announced on their forum that they had severed a previous business relationship with MyHeritage where 23andMe allowed customers to link to MyHeritage trees in lieu of having customer trees directly on the 23andMe site.  This approach had been problematic because customers are only allowed 250 individuals in their tree for free, and anything above that requires a MyHeritage subscription.  Currently 23andMe has no tree capability.

It appears that MyHeritage refined their DNA matching routines at least somewhat, because many of the bogus matches were gone in November when they announced that their beta was complete and that they were going to sell their own autosomal DNA tests. However, matching issues have not disappeared or been entirely resolved.

While Family Tree DNA’s lab will be processing the MyHeritage autosomal tests, the results will NOT be automatically placed in the Family Tree DNA data base.

MyHeritage will be doing their own matching within their own database. There are no comparison tools, tree matching or ethnicity estimates today, but My Heritage says they will develop a chromosome browser and ethnicity estimates. However, it is NOT clear whether these will be available for free to individuals who have transferred their results into MyHeritage or if they will only be available to people who tested through MyHeritage.

2016-myheritage-matches

For the record, I have 28 matches today at MyHeritage.

2016-myheritage-second-match

I found that my second closest match at MyHeritage is also at Ancestry.

2016-myheritage-at-ancestry

At MyHeritage, they report that I match this individual on a total of 64.1 cM, across 7 segments, with the largest segment being 14.9 cM.

Ancestry reports this same match at 8.3 cM total across 1 segment, which of course means that the longest segment is also 8.3 cM.

Ancestry estimates the relationship as 5th to 8th cousin, and MyHeritage estimates it as 2nd to 4th.

While I think Ancestry’s Timber strips out too much DNA, there is clearly a HUGE difference in the reported results and the majority of this issue likely lies with the MyHeritage DNA imputation and matching routines.

I uploaded my Family Tree DNA autosomal file to MyHeritage, so MyHeritage is imputing at least 300,000 SNPs for me – almost half of the SNPs needed to match to Ancestry files.  They are probably imputing that many for my match’s file too, so that we have an equal number of SNPs for comparison.  Combined, this would mean that my match and I are comparing 382,000 actual SNPs that we both tested, and roughly 600,000 SNPs that we did not test and were imputed.  No wonder the MyHeritage numbers are so “off.”

My Heritage has a long way to go before they are a real player in this arena. However, My Heritage has potential, as they have a large subscriber base in Europe, where we desperately need additional testers – so I’m hopeful that they can attract additional genealogists that are willing to test from areas that are under-represented to date.

My Heritage got off to a bit of a rocky start by requiring users to relinquish the rights to their DNA, but then changed their terms in May, according to Judy Russell’s blog.

All vendors can change their terms at any time, in a positive or negative direction, so I would strongly encourage all individuals considering utilizing any testing company or upload service to closely read all the legal language, including Terms and Conditions and any links found in the Terms and Conditions.

Please note that MyHeritage is a subscription genealogy site, similar to Ancestry.  MyHeritage also owns Geni.com.  One site, MyHeritage, allows individual trees and the other, Geni, embraces the “one world tree” model.  For a comparison of the two, check out Judy Russell’s articles, here and here.  Geni has also embraced DNA by allowing uploads from Family Tree DNA of Y, mitochondrial and autosomal, but the benefits and possible benefits are much less clear.

If the MyHeritage story sounds like a confusing soap opera, it is.  Let’s hope that 2017 brings both clarity and improvements.

Living DNA

Living DNA is a company out of the British Isles with a new test that purports to provide you with a breakdown of your ethnicity and the locations of your ancestral lines within 21 regions in the British Isles.  Truthfully, I’m very skeptical, but open minded.

They have had my kit for several weeks now, and testing has yet to begin.  I’ll write about the results when I receive them.  So far, I don’t know of anyone who has received results.

2016-living-dna

Genos

I debated whether or not I should include Genos, because they are not a test for genealogy and are medically focused. However, I am including them because they have launched a new model for genetic testing wherein your full exome is tested, you receive the results along with information on the SNPs where mutations are found. You can then choose to be involved with research programs in the future, if you wish, or not.

That’s a vastly different model that the current approach taken by 23andMe and Ancestry where you relinquish your rights to the sale of your DNA when you sign up to test.  I like this new approach with complete transparency, allowing the customer to decide the fate of their DNA. I wrote about the Genos test and the results, here.

Third Parties

Individuals sometimes create and introduce new tools to assist genealogists with genetic genealogy and analysis.

I have covered these extensively over the years.

GedMatch, WikiTree, DNAGedcom.com and Kitty Cooper’s tools remain my favorites.

I love Kitty’s Ancestor Chromosome Mapper which maps the segments identified with your ancestors on your chromosomes. I just love seeing which ancestors’ DNA I carry on which chromosomes.  Somehow, this makes me feel closer to them.  They’re not really gone, because they still exist in me and other descendants as well.

Roberta's ancestor map2

In order to use Kitty’s tool, you’ll have to have mapped at least some of your autosomal DNA to ancestors.

The Autosomal DNA Segment Analyzer written by Don Worth and available at DNAGedcom is still one of my favorite tools for quick, visual and easy to understand segment matching results.

ADSA Crumley cluster

GedMatch has offered a triangulation tool for some time now, but recently introduced a new Triangulation Groups tool.

2016-gedmatch-triangulation-groups

I have not utilized this tool extensively but it looks very interesting. Unfortunately, there is no explanation or help function available for what this tool is displaying or how to understand and interpret the results. Hopefully, that will be added soon, as I think it would be possible to misinterpret the output without educational material.

GedMatch also introduced their “Evil Twin” tool, which made me laugh when I saw the name.  Using parental phasing, you can phase your DNA to your parent or parents at GedMatch, creating kits that only have your mother’s half of your DNA, or your father’s half.  These phased kits allow you to see your matches that come from that parent, only.  However, the “Evil Twin” feature creates a kit made up of the DNA that you DIDN’T receive from that parent – so in essence it’s your other half, your evil twin – you know, that person who got blamed for everything you “didn’t do.”  In any case, this allows you to see the matches to the other half of your parent’s DNA that do not show up as your matches.

Truthfully, the Evil Twin tool is interesting, but since you have to have that parent’s DNA to phase against in the first place, it’s just as easy to look at your parent’s matches – at least for me.

Others offer unique tools that are a bit different.

DNAadoption.com offers tools, search and research techniques, especially for adoptees and those looking to identify a parent or grandparents, but perhaps even more important, they offer genetic genealogy classes including basic and introductory.

I send all adoptees in their direction, but I encourage everyone to utilize their classes.

WikiTree has continued to develop and enhance their DNA offerings.  While WikiTree is not a testing service nor do they offer autosomal data tools like Family Tree DNA and GedMatch, they do allow individuals to discover whether anyone in their ancestral line has tested their Y, mitochondrial or autosomal DNA.

Specifically, you can identify the haplogroup of any male or female ancestor if another individual from that direct lineage has tested and provided that information for that ancestor on WikiTree.  While I am generally not a fan of the “one world tree” types of implementations, I am a fan of WikiTree because of their far-sighted DNA comparisons, the fact that they actively engage their customers, they listen and they expend a significant amount of effort making sure they “get it right,” relative to DNA. Check out WikiTree’s article,  Putting DNA Results Into Action, for how to utilize their DNA Features.

2016-wikitree-peter-roberts

Thanks particularly to Chris Whitten at WikiTree and Peter Roberts for their tireless efforts.  WikiTree is the only vendor to offer the ability to discover the Y and mtDNA haplogroups of ancestors by searching trees.

All of the people creating the tools mentioned above, to the best of my knowledge, are primarily volunteers, although GedMatch does charge a small subscription service for their high end tools, including the triangulation and evil twin tools.  DNAGedcom does as well.  Wikitree generates some revenue for the site through ads on pages of non-members. DNAAdoption charges nominally for classes but they do have need-based scholarships. Kitty has a donation link on her website and all of these folks would gladly accept donations, I’m sure.  Websites and everything that goes along with them aren’t free.  Donations are a nice way to say thank you.

What Defined 2016

I have noticed two trends in the genetic genealogy industry in 2016, and they are intertwined – ethnicity and education.

First, there is an avalanche of new testers, many of whom are not genetic genealogists.

Why would one test if they weren’t a genetic genealogist?

The answer is simple…

Ethnicity.

Or more specifically, the targeted marketing of ethnicity.  Ethnicity testing looks like an easy, quick answer to a basic human question, and it sells kits.

Ethnicity

“Kim just wanted to know who she was.”

I have to tell you, these commercials absolutely make me CRINGE.

Yes, they do bring additional testers into the community, BUT carrying significantly misset expectations. If you’re wondering about WHY I would suggest that ethnicity results really cannot tell you “who you are,” check out this article about ethnicity estimates.

And yes, that’s what they are, estimates – very interesting estimates, but estimates just the same.  Estimates that provide important and valid hints and clues, but not definitive answers.

ESTIMATES.

Nothing more.

Estimates based on proprietary vendor algorithms that tend to be fairly accurate at the continental level, and not so much within continents – in particular, not terribly accurate within Europe. Not all of this can be laid a the vendor’s feet.  For example, DNA testing is illegal in France.  Not to mention, genetic genealogy and population genetics is still a new and emerging field.  We’re on the frontier, folks.

The ethnicity results one receives from the 3 major vendors (Ancestry, Family Tree DNA and 23andMe) and the various tools at GedMatch don’t and won’t agree – because they use different reference populations, different matching routines, etc.  Not to mention people and populations move around and have moved around.

The next thing that happens, after these people receive their results, is that we find them on the Facebook groups asking questions like, “Why doesn’t my full blooded Native American grandmother show up?” and “I just got my Ancestry results back. What do I do?”  They mean that question quite literally.

I’m not making fun of these people, or light of the situation. Their level of frustration and confusion is evident. I feel sorry for them…but the genetic genealogy community and the rest of us are left with applying ointment and Band-Aids.  Truthfully, we’re out-numbered.

Because of the expectations, people who test today don’t realize that genetic testing is a TOOL, it’s not an ANSWER. It’s only part of the story. Oh, and did I mention, ethnicity is only an ESTIMATE!!!

But an estimate isn’t what these folks are expecting. They are expecting “the answer,” their own personal answer, which is very, very unfortunate, because eventually they are either unhappy or blissfully unaware.

Many become unhappy because they perceive the results to be in error without understanding anything about the technology or what information can reasonably be delivered, or they swallow “the answer” lock stock and barrel, again, without understanding anything about the technology.

Ethnicity is fun, it isn’t “bad” but the results need to be evaluated in context with other information, such as Y and mitochondrial haplogroups, genealogical records and ethnicity results from the other major testing companies.

Fortunately, we can recruit some of the ethnicity testers to become genealogists, but that requires education and encouragement. Let’s hope that those DNA ethnicity results light the fires of curiosity and that we can fan those flames!

Education

The genetic genealogy community desperately needs educational resources, in part as a result of the avalanche of new testers – approximately 1 million a year, and that estimate may be low. Thankfully, we do have several education options – but we can always use more.  Unfortunately, the learning curve is rather steep.

My blog offers just shy of 800 articles, all key word searchable, but one has to first find the blog and want to search and learn, as opposed to being handed “the answer.”

Of course, the “Help” link is always a good place to start as are these articles, DNA Testing for Genealogy 101 and Autosomal DNA Testing 101.  These two articles should be “must reads” for everyone who has DNA tested, or wants to, for that matter.  Tips and Tricks for Contact Success is another article that is immensely helpful to people just beginning to reach out.

In order to address the need for basic understanding of autosomal DNA principles, tools and how to utilize them, I began the “Concepts” series in February 2016. To date I offer the following 15 articles about genetic genealogy concepts. To be clear, DNA testing is only the genetic part of genetic genealogy, the genealogical research part being the second half of the equation.

My blog isn’t the only resource of course.

Kelly Wheaton provides 19 free lessons in her Beginners Guide to Genetic Genealogy.

Other blogs I highly recommend include:

Excellent books in print that should be in every genetic genealogist’s library:

And of course, the ISOGG Wiki.

Online Conference Resources

The good news and bad news is that I’m constantly seeing a genetic genealogy seminar, webinar or symposium hosted by a group someplace that is online, and often free. When I see names I recognize as being reputable, I am delighted that there is so much available to people who want to learn.

And for the record, I think that includes everyone. Even professional genetic genealogists watch these sessions, because you just never know what wonderful tidbit you’re going to pick up.  Learning, in this fast moving field, is an everyday event.

The bad news is that I can’t keep track of everything available, so I don’t mean to slight any resource.  Please feel free to post additional resources in the comments.

You would be hard pressed to find any genealogy conference, anyplace, today that didn’t include at least a few sessions about genetic genealogy. However, genetic genealogy has come of age and has its own dedicated conferences.

Dr. Maurice Gleeson, the gentleman who coordinates Genetic Genealogy Ireland films the sessions at the conference and then makes them available, for free, on YouTube. This link provides a list of the various sessions from 2016 and past years as well. Well worth your time!  A big thank you to Maurice!!!

The 19 video series from the I4GG Conference this fall is now available for $99. This series is an excellent opportunity for genetic genealogy education.

As always, I encourage project administrators to attend the Family Tree DNA International Conference on Genetic Genealogy. The sessions are not filmed, but the slides are made available after the conference, courtesy of the presenters and Family Tree DNA. You can view the presentations from 2015 and 2016 at this link.

Jennifer Zinck attended the conference and published her excellent notes here and here, if you want to read what she had to say about the sessions she attended. Thankfully, she can type much faster and more accurately than I can! Thank you so much Jennifer.

If you’d like to read about the unique lifetime achievement awards presented at the conference this year to Bennett Greenspan and Max Blankfeld, the founders of Family Tree DNA, click here. They were quite surprised!  This article also documents the history of genetic genealogy from the beginning – a walk down memory lane.

The 13th annual Family Tree DNA conference which will be held November 10-12, 2017 at the Hyatt Regency North Houston. Registration is always limited due to facility size, so mark your calendars now, watch for the announcement and be sure to register in time.

Summary

2016 has been an extremely busy year. I think my blog has had more views, more comments and by far, more questions, than ever before.

I’ve noticed that the membership in the ISOGG Facebook group, dedicated to genetic genealogy, has increased by about 50% in the past year, from roughly 8,000 members to just under 12,000. Other social media groups have been formed as well, some focused on specific aspects of genetic genealogy, such as specific surnames, adoption search, Native American or African American heritage and research.

The genetic aspect of genealogy has become “normal” today, with most genealogists not only accepting DNA testing, but embracing the various tools and what they can do for us in terms of understanding our ancestors, tracking them, and verifying that they are indeed who we think they are.

I may have to explain the three basic kinds of DNA testing and how they are used today, but no longer do I have to explain THAT DNA testing for genealogy exists and that it’s legitimate.

I hope that each of us can become an ambassador for genetic genealogy, encouraging others to test, with appropriate expectations, and helping to educate, enlighten and encourage. After all, the more people who test and are excited about the results, the better for everyone else.

Genetic genealogy is and can only be a collaborative team sport.

Here’s wishing you many new cousins and discoveries in 2017.

Happy New Year!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

23andMe’s New Ancestry Composition (Ethnicity) Chromosome Segments

I was excited to see 23andMe’s latest feature that provides customers with Ancestry Composition (ethnicity) chromosome segment information by location.  This means I can compare my triangulation groups to these segments and potentially identify which ancestor’s DNA that I inherited carry which ethnicity – right?? Another potential way to help discern whether I should ask Santa for lederhosen or a kilt?

Not so fast…

Theoretically yes, but as it turns out, after working with the results, this tool doesn’t fulfill it’s potential and has some very significant issues, or maybe this new tool just unveiled underlying issues.

Rats, I guess Santa is off the hook.

Let’s take a look and step through the process.

Ancestry Composition Chromosome Painting

To see your Ancestry Composition ethnicity chromosome painting, sign into 23andMe, then go to the Reports tab at the top of your page and click on Ancestry. Please note that you can click on any of the graphics in this article to enlarge.

23andme-eth-seg-1

Then click on Ancestry Composition, which shows you the following:

23andme-eth-seg-2

Scrolling downs shows you your chromosomes, painted with your ethnicity. This isn’t new and it’s a great visual.

You may note that 23andMe paints both “sides” of each chromosome separately, the side you received from your mother and the side you received from your father. However, there is no way to determine which is which, and they are not necessarily the same side on each chromosome.

If one or both of your parents tested at 23andMe, you can connect your parents to your results and you can then see which ethnicity you received from which parent.

Let’s work through an example.

23andme-eth-seg-3

This person, we’ll call her Jasmine, received two segments of Native ancestry, one on chromsome 1 and one on chromosome 2, both on the first (top) strands or copies. She also received one segment of African on DNA strand (copy) 1 of chromsome 7.

Caveat

Words of warning.

JUST BECAUSE THESE ETNICITIES APPEAR ON THE SAME STRANDS OF DIFFERENT CHROMOSOMES, STRAND ONE IN THIS CASE, DOES NOT MEAN THEY ARE INHERITED FROM THE SAME PARENT.  

Each chromosome recombines separately and without a parent to compare to, there is no way to know which strand is mother’s or father’s on any chromsome. And figuring out which strand is which for one chromsome does NOT mean it’s the same for other chromsomes.

In fact, Jasmine’s mother has tested, and she has NO African on chromosome 7. However, Jasmine and her mother both have Native American on chromosomes 1 and 2 in the same location, so we know absolutely that Jasmine’s strand 1 on chromosome 7 is not from the same parent as strand 1 on chromosome 1 and 2, because Jasmine’s mother doesn’t have any African DNA in that location.

If you’re a seasoned 23andMe user, and you’re saying to yourself, “That’s not right, the chromosome sides should be aligned if a parent tests.”  You’re right, at least that’s what we’ve all thought.  Keep reading.

Let’s dig a bit further.

Connecting Up

23and Me encourages everyone to connect their parents, if your parents have tested.

Jasmine’s mother has tested and is connected to Jasmine at 23andMe.

23andme-eth-seg-4

Even though the button says “Connect Mother,” which makes it appear that Jasmine’s mother isn’t connected, she is. Clicking on Jasmine’s “Connect Mother” button shows the following:

23andme-eth-seg-5

Furthermore, if the parent isn’t connected, you don’t see any parental side ethnicity breakdown – and we clearly see those results for Jasmine.  Below is an example of the same page of someone whose parents aren’t connected – and you can see the verbiage at the bottom saying that a parent must be connected to see how much ancestry composition was inherited from each parent.

23andme-eth-seg-not-connect

If a child is connected to at least one parent, 23andMe, based on that parent’s test, tells the child which sides they inherited which pieces of their ethnicity from, shown for Jasmine, below.

23andme-eth-seg-6

In this case, the mother is connected to Jasmine and the father’s ethnicity results are imputed by subtracting the results where Jasmine matches her mother. The balance of Jasmine’s DNA ethnicity results that don’t match her mother in that location are clearly from her father.

23andMe may sort the results into the correct buckets, but they do not correctly rearrange the chromosome “copies” or “sides” on the chromosome browser display based on the parents’ DNA, as seen from the African example on chromosome 7. Either that, or the ethnicity phasing is inaccurate, or both.

You can see that 23andMe tells Jasmine that all of her Native is from her mother’s side, which is correct.

23andMe tells Jasmine that part of her North African and Sub-Saharan African are from her mother, but some North African is also from her father. You can see Jasmine’s African on her chromosome 7, below.

23andme-eth-seg-7

There is no African on Jasmine’s mother’s chromosome 7, below.

23andme-eth-seg-8

So if African exists on chromosome 7, it MUST come from Jasmine’s father’s side. Therefore, side one of chromosome 7 cannot be Jasmine’s mother’s side, because that’s where Jasmine’s African resides.

This indictes that either the results are incorrect, or the “sides” showing have not been corrected or realigned by 23andMe after parental ethnicity phasing, or both.

Here’s another example. Jasmine shows Middle East and North Africa on chromosomes 12 and 13 on sides one and two, respectively.

23andme-eth-seg-9

Jasmine’s mother shows Middle East and North Africa on chromosome 14, only, with none showing on chromosome 12 or 13.

23andme-eth-seg-10

Yet, 23andMe shows Jasmine receiving Middle East and North African DNA from her mother.

23andme-eth-seg-11

Jasmine is also shown as receiving Sub-Saharan African and West African from her mother, but Jasmine’s mother has no Sub-Saharan or West African, at all.

Interestingly, when you highlight both West African and Sub-Saharan African, shown below, it highlights the same segment of Jasmine’s DNA, so apparently these are not different categories, but subsets of each other, at least in this case, and reflect the same segment.

23andme-eth-seg-12

23andme-eth-seg-13

Jasmine’s mother shows this region of chromosome 7 to be “European” with no further breakdown.

Clearly Jasmine’s sides 1 and 2 have not been consistently assigned to her mother, because Jasmine’ African shows on both sides 1 and 2 of chromosomes 12 and 13 and Jasmine’s mother has no African on either on those chromosomes – so those segments should be assigned consistently to Jasmine’s father’s side, which, based on Jasmine’s match to her mother on chromosome 1, side 1 – Jasmine’s father’s “copy” should be Jasmine’s side 2.  This tool is not functioning correctly.

Jasmine’s father is deceased, so there is no way to test him.

The information provided by 23and Me contradicts itself.

Either the ethnicity assignment itself or the parental ethnicity phasing is inaccurate, or both. Additionally, we now know that the chromosome “sides,” meaning “copies” are inaccurately displayed, even when one parent’s DNA is available and connected, and the sides could and should be portrayed accurately.

This discrepancy has to be evident to 23andMe, if they are checking for consistency in assigning child to parent segments.  You can’t assign a child’s segment to a parent who doesn’t carry any of that ethnicity in a common location.  That situation should result in a big red neon sign flashing “STOP” in quality assurance.  Inaccurate results should never be delivered to testers, especially when there are easy ways to determine that something isn’t right.

The New Feature – Ethnicity Segments

Like I said, I was initially quite excited about this new feature, at least until I did the analysis. Now, I’m not excited at all, because if the results are flawed, so is the underlying segment data.

My original intention was to download the ethnicity segment information into my master spreadsheet so that I could potentially match the ethnicity segments against ancestors when I’ve identified an ancestral segment as belonging to a particular ancestral line.

This would have been an absolutely wonderful benefit.

Let’s walk though these steps so you can find your results and do your own analysis.

When you are on the Ancestry Composition page, you will be, by default, on the Summary page.

23andme-eth-seg-14

Click on the Scientific Details tab, at the top, and scroll down to the bottom of the page where you will see the following:

23andme-eth-seg-15

You will be able to select a confidence level, ranging from 50% to 90%, where 50% is speculative and 90% is the highest confidence. Hint – at the highest confidence level, many of the areas broken out in the speculative level are rolled up into general regions, like “European.”  Default is 50%.

23andme-eth-seg-16

Click on download raw data and you can then open or save a .csv file. I suggest then saving that file as an Excel file so you can do some comparisons without losing features like color.

In my case, I saved a 50% confidence file and a 90% confidence file to compare to each other.

I began my analysis with both strands of chromosome 1:

Strand 1 was easy.  (Click on graphic to enlarge.)

23andme-eth-seg-17

At the 50% confidence level, on the left, three segments are identified, but when you really look at the start and end positions, rows one and two overlap entirely. Looking back at the chromosome browser painting, this looks to be because that segment will show up in both of those categories, so this isn’t an either-or situation. Row 3 shows Scandinavian beginning at 79,380,466 and continuing through 230,560,900, which is a partial embedded segment of row 2.

At the 90% confidence level, on the right, above, this entire segment, meaning all of chromosome 1 on side 1, is simply called European.

You can see how this might get complex very quickly when trying to utilize this information in a Master DNA Spreadsheet with your matches, especially since individual segments can have 2 or 3 different labels.  However, I’d love to know where my mystery Scandinavian is coming from – assuming it’s real.

Now, let’s look at strand 2 of chromosome one. It’s a little more complex.

23andme-eth-seg-18

I’ve tried to color code identical, or partially-overlapping segments.

The red, green and apricot segments overlap or partially overlap at the 50% level, on the left, indicating that they show up in different categories.

The red segments are partially the same, with some overlapping, but are grouped differently within Europe.

The green Native/East Asian segments at the 90% level are interrupted by the blue unassigned segments in the middle of the green segments, while at the 50% confidence level, they remain contiguous.

All of the start and end segments change, even if the categories stay the same or generally the same. The grey example at the bottom is the easiest to see – the category changes to the more general “European” at the 90% level and the start segment is slightly different.

Jasmine and Her Mother

As one last example, let’s look at the segments at the 50% confidence level, which should be the least restrictive, that we were comparing when discussing Jasmine and her mother.

You can see, below, that Jasmine’s Native portion of chromosome 1 and 2 are either equal to or a subset of her mother’s Native portion, so these match accurately and are shown in green.

This tells us that Jasmine’s mother’s side of chromosomes 1 and 2 is Jasmine’s “copy 1” and given that we can identify Jasmine’s mother’s DNA, all of Jasmine’s “copy 1” should now be displayed as her mother’s DNA, but it isn’t.

23andme-eth-seg-19

On chromosomes 7 and 12, where Jasmine’s copy 1 shows African DNA, her mother has none. All African DNA segments are shown in red, above.

Furthermore, 23andMe attributes at least some portion of Jasmine’s African to Jasmine’s mother, but Jasmine’s mother’s only African DNA appears on chromosome 14, a location where Jasmine has none. There is no common African segment or segments between Jasmine and her mother, in spite of the fact that 23andMe indicates that Jasmine inherited part of her African DNA from her mother.  It’s true that Jasmine and her mother both carry African DNA, but not on any of the same segments, so Jasmine did not inherit her mother’s African DNA.  Jasmine’s African DNA had to have come from her father – and that’s evident if you compare Jasmine and her mother’s segment data.

Where Jasmine has African DNA segments, above, I’ve shown her mother’s corresponding DNA segments on both strands for comparison. I have not colored these segments. Conversely, where Jasmine’s mother has African, on chromosome 14, I have shown Jasmine’s corresponding DNA segments covering that segment.  There are no matches.

Clearly Jasmine did not inherit her African segments from her mother, or the segments have been incorrectly assigned as African or European, or multiple problems exist.

Summary

I initially thought the Ancestry Composition segments were a great addition to the genealogists toolset, but unfortunately, it has proven to be otherwise, highlighting deficiencies in more than one of the following area:

  • Potentially, the ancestry composition ethnicity breakdown itself.  Is the underlying ethnicity assignment incorrect?  In either case, that would not explain the balance of the issues we encountered.
  • The chromosome “sides” or “copy” shown after the parental phasing – in other words, the child’s chromosome copies can be assigned to a particular parent with either or both parents’ DNA. Therefore, after parental phasing, all of the same parent’s DNA should consistently be assigned to either copy 1 or copy 2 for the child on all of their chromosomes.  It isn’t.
  • The child’s ethnicity source (parent) assignment based on the parent’s or parents’ ethnicity assignment(s).  Hence, the African segment assignment issues above.
  • The ethnicity phasing itself.  The assigning of the source of Jasmine’s African DNA to her mother when they share no common African segments.  Clearly this is incorrect, calling into question the validity of the rest of the parental ethnicity phasing.

Unfortunately, we really don’t have adequate tools to determine exactly where the problem or problems lie, but problems clearly do exist. This is very disappointing.

As a result, I won’t be adding this information to my Master DNA spreadsheet, and I’m surely glad I took the time to do the analysis BEFORE I copied the segment data into my spreadsheet.  In my excitement, I almost skipped the analysis step, trusting that 23andMe had this right.

All ethnicity results need to be taken with a large grain of salt, especially at the intra-continent level, because the reference populations and technology just haven’t been perfected.  It’s very difficult to discern between countries and regions of Europe, for example.  I discussed this in the article, “Ethnicity Testing – A Conundrum.”

However, it appears that adding parental phasing on top means that instead of a grain of salt, we’re looking at the entire shaker, at least at 23andMe – even at the continent level – in this case, Africa, which should be easily discernable from European. Parental phasing by its very nature should be able to help refine our results, not make them less reliable.

Is this new segment information just showing us the problems with the original ethnicity information?  I hate to even think about this or ask these difficult questions, but we must, because testers often rely on minority (to them) ethnicity admixture information to help confirm the ethnicity of distant ancestors. Are the display tools or 23andMe’s programs not working correctly, or is there a deeper problem, or both?

I think I just received a big lump of coal, or maybe a chunk of salt, in my stocking for Christmas.

Bah, humbug.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Beware The Sale of Your DNA – Just Because You Can Upload Doesn’t Mean You Should

You know something is coming of age when you begin to see knockoffs, opportunists – or ads on late night TV. As soon as someone figures out they can make money from something, rest assured, they will.

In the past few weeks, we’re beginning to see additional “opportunities” for places to upload your DNA files. Each of them has something to “give” you in return.  You can view this as genuine, or you can view this as bait – or maybe some of each.

So far, each of them also seems to have an agenda that is NOT serving us or our DNA – but serving only or primarily them. I’m not saying this is good or bad – that depends on your perspective – but I am saying that we need to be quite aware of a variety of factors before we participate or upload our autosomal DNA results.

Some sites are more straightforward than others.

I have already covered the fact that both 23andMe and Ancestry sell your DNA to whomever for whatever they see fit.

Truthfully, I always knew that 23andMe was focused on health, but I mistakenly presumed it was on the study of diseases like Parkinson’s. My mother was diagnosed with Parkinson’s, so I had a personal stake in that game.  When their very first patent was for “designer babies,” I felt shell-shocked, stupid, naïve, duped and taken advantage of. I had willingly opted-in and contributed my information with the idea that I was contributing to Parkinson’s research, while in reality, my DNA may have been used in the designer baby patent research.  I have no way of knowing and I had no idea that’s the type of research they were doing.

Parkinson’s yes, designer babies no.  It’s a personal decision, but once your DNA is being utilized or sold, it can be used for anything and you have no control whatsoever.  While I was perfectly willing to participate in surveys and have my DNA utilized for a cure for diseases, in particular Parkinson’s, I was not and am not willing for my DNA to be utilized for things like designer babies so the wealthy can select blue eyed, blonde haired children carrying the genes most likely to allow them to become athletes or cheerleaders.

And once the DNA cat is out of the bag, so to speak, there is no putting it back in. In some cases, you can opt out of identified data, but you can’t opt out of what has already been used, and in many cases, you can’t opt out of having your anonymized data sold.

So, let me give you an example of just how much protection anonymizing your data will give you.

Anonymized Data

Let’s say that someone in one of those unknown firms wants to know who I am. All they have to do is drop my results into GedMatch and my name is right there, along with my e-mail.

Have a fake name at Gedmatch? Well, think for a minute of the adoption search groups and how they identify people, sometimes very quickly and easily by their matches.  Everyday.

Not to mention, my children (and my parents, were they living) are very clearly identifiable utilizing my DNA. So while my DNA is mine, and legally belongs to me, it’s not entirely ONLY mine.

The promise of anonymized data by stripping out your identifying information has become somewhat of a hollow promise today. In a recent example, a cholesterol study volunteer recognized “herself” in a published paper, but was not notified of the results. In an earlier paper, several Y DNA volunteers were identified as well. Ironically, Dr. Erlich, now having formed DNA.Land and soliciting DNA uploads was involved with this unmasking.

Knowing what I know today, I would NEVER have tested at 23andMe and I would have to think very long and hard about Ancestry. The hook that Ancestry has, of course, is all of those DNA plus matching trees.  Is having my anonymized DNA sold worth that?  I don’t really know.  For me, it’s too late for an Ancestry decision, because I’ve already tested there and you cannot opt out of having your anonymized data sold.

I already had an Ancestry subscription, but some testers don’t realize they have to have at least a minimum level subscription to receive all of the benefits of testing at Ancestry. That could certainly be a rude awakening – and unexpected when they purchased the test.  The $49 DNA base subscription is not available on Ancestry’s website either – you have to know about it and call support to purchase that level.  I’m sure most people simply purchase the normal subscription or do without.

One thing is for sure, our DNA is worth a lot of money to both research and Big Pharm, and apparently worth a lot of effort as well, given how many people are attempting to capture our DNA for sale.

In the past few weeks, there have been several new sites that have come online relative to autosomal DNA uploading and testing.

But before we talk about those, I’d like to take a moment for education.

The Sanger Survey

Sanger survey

I’d like to suggest that you take a few minutes to view the videos associated with the Sanger Institute DNA survey here. I think the videos do a good job of explaining at least some of the issues facing people about the usage of their DNA.  Of course, you have to take their survey to see the videos at each step – but it’s good food for thought and they do allow you to make comments.

So, please, take a few minutes for this survey before proceeding.

Genes and US

One of the first “sidebar” companies to appear in September 2014 was at the site   http://www.genesand.us/ which is now nonfunctional.

I took screen shots at that time, since I was going to write an article about what seemed quite interesting.

Genesandus

It was a free service that offered to “find the best genes that you can give to your child.” You had to test at 23andMe, then upload both you and your partner’s raw DNA files and they would provide you with results.

I did just that, and the screen shot below shows the partial results. There were several pages.

Genesandus1

At the end of this section was a question asking if I wanted to “speak to a doctor about any of these benefits.” I didn’t, but I did want to know if gene selection was actual possible and being implemented.  I found the site’s contact information.  I sent this e-mail, which was never answered.

genesandus2

So let me ask you…where is my and my husband’s DNA today? I uploaded it.  Who has it?  Was this just a ploy to obtain our DNA files?  And for what purpose?  Who were these people anyway?  They are gone without a trace today.

DNA.Land

More recently, in the fall of 2015, DNA.Land came upon the scene.

As of today, 22,000+ people have uploaded their autosomal DNA files.

dna.land

What does DNA.Land offer the genealogist?

A different organization’s view of your ethnicity as well as relative matching to others who upload.

The quality and reliability of these enticements offered by companies in exchange for our DNA files may vary widely. For example, when DNA.Land launched, their matching routine didn’t find immediate family members.  No product should ever be launched in an alpha state, which calls into question the quality of the rest of their products and research.  That matching problem has reportedly been fixed.

The second enticement they offer is an ethnicity tool.

I can’t show you my example, because I have not uploaded my DNA to DNA.Land.   However, a genetic genealogy colleague conducted an interesting experiment.

TL Dixon uploaded four DNA files in late April 2016. He tested twice at 23andMe, both tests being the v3 version, and twice at Ancestry, in 2012 and 2014, and uploaded all 4 files to DNA.Land to see what the results would be, comparatively.

TL 23andMe test 1

23andMe v3 test 1

TL 23andme test 2

23andMe v3 test 2

TL Ancestry test 1 2014

Ancestry test from 2014

TL Ancestry test 2 2012

Ancestry test from 2012

We all know that ethnicity testing as a whole is not terribly reliable, but is the most reliable on the continent level, meaning Africa vs Europe vs Asia vs Native American. Given that these raw data files are from the same testing companies, on the same chip platform, for the same person, the Ancestry 2012 and 2014 ethnicity results from DNA.Land are quite different from each other relative to African vs Eurasian DNA, and also from the 23andMe results – even at the continent level.  Said another way, both 23andme results and the Ancestry 2014 results are very similar, with the Ancestry 2012 test, shown last, being the outlier.

Thanks to TL Dixon for both his multiple testing and sharing his results. According to TL’s known family history, the two 23andMe and the Ancestry 2014 kits are closest to accurate.  Just as an aside, TL, surprised by the differing results, utilized David Pike’s utilities to compare the two Ancestry files to see if one had a problem, and they were both very similar, so the difference does not appear to be in the Ancestry kits themselves – so the difference has to be at DNA.Land.

So, what I’m saying is that DNA.Land’s enticement of a different company’s view of ethnicity, even after several months, and even at the continent level, still needs work. This along with the original matching issue calls into question the quality of some of the enticements that are being used to attract DNA donors.  We should consider this not only at this site, but at others that provide enticement or “free” services or goodies as well.  Uploaders beware!

While the non-profit status of DNA.Land along with their verbiage leads people to believe that their work is entirely charitable, it is not, as reflected in this sentence from their consent information.

I understand that the research in this study may lead to new products, research tools, or inventions that have financial value. By accepting the terms of this consent, I understand that I will not be able to share in the profits from future commercialization of products developed from this study.

At least they are transparent about this, assuming you actually read all of the information provided on the site – which you should do with every site.

My Heritage Adds DNA Matching

This past week, My Heritage, a company headquartered in Israel, announced that it has added autosomal DNA matching. Some people think this is great, and others not so much.

MyHeritage

My Heritage, like Ancestry, is a subscription site. I happen to already be a member, so I was initially pretty excited about this, especially when I saw this in their blog.

Your DNA data will be kept private and secure on MyHeritage.

Our service will then match you to other people who share DNA with you: your relatives through a common ancestor. You will be able to review your matches’ family trees (excluding living people), and filter your matches by common surnames or geographies to focus on more relevant matches.

And also:

Who has access to the DNA data?

Only you do. Nobody else can see it, and nobody can even know that it was uploaded. Only the uploader can see the data, and you can delete it at any time. Users who are matched with your DNA will not have access to your DNA or your email address, but will be able to get in touch with you via MyHeritage.

I was thinking this might be a great opportunity, perhaps similar to the Ancestry trees, although they don’t say anything about tree matching.

However, their Terms of Service are not available to view unless you pretend to start an upload of your DNA (thanks for this tip Ann Turner) and then the “Terms of Service” and “Consent Agreement” links become available to view. They should be available for everyone BEFORE you start your upload.

On the MyHeritage main site, you’ll see DNA matching at the top. I’m a member, so, if you’re not a member, your “main site” may look different.

MyHeritage1

Click on “learn more” on the DNA Matching tab.

MyHeritage2

Step two shows you two boxes saying you have read the DNA Terms of Use and Consent Agreement. Don’t just click through these – read them.  Not just at this vendor, at all vendors.

In the required DNA Terms of Use we find this in the 5th paragraph:

By submitting DNA Results to the Website, you grant MyHeritage a perpetual, royalty-free, world-wide, transferable license to use your DNA Results, and any DNA Results you submit for any person from whom you obtained legal authorization as described in this Agreement, and to use, host, sublicense and distribute the resulting analysis to the extent and in the form or context we deem appropriate on or through any media or medium and with any technology or devices now known or hereafter developed or discovered.

And this in item 7:

c. We may transfer, lease, rent, sell, share and/or or otherwise distribute de-identified information to third parties for any purpose, including without limitation, internal business purposes. Whenever we transfer, lease, rent, sell, share and/or or otherwise distribute your information to third parties, this information will be aggregated and personal identifiers (such as names, birth dates, etc.) will be removed.

In the optional Informed Consent agreement, we find this:

The Project collects, preserves and analyzes genealogical lineage, historical records, surveys, genetic information, and other records (collectively, “Research Information“) provided by users in order to conduct research studies to better understand, among other things, human evolution and migration, population genetics, regional health issues, ethnographic diversity and boundaries, genealogy and the history of the human species. Researchers hope that the Project will be an invaluable tool for a wide range of scholars and researchers interested in genealogy, anthropology, evolution, languages, cultures, medicine, and other topics and that the Project may benefit future generations. Discoveries made as a result of the Project may be used in the study of genealogy, anthropology, population genetics, population health issues, cultures, trends (for example, to identify health risks or spread of certain diseases), and other related topics. If we or a third party wants to conduct a study (1) on topics unrelated to the Project, or (2) using Research Information beyond what is described in this Informed Consent, we will re-contact you to seek your specific approval. In addition, we may contact you to ask you to complete a questionnaire or to ask you if you are willing to be interviewed about the Project or other matters.

  1. What are the costs and will I receive compensation? MyHeritage will not charge participants any fees in order to be part of the Project. There will be no financial compensation paid to Project participants. The data you share with us for the Project may benefit researchers and others in the future. If any commercial product is developed as a result of the Project or its outcomes, there will be no financial benefit to you.

You can’t see the terms of use or consent agreement unless you are in the process of uploading your DNA and in addition, it appears that your DNA data is automatically available in anonymized fashion to third parties. The terms of service and informed consent data above does not seem to correlate with the marketing information which states that “nobody else” can see your data.

The other thing that’s NOT obvious, is that you don’t HAVE to click the box on the Consent Agreement, but you do HAVE to click the box on the DNA Terms of Use.

If you are not alright with the entirety of the DNA Terms of Use, which is required, do not upload your DNA file to My Heritage.  If you are not alright with the Consent Agreement, don’t click the box.  Judy Russel wrote an detailed article about the terms here.

Uploading your DNA to MyHeritage is free today, but may be a pay service later. It is unclear whether a subscription is required today, or will be in the future.  However, at one time one could upload a family tree of up to 250 people to MyHeritage for free through 23andMe.  Larger files were accepted, but were only free for a certain time period and now the person whose tree was larger than 250 people and who did not subscribe is locked out of their account.  They can’t delete their larger-than-250 person tree unless they purchase a subscription.  It’s unclear what the future holds for DNA uploads, trees and subscriptions as well.

I have not uploaded my DNA to MyHeritage either, based on 7c. It would appear that even if you don’t give consent for additional “research information” to be collected and provided, they can still sell your anonymized DNA.

WeGene

WeGene

Very recently, a new company, WeGene at http://www.wegene.com has begun DNA testing focused on the Chinese marketplace.

Their website it in Chinese, but Google translates it, at least nominally, as does Chrome.

WeGene1

WeGene2

It does not appear that WeGene does matching between their customers, or if they do, I’ve missed it in the translations.

You can, however, upload at least 23andMe files to WeGene. I can’t tell about Family Tree DNA and Ancestry files.  Unless you have direct and fairly recent Chinese ancestry, I don’t know what the benefit would be.

Their privacy and security, such as it is, is at this link, although obviously autotranslated. Some people seem to have found other verbiage as well.  Navigating their site, written in Chinese, is very difficult and the accuracy of the autotranslation is questionable, at best.

Their autosomal DNA file is obviously available for download, because GedMatch now accepts these files.

I am certainly not uploading my DNA to WeGene, for numerous reasons.

Vendor Summary

This vendor summary was more difficult to put together than I thought it would be – in part because I am not a new user at either Ancestry or 23andMe and obviously can’t see what a new user would see on any of my accounts. Furthermore, Ancestry in particular has several documents that refer back and forth to each other, and let’s just say they are written more for the legal mind than the typical consumer.

vendor summary

* – Both 23andMe and Ancestry appear to utilize all clients DNA for anonymized distribution, but not for identified distribution without an individual opt-in.

*1 – According to the 23andMe Privacy Policy, although you can opt in to the higher level of research testing where your identity is not removed, you cannot opt out of the anonymized level of DNA sharing/sale. Please review current 23andMe documentation before making a decision.

*2 – Can Opt in or Opt out.

*3 – Can opt out of non-anonymized sales, but not anonymized sales. Please verify utilizing the current Ancestry documents before making a decision.

*4 – DNA.land indicates that you can withdraw consent, but does not say anything about deleting your DNA file.

*5 – DNA.Land states in their consent agreement that they will not provide identified DNA information without first contacting you.

*6 – At 23andMe, deleting DNA from data base closes account.

*7 – Automatically opted in for anonymized sales/sharing, but must opt in for identified DNA sharing.

*8 – 23andMe has been and continues to experience significant difficulties and at this point are not considered a viable genetic genealogy option by many, or stated another way, they would be the last choice of the main three testing companies.

*9 – All legal action must be brought in Tel Aviv, Israel, individually, and not as a class action suit, according to item 9 in the DNA Terms of Use document.

*10 – Website in Chinese, information through an automated English translator, so the information provided here is necessarily incomplete and may not be entirely accurate.

Please note that any or all of these factors are subject to change over time and the vendors’ documents should be consulting and read thoroughly at the time any decision is being made.

Please note that at some vendors there are many different documents that cross-reference each other. They are confusing and should all be read before any decision is made.

And of course, some vendors’ websites aren’t even in English.

Points to Consider

While these companies are the ones that have come to the forefront in the past few months, there will assuredly be more as this industry develops. Here are a list of things for you to think about and points to consider that may help you make your decision about whether you want to either test or upload your autosomal DNA with any particular company.  After all, your autosomal DNA file does contain that obviously much-sought-after medical information.

First, always read every document on a vendor site that says anything like “Terms of Use,” “Security and Privacy” or “Terms of Service” or “Informed Consent.” Many times the fine print is spread throughout several documents that reference each other.  If their policy does not say specifically, do NOT assume.

Also be aware that the verbiage of most companies says they can change their rules of engagement at any time without notification.

Here are the questions you may want to consider as you read these documents.

  • Does the company or organization sell or share your data?
  • Is the data that is sold or shared anonymized or nonanonymized, understanding that really no one is truly anonymous anymore?
  • Who do they sell your data to?
  • For what purpose?
  • Do you have the opportunity to authorize your DNA’s involvement per study?
  • If you do not live in the same country as the company with whom you are doing business, what recourse do you have to enforce any agreement?
  • How do you feel about your DNA being in the hands of either organizations or companies you don’t know for purposes you don’t know?
  • Are you asked up front if you want to participate?
  • Can you opt out of your DNA being shared or sold entirely from the beginning?
  • Can you opt out of your DNA being shared or sold entirely at any time if you have initially opted in?
  • Do you receive the opportunity to opt in, or are you automatically opted in?
  • If you are automatically opted in, do you get the opportunity, right then, to opt out, or only if you happen to discover the situation? And if you can opt out immediately, are you only able to opt out of non-anonymized data or can you opt out entirely?
  • Is the company up front and transparent about what they are doing with your DNA or do you have to dig to unearth the truth?
  • If you already tested, and gave up rights, were you aware that you did so, and do you understand if or how you can rescind that inadvertent authorization?
  • Do you have to dig for the terms of service and are they as represented in the marketing literature?
  • Do you feel like you are giving truly informed consent and understand what can and will happened to your DNA, and what your options are if you change your mind, and how to exercise those options? Are you comfortable with those options and the approach of the company towards DNA sale as a whole? Were they forthright?
  • For companies like MyHeritage and Ancestry, are their other unknown “gotchas” like a subscription being required in addition to testing or uploading to obtain the full benefits of the test or upload?
  • What happens to your DNA if the company no longer exists or goes out of business? For two examples, look at the Sorenson and Ancestry Y and mtDNA DNA results. This is certainly not what any consumer or tester expected. Not to mention, I’m left wondering where my DNA submitted to genesandus is today.
  • Who owns the company?  What are their names?  Where can you find them?  What is the address of the company?  What does google have to say about the owners or management?  Linked-In?  Facebook?  If there is absolutely no history, that’s probably as damning as a bad history.  No one can exist today in a professional capacity and have no history.  Just saying.
  • Is the company acting in any way that would cause you not to trust them, their motives or agenda?  As my mother used to say, the best predictor of future behavior is past behavior.

Near and Dear to My Heart

I have family members who work in the medical field in various capacities. I also have family members who have or have had genetically heritable conditions and like everyone else, I would love to see those diseases cured.  My reticence to donate my DNA to whomever for whatever is not a result of being heartless.  It’s a function of wanting to be in control of who profits with/from my DNA and that of my family.

Let me share a personal story with you.

My brother died of cancer in 2012. He went for chemo treatments every two weeks, and before he could have his chemo treatment, he had to have bloodwork to assure that his system was able to handle the next dose of chemo.

If his white cell count was below a certain threshold, a shot of a drug called Neulasta was available to him to stimulate his body to increase the white blood cells. The shots were $8000 a piece.  And no, that is not a typo.  $8000!  His insurance did not cover the shots, because as far as they were concerned, he could just wait until his white cell numbers increased of their own accord and have the chemo then.  Of course, delaying the chemo decreased his chances of survival.

Over the course of his chemo, he had to have three of these $8000 shots. Fortunately, he did have the money to pay, although he did have to reschedule his appointment because he was required to bring a cashier’s check with the full payment in advance before the clinic would administer the shot.  After that, he simply carried an $8000 cashier’s check to each appointment, just in case.

I do not for one minute believe that those shots COST $8000 to manufacture, but I do believe that the pharmaceutical industry could, would and does CHARGE $8000 to desperate patients in order to continue the chemo that is their only hope of life. For those whose insurance pays, it’s entirely irrelevant. For those whose insurance does not pay, it’s a matter of life and death.  And yes, I’m equally as angry with the insurance company, but they aren’t the ones asking for me to do donate my DNA.

So, as for my DNA, no Big Pharm company will ever get their hands on it if there is ANYTHING I can do about it – although it’s probably too late now since I have tested with both 23andMe and Ancestry, who do not allow you to opt out entirely. I wish I had known before I tested.  At least I would have been giving informed consent, which was not the case.

Consequently, I want to know who is doing what with my DNA, so that I have the option of participating or not – and I want to know up front – and I don’t want it hidden in fine print with the company hoping I’ll just “click through” and never read the documentation. I don’t want it to be intentionally or unintentionally confusing, and I want unquestionable full disclosure – ahead of time.  Is that too much to ask?

My brother had the money for the shots, and he died anyway, but can you imagine being the family of someone who did not have $24,000?

And if you think for one minute that Big Pharm won’t do that, consider Turing Pharmaceuticals CEO Martin Shkreli, dubbed “the most hated man in America” in September 2015 for gouging patients dependent on a drug used for HIV and cancer treatment by raising the price from $13.50 per pill to $750 for the same pill, a 5,556% increase – because he could.

Medical research to cure disease I’m supportive of in terms of DNA donation, but not designer babies and not Big Pharm – and today there seems to be no way to separate the bad from the good or to determine who our DNA is being sold to for what purpose. Worse yet, some medical research is funded by Big Pharm, so it’s hard to determine which medical research is independent and which is not.

The companies selling our DNA and Big Pharm are the only people who stand to benefit financially from that arrangement – and they stand to benefit substantially from our contributions by encouraging us to “help science.” We’ll never know if a study our donated DNA was used for produced a new drug – and if it’s one we can’t afford, you can bet the pharmaceutical industry and manufacturers care not one whit that we were one of the people who donated our DNA so they could develop the drug we can’t afford.  If any industry should not be soliciting free DNA donations for research, Big Pharm is that industry with their jaw-dropping profits.

So, How Much is Our DNA Worth Anyway?

I don’t know, directly, but we can get some idea from the deal that 23andMe struck with pharmaceutical company Genentech, the US unit of Swiss drug company, Roche, in January 2015, as reported by Forbes.

Quoting now, directly from the Forbes article:

According to sources close to the deal, 23andMe is receiving an upfront payment from Genentech of $10 million, with further milestones of as much as $50 million. The deal is the first of ten 23andMe says it has signed with large pharmaceutical and biotech companies.

Such deals, which make use of the database created by customers who have bought 23andMe’s DNA test kits and donated their genetic and health data for research, could be a far more significant opportunity than 23andMe’s primary business of selling the DNA kits to consumers. Since it was founded in 2006, 23andMe has collected data from 800,000 customers and it sells its tests for $99 each. That means this single deal with one large drug company could generate almost as much revenue as doubling 23andMe’s customer base.

The article further says that the drug company was particularly interested in the 12,000 Parkinson’s patients and 1,300 of their parents and siblings who had provided family information. Ten million divided by 13,300 means Genentech were willing to pay $750 for each person’s DNA, out the door.  So the tester paid $99 or upwards, depending on when they tested – $1000 before September 2008 when the test dropped to $399, to 23andMe and then 23andMe made another $750 per kit from the tester’s donated DNA results.

And that’s before the additional $50 million and the other deals 23andMe and the other DNA-sellers have struck with Big Pharm. So yes indeed, our DNA is worth a lot.

It’s no wonder so many people are trying to trying to find a way to entice us to donate our results so they can sell them. In fact, it’s a wonder, and a testament to their integrity, that there is ANY company with access to our DNA results that isn’t selling them.  In fact, there are only two companies, plus the Genographic Project.

Who Doesn’t Share or Sell Your Autosomal DNA?

Of the major companies, organizations and sites, the only three, as best I can tell, that do not share or sell your autosomal DNA (or reserve the right to do so) and specifically state that they do not are National Geographic’s Genographic Project , Family Tree DNA and GedMatch.

Of those three, Family Tree DNA, a subsidiary of Gene by Gene is the only testing company and says the following:

Gene by Gene collects, processes, stores and shares your Personal Information in a responsible, transparent and secure environment that fosters our customers’ trust and confidence. To that end, Gene by Gene respects your privacy and will not sell or rent your Personal Information without your consent.

National Geographic utilizes Family Tree DNA for testing, and the worst thing I could find in their privacy policy is that they will share:

  • with other selected third parties so that they may send you promotional materials about goods and services that they offer. You have the opportunity to opt out of our sharing information about you as described below in the section entitled “Your Choices”;
  • in accordance with your consent.

Nothing problematic here.

Your Genographic DNA file is only uploadable to Family Tree DNA and Nat Geo does not accept uploaded data from other vendors.

GedMatch, which allows users to upload their raw data files from the major testing companies for comparison says the following:

It is our policy to never provide your genealogy, DNA information, or email address to 3rd parties, except as noted above.

Please refer to the entire documents from these organizations for details.

Serious genealogists have probably already uploaded to GedMatch and tested at or uploaded to Family Tree DNA as well, so people are unlikely to find new matches at new sites that aren’t already in one of these two places.

To Be Clear

I just want to make sure there is no confusion about which type of companies we’ve been referencing, and who is excluded, and why. The only companies or organizations this article applies to are those who have access to your raw data autosomal DNA file.  Those would be either the companies who test your autosomal DNA (National Geographic, Family Tree DNA, Ancestry and 23andMe in the US and WeGenes in China), or if you download your raw data file from those companies and upload it to another company, organization or location, as discussed in this article.  The companies and organizations discussed may not be the only firms or organizations to which you can upload your autosomal DNA file today, and assuredly, there will be more in the future.

The line in the sand is that autosomal DNA file. Not your Y DNA, not your mitochondrial DNA, not your match list – just that raw data file – that’s what contains your DNA information that the medical and pharmaceutical industry seeks and is willing to pay handsomely to obtain.

There are other companies and organizations that offer helpful tools for autosomal DNA analysis and tree integration, but you do NOT upload your raw data file to those sites. Those sites would include sites like www.dnagedcom.com and www.wikitree.com. I want to be sure no one confuses sites that do NOT upload or solicit the upload of your raw autosomal DNA files with those that do.  I have not discussed these sites that do not upload your autosomal DNA files because they are not relevant to this discussion.

This article does not pertain to sites that do not utilize or have access to your autosomal raw data file – only those that do.

Summary

As the number of DNA testing consumers rises, the number of potential targets for DNA sales into the medical/pharmaceutical field rises equally, as does the number of targets for scammers.

Along with that, I increasingly feel like my ancestors and the data available through my DNA about my ancestors, specifically ethnicity since everyone seems to be looking for a better answer, is being used as bait to obtain my DNA for companies with a hidden, or less than obvious, agenda – that being to obtain my DNA for subsequent sale.

I greatly appreciate the Genographic Project, Family Tree DNA and GedMatch, the organizations who either test or accept autosomal file uploads do not sell my DNA, and I hope that they are not forced into that position economically in order to survive. It’s quite obvious that there is significant money to be made from the sale of massive amounts of DNA to the medical and pharmaceutical communities.  They alone have resisted that temptation and stayed true to the cause of the study of indigenous cultures and population genetics in the case of Nat Geo, and genetic genealogy, and only genetic genealogy in the case of Family Tree DNA and GedMatch.

In other words, just because you can doesn’t mean you should.

Frankly, I believe selling our data is fundamentally wrong unless that information is abundantly clear, as in truly informed consent as defined by the Office for Human Research Protections, in advance of purchasing (or uploading) the test, and not simply a required “click through box” that says you read something. I would be much more likely to participate in anything that was straightforward rather than something that was hidden or not straightforward, like perhaps the company or organization was hoping we wouldn’t notice, or we would automatically click the box without reading further, thinking we have no other option.

The notice needs to say something on the order of, “I understand that my DNA is going to be sold, may be used for profit making ventures, and I cannot opt out if I order this DNA test,” if that is the case. That is truly informed consent – not a check box that says “I have read the Consent Document.”

Yes, the companies that sell DNA testing and our DNA results would probably receive far fewer orders, but those who would order would be truly informed and giving informed consent. Today, in the large majority of cases, I don’t believe that’s happening.

We need to be aware as consumers and make informed decisions. I’m not telling you whether you should or should not utilize these various companies and sites, or whether you should or should not participate in contributing your DNA to research, or at which level, if at all. That is a personal decision we all have to make.

But I will tell you that I think you need to educate yourself and be aware of these trends and issues in the industry so you can make a truly informed decision each and every time you consider sharing your DNA. And you should know that in some cases, your DNA is being sold and there is absolutely nothing you can do about if it you utilize the services of that company.

Above all, read all of the fine print.

Let me say that again, channeling my best Judy Russell voice.

ALWAYS, READ ALL OF THE FINE PRINT!!!

ALWAYS.
READ.
ALL.
OF.
THE.
FINE.
PRINT.

Unfortunately, things are not always as they seem on the surface.

If you see a click-through box, a red neon danger light should now start flashing in your brain and refuse to allow you to click on that box until you’ve done what? Read all the fine print.

There really is no such thing as a free lunch – so be judiciously suspicious.

I will leave you with the same thought relative to testing companies and upload opportunities that I said about companies selling our data. Just because you can doesn’t mean you should.

I think early in this game we all got excited and presumed the best about the motives of companies and organizations, like I did with both 23andMe and genesandus, but now we know better – and that there may be more to the story than initially meets the eye.

And besides that, we all know that presume is the first cousin to assume…and well, we all know where this is going.  And by the way, that’s exactly how I feel about genesandus who disappeared with my and my husband’s DNA.  I wasn’t nearly suspicious or judicious enough then…but I am now.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Ethnicity Testing – A Conundrum

Ethnicity results from DNA testing.  Fascinating.  Intriguing.  Frustrating.  Exciting.  Fun. Challenging.  Mysterious.  Enlightening.  And sometimes wrong.  These descriptions all fit.  Welcome to your personal conundrum!  The riddle of you!  If you’d like to understand why your ethnicity results might not have been what you expected, read on!

Today, about 50% of the people taking autosomal DNA tests purchase them for the ethnicity results. Ironically, that’s the least reliable aspect of DNA testing – but apparently somebody’s ad campaigns have been very effective.  After all, humans are curious creatures and inquiring minds want to know.  Who am I anyway?

I think a lot of people who aren’t necessarily interested in genealogy per se are interested in discovering their ethnic mix – and maybe for some it will be a doorway to more traditional genealogy because it will fan the flame of curiosity.

Given the increase in testing for ethnicity alone, I’m seeing a huge increase in people who are both confused by and disappointed in their results. And of course, there are a few who are thrilled, trading their lederhosen for a kilt because of their new discovery.  To put it gently, they might be a little premature in their celebration.

A lot of whether you’re happy or unhappy has to do with why you tested, your experience level and your expectations.

So, for all of you who could write an e-mail similar to this one that I received – this article is for you:

“I received my ethnicity results and I’m surprised and confused. I’m half German yet my ethnicity shows I’m from the British Isles and Scandinavia.  Then I tested my parents and their results don’t even resemble mine, nor are they accurate.  I should be roughly half of what they are, and based on the ethnicity report, it looks like I’m totally unrelated.  I realize my ethnicity is not just a matter of dividing my parents results by half, but we’re not even in the same countries.  How can I be from where they aren’t? How can I have significantly more, almost double, the Scandinavian DNA that they do combined?  And yes, I match them autosomally as a child so there is no question of paternity.”

Do not, and I repeat, DO NOT, trade in your lederhosen for a kilt just yet.

lederhosen kilt

Lederhosen – By The original uploader was Aquajazz at German Wikipedia – Transferred from de.wikipedia to Commons., CC BY-SA 2.0 de, https://commons.wikimedia.org/w/index.php?curid=2746036 Kilt – By Jongleur100 – Own work, Public Domain, https://commons.wikimedia.org/w/index.php?curid=7917180

This technology is not really ripe yet for that level of confidence except perhaps at the continent level and for people with Jewish heritage.

  1. In determining majority ethnicity at the continent level, these tests are quite accurate, but then you can determine the same thing by looking in the mirror.  I’m primarily of European heritage.  I can see that easily and don’t need a DNA test for that information.
  2. When comparing between continental ethnicity, meaning sorting African from European from Asian from Native American, these tests are relatively accurate, meaning there is sometimes a little bit of overlap, but not much.  I’m between 4 and 5% Native American and African – which I can’t see in the mirror – but some of these tests can.
  3. When dealing with intra-continent ethnicity – meaning Europe in particular, comparing one country or region to another, these tests are not reliable and in some cases, appear to be outright wrong. The exception here is Ashkenazi Jewish results which are generally quite accurate, especially at higher levels.

There are times when you seem to have too much of a particular ethnicity, and times when you seem to have too little.

Aside from the obvious adoption, misattributed parent or the oral history simply being wrong, the next question is why.

Ok, Why?

So glad you asked!

Part of why has to do with actual population mixing. Think about the history of Europe.  In fact, let’s just look at Germany.  Wiki provides a nice summary timeline.  Take a look, because you’ll see that the overarching theme is warfare and instability.  The borders changed, the rulers changed, invasions happened, and most importantly, the population changed.

Let’s just look at one event. The Thirty Years War (1618-1648) devastated the population, wiped out large portions of the countryside entirely, to the point that after its conclusion, parts of Germany were entirely depopulated for years.  The rulers invited people from other parts of Europe to come, settle and farm.  And they did just that.  Hear those words, other parts of Europe.

My ancestors found in the later 1600s along the Rhine near Speyer and Mannheim were some of those settlers, from Switzerland. Where were they from before Switzerland, before records?  We don’t know and we wouldn’t even know that much were it not for the early church records.

So, who are the Germans?

Who or where is the reference population that you would use to represent Germans?

If you match against a “German” population today, what does that mean, exactly? Who are you really matching?

Now think about who settled the British Isles.

Where did those people come from and who were they?

Well, the Anglo-Saxon people were comprised of Germanic tribes, the Angles and the Saxons.  Is it any wonder that if your heritage is German you’re going to be matching some people from the British Isles and vice versa?

Anglo-Saxons weren’t the only people who settled in the British Isles. There were Vikings from Scandinavia and the Normans from France who were themselves “Norsemen” aka from the same stock as the Vikings.

See the swirl and the admixture? Is there any wonder that European intracontinental admixture is so confusing and perplexing today?

Reference Populations

The second challenge is obtaining valid and adequate reference populations.

Each company that offers ethnicity tests assembles a group of reference populations against which they compare your results to put you into a bucket or buckets.

Except, it’s not quite that easy.

When comparing highly disparate populations, meaning those whose common ancestor was tens of thousands of years ago, you can find significant differences in their DNA. Think the four major continental areas here – Africa, Europe, Asia, the Americas.

Major, unquestionable differences are much easier to discern and interpret.

However, within population groups, think Europe here, it is much more difficult.

To begin with, we don’t have much (if any) ancient DNA to compare to. So we don’t know what the Germanic, French, Norwegian, Scottish or Italian populations looked like in, let’s say, the year 1000.

We don’t know what they looked like in the year 500, or 2000BC either and based on what we do know about warfare and the movement of people within Europe, those populations in the same location could genetically look entirely different at different points in history. Think before and after The 30 Years War.

population admixture

By User:MapMaster – Own work, CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php?curid=1234669

As an example, consider the population of Hungary and the Slavic portion of Germany before and after the Mongol invasion of Europe in the 13th century and Hun invasions that occurred between the 1st and 5th centuries.  The invaders DNA didn’t go away, it became part of the local population and we find it in descendants today.  But how do we know it’s Hunnic and not “German,” whatever German used to be, or Hungarian, or Norse?

That’s what we do know.

Now, think about how much we don’t know. There is no reason to believe the admixture and intermixing of populations on any other continent that was inhabited was any different.  People will be people.  They have wars, they migrate, they fight with each other and they produce offspring.

We are one big mixing bowl.

Software

A third challenge faced in determining ethnicity is how to calculate and interpret matching.

Population based matching is what is known as “best fit.”  This means that with few exceptions, such as some D9S919 values (Native American), the Duffy Null Allele (African) and Neanderthal not being found in African populations, all of the DNA sequences used for ethnicity matching are found in almost all populations worldwide, just at differing frequencies.

So assigning a specific “ethnicity” to you is a matter of finding the best fit – in other words which population you match at the highest frequency for the combined segments being measured.

Let’s say that the company you’re using has 50 people from each “grouping” that they are using for buckets.

A bucket is something you’ll be assigned to. Buckets sometimes resemble modern-day countries, but most often the testing companies try to be less boundary aligned and more population group aligned – like British Isles, or Eastern European, for example.

Ethnic regions

How does one decide which “country” goes where? That’s up to the company involved.  As a consumer, you need to read what the company publishes about their reference populations and their bucket assignment methodology.

ethnic country

For example, one company groups the Czech Republic and Poland in with Western Europe and another groups them primarily with Eastern Europe but partly in Western Europe and a third puts Poland in Eastern Europe and doesn’t say where they group The Czech Republic. None of these are inherently right are wrong – just understand that they are different and you’re not necessarily comparing apples to apples.

Two Strands of DNA

In the past, we’ve discussed the fact that you have two strands of DNA and they don’t come with a Mom side, a Dad side, no zipper and no instructions that tell you which is Mom’s and which is Dad’s.  Not fair – but it’s what we have to work with.

When you match someone because your DNA is zigzagging back and forth between Mom’s and Dad’s DNA sides, that’s called identical by chance.

It’s certainly possible that the same thing can happen in population genetics – where two strands when combined “look like” and match to a population reference sample, by chance.

pop ref 3

In the example above, you can see that you received all As from Mom and all Cs from Dad, and the reference population matches the As and Cs by zigzagging back and forth between your parents.  In this case, your DNA would match that particular reference population, but your parents would not.  The matching is technically accurate, it’s just that the results aren’t relevant because you match by chance and not because you have an ancestor from that reference population.

Finding The Right Bucket

Our DNA, as humans, is more than 99.% the same.  The differences are where mutations have occurred that allow population groups and individuals to look different from one another and other minor differences.  Understanding the degree of similarity makes the concept of “race” a bit outdated.

For genetic genealogy, it’s those differences we seek, both on a population level for ethnicity testing and on a personal level for identifying our ancestors based on who else our autosomal DNA matches who also has those same ancestors.

Let’s look at those differences that have occurred within population groups.

Let’s say that one particular sequence of your DNA is found in the following “bucket” groups in the following percentages:

  • Germany – 50%
  • British Isles – 25%
  • Scandinavian – 10%

What do you do with that? It’s the same DNA segment found in all of the populations.  As a company, do you assume German because it’s where the largest reference population is found?

And who are the Germans anyway?

Does all German DNA look alike? We already know the answer to that.

Are multiple ancestors contributing German ancestry from long ago, or are they German today or just a generation or two back in time?

And do you put this person in just the German bucket, or in the other buckets too, just at lower frequencies.  After all, buckets are cumulative in terms of figuring out your ethnicity.

If there isn’t a reference population, then the software of course can’t match to that population and moves to find the “next best fit.”  Keep in mind too that some of these reference populations are very small and may not represent the range of genetic diversity found within the entire region they represent.

If your ancestors are Hungarian today, they may find themselves in a bucket entirely unrelated to Hungary if a Hungarian reference population isn’t available AND/OR if a reference population is available but it’s not relevant to your ancestry from your part of Hungary.

If you’d like a contemporary example to equate to this, just think of a major American city today and the ethnic neighborhoods. In Detroit, if someone went to the ethnic Polish neighborhood and took 50 samples, would that be reflective of all of Detroit?  How about the Italian neighborhood?  The German neighborhood?  You get the drift.  None of those are reflective of Detroit, or of Michigan or even of the US.  And if you don’t KNOW that you have a biased sample, the only “matches” you’ll receive are Polish matches and you’ll have no way to understand the results in context.

Furthermore, that ethnic neighborhood 50 or 100 years earlier or later in time might not be comprised of that ethnic group at all.

Based on this example, you might be trading in your lederhosen for a pierogi or a Paczki, which are both wonderful, but entirely irrelevant to you.

paczki

Real Life Examples

Probably the best example I can think of to illustrate this phenomenon is that at least a portion of the Germanic population and the Native American population both originated in a common population in central northern Asia.  That Asiatic population migrated both to Europe to the west and eventually, to the Americas via an eastern route through Beringia.  Today, as a result of that common population foundation, some Germanic people show trace amounts of “Native American” DNA.  Is it actually from a Native American?  Clearly not, based on the fact that these people nor their ancestors have ever set foot in the Americas nor are they coastal.  However, the common genetic “signature” remains today and is occasionally detected in Germanic and eastern European people.

If you’re saying, “no, not possible,” remember for a minute that everyone in Europe carries some Neanderthal DNA from a population believed to be “extinct” now for between 25,000 and 40,000 years, depending on whose estimates you use and how you measure “extinct.”  Neanderthal aren’t extinct, they have evolved into us.  They assimilated, whether by choice or force is unknown, but the fact remains that they did because they are a forever part of Europeans, most Asians and yes, Native Americans today.

Back to You

So how can you judge the relevance or accuracy of this information aside from looking in the mirror?

Because I have been a genealogist for decades now, I have an extensive pedigree chart that I can use to judge the ethnicity predictions relatively accurately. I created an “expected” set of percentages here and then compared them to my real results from the testing companies.  This paper details the process I used.  You can easily do the same thing.

Part of how happy or unhappy you will be is based on your goals and expectations for ethnicity testing. If you want a definitive black and white, 100% accurate answer, you’re probably going to be unhappy, or you’ll be happy only because you don’t know enough about the topic to know you should be unhappy.  If you test with only one company, accept their results as gospel and go merrily on your way, you’ll never know that had you tested elsewhere, you’d probably have received a somewhat different answer.

If you’re scratching your head, wondering which one is right, join the party.  Perhaps, except for obvious outliers, they are all right.

If you know your pedigree pretty well and you’re testing for general interest, then you’ll be fine because you have a measuring stick against which to evaluate the results.

I found it fun to test with all 4 vendors, meaning Family Tree DNA, 23andMe and Ancestry along with the Genographic project and compare their results.

In my case, I was specifically interesting in ascertaining minority admixture and determining which line or lines it descended from. This means both Native American and African.

You can do this too and then download your results to www.gedmatch.com and utilize their admixture utilities.

GedMatch admix menu

At GedMatch, there are several versions of various contributed admixture/ethnicity tools for you to use. The authors of these tools have in essence done the same thing the testing companies have done – compiled reference populations of their choosing and compare your results in a specific manner as determined by the software written by that author.  They all vary.  They are free.  Your mileage can and will vary too!

By comparing the results, you can clearly see the effects of including or omitting specific populations. You’ll come away wondering how they could all be measuring the same you, but it’s an incredibly eye-opening experience.

The Exceptions and Minority Ancestry

You know, there is always an exception to every rule and this is no exception to the exception rule. (Sorry, I couldn’t resist.)

By and large, the majority continental ancestry will be the most accurate, but it’s the minority ancestry many testers are seeking.  That which we cannot see in the mirror and may be obscured in written records as well, if any records existed at all.

Let me say very clearly that when you are looking for minority ancestry, the lack of that ancestry appearing in these tests does NOT prove that it doesn’t exist. You can’t prove a negative.  It may mean that it’s just too far back in time to show, or that the DNA in that bucket has “washed out” of your line, or that we just don’t recognize enough of that kind of DNA today because we need a larger reference population.  These tests will improve with time and all 3 major vendors update the results of those who tested with them when they have new releases of their ethnicity software.

Think about it – who is 100% Native American today that we can use as a reference population?  Are Native people from North and South American the same genetically?  And let’s not forget the tribes in the US do not view DNA testing favorably.  To say we have challenges understanding the genetic makeup and migrations of the Native population is an understatement – yet those are the answers so many people seek.

Aside from obtaining more reference samples, what are the challenges?

There are two factors at play.

Recombination – the “Washing Out” Factor

First, your DNA is divided in half with every generation, meaning that you will, on the average, inherit roughly half of the DNA of your ancestors.  Now in reality, half is an average and it doesn’t always work that way.  You may inherit an entire segment of an ancestor’s DNA, or none at all, instead of half.

I’ve graphed the “washing out factor” below and you can see that within a few generations, if you have only one Native or African ancestor, their DNA is found in such small percentages, assuming a 50% inheritance or recombination rate, that it won’t be found above 1% which is the threshold used by most testing companies.

Wash out factor 2

Therefore, the ethnicity of any ancestor born 7 generations ago, or before about 1780 may not be detectable.  This is why the testing companies say these tests are effective to about the rough threshold of 5 or 6 generations.  In reality, there is no line in the sand.  If you have received more than 50% of that ancestor’s DNA, or a particularly large segment, it may be detectable at further distances.  If you received less, it may be undetectable at closer distances.  It’s the roll of the DNA dice in every generation between them and you.  This is also why it’s important to test parents and other family members – they may well have received DNA that you didn’t that helps to illuminate your ancestry.

Recombination – Population Admixture – the “Keeping In” Factor

The second factor at play here is population admixture which works exactly the opposite of the “washing out” factor. It’s the “keeping in” factor.  While recombination, the “washing out” factor, removes DNA in every generation, the population admixture “keeping in” factor makes sure that ancestral DNA stays in the mix. So yes, those two natural factors are kind of working at cross purposes and you can rest assured that both are at play in your DNA at some level.  Kind of a mean trick of nature isn’t it!

The population admixture factor, known as IBP, or identical by population, happens when identical DNA is found in an entire or a large population segment – which is exactly what ethnicity software is looking for – but the problem is that when you’re measuring the expected amount of DNA in your pedigree chart, you have no idea how to allow for endogamy and population based admixture from the past.

Endogamy IBP

This example shows that both Mom and Dad have the exact same DNA, because at these locations, that’s what this endogamous population carries.  Therefore the child carries this DNA too, because there isn’t any other DNA to inherit.  The ethnicity software looks for this matching string and equates it to this particular population.

Like Neanderthal DNA, population based admixture doesn’t really divide or wash out, because it’s found in the majority of that particular population and as long as that population is marrying within itself, those segments are preserved forever and just get passed around and around – because it’s the same DNA segment and most of the population carries it.

This is why Ashkenazi Jewish people have so many autosomal matches – they all descend from a common founding population and did not marry outside of the Jewish community.  This is also why a few contemporary living people with Native American heritage match the ancient Anzick Child at levels we would expect to see in genealogically related people within a few generations.

Small amounts of admixture, especially unexpected admixture, should be taken with a grain of salt. It could be noise or in the case of someone with both Native American and Germanic or Eastern European heritage, “Native American” could actually be Germanic in terms of who you inherited that segment from.

Have unexpected small percentages of Middle Eastern ethnic results?  Remember, the Mesolithic and Neolithic farmer expansion arrived in Europe from the Middle East some 7,000 – 12,000 years ago.  If Europeans and Asians can carry Neanderthal DNA from 25,000-45,000 years ago, there is no reason why you couldn’t match a Middle Eastern population in small amounts from 3,000, 7,000 or 12,000 years ago for the same historic reasons.

The Middle East is the supreme continental mixing bowl as well, the only location worldwide where historically we see Asian, European and African DNA intermixed in the same location.

Best stated, we just don’t know why you might carry small amounts of unexplained regional ethnic DNA.  There are several possibilities that include an inadequate population reference base, an inadequate understanding of population migration, quirks in matching software, identical segments by chance, noise, or real ancient or more modern DNA from a population group of your ancestors.

Using Minority Admixture to Your Advantage

Having said that, in my case and in the cases of others who have been willing to do the work, you can sometimes track specific admixture to specific ancestors using a combination of ethnicity testing and triangulation.

You cannot do this at Ancestry because they don’t give you ANY segment information.

Family Tree DNA and 23andMe both provide you with segment information, but not for ethnicity ranges without utilizing additional tools.

The easiest approach, by far, is to download your autosomal results to GedMatch and utilize their tools to determine the segment ranges of your minority admixture segments, then utilize that information to see which of your matches on that segment also have the same minority admixture on that same chromosome segment.

I wrote a several-part series detailing how I did this, called The Autosomal Me.

Let me sum the process up thus. I expected my largest Native segments to be on my father’s side.  They weren’t.  In fact, they were from my mother’s Acadian lines, probably because endogamy maintained (“kept in”) those Native segments in that population group for generations.  Thank you endogamy, aka, IBP, identical by population.

I made this discovery by discerning that my specifically identified Native segments matched my mother’s segments, also identified as Native, in exactly the same location, so I had obviously received those Native segments from her. Continuing to compare those segments and looking at GedMatch to see which of our cousins also had a match (to us) in that region pointed me to which ancestral line the Native segment had descended from.  Mitochondrial and Y DNA testing of those Acadian lines confirmed the Native ancestors.

That’s A Lot of Work!!!

Yes, it was, but well, well worth it.

This would be a good time to mention that I couldn’t have proven those connections without the cooperation of several cousins who agreed to test along with cousins I found because they tested, combined with the Mothers of Acadia and the AmerIndian Ancestry out of Acadia projects hosted by Family Tree DNA and the tools at GedMatch.  I am forever grateful to all those people because without the sharing and cooperation that occurs, we couldn’t do genetic genealogy at all.

If you want to be amused and perhaps trade your lederhosen for a kilt, then you can just take ethnicity results at face value.  If you’re reading this article, I’m guessing you’re already questioning “face value” or have noticed “discrepancies.”

Ethnicity results do make good cocktail party conversation, especially if you’re wearing either lederhosen or a kilt.  I’m thinking you could even wear lederhosen under your kilt……

If you want to be a bit more of an educated consumer, you can compare your known genealogy to ethnicity results to judge for yourself how close to reality they might be. However, you can never really know the effects of early population movements – except you can pretty well say that if you have 25% Scandinavian – you had better have a Scandinavian grandparent.  3% Scandinavian is another matter entirely.

If you’re saying to yourself, “this is part interpretive art and part science,” you’d be right.

If you want to take a really deep dive, and you carry significantly mixed ethnicity, such that it’s quite distinct from your other ancestry – meaning the four continents once again, you can work a little harder to track your ethnic segments back in time. So, if you have a European grandparent, an Asian grandparent, an African grandparent and a Native American grandparent – not only do you have an amazing and rich genealogy – you are the most lucky genetic genealogist I know, because you’ll pretty well know if your ethnicity results are accurate and your matches will easily fall into the correct family lines!

For some of us, utilizing the results of ethnicity testing for minority admixture combined with other tools is the only prayer we will ever have of finding our non-European ancestors.  If you fall into this group, that is an extremely powerful and compelling statement and represents the holy grail of both genealogy and genetic genealogy.

Let’s Talk About Scandinavia

We’ve talked about minority admixture and cases when we have too little DNA or unexpected small segments of DNA, but sometimes we have what appears to be too much.  Often, that happens in Scandinavia, although far more often with one company than the other two.  However, in my case, we have the perfect example of an unsolvable mystery introduced by ethnicity testing and of course, it involves Scandinavia.

23andMe, Ancestry and Family Tree DNA show me at 8%, 10% and 12% Scandinavian, respectively, which is simply mystifying. That’s a lot to be “just noise.”  That amount is in the great-grandparent or third generation range at 12.5%, but I don’t have anyone that qualifies, anyplace in my pedigree chart, as far back as I can go.  I have all of my ancestors identified and three-quarters (yellow) confirmed via DNA through the 6th generation, shown below.

The unconfirmed groups (uncolored) are genealogically confirmed via church and other records, just not genetically confirmed.  They are Dutch and German, respectively, and people in those countries have not embraced genetic genealogy to the degree Americans have.

Genetically confirmed means that through triangulation, I know that I match other descendants of these ancestors on common segments.  In other words, on the yellow ancestors, here is no possibility of misattributed parentage or an adoption in that line between me and that ancestor.

Six gen both

Barbara Mehlheimer, my mitochondrial line, does have Scandinavian mitochondrial DNA matches, but even if she were 100% Scandinavian, which she isn’t because I have her birth record in Germany, that would only account for approximately 3.12% of my DNA, not 8-12%.

In order for me to carry 8-12% Scandinavian legitimately from an ancestral line, four of these ancestors would need to be 100% Scandinavian to contribute 12.5% to me today assuming a 50% recombination rate, and my mother’s percentage of Scandinavian should be about twice mine, or 24%.

My mother is only in one of the testing company data bases, because she passed away before autosomal DNA testing was widely available.  I was fortunate that her DNA had been archived at Family Tree DNA and was available for a Family Finder upgrade.

Mom’s Scandinavian results are 7%, or 8% if you add in Finland and Northern Siberia.  Clearly not twice mine, in fact, it’s less. If I received half of hers, that would be roughly 4%, leaving 8% of mine unaccounted for.  If I didn’t receive all of my “Scandinavian” from her, then the balance would have had to come from my father whose Estes side of the tree is Appalachian/Colonial American.  Even less likely that he would have carried 16% Scandinavian, assuming again, that I inherited half.  Even if I inherited all 8% of Mom’s, that still leaves me 4% short and means my father would have had approximately 8%, which is still between the great and great-great-grandfather level.  By that time, his ancestors had been in America for generations and none were Scandinavian.  Clearly, something else is going on.  Is there a Scandinavian line in the woodpile someplace?  If so, which lines are the likely candidates?

In mother’s Ferverda/Camstra/deJong/Houtsma line, which is not DNA confirmed, we have several additional generations of records procured by a professional genealogist in the Netherlands from Leeuwarden, so we know where these ancestors originated and lived for generations, and it wasn’t Scandinavia.

The Kirsch/Lemmert line also reaches back in church records several generations in Mutterstadt and Fussgoenheim, Germany.  The Drechsel line reaches back several generations in Wirbenz, Germany and the Mehlheimer line reaches back one more generation in Speichersdorf before ending in an unmarried mother giving birth and not listing the father.  Aha, you say…there he is…that rogue Scandinavian.  And yes, it could be, but in that generation, he would account for only 1.56% of my DNA, not 8-12%.

So, what can we conclude about this conundrum.

  • The Scandinavian results are NOT a function of specific Scandinavian genealogical ancestors – meaning ones in the tree who would individually contribute that level of Scandinavian heritage.  There is no Scandinavian great-grandpa or Scandinavian heritage at all, in any line, tracking back more than 6 generations.  The first “available” spot with an unknown ancestor for a Scandinavian is in the 7th generation where they would contribute 1.56% of my DNA and 3.12% of mothers.
  • The Scandinavian results could be a function of a huge amount of population intermixing in several lines, but 8-12% is an awfully high number to attribute to unknown population admixture from many generations ago.
  • The Scandinavian results could be a function of a problematic reference population being utilized by multiple companies.
  • The Scandinavian results could be identical by chance matching, possibly in addition to population admixture in ancient lines.
  • The Scandinavian results could be a function of something we don’t yet understand.
  • The Scandinavian results could be a combination of several of the above.

It’s a mystery.  It may be unraveled as the tools improve and as an industry, additional population reference samples become available or better understood.  Or, it may never be unraveled.  But one thing is for sure, it is very, very interesting!  However, I’m not trading lederhosen for anything based on this.

The Companies

I wrote a comparison of the testing companies when they introduced their second generation tools.  Not a lot has changed.  Hopefully we will see a third software generation soon.

I do recommend selecting between the main three testing companies plus National Geographic’s Genographic 2.0 products if you’re going to test for ethnicity.  Stay safe.  There are less than ethical people and companies out there looking to take advantage of people’s curiosity to learn about their heritage.

Today, 23andMe is double the price of either Family Tree DNA or Ancestry and they are having other issues as well.  However, they do sometimes pick up the smallest amounts of minority admixture.

Ancestry continues to have “a Scandinavian problem” where many/most of their clients have a significant amount (some as high as the 30% range) of Scandinavian ancestry assigned to them that is not reflected by other testing companies or tools, or the tester’s known heritage – and is apparently incorrect.

However, Ancestry did pick up my minority Ancestry of both Native and African. How much credibility should I give that in light of the known Scandinavian issue?  In other words, if they can’t get 30% right, how could they ever get 4 or 5% right?

Remember what I said about companies doing pretty well on a comparative continental basis but sorting through ethnicity within a continent being much more difficult. This is the perfect example.  Ancestry also is not alone in reporting small amounts of my minority admixture.  The other companies do as well, although their amounts and descriptions don’t match each other exactly.

However, I can download any or all three of these raw data files to GedMatch and utilize their various ethnicity, triangulation and chromosome by chromosome comparison utilities. Both Family Tree DNA and Ancestry test more SNP locations than does 23andMe, and cost half as much, if you’re planning to test in order to upload your raw data file to GedMatch.

If you are considering ordering from either 23andMe or Ancestry, be sure you understand their privacy policy before ordering.

In Summary

I hate to steal Judy Russell’s line, but she’s right – it’s not soup yet if ethnicity testing is the only tool you’re going to use and if you’re expecting answers, not estimates.  View today’s ethnicity results from any of the major testing companies as interesting, because that’s what they are, unless you have a very specific research agenda, know what you are doing and plan to take a deeper dive.

I’m not discouraging anyone from ethnicity testing. I think it’s fun and for me, it was extremely informative.  But at the same time, it’s important to set expectations accurately to avoid disappointment, anxiety, misinformation or over-reliance on the results.

You can’t just discount these results because you don’t like them, and neither can you simply accept them.

If you think your grandfather was 100% Native America and you have no Native American heritage on the ethnicity test, the problem is likely not the test or the reference populations.  You should have 25% and carry zero.  The problem is likely that the oral history is incorrect.  There is virtually no one, and certainly not in the Eastern tribes, who was not admixed by two generations ago.  It’s also possible that he is not your grandfather.  View ethnicity results as a call to action to set forth and verify or refute their accuracy, especially if they vary dramatically from what you expected.  If it’s the truth you seek, this is your personal doorway to Delphi.

Just don’t trade in your lederhosen, or anything else just yet based on ethnicity results alone, because this technology it still in it’s infancy, especially within Europe.  I mean, after all, it’s embarrassing to have to go and try to retrieve your lederhosen from the pawn shop.  They’re going to laugh at you.

I find it ironic that Y DNA and mtDNA, much less popular, can be very, very specific and yield definitive answers about individual ancestors, reaching far beyond the 5th or 6th generation – yet the broad brush ethnicity painting which is much less reliable is much more popular.  This is due, in part, I’m sure, to the fact that everyone can take the ethnicity tests, which represent all lines.  You aren’t limited to testing one or two of your own lines and you don’t need to understand anything about genetic genealogy or how it works.  All you have to do is spit or swab and wait for results.

You can take a look at how Y and mtDNA testing versus autosomal tests work here.  Maybe Y or mitochondrial should be next on your list, as they reach much further back in time on specific lines, and you can use these results to create a DNA pedigree chart that tells you very specifically about the ancestry of those particular lines.

Ethnicity testing is like any other tool – it’s just one of many available to you.  You’ll need to gather different kinds of DNA and other evidence from various sources and assemble the pieces of your ancestral story like a big puzzle.  Ethnicity testing isn’t the end, it’s the beginning.  There is so much more!

My real hope is that ethnicity testing will kindle the fires and that some of the folks that enter the genetic genealogy space via ethnicity testing will be become both curious and encouraged and will continue to pursue other aspects of genealogy and genetic genealogy.  Maybe they will ask the question of “who” in their tree wore kilts or lederhosen and catch the genealogy bug.  Maybe they will find out more about grandpa’s Native American heritage, or lack thereof.  Maybe they will meet a match that has more information than they do and who will help them.  After all, ALL of genetic genealogy is founded upon sharing – matches, trees and information.  The more the merrier!

So, if you tested for ethnicity and would like to learn more, come on in, the water’s fine and we welcome both lederhosen and kilts, whatever you’re wearing today!  Jump right in!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

The Best and Worst of 2015 – Genetic Genealogy Year in Review

2015 Best and Worst

For the past three years I’ve written a year-in-review article. You can see just how much the landscape has changed in the 2012, 2013 and 2014 versions.

This year, I’ve added a few specific “award” categories for people or firms that I feel need to be specially recognized as outstanding in one direction or the other.

In past years, some news items, announcements and innovations turned out to be very important like the Genographic Project and GedMatch, and others, well, not so much. Who among us has tested their full genome today, for example, or even their exome?  And would you do with that information if you did?

And then there are the deaths, like the Sorenson database and Ancestry’s own Y and mitochondrial data base. I still shudder to think how much we’ve lost at the corporate hands of Ancestry.

In past years, there have often been big new announcements facilitated by new technology. In many ways, the big fish have been caught in a technology sense.  Those big fish are autosomal DNA and the Big Y types of tests.  Both of these have created an avalanche of data and we, personally and as a community, are still trying to sort through what all of this means genealogically and how to best utilize the information.  Now we need tools.

This is probably illustrated most aptly by the expansion of the Y tree.

The SNP Tsunami Growing Pains Continue

2015 snp tsunami

Going from 800+ SNPs in 2012 to more than 35,000 SNPs today has introduced its own set of problems. First, there are multiple trees in existence, completely or partially maintained by different organizations for different purposes.  Needless to say, these trees are not in sync with each other.  The criteria for adding a SNP to the tree is decided by the owner or steward of that tree, and there is no agreement as to the definition of a valid SNP or how many instances of that SNP need to be in existence to be added to the tree.

This angst has been taking place for the most part outside of the public view, but it exists just the same.

For example, 23andMe still uses the old haplogroup names like R1b which have not been used in years elsewhere. Family Tree DNA is catching up with updating their tree, working with haplogroup administrators to be sure only high quality, proven SNPs are added to branches.  ISOGG maintains another tree (one branch shown above) that’s publicly available, utilizing volunteers per haplogroup and sometimes per subgroup.  Other individuals and organizations maintain other trees, or branches of trees, some very accurate and some adding a new “branch” with as little as one result.

The good news is that this will shake itself out. Personally, I’m voting for the more conservative approach for public reference trees to avoid “pollution” and a lot of shifting and changing downstream when it’s discovered that the single instance of a SNP is either invalid or in a different branch location.  However, you have to start with an experimental or speculative tree before you can prove that a SNP is where it belongs or needs to be moved, so each of the trees has its own purpose.

The full trees I utilize are the Family Tree DNA tree, available for customers, the ISOGG tree and Ray Banks’ tree which includes locations where the SNPs are found when the geographic location is localized. Within haplogroup projects, I tend to use a speculative tree assembled by the administrators, if one is available.  The haplogroup admins generally know more about their haplogroup or branch than anyone else.

The bad news is that this situation hasn’t shaken itself out yet, and due to the magnitude of the elephant at hand, I don’t think it will anytime soon. As this shuffling and shaking occurs, we learn more about where the SNPs are found today in the world, where they aren’t found, which SNPs are “family” or “clan” SNPs and the timeframes in which they were born.

In other words, this is a learning process for all involved – albeit a slow and frustrating one. However, we are making progress and the tree becomes more robust and accurate every year.

We may be having growing pains, but growing pains aren’t necessarily a bad thing and are necessary for growth.

Thank you to the hundreds of volunteers who work on these trees, and in particular, to Alice Fairhurst who has spearheaded the ISOGG tree for the past nine years. Alice retired from that volunteer position this year and is shown below after receiving two much-deserved awards for her service at the Family Tree DNA Conference in November.

2015 ftdna fairhurst 2

Best Innovative Use of Integrated Data

2015 smileDr. Maurice Gleeson receives an award this year for the best genealogical use of integrated types of data. He has utilized just about every tool he can find to wring as much information as possible out of Y DNA results.  Not only that, but he has taken great pains to share that information with us in presentations in the US and overseas, and by creating a video, noted in the article below.  Thanks so much Maurice.

Making Sense of Y Data

Estes pedigree

The advent of massive amounts of Y DNA data has been both wonderful and perplexing. We as genetic genealogists want to know as much about our family as possible, including what the combination of STR and SNP markers means to us.  In other words, we don’t want two separate “test results” but a genealogical marriage of the two.

I took a look at this from the perspective of the Estes DNA project. Of course, everyone else will view those results through the lens of their own surname or haplogroup project.

Estes Big Y DNA Results
http://dna-explained.com/2015/03/26/estes-big-y-dna-results/

At the Family Tree DNA Conference in November, James Irvine and Maurice Gleeson both presented sessions on utilizing a combination of STR and SNP data and various tools in analyzing their individual projects.

Maurice’s presentation was titled “Combining SNPs, STRs and Genealogy to build a Surname Origins Tree.”
http://www.slideshare.net/FamilyTreeDNA/building-a-mutation-history-tree

Maurice created a wonderful video that includes a lot of information about working with Y DNA results. I would consider this one of the very best Y DNA presentations I’ve ever seen, and thanks to Maurice, it’s available as a video here:
https://www.youtube.com/watch?v=rvyHY4R6DwE&feature=youtu.be

You can view more of Maurice’s work at:
http://gleesondna.blogspot.com/2015/08/genetic-distance-genetic-families.html

James Irvine’s presentation was titled “Surname Projects – Some Fresh Ideas.” http://www.slideshare.net/FamilyTreeDNA/y-dna-surname-projects-some-fresh-ideas

Another excellent presentation discussing Y DNA results was “YDNA maps Scandinavian Family Trees from Medieval Times and the Viking Age” by Peter Sjolund.
http://www.slideshare.net/FamilyTreeDNA/ydna-maps-scandinavian-family-trees-from-medieval-times-and-the-viking-age

Peter’s session at the genealogy conference in Sweden this year was packed. This photo, compliments of Katherine Borges, shows the room and the level of interest in Y-DNA and the messages it holds for genetic genealogists.

sweden 2015

This type of work is the wave of the future, although hopefully it won’t be so manually intensive. However, the process of discovery is by definition laborious.  From this early work will one day emerge reproducible methodologies, the fruits of which we will all enjoy.

Haplogroup Definitions and Discoveries Continue

A4 mutations

Often, haplogroup work flies under the radar today and gets dwarfed by some of the larger citizen science projects, but this work is fundamentally important. In 2015, we made discoveries about haplogroups A4 and C, for example.

Haplogroup A4 Unpeeled – European, Jewish, Asian and Native American
http://dna-explained.com/2015/03/05/haplogroup-a4-unpeeled-european-jewish-asian-and-native-american/

New Haplogroup C Native American Subgroups
http://dna-explained.com/2015/03/11/new-haplogroup-c-native-american-subgroups/

Native American Haplogroup C Update – Progress
http://dna-explained.com/2015/08/25/native-american-haplogroup-c-update-progress/

These aren’t the only discoveries, by any stretch of the imagination. For example, Mike Wadna, administrator for the Haplogroup R1b Project reports that there are now over 1500 SNPs on the R1b tree at Family Tree DNA – which is just about twice as many as were known in total for the entire Y tree in 2012 before the Genographic project was introduced.

The new Y DNA SNP Packs being introduced by Family Tree DNA which test more than 100 SNPs for about $100 will go a very long way in helping participants obtain haplogroup assignments further down the tree without doing the significantly more expensive Big Y test. For example, the R1b-DF49XM222 SNP Pack tests 157 SNPs for $109.  Of course, if you want to discover your own private line of SNPs, you’ll have to take the Big Y.  SNP Packs can only test what is already known and the Big Y is a test of discovery.

                       Best Blog2015 smile

Jim Bartlett, hands down, receives this award for his new and wonderful blog, Segmentology.

                             Making Sense of Autosomal DNA

segmentology

Our autosomal DNA results provide us with matches at each of the vendors and at GedMatch, but what do we DO with all those matches and how to we utilize the genetic match information? How to we translate those matches into ancestral information.  And once we’ve assigned a common ancestor to a match with an individual, how does that match affect other matches on that same segment?

2015 has been the year of sorting through the pieces and defining terms like IBS (identical by state, which covers both identical by population and identical by chance) and IBD (identical by descent). There has been a lot written this year.

Jim Bartlett, a long-time autosomal researcher has introduced his new blog, Segmentology, to discuss his journey through mapping ancestors to his DNA segments. To the best of my knowledge, Jim has mapped more of his chromosomes than any other researcher, more than 80% to specific ancestors – and all of us can leverage Jim’s lessons learned.

Segmentology.org by Jim Bartlett
http://dna-explained.com/2015/05/12/segmentology-org-by-jim-bartlett/

When you visit Jim’s site, please take a look at all of his articles. He and I and others may differ slightly in the details our approach, but the basics are the same and his examples are wonderful.

Autosomal DNA Testing – What Now?
http://dna-explained.com/2015/08/07/autosomal-dna-testing-101-what-now/

Autosomal DNA Testing 101 – Tips and Tricks for Contact Success
http://dna-explained.com/2015/08/11/autosomal-dna-testing-101-tips-and-tricks-for-contact-success/

How Phasing Works and Determining IBS vs IBD Matches
http://dna-explained.com/2015/01/02/how-phasing-works-and-determining-ibd-versus-ibs-matches/

Just One Cousin
http://dna-explained.com/2015/01/11/just-one-cousin/

Demystifying Autosomal DNA Matching
http://dna-explained.com/2015/01/17/demystifying-autosomal-dna-matching/

A Study Using Small Segment Matching
http://dna-explained.com/2015/01/21/a-study-utilizing-small-segment-matching/

Finally, A How-To Class for Working with Autosomal Results
http://dna-explained.com/2015/02/10/finally-a-how-to-class-for-working-with-autosomal-dna-results/

Parent-Child Non-Matching Autosomal DNA Segments
http://dna-explained.com/2015/05/14/parent-child-non-matching-autosomal-dna-segments/

A Match List Does Not an Ancestor Make
http://dna-explained.com/2015/05/19/a-match-list-does-not-an-ancestor-make/

4 Generation Inheritance Study
http://dna-explained.com/2015/08/23/4-generation-inheritance-study/

Phasing Yourself
http://dna-explained.com/2015/08/27/phasing-yourself/

Autosomal DNA Matching Confidence Spectrum
http://dna-explained.com/2015/09/25/autosomal-dna-matching-confidence-spectrum/

Earlier in the year, there was a lot of discussion and dissention about the definition of and use of small segments. I utilize them, carefully, generally in conjunction with larger segments.  Others don’t.  Here’s my advice.  Don’t get yourself hung up on this.  You probably won’t need or use small segments until you get done with the larger segments, meaning low-hanging fruit, or unless you are doing a very specific research project.  By the time you get to that point, you’ll understand this topic and you’ll realize that the various researchers agree about far more than they disagree, and you can make your own decision based on your individual circumstances. If you’re entirely endogamous, small segments may just make you crazy.  However, if you’re chasing a colonial American ancestor, then you may need those small segments to identify or confirm that ancestor.

It is unfortunate, however, that all of the relevant articles are not represented in the ISOGG wiki, allowing people to fully educate themselves. Hopefully this can be updated shortly with the additional articles, listed above and from Jim Bartlett’s blog, published during this past year.

Recreating the Dead

James Crumley overlapping segments

James and Catherne Crumley segments above, compliments of Kitty Cooper’s tools

As we learn more about how to use autosomal DNA, we have begun to reconstruct our ancestors from the DNA of their descendants. Not as in cloning, but as in attributing DNA found in multiple descendants that originate from a common ancestor, or ancestral couple.  The first foray into this arena was GedMatch with their Lazarus tool.

Lazarus – Putting Humpty Dumpty Back Together Again
http://dna-explained.com/2015/01/14/lazarus-putting-humpty-dumpty-back-together-again/

I have taken a bit of a different proof approach wherein I recreated an ancestor, James Crumley, born in 1712 from the matching DNA of roughly 30 of his descendants.
http://www.slideshare.net/FamilyTreeDNA/roberta-estes-crumley-y-dna

I did the same thing, on an experimental smaller scale about a year ago with my ancestor, Henry Bolton.
http://dna-explained.com/2014/11/10/henry-bolton-c1759-1846-kidnapped-revolutionary-war-veteran-52-ancestors-45/

This is the way of the future in genetic genealogy, and I’ll be writing more about the Crumley project and the reconstruction of James Crumley in 2016.

                         Lump Of Coal Award(s)2015 frown

This category is a “special category” that is exactly what you think it is. Yep, this is the award no one wants.  We have a tie for the Lump of Coal Award this year between Ancestry and 23andMe.

               Ancestry Becomes the J.R. Ewing of the Genealogy World

2015 Larry Hagman

Attribution : © Glenn Francis, http://www.PacificProDigital.com

Some of you may remember J.R. Ewing on the television show called Dallas that ran from 1978 through 1991. J.R. Ewing, a greedy and unethical oil tycoon was one of the main characters.  The series was utterly mesmerizing, and literally everyone tuned in.  We all, and I mean universally, hated J.R. Ewing for what he unfeelingly and selfishly did to his family and others.  Finally, in a cliffhanger end of the season episode, someone shot J.R. Ewing.  OMG!!!  We didn’t know who.  We didn’t know if J.R. lived or died.  Speculation was rampant.  “Who shot JR?” was the theme on t-shirts everyplace that summer.  J.R. Ewing, over time, became the man all of America loved to hate.

Ancestry has become the J.R. Ewing of the genealogy world for the same reasons.

In essence, in the genetic genealogy world, Ancestry introduced a substandard DNA product, which remains substandard years later with no chromosome browser or comparison tools that we need….and they have the unmitigated audacity to try to convince us we really don’t need those tools anyway. Kind of like trying to convince someone with a car that they don’t need tires.

Worse, yet, they’ve introduced “better” tools (New Ancestor Discoveries), as in tools that were going to be better than a chromosome browser.  New Ancestor Discoveries “gives us” ancestors that aren’t ours. Sadly, there are many genealogists being led down the wrong path with no compass available.

Ancestry’s history of corporate stewardship is abysmal and continues with the obsolescence of various products and services including the Sorenson DNA database, their own Y and mtDNA database, MyFamily and most recently, Family Tree Maker. While the Family Tree Maker announcement has been met with great gnashing of teeth and angst among their customers, there are other software programs available.  Ancestry’s choices to obsolete the DNA data bases is irrecoverable and a huge loss to the genetic genealogy community.  That information is lost forever and not available elsewhere – a priceless, irreplaceable international treasure intentionally trashed.

If Ancestry had not bought up nearly all of the competing resources, people would be cancelling their subscriptions in droves to use another company – any other company. But there really is no one else anymore.  Ancestry knows this, so they have become the J.R. Ewing of the genealogy world – uncaring about the effects of their decisions on their customers or the community as a whole.  It’s hard for me to believe they have knowingly created such wholesale animosity within their own customer base.  I think having a job as a customer service rep at Ancestry would be an extremely undesirable job right now.  Many customers are furious and Ancestry has managed to upset pretty much everyone one way or another in 2015.

AncestryDNA Has Now Thoroughly Lost Its Mind
https://digginupgraves.wordpress.com/2015/04/02/ancestrydna-has-now-thoroughly-lost-its-mind/

Kenny, Kenny, Kenny
https://digginupgraves.wordpress.com/2015/04/10/kenny-kenny-kenny/

Dear Kenny – Any Suggestions for our New Ancestor Discoveries?
https://digginupgraves.wordpress.com/2015/04/13/dear-kenny-any-suggestions-for-our-new-ancestor-discoveries/

RIP Sorenson – A Crushing Loss
http://dna-explained.com/2015/05/15/rip-sorenson-a-crushing-loss/

Of Babies and Bathwater
http://www.legalgenealogist.com/blog/2015/05/17/of-babies-and-bathwater/

Facts Matter
http://legalgenealogist.com/blog/2015/05/03/facts-matter/

Getting the Most Out of AncestryDNA
http://dna-explained.com/2015/02/02/getting-the-most-out-of-ancestrydna/

Ancestry Gave Me a New DNA Ancestor and It’s Wrong
http://dna-explained.com/2015/04/03/ancestry-gave-me-a-new-dna-ancestor-and-its-wrong/

Testing Ancestry’s Amazing New Ancestor DNA Claim
http://dna-explained.com/2015/04/07/testing-ancestrys-amazing-new-ancestor-dna-claim/

Dissecting AncestryDNA Circles and New Ancestors
http://dna-explained.com/2015/04/09/dissecting-ancestrydna-circles-and-new-ancestors/

Squaring the Circle
http://legalgenealogist.com/blog/2015/03/29/squaring-the-circle/

Still Waiting for the Holy Grail
http://legalgenealogist.com/blog/2015/04/05/still-waiting-for-the-holy-grail/

A Dozen Ancestors That Aren’t aka Bad NADs
http://dna-explained.com/2015/04/14/a-dozen-ancestors-that-arent-aka-bad-nads/

The Logic and Birth of a Bad NAD (New Ancestor Discovery)
http://dna-explained.com/2015/08/12/the-logic-and-birth-of-a-bad-nad-new-ancestor-discovery/

Circling the Shews
http://legalgenealogist.com/blog/2015/05/24/circling-the-shews/

Naughty Bad NADs Sneak Home Under Cover of Darkness
http://dna-explained.com/2015/08/24/naughty-bad-nads-sneak-home-under-cover-of-darkness/

Ancestry Shared Matches Combined with New Ancestor Discoveries
http://dna-explained.com/2015/08/28/ancestry-shared-matches-combined-with-new-ancestor-discoveries/

Ancestry Shakey Leaf Disappearing Matches: Now You See Them – Now You Don’t
http://dna-explained.com/2015/09/24/ancestry-shakey-leaf-disappearing-matches-now-you-see-them-now-you-dont/

Ancestry’s New Amount of Shared DNA – What Does It Really Mean?
http://dna-explained.com/2015/11/06/ancestrys-new-amount-of-shared-dna-what-does-it-really-mean/

The Winds of Change
http://legalgenealogist.com/blog/2015/11/08/the-winds-of-change/

Confusion – Family Tree Maker, Family Tree DNA and Ancestry.com
http://dna-explained.com/2015/12/13/confusion-family-tree-maker-family-tree-dna-and-ancestry-com/

DNA: good news, bad news
http://legalgenealogist.com/blog/2015/01/11/dna-good-news-bad-news/

Check out the Alternatives
http://legalgenealogist.com/blog/2015/12/09/check-out-the-alternatives/

GeneAwards 2015
http://www.tamurajones.net/GeneAwards2015.xhtml

23andMe Betrays Genealogists

2015 broken heart

In October, 23andMe announced that it has reached an agreement with the FDA about reporting some health information such as carrier status and traits to their clients. As a part of or perhaps as a result of that agreement, 23andMe is dramatically changing the user experience.

In some aspects, the process will be simplified for genealogists with a universal opt-in. However, other functions are being removed and the price has doubled.  New advertising says little or nothing about genealogy and is entirely medically focused.  That combined with the move of the trees offsite to MyHeritage seems to signal that 23andMe has lost any commitment they had to the genetic genealogy community, effectively abandoning the group entirely that pulled their collective bacon out of the fire. This is somehow greatly ironic in light of the fact that it was the genetic genealogy community through their testing recommendations that kept 23andMe in business for the two years, from November of 2013 through October of 2015 when the FDA had the health portion of their testing shut down.  This is a mighty fine thank you.

As a result of the changes at 23andMe relative to genealogy, the genetic genealogy community has largely withdrawn their support and recommendations to test at 23andMe in favor of Ancestry and Family Tree DNA.

Kelly Wheaton, writing on the Facebook ISOGG group along with other places has very succinctly summed up the situation:
https://www.facebook.com/groups/isogg/permalink/10153873250057922/

You can also view Kelly’s related posts from earlier in December and their comments at:
https://www.facebook.com/groups/isogg/permalink/10153830929022922/
and…
https://www.facebook.com/groups/isogg/permalink/10153828722587922/

My account at 23andMe has not yet been converted to the new format, so I cannot personally comment on the format changes yet, but I will write about the experience in 2016 after my account is converted.

Furthermore, I will also be writing a new autosomal vendor testing comparison article after their new platform is released.

I Hate 23andMe
https://digginupgraves.wordpress.com/2015/06/14/i-hate-23andme/

23andMe to Get Makeover After Agreement With FDA
http://dna-explained.com/2015/10/21/23andme-to-get-a-makeover-after-agreement-with-fda/

23andMe Metamorphosis
http://throughthetreesblog.tumblr.com/post/131724191762/the-23andme-metamorphosis

The Changes at 23andMe
http://legalgenealogist.com/blog/2015/10/25/the-changes-at-23andme/

The 23and Me Transition – The First Step
http://dna-explained.com/2015/11/05/the-23andme-transition-first-step-november-11th/

The Winds of Change
http://legalgenealogist.com/blog/2015/11/08/the-winds-of-change/

Why Autosomal Response Rate Really Does Matter
http://dna-explained.com/2015/02/24/why-autosomal-response-rate-really-does-matter/

Heads Up About the 23andMe Meltdown
http://dna-explained.com/2015/12/04/heads-up-about-the-23andme-meltdown/

Now…and not now
http://legalgenealogist.com/blog/2015/12/06/now-and-not-now/

                             Cone of Shame Award 2015 frown

Another award this year is the Cone of Shame award which is also awarded to both Ancestry and 23andMe for their methodology of obtaining “consent” to sell their customers’, meaning our, DNA and associated information.

Genetic Genealogy Data Gets Sold

2015 shame

Unfortunately, 2015 has been the year that the goals of both 23andMe and Ancestry have become clear in terms of our DNA data. While 23andMe has always been at least somewhat focused on health, Ancestry never was previously, but has now hired a health officer and teamed with Calico for medical genetics research.

Now, both Ancestry and 23andMe have made research arrangements and state in their release and privacy verbiage that all customers must electronically sign (or click through) when purchasing their DNA tests that they can sell, at minimum, your anonymized DNA data, without any further consent.  And there is no opt-out at that level.

They can also use our DNA and data internally, meaning that 23andMe’s dream of creating and patenting new drugs can come true based on your DNA that you submitted for genealogical purposes, even if they never sell it to anyone else.

In an interview in November, 23andMe CEO Anne Wojcicki said the following:

23andMe is now looking at expanding beyond the development of DNA testing and exploring the possibility of developing its own medications. In July, the company raised $79 million to partly fund that effort. Additionally, the funding will likely help the company continue with the development of its new therapeutics division. In March, 23andMe began to delve into the therapeutics market, to create a third pillar behind the company’s personal genetics tests and sales of genetic data to pharmaceutical companies.

Given that the future of genetic genealogy at these two companies seems to be tied to the sale of their customer’s genetic and other information, which, based on the above, is very clearly worth big bucks, I feel that the fact that these companies are selling and utilizing their customer’s information in this manner should be fully disclosed. Even more appropriate, the DNA information should not be sold or utilized for research without an informed consent that would traditionally be used for research subjects.

Within the past few days, I wrote an article, providing specifics and calling on both companies to do the following.

  1. To minimally create transparent, understandable verbiage that informs their customers before the end of the purchase process that their DNA will be sold or utilized for unspecified research with the intention of financial gain and that there is no opt-out. However, a preferred plan of action would be a combination of 2 and 3, below.
  2. Implement a plan where customer DNA can never be utilized for anything other than to deliver the services to the consumers that they purchased unless a separate, fully informed consent authorization is signed for each research project, without coercion, meaning that the client does not have to sign the consent to obtain any of the DNA testing or services.
  3. To immediately stop utilizing the DNA information and results from customers who have already tested until they have signed an appropriate informed consent form for each research project in which their DNA or other information will be utilized.

And Now Ancestry Health
http://dna-explained.com/2015/06/06/and-now-ancestry-health/

Opting Out
http://legalgenealogist.com/blog/2015/07/26/opting-out/

Ancestry Terms of Use Updated
http://legalgenealogist.com/blog/2015/07/07/ancestry-terms-of-use-updated/

AncestryDNA Doings
http://legalgenealogist.com/blog/2015/07/05/ancestrydna-doings/

Heads Up About the 23andMe Meltdown
http://dna-explained.com/2015/12/04/heads-up-about-the-23andme-meltdown/

23andMe and Ancestry and Selling Your DNA Information
http://dna-explained.com/2015/12/30/23andme-ancestry-and-selling-your-dna-information/

                      Citizen Science Leadership Award   2015 smile

The Citizen Science Leadership Award this year goes to Blaine Bettinger for initiating the Shared cM Project, a crowdsourced project which benefits everyone.

Citizen Scientists Continue to Push the Edges of the Envelope with the Shared cM Project

Citizen scientists, in the words of Dr. Doron Behar, “are not amateurs.” In fact, citizen scientists have been contributing mightily and pushing the edge of the genetic genealogy frontier consistently now for 15 years.  This trend continues, with new discoveries and new ways of viewing and utilizing information we already have.

For example, Blaine Bettinger’s Shared cM Project was begun in March and continues today. This important project has provided real life information as to the real matching amounts and ranges between people of different relationships, such as first cousins, for example, as compared to theoretical match amounts.  This wonderful project produced results such as this:

2015 shared cM

I don’t think Blaine initially expected this project to continue, but it has and you can read about it, see the rest of the results, and contribute your own data here. Blaine has written several other articles on this topic as well, available at the same link.

Am I Weird or What?
http://dna-explained.com/2015/03/07/am-i-weird-or-what/

Jim Owston analyzed fourth cousins and other near distant relationships in his Owston one-name study:
https://owston.wordpress.com/2015/08/10/an-analysis-of-fourth-cousins-and-other-near-distant-relatives/

I provided distant cousin information in the Crumley surname study:
http://www.slideshare.net/FamilyTreeDNA/roberta-estes-crumley-y-dna

I hope more genetic genealogists will compile and contribute this type of real world data as we move forward. If you have compiled something like this, the Surname DNA Journal is peer reviewed and always looking for quality articles for publication.

Privacy, Law Enforcement and DNA

2015 privacy

Unfortunately, in May, a situation by which Y DNA was utilized in a murder investigation was reported in a sensationalist “scare” type fashion.  This action provided cause, ammunition or an excuse for Ancestry to remove the Sorenson data base from public view.

I find this exceedingly, exceedingly unfortunate. Given Ancestry’s history with obsoleting older data bases instead of updating them, I’m suspecting this was an opportune moment for Ancestry to be able to withdraw this database, removing a support or upgrade problem from their plate and blame the problem on either law enforcement or the associated reporting.

I haven’t said much about this situation, in part because I’m not a lawyer and in part because the topic is so controversial and there is no possible benefit since the damage has already been done. Unfortunately, nothing anyone can say or has said will bring back the Sorenson (or Ancestry) data bases and arguments would be for naught.  We already beat this dead horse a year ago when Ancestry obsoleted their own data base.  On this topic, be sure to read Judy Russell’s articles and her sources as well for the “rest of the story.”

Privacy, the Police and DNA
http://legalgenealogist.com/blog/2015/02/08/privacy-the-police-and-dna/

Big Easy DNA Not So Easy
http://legalgenealogist.com/blog/2015/03/15/big-easy-dna-not-so-easy/

Of Babies and Bathwater
http://www.legalgenealogist.com/blog/2015/05/17/of-babies-and-bathwater/

Facts Matter
http://legalgenealogist.com/blog/2015/05/03/facts-matter/

Genetic genealogy standards from within the community were already in the works prior to the Idaho case, referenced above, and were subsequently published as guidelines.

Announcing Genetic Genealogy Standards
http://thegeneticgenealogist.com/2015/01/10/announcing-genetic-genealogy-standards/

The standards themselves:
http://www.thegeneticgenealogist.com/wp-content/uploads/2015/01/Genetic-Genealogy-Standards.pdf

Ancient DNA Results Continue to Amass

“Moorleiche3-Schloss-Gottorf” by Commander-pirx at de.wikipedia – Own work. Licensed under CC BY-SA 3.0 via Commons

Ancient DNA is difficult to recover and even more difficult to sequence, reassembling tiny little blocks of broken apart DNA into an ancient human genome.

However, each year we see a few more samples and we are beginning to repaint the picture of human population movement, which is different than we thought it would be.

One of the best summaries of the ancient ancestry field was Michael Hammer’s presentation at the Family Tree DNA Conference in November titled “R1B and the Peopling of Europe: an Ancient DNA Update.” His slides are available here:
http://www.slideshare.net/FamilyTreeDNA/r1b-and-the-people-of-europe-an-ancient-dna-update

One of the best ongoing sources for this information is Dienekes’ Anthropology Blog. He covered most of the new articles and there have been several.  That’s the good news and the bad news, all rolled into one. http://dienekes.blogspot.com/

I have covered several that were of particular interest to the evolution of Europeans and Native Americans.

Yamnaya, Light Skinned Brown Eyed….Ancestors?
http://dna-explained.com/2015/06/15/yamnaya-light-skinned-brown-eyed-ancestors/

Kennewick Man is Native American
http://dna-explained.com/2015/06/18/kennewick-man-is-native-american/

Botocudo – Ancient Remains from Brazil
http://dna-explained.com/2015/07/02/botocudo-ancient-remains-from-brazil/

Some Native had Oceanic Ancestors
http://dna-explained.com/2015/07/22/some-native-americans-had-oceanic-ancestors/

Homo Naledi – A New Species Discovered
http://dna-explained.com/2015/09/11/homo-naledi-a-new-species-discovered/

Massive Pre-Contact Grave in California Yields Disappointing Results
http://dna-explained.com/2015/10/20/mass-pre-contact-native-grave-in-california-yields-disappointing-results/

I know of several projects involving ancient DNA that are in process now, so 2016 promises to be a wonderful ancient DNA year!

Education

2015 education

Many, many new people discover genetic genealogy every day and education continues to be an ongoing and increasing need. It’s a wonderful sign that all major conferences now include genetic genealogy, many with a specific track.

The European conferences have done a great deal to bring genetic genealogy testing to Europeans. European testing benefits those of us whose ancestors were European before immigrating to North America.  This year, ISOGG volunteers staffed booths and gave presentations at genealogy conferences in Birmingham, England, Dublin, Ireland and in Nyköping, Sweden, shown below, photo compliments of Catherine Borges.

ISOGG volunteers

Several great new online educational opportunities arose this year, outside of conferences, for which I’m very grateful.

DNA Lectures YouTube Channel
http://dna-explained.com/2015/04/26/dna-lectures-youtube-channel/

Allen County Public Library Online Resources
http://dna-explained.com/2015/06/03/allen-county-public-library-online-resources/

DNA Data Organization Tools and Who’s on First
http://dna-explained.com/2015/09/08/dna-data-organization-tools-and-whos-on-first/

Genetic Genealogy Educational Resource List
http://dna-explained.com/2015/12/03/genetic-genealogy-educational-resource-list/

Genetic Genealogy Ireland Videos
https://www.youtube.com/channel/UCHnW2NAfPIA2KUipZ_PlUlw

DNA Lectures – Who Do You Think You Are
https://www.youtube.com/channel/UC7HQSiSkiy7ujlkgQER1FYw

Ongoing and Online Classes in how to utilize both Y and autosomal DNA
http://www.dnaadoption.com/index.php?page=online-classes

Education Award

2015 smile Family Tree DNA receives the Education Award this year along with a huge vote of gratitude for their 11 years of genetic genealogy conferences. They are the only testing or genealogy company to hold a conference of this type and they do a fantastic job.  Furthermore, they sponsor additional educational events by providing the “theater” for DNA presentations at international events such as the Who Do You Think You Are conference in England.  Thank you Family Tree DNA.

Family Tree DNA Conference

ftdna 2015

The Family Tree DNA Conference, held in November, was a hit once again. I’m not a typical genealogy conference person.  My focus is on genetic genealogy, so I want to attend a conference where I can learn something new, something leading edge about the science of genetic genealogy – and that conference is definitely the Family Tree DNA conference.

Furthermore, Family Tree DNA offers tours of their lab on the Monday following the conference for attendees, and actively solicits input on their products and features from conference attendees and project administrators.

2015 FTDNA lab

Family Tree DNA 11th International Conference – The Best Yet
http://dna-explained.com/2015/11/18/2015-family-tree-dna-11th-international-conference-the-best-yet/

All of the conference presentations that were provided by the presenters have been made available by Family Tree DNA at:
http://www.slideshare.net/FamilyTreeDNA?utm_campaign=website&utm_source=sendgrid.com&utm_medium=email

2016 Genetic Genealogy Wish List

2015 wish list

In 2014, I presented a wish list for 2015 and it didn’t do very well.  Will my 2015 list for 2016 fare any better?

  • Ancestry restores Sorenson and their own Y and mtDNA data bases in some format or contributes to an independent organization like ISOGG.
  • Ancestry provides chromosome browser.
  • Ancestry removes or revamps Timber in order to restore legitimate matches removed by Timber algorithm.
  • Fully informed consent (per research project) implemented by 23andMe and Ancestry, and any other vendor who might aspire to sell consumer DNA or related information, without coercion, and not as a prerequisite for purchasing a DNA testing product. DNA and information will not be shared or utilized internally or externally without informed consent and current DNA information will cease being used in this fashion until informed consent is granted by customers who have already tested.
  • Improved ethnicity reporting at all vendors including ancient samples and additional reference samples for Native Americans.
  • Autosomal Triangulation tools at all vendors.
  • Big Y and STR integration and analysis enhancement at Family Tree DNA.
  • Ancestor Reconstruction
  • Mitochondrial and Y DNA search tools by ancestor and ancestral line at Family Tree DNA.
  • Improved tree at Family Tree DNA – along with new search capabilities.
  • 23andMe restores lost capabilities, drops price, makes changes and adds features previously submitted as suggestions by community ambassadors.
  • More tools (This is equivalent to “bring me some surprises” on my Santa list as a kid.)

My own goals haven’t changed much over the years. I still just want to be able to confirm my genealogy, to learn as much as I can about each ancestor, and to break down brick walls and fill in gaps.

I’m very hopeful each year as more tools and methodologies emerge.  More people test, each one providing a unique opportunity to match and to understand our past, individually and collectively.  Every year genetic genealogy gets better!  I can’t wait to see what 2016 has in store.

Here’s wishing you a very Happy and Ancestrally Prosperous New Year!

2015 happy new year

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Top 10 Most Popular Articles of 2015

Wordpress 2015

WordPress, the blogging software I use, provides a year-end summary that is quite interesting.

I really like this report, as I tend to be very focused on what I’m researching and writing, not on stats – so this is a refreshing break and summary. I thought you might be interested too.

The top 10 most viewed posts in 2015 were, in order from least to most:

10thPromethease – Genetic Health Information Alternative – From December 2013

People are beginning to ask about how they can obtain some of the health information that they were previously receiving from 23andMe.  For $5, at Promethease,  you can upload any of the autosomal files from either Family Tree DNA, 23andMe or Ancestry.com.  They will process your raw data and provide you with a report that is available to download from their server for 45 days.  They also e-mail you a copy.

9thX Marks the Spot – From September 2012

When using autosomal DNA, the X chromosome is a powerful tool with special inheritance properties.  Many people think that mitochondrial DNA is the same as the X chromosome.  It’s not.

8thThick Hair, Small Boobs, Shovel Shaped Teeth and More – From February 2013

Yep, there’s a gene for these traits, and more.  The same gene, named EDAR (short for Ectodysplasin receptor EDARV370A), it turns out, also confers more sweat glands and distinctive teeth and is found in the majority of East Asian people.

7thMythbusting – Women, Fathers and DNA – From June 2013

I’m sometimes amazed at what people believe – and not just a few people – but a lot of people.

Recently, I ran across a situation where someone was just adamant that autosomal DNA could not help a female find or identify her father.  That’s simply wrong. Incorrect.  Nada!  This isn’t, I repeat, IS NOT, true of autosomal testing.

6th4 Kinds of DNA for Genetic Genealogy – from October 2012 – This is probably the article I refer people to most often.  It’s the basics, just the basics.

There seems to be a lot of confusion about the different “kinds” of DNA and how they can be used for genetic genealogy.

It used to be simple.  When this “industry” first started, in the year 2000, you could test two kinds of DNA and it was straightforward.  Now we’ve added more DNA, more tools and more testing companies and it’s not quite so straightforward anymore.

5thIs History Repeating Itself at Ancestry? – from August 2012

Is history repeating itself at Ancestry?

I’ve been thinking about whether or not I should publish this posting.  As I write and rewrite it, I still haven’t made up my mind.  It’s one of those sticky wickets, as they are called.  One of the reasons I hesitate is that I have far more questions than answers.

4thWhat is a Haplogroup? – From January 2013

Sometimes we’ve been doing genetic genealogy for so long we forget what it’s like to be new.  I’m reminded, sometimes humorously, by some of the questions I receive.

3rdAutosomal DNA 2015 – Which Test is the Best? – From February 2015

This now obsolete article compared the autosomal tests from Family Tree DNA, Ancestry and 23andMe.  23andMe, as of year end (2015), is in the midst of rewriting their platform, which obsoletes some of the tools they offered previously.   As soon as the 23andMe transition to their new platform is complete, I’ll be writing an updated version of this article for 2016.  Until then, suffice it to say I am recommending Family Tree DNA and Ancestry, in that order.

2ndEthnicity Results – True or Not? – from October 2013

I can’t even begin to tell you how many questions I receive that go something like this:

“I received my ethnicity results from XYZ.  I’m confused.  The results don’t seem to align with my research and I don’t know what to make of them?”

1stProving Native American Ancestry Using DNA – From December 2012 – this has been the most popular article every year since 2012. This doesn’t surprise me, as it’s also the most common question I receive.

Every day, I receive e-mails very similar to this one.

“My family has always said that we were part Native American.  I want to prove this so that I can receive help with money for college.”

Interesting

I was surprised, at first, to see so many older posts, but then I realized they have had more time to accumulate hits.

Of these all-time Top 10, three of them, including the most popular, which is most popular by far, have to do with Native American ancestry, directly or indirectly. The most common questions I receive about ethnicity also relate to the discovery of Native American ancestry.

Thank you everyone for coming along with me on this on this wonderful journey.  It will be exciting to see what 2016 has to offer.  I already have some exciting research planned that I’ll be sharing with you.

Happy New Year everyone!  I’m wishing you new ancestors!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Ethnicity Testing and Results

I have written repeatedly about ethnicity results as part of the autosomal test offerings of the major DNA testing companies, but I still receive lots of questions about which ethnicity test is best, which is the most accurate, etc.  Take a look at “Ethnicity Percentages – Second Generation Report Card” for a detailed analysis and comparison.

First, let’s clarify which testing companies we are talking about.  They are:

Let’s make this answer unmistakable.

  1. Some of the companies are somewhat better than others relative to ethnicity – but not a lot.
  2. These tests are reasonably reliable when it comes to a continent level test – meaning African, European, Asian and sometimes, Native American.
  3. These tests are great at detecting ancestry over 25% – but if you know who your grandparents are – you already have that information.
  4. The usefulness of these tests for accurately providing ethnicity information diminishes as the percentage of that minority admixture declines.  Said another way – as your percentage of a particular ethnicity decreases, so does the testing companies’ ability to find it.
  5. Intra-continental results, meaning within Europe, for example, are speculative, at best.  Do not expect them to align with your known genealogy.  They likely won’t – and if they do at one vendor – they won’t at others.  Which one is “right”?  Who knows – maybe all of them when you consider population movement, migration and assimilation.
  6. As the vendors add to and improve their data bases, reference populations and analysis tools, your results change. I discussed how vendors determine your ethnicity percentages in the article, “Determining Ethnicity Percentages.”
  7. Sometimes unexpected results, especially continent level results, are a factor of ancient population mixing and migrations, not recent admixture – and it’s impossible to tell the difference. For example, the Celts, from the Germanic area of Europe also settled in the British Isles. Attila the Hun and his army, from Asia, invaded and settled in what is today, Germany, as well as other parts of Eastern Europe.
  8. Ethnicity tests are unreliable in consistently detecting minority admixture. Minority in this context means a small amount, generally less than 5%.  It does not refer to any specific ethnicity. Having said that, there are very few reference data base entries for Native American populations.  Most are from from Canada and South America.

In the context of ethnicity, what does unreliable mean?

Unreliable means that the results are not consistent and often not reproducible across platforms, especially in terms of minority admixture.  For example, a German/Hungarian family member shows Native American admixture at low percentages, around 3%, at some, but not all, vendors.  His European family history does not reflect Native heritage and in fact, precludes it.  However, his results likely reflect Native American from a common underlying ancestral population, the Yamnaya, between the Asian people who settled Hungary and parts of Germany and also contributed to the Native American population.

Unreliable can also mean that different vendors, measuring different parts of your DNA, can assign results to different regions.  For example, if you carry Celtic ancestry, would you be surprised to see Germanic results and think they are “wrong?”  Speaking of Celts, they didn’t just stay put in one region within Europe either.  And who were the Celts and where did they ‘come from’ before they were Celts.  All of this current and ancient admixture is carried in your DNA.  Teasing it out and the meaning it carries is the challenge.

Unreliable may also mean that the tests often do not reflect what is “known” in terms of family history.  I put the word “known” in quotes here, because oral history does not constitute “known” and it’s certainly not proof.  For the most part, documented genealogy does constitute “known” but you can never “know” about an undocumented adoption, also referred to as a “nonparental event” or NPE.  Yes, that’s when one or both parents are not who you think they are based on traditional information.  With the advent of DNA testing, NPEs can, in some instances, be discovered.

So, the end result is that you receive very interesting information about your genetic history that often does not correlate with what you expected – and you are left scratching your head.

However, in some cases, if you’re looking for something specific – like a small amount of Native American or African ancestry, you, indeed, can confirm it through your DNA – and can confirm your family history.  One thing is for sure, if you don’t test, you will never know.

Minority Admixture

Let’s take a look at how ethnicity estimates work relative to minority admixture.

In terms of minority admixture, I’m referring to admixture that is several generations back in your tree.  It’s often revealed in oral history, but unproven, and people turn to genetic genealogy to prove those stories.

In my case, I have several documented Native American lines and a few that are not documented.  All of these results are too far back in time, the 1600s and 1700s, to realistically be “found” in autosomal admixture tests consistently.  I also have a small amount of African admixture.  I know which line this comes from, but I don’t know which ancestor, exactly.  I have worked through these small percentages systematically and documented the process in the series titled, “The Autosomal Me.”  This is not an easy or quick process – and if quick and easy is the type of answer you’re seeking – then working further, beyond what the testing companies give you, with small amounts of admixture, is probably not for you.

Let’s look at what you can expect in terms of inheritance admixture.  You receive 50% of your DNA from each parent, and so forth, until eventually you receive very little DNA (or none) from your ancestors from many generations back in your tree.

Ethnicity DNA table

Let’s put this in perspective.  The first US census was taken in 1790, so your ancestors born in 1770 should be included in the 1790 census, probably as a child, and in following censuses as an adult.  You carry less than 1% of this ancestor’s DNA.

The first detailed census listing all family members was taken in 1850, so most of your ancestors that contributed more than 1% of your DNA would be found on that or subsequent detailed census forms.

These are often not the “mysterious” ancestors that we seek.  These ancestors, whose DNA we receive in amounts over 1%, are the ones we can more easily track through traditional means.

The reason the column of DNA percentages is labeled “approximate” is because, other than your parents, you don’t receive exactly half of your ancestor’s DNA.  DNA is not divided exactly in half and passed on to subsequence generations, except for what you receive from your parents.  Therefore, you can have more or less of any one ancestor’s individual DNA that would be predicted by the chart, above.  Eventually, as you continue to move further out in your tree, you may carry none of a specific ancestor’s DNA or it is in such small pieces that it is not detected by autosomal DNA testing.

The Vendors

At least two of the three major vendors have made changes of some sort this year in their calculations or underlying data bases.  Generally, they don’t tell us, and we discover the change by noticing a difference when we look at our results.

Historically, Ancestry has been the worst, with widely diverging estimates, especially within continents.  However, their current version is picking up both my Native and African.  However, with their history of inconsistency and wildly inaccurate results, it’s hard to have much confidence, even when the current results seem more reasonable and in line with other vendors.  I’ve adopted a reserved “wait and see” position with Ancestry relative to ethnicity.

Family Tree DNA’s Family Finder product is in the middle with consistent results, but they don’t report less than 1% admixture which is often where those distant ancestors’ minority ethnicity would be found, if at all.  However, Family Tree DNA does provide Y and mitochondrial mapping comparisons, and ethnicity comparisons to your matches that are not provided by other vendors.

Ethnicity DNA matches

In this view, you can see the matching ethnicity percentages for those whom you match autosomally.

23andMe is currently best in terms of minority ethnicity detection, in part, because they report amounts less than 1%, have a speculative view, which is preferred by most genetic genealogists and because they paint your ethnicity on your chromosomes, shown below.  You can see that both chromosome 1 and 2 show Native segments.

Ethnicity 23andMe chromosome

So, looking at minority admixture only – let’s take a look at today’s vendor results as compared to the same vendors in May 2014.

Ethnicity 2014-2015 compare

The Rest of the Story

Keep in mind, we’re only discussing ethnicity here – and there is a lot more to autosomal DNA testing than ethnicity – for example – matching to cousins, tools, such as a chromosome browser (or lack thereof), trees, ease of use and ability to contact your matches.  Please see “Autosomal DNA 2015 – Which Test is the Best?”  Unless ethnicity is absolutely the ONLY reason you are DNA testing, then you need to consider the rest of the story.

And speaking of the rest of the story, National Geographic has been pretty much omitted from this discussion because they have just announced a new upgrade, “Geno 2.0: Next Generation,” to their offering, which promises to be a better biogeographical tool.  I hope so – as National Geographic is in a unique position to evaluate populations with their focus on sample collection from what is left of unique and sometimes isolated populations.  We don’t have much information on the new product yet, and of course, no results because the new test won’t be released until in September, 2015.  So the jury is out on this one.  Stay tuned.

GedMatch – Not A Vendor, But a Great Toolbox

Finally, most people who are interested in ethnicity test at one (or all) of the companies, utilize the rest of the tools offered by that company, then download their results to www.gedmatch.com, a donation based site, and make use of the numerous contributed admixture tools there.

Ethnicity GedMatch

GedMatch offers lots of options and several tools that provide a wide range of focus.  For example, some tools are specifically written for European, African, Asian or even comparison against ancient DNA results.

Ethnicity ancient admixture

Conclusion

So what is the net-net of this discussion?

  1. There is a lot more to autosomal DNA testing than just ethnicity – so take everything into consideration.
  2. Ethnicity determination is still an infant and emerging field – with all vendors making relatively regular updates and changes. You cannot take minority results to the bank without additional and confirming research, often outside of genetic genealogy. However, mitochondrial or Y DNA testing, available only through Family Tree DNA, can positively confirm Native or minority ancestry in the lines available for testing. You can create a DNA Pedigree Chart to help identify or eliminate Native lines.
  3. If the ancestors you seek are more than a few generations removed, you may not carry enough of their ethnic DNA to be identified.
  4. Your “100% Cherokee” ancestor was likely already admixed – and so their descendants may carry even less Native DNA than anticipated.
  5. You cannot prove a negative using autosomal DNA (but you can with both Y and mitochondrial DNA). In other words, a negative autosomal ethnicity result alone, meaning no Native heritage, does NOT mean your ancestors were not Native. It MIGHT mean they weren’t Native. It also might mean that they were either very admixed or the Native ancestry is too far back in your tree to be found with today’s technology. Again, mitochondrial and Y DNA testing provide confirmed ancestry identification for the lines they represent. Y is the male paternal (surname) line and mitochondrial is the matrilineal line of both males and females – the mother’s, mother’s, mother’s line, on up the tree until you run out of mothers.
  6. It is very unlikely that you will be able to find your tribe, although it is occasionally possible. If a company says they can do this, take that claim with a very big grain of salt. Your internal neon warning sign should be flashing about now.
  7. If you’re considering purchasing an ethnicity test from a company other than the four I mentioned – well, just don’t.  Many use very obsolete technology and oversell what they can reliably provide.  They don’t have any better reference populations available to them than the major companies and Nat Geo, and let’s just say there are ways to “suggest” people are Native when they aren’t. Here are two examples of accidental ways people think they are Native or related – so just imagine what kind of damage could be done by a company that was intentionally providing “marginal” or misleading information to people who don’t have the experience to know that because they “match” someone who has a Native ancestor doesn’t mean they share that same Native ancestor – or any connection to that tribe. So, stay with the known companies if you’re going to engage in ethnicity testing. We may not like everything about the products offered by these companies, but we know and understand them.

My Recommendation

By all means, test.

Test with all three companies, 23andMe, Family Tree DNA and Ancestry – then download your results from either Family Tree DNA or Ancestry (who test more markers than 23andMe) to GedMatch and utilize their ethnicity tools.  When I’m looking for minority admixture, I tend to look for consistent trends – not just at results from any one vendor or source.

If you have already tested at Ancestry, or you tested at 23andMe on the V3 chip, prior to December 2013, you can download your raw data file to Family Tree DNA and pay just $39.  Family Tree DNA will process your raw data within a couple days and you will then see your myOrigins ethnicity results as interpreted by their software.  Of course, that’s in addition to having access to Family Tree DNA‘s other autosomal features, functions and tools.  The transfer price of $39 is significantly less expensive than retesting.

Just understand that what you receive from these companies in terms of ethnicity is reflective of both contemporary and ancient admixture – from all of your ancestral lines.  This field is in its infancy – your results will change from time to time as we learn – and the only part of ethnicity that is cast in concrete is probably your majority ancestry which you can likely discern by looking in the mirror.  The rest – well – it’s a mystery and an adventure.  Welcome aboard to the miraculous mysterious journey of you, as viewed through the DNA of your ancestors!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Some Native Americans Had Oceanic Ancestors

This week has seen a flurry of new scientific and news articles.  What has been causing such a stir?  It appears that Australian or more accurately, Australo-Melanese DNA has been found in South America’s Native American population. In addition, it has also been found in Aleutian Islanders off the coast of Alaska.  In case you aren’t aware, that’s about 8,500 miles as the crow flies.  That’s one tired crow.  As the person paddles or walks along the shoreline, it’s even further, probably about 12,000 miles.

Aleutians to Brazil

Whatever the story, it was quite a journey and it certainly wasn’t all over flat land.

This isn’t the first inkling we’ve had.  Just a couple weeks ago, it was revealed that the Botocudo remains from Brazil were Polynesian and not admixed with either Native, European or African.  This admixture was first discovered via mitochondrial DNA, but full genome sequencing confirmed their ancestry and added the twist that they were not admixed – an extremely unexpected finding.  This is admittedly a bit confusing, because it implies that there were new Polynesian arrivals in the 1600s or 1700s.

Unlikely as it seems, it obviously happened, so we set that aside as relatively contemporary.

The findings in the papers just released are anything but contemporary.

The First Article

The first article in Science, “Genomic evidence for the Pleistocene and recent population history of Native Americans” by Raghaven et al published this week provides the following summary (bolding is mine):

How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we find that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (KYA), and after no more than 8,000-year isolation period in Beringia. Following their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 KYA, one that is now dispersed across North and South America and the other is restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative ‘Paleoamerican’ relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.

This article in EurekAlert and a second one here discuss the Science paper.

Raghaven 2015

Migration map from the Raghaven paper.

The paper included the gene flow and population migration map, above, along with dates.

The scientists sequenced the DNA of 31 living individuals from the Americas, Siberia and Oceana as follows:

Siberian:

  • Altai – 2
  • Buryat – 2
  • Ket – 2
  • Kiryak – 2
  • Sakha – 2
  • Siberian Yupik – 2

North American Native:

  • Tsimshian (number not stated, but by subtraction, it’s 1)

Southern North American, Central and South American Native:

  • Pima – 1
  • Huichol -1
  • Aymara – 1
  • Yakpa – 1

Oceana:

  • Papuan – 14

The researchers also state that they utilized 17 specimens from relict groups such as the Pericues from Mexico and Fuego-Patagonians from the southernmost tip of South America.  They also sequenced two pre-Columbian mummies from the Sierra Tarahumara in northern Mexico.  In total, 23 ancient samples from the Americas were utilized.

They then compared these results with a reference panel of 3053 individuals from 169 populations which included the ancient Saqqaq Greenland individual at 400 years of age as well as the Anzick child from Montana from about 12,500 years ago and the Mal’ta child from Siberia at 24,000 years of age.

Not surprisingly, all of the contemporary samples with the exception of the Tsimshian genome showed recent western Eurasian admixture.

As expected, the results confirm that the Yupik and Koryak are the closest Eurasian population to the Americas.  They indicate that there is a “clean split” between the Native American population and the Koryak about 20,000 years ago.

They found that “Athabascans and Anzick-1, but not the Greenlandis Inuit and Saqqaq belong to the same initial migration wave that gave rise to present-day Amerindians from southern North America and Central and South America, and that this migration likely followed a coastal route, given our current understanding of the glacial geological and paleoenvironmental parameters of the Late Pleistocene.”

Evidence of gene flow between the two groups was also found, meaning between the Athabascans and the Inuit.  Additionally, they found evidence of post-split gene flow between Siberians and Native Americans which seems to have stopped about 12,000 years ago, which meshes with the time that the Beringia land bridge was flooded by rising seas, cutting off land access between the two land masses.

They state that the results support all Native migration from Siberia, contradicting claims of an early migration from Europe.

The researchers then studied the Karitiana people of South America and determined that the two groups, Athabascans and Karitiana diverged about 13,000 years ago, probably not in current day Alaska, but in lower North America.  This makes sense, because the Clovis Anzick child, found in Montana, most closely matches people in South America.

By the Clovis period of about 12,500 years ago, the Native American population had already split into two branches, the northern and southern, with the northern including Athabascan and other groups such as the Chippewa, Cree and Ojibwa.  The Southern group included people from southern North America and Central and South America.

Interestingly, while admixture with the Inuit was found with the Athabascan, Inuit admixture was not found among the Cree, Ojibwa and Chippewa.  The researchers suggest that this may be why the southern branch, such as the Karitiana are genetically closer to the northern Amerindians located further east than to northwest coast Amerindians and Athabascans.

Finally, we get to the Australian part.  The researchers when trying to sort through the “who is closer to whom” puzzle found unexpected results.  They found that some Native American populations including Aleutian Islanders, Surui (Brazil) and Athabascans are closer to Australo-Melanesians compared to other Native Americans, such as Ojibwa, Cree and Algonquian and South American Purepecha (Mexico), Arhuaco (Colombia) and Wayuu (Colombia, Venezuela).  In fact, the Surui are one of the closest populations to East Asians and Australo-Melanese, the latter including Papuans, non-Papuan Melanesians, Solomon Islanders and hunter-gatherers such as Aeta. The researchers acknowledge these are weak trends, but they are nonetheless consistently present.

Dr. David Reich, from Harvard, a co-author of another paper, also published this past week, says that 2% of the DNA of Amazonians is from Oceana.  If that is consistent, it speaks to a founder population in isolation, such that the 2% just keeps getting passed around in the isolated population, never being diluted by outside DNA.  I would suggest that is not a weak signal.

The researchers suggest that the variance in the strength of this Oceanic signal suggests that the introduction of the Australo-Melanese occurred after the initial peopling of the Americas.  The ancient samples cluster with the Native American groups and do not show the Oceanic markers and show no evidence of gene flow from Oceana.

The researchers also included cranial morphology analysis, which I am omitting since cranial morphology seems to have led researchers astray in the past, specifically in the case of Kennewick man.

One of the reasons cranial morphology is such a hotly debated topic is because of the very high degree of cranial variance found in early skeletal remains.  One of the theories evolving from the cranial differences involving the populating of the Americans has been that the Australo-Melanese were part of a separate and earlier migration that gave rise to the earliest Americans who were then later replaced by the Asian ancestors of current day Native Americans.  If this were the case, then the now-extinct Fuego-Patagonains samples from the location furthest south on the South American land mass should have included DNA from Oceana, but it didn’t.

The Second Article

A second article published this week, titled “’Ghost population’ hints at long lost migration to the Americas” by Ellen Callaway discusses similar findings, presented in a draft letter to Nature titled “Genetic evidence for two founding populations of the Americas” by Skoglund et al.  This second group discovers the same artifact Australo-Melanesian DNA in Native American populations but suggests that it may be from the original migration and settlement event or that there may have been two distinct founding populations that settled at the same time or that there were two founding events.

EurekAlert discusses the article as well.

It’s good to have confirmation and agreement between the two labs who happened across these results independently that the Australo-Melanesian DNA is present in some Native populations today.

Their interpretations and theories about how this Oceanic DNA arrived in some of the Native populations vary a bit, but if you read the details, it’s really not quite as different as it first appears from the headlines.  Neither group claims to know for sure, and both discuss possibilities.

Questions remain.  For example, if the founding group was small, why, then, don’t all of the Native people and populations have at least some Oceanic markers?  The Anzick Child from 12,500 years ago does not.  He is most closely related to the tribes in South America, where the Oceanic markers appear with the highest frequencies.

In the Harvard study, the scientists fully genome sequenced 63 individuals without discernable evidence of European or African ancestors in 21 Native American populations, restricting their study to individuals from Central and South America that have the strongest evidence of being entirely derived from a homogenous First American ancestral population.

Their results show that the two Amazonian groups, Surui and Karitians are closest to the “Australasian populations, the Onge from the Andaman Island in the Bay of Bengal (a so-called ‘Negrito’ group), New Guineans, Papuans and indigenous Australians.”  Within those groups, the Australasian populations are the only outliers – meaning no Africans, Europeans or East Asian DNA found in the Native American people.

When repeating these tests, utilizing blood instead of saliva, a third group was shown to also carry these Oceanic markers – the Xavante, a population from the Brazilian plateau that speaks a language of the Ge group that is different from the Tupi language group spoke by the Karitians and Surui.

Skoglund 2015-2

The closest populations that these Native people matched in Oceana, shown above on the map from the draft Skoglund letter, were, in order, New Guineans, Papuans and Andamanese.  The researchers further state that populations from west of the Andes or north of the Panama isthmus show no significant evidence of an affinity to the Onge from the Andaman Islands with the exception of the Cabecar (Costa Rica).

That’s a very surprising finding, given that one would expect more admixture on the west, which is the side of the continent where the migration occurred.

The researchers then compared the results with other individuals, such as Mal’ta child who is known to have contributed DNA to the Native people today, and found no correlation with Oceanic DNA.  Therefore, they surmised that the Oceanic admixture cannot be explained by a previously known admixture event.

They propose that a mystery population they have labeled as “Population Y” (after Ypykuera which means ancestor in the Tupi language family) contributed the Australasian lineage to the First Americans and that is was already mixed into the lineage by the time it arrived in Brazil.

According to their work, Population Y may itself have been admixed, and the 2% of Oceanic DNA found in the Brazilian Natives may be an artifact of between 2 and 85% of the DNA of the Surui, Karitiana and Xavante that may have come from Population Y.  They mention that this result is striking in that the majority of the craniums that are more Oceanic in Nature than Asiatic, as would be expected from people who migrated from Siberia, are found in Brazil.

They conclude that the variance in the presence or absence of DNA in Native people and remains, and the differing percentages argue for more than one migration event and that “the genetic ancestry of Native Americans from Central and South America cannot be due to a single pulse of migration south of the Late Pleistocene ice sheets from a homogenous source population, and instead must reflect at least two streams of migration or alternatively a long drawn out period of gene flow from a structured Beringian or Northeast Asian source.”

Perhaps even more interesting is the following statement:

“The arrival of population Y ancestry in the Americas must in any scenario have been ancient: while Population Y shows a distant genetic affinity to Andamanese, Australian and New Guinean populations, it is not particularly closely related to any of them, suggesting that the source of population Y in Eurasia no longer exists.”

They further state they find no admixture indication that would suggest that Population Y arrived in the last few thousand years.

So, it appears that perhaps the Neanderthals and Denisovans were not the only people who were our ancestors, but no longer exist as a separate people, only as an admixed part of us today.  We are their legacy.

The Take Away

When I did the Anzick extractions, we had hints that something of this sort might have been occurring.  For example, I found surprising instances of haplogroup M, which is neither European, African nor Native American, so far as we know today.  This may have been a foreshadowing of this Oceanic admixture.  It may also be a mitochondrial artifact.  Time will tell.  Perhaps haplogroup M will turn out to be Native by virtue of being Oceanic and admixed thousands of years ago.  There is still a great deal to learn.  Regardless of how these haplogroups and Oceanic DNA arrived in Brazil in South America and in the Aleutian Islands off of Alaska, one thing is for sure, it did.

We know that the Oceanic DNA found in the Brazilian people studied for these articles is not contemporary and is ancient.  This means that it is not related to the Oceanic DNA found in the Botocudo people, who, by the way, also sport mitochondrial haplogroups that are within the range of Native people, meaning haplogroup B, but have not been found in other Native people.  Specifically, haplogroups B4a1a1 and B4a1a1a.  Additionally, there are other B4a1a, B4a1b and B4a1b1 results found in the Anzick extract which could also be Oceanic.  You can see all of the potential and confirmed Native American mitochondrial DNA results in my article “Native American Mitochondrial Haplogroups” that I update regularly.

We don’t know how or when the Botocudo arrived, but the when has been narrowed to the 1600s or 1700s.  We don’t know how or when the Oceanic DNA in the Brazilian people arrived either, but the when was ancient.  This means that Oceanic DNA has arrived in South America at least twice and is found among the Native peoples both times.

We know that some Native groups have some Oceanic admixture, and others seem to have none, in particular the Northern split group that became the Cree, Ojibwa, Algonquian, and Chippewa.

We know that the Brazilian Native groups are most closely related to Oceanic groups, but that the first paper also found Oceanic admixture in the Aleutian Islands.  The second paper focused on the Central and South American tribes.

We know that the eastern American tribes, specifically the Algonquian tribes are closely related to the South Americans, but they don’t share the Oceanic DNA and neither do the mid-continent tribes like the Cree, Ojibwa and Chippewa.  The only Paleolithic skeleton that has been sequenced, Anzick, from 12,500 years ago in Montana also does not carry the Oceanic signature.

In my opinion, the disparity between who does and does not carry the Oceanic signature suggests that the source of the Oceanic DNA in the Native population could not have been a member of the first party to exit out of Beringia and settle in what is now the Americas.  Given that this had to be a small party, all of the individuals would have been thoroughly admixed with each other’s ancestral DNA within just a couple of generations.  It would have been impossible for one ancestor’s DNA to only be found in some people.  To me, this argues for one of two scenarios.

First, a second immigration wave that joined the first wave but did not admix with some groups that might have already split off from the original group such as the Anzick/Montana group.

Second, multiple Oceanic immigration events.  We still have to consider the possibility that there were multiple events that introduced Oceanic DNA into the Native population.  In other words, perhaps the Aleutian Islands Oceanic DNA is not from the same migration event as the Brazilian DNA which we know is not from the same event as the Botocudo.  I would very much like to see the Oceanic DNA appear in a migration path of people, not just in one place and then the other.  We need to connect the dots.

What this new information does is to rule out the possibility that there truly was only one wave of migration – one group of people who settled the Americas at one time.  More likely, at least until the land bridge submerged, is that there were multiple small groups that exited Beringia over the 8,000 or so years it was inhabitable.  Maybe one of those groups included people from Oceana.  Someplace, sometime, as unlikely as it seems, it happened.

The amazing thing is that it’s more than 10,000 miles from Australia to the Aleutian Islands, directly across the Pacific.  Early adventurers would have likely followed a coastal route to be sustainable, which would have been significantly longer.  The fact that they survived and sent their DNA on a long adventure from Australia to Alaska to South America – and it’s still present today is absolutely amazing.

Australia to Aleutians

We know we still have a lot to learn and this is the tip of a very exciting iceberg.  As more contemporary and ancient Native people have their full genomes sequenced, we’ll learn more answers.  The answer is in the DNA.  We just have to sequence enough of it and learn how to understand the message being delivered.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Botocudo Ancient Remains from Brazil

Update: Please note that I am leaving this article because the scientific information is accurate, BUT, it was subsequently discovered that the remains were mislabeled in the museum and were not Native.

One thing you can always count on in the infant science of population genetics…  whatever you think you know, for sure, for a fact…well….you don’t.  So don’t say too much, too strongly or you’ll wind up having to decide if you’d like catsup with your crow!  Well, not literally, of course.  It’s an exciting adventure that we’re on together and it just keeps getting better and better.  And the times…they are a changin’.

We have some very interesting news to report.  Fortunately, or unfortunately – the news weaves a new, but extremely interesting, mystery.

Ancient Mitochondrial DNA

Back in 2013, a paper, Identification of Polynesian mtdNA haplogroups in remains of Botocudo Amerindians from Brazil, was published that identified both Native American and Polynesian haplogroups in a group of 14 skeletal remains of Botocudo Indians from Brazil whose remains arrived at a Museum in August of 1890 and who, the scientists felt, died in the second half of the 19th century.

Twelve of their mitochondrial haplogroups were the traditional Native haplogroup of C1.

However, two of the skulls carried Polynesian haplogroups, downstream of haplogroup B, specifically B4a1a1a and B4a1a1, that compare to contemporary individuals from Polynesian, Solomon Island and Fijian populations.  These haplotypes had not been found in Native people or previous remains.

Those haplogroups include what is known as the Polynesian motif and are found in Indonesian populations and also in Madagascar, according to the paper, but the time to the most common recent ancestor for that motif was calculated at 9,300 years plus or minus 2000 years.  This suggests that the motif arose after the Asian people who would become the Native Americans had already entered North and South America through Beringia, assuming there were no later migration waves.

The paper discusses several possible scenarios as to how a Polynesian haplotype found its way to central Brazil among a now extinct Native people. Of course, the two options are either pre-Columbian (pre-1500) contact or post-Columbian contact which would infer from the 1500s to current and suggests that the founders who carried the Polynesian motif were perhaps either slaves or sailors.

In the first half of the 1800s, the Botocudo Indians had been pacified and worked side by side with African slaves on plantations.

Beyond that, without full genome sequencing there was no more that could be determined from the remains at that time.  We know they carried a Polynesian motif, were found among Native American remains and at some point in history, intermingled with the Native people because of where they were found.  Initial contact could have been 9,000 years ago or 200.  There was no way to tell.  They did have some exact HVR1 and HVR2 matches, so they could have been “current,” but I’ve also seen HVR1 and HVR2 matches that reach back to a common ancestor thousands of years ago…so an HVR1/HVR2 match is nothing you can take to the bank, certainly not in this case.

Full Genome Sequencing and Y DNA

This week, one on my subscribers, Kalani, mentioned that Felix Immanuel had uploaded another two kits to GedMatch of ancient remains.  Those two kits are indeed two of the Botocudo remains – the two with the Polynesian mitochondrial motif which have now been fully sequenced.  A corresponding paper has been published as well, “Two ancient genomes reveal Polynesian ancestry among the indigenous Botocudos of Brazil” by Malaspinas et al with supplemental information here.

There are two revelations which are absolutely fascinating in this paper and citizen scientist’s subsequent work.

First, their Y haplogroups are C-P3092 and C-Z31878, both equivalent to C-B477 which identifies former haplogroup C1b2.  The Y haplogroups aren’t identified in the paper, but Felix identified them in the raw data files that are available (for those of you who are gluttons for punishment) at the google drive links in Felix’s article Two Ancient DNA from indigenous Botocudos of Brazil.

I’ve never seen haplogroup C1b2 as Native American, but I wanted to be sure I hadn’t missed a bus, so I contacted Ray Banks who is one of the administrators for the main haplogroup C project at Family Tree DNA and also is the coordinator for the haplogroup C portion of the ISOGG tree.

ISOGG y tree

You can see the position of C1b2, C-B477 in yellow on the ISOGG (2015) tree relative to the position of C-P39 in blue, the Native American SNP shown several branches below, both as branches of haplogroup C.

Ray maintains a much more descriptive tree of haplogroup C1 at this link and of C2 at this link.

Ray Banks C1 tree

The branch above is the Polynesian (B477) branch and below, the Native American (P39) branch of haplogroup C.

Ray Banks C2 treeIn addition to confirming the haplogroup that Felix identified, when Ray downloaded the BAM files and analyzed the contents, he found that both samples were also positive for M38 and M208, which moves them downstream two branches from C1b2 (B477).

Furthermore, one of the samples had a mutation at Z32295 which Ray has included as a new branch of the C tree, shown below.

Ray Banks Z32295

Ray indicated that the second sample had a “no read” at Z32295, so we don’t know if he carried this mutation.  Ray mentions that both men are negative for many of the B459 equivalents, which would move them down one more branch.  He also mentioned that about half of the Y DNA sites are missing, meaning they had no calls in the sequence read.  This is common in ancient DNA results.  It would be very interesting to have a Big Y or equivalent test on contemporary individuals with this haplogroup from the Pacific Island region.

Ray notes that all Pacific Islanders may be downstream of Z33295.

Not Admixed

The second interesting aspect of the genomic sequencing is that the remains did not show any evidence of admixture with European, Native American nor African individuals.  More than 97% of their genome fits exactly with the Polynesian motifs.  In other words, they appear to be first generation Polynesians.  They carry Polynesian mitochondrial, Y and autosomal (nuclear) DNA, exclusively.

Botocudo not admixed

In total, 25 Botocudo remains have been analyzed and of those, two have Polynesian ancestry and those two, BOT15 and BOT17, have exclusively Polynesian ancestry as indicated in the graphic above from the paper.

When did they live?  Accelerator mass spectrometry radiocarbon dating with marine correction gives us dates of 1479-1708 AD and 1730-1804 for specimen BOT15 and 1496-1842 for BOT17.

The paper goes on to discuss four possible scenarios for how this situation occurred and the pros and cons of each.

The Polynesian Peru Slave Trade

This occurred between 1862-1864 and can be ruled out because the dates for the skulls predate this trade period, significantly.

The Madagascar-Brazil Slave Trade

The researchers state that Madagascar is known to have been peopled by Southeast Asians and not by Polynesians.  Another factor excluding this option is that it’s known that the Malagasy ancestors admixed with African populations prior to the slave trade.  No such ancestry was detected in the samples, so these individuals were not brought as a result of the Madagascar-Brazil slave trade – contrary to what has been erroneously inferred and concluded.

Voyaging on European Ships as Crew, Passengers or StowAways

Trade on Euroamerican ships in the Pacific only began after 1760 AD and by 1760, Bot15 and Bot17 were already deceased with a probability of .92 and .81, respectively, making this scenario unlikely, but not entirely impossible.

Polynesian Voyaging

Polynesian ancestors originated from East Asia and migrated eastwards, interacting with New Guineans before colonizing the Pacific.  These people did colonize the Pacific, as unlikely as it seems, traveling thousands of miles, reaching New Zealand, Hawaii and Easter Island between 1200 and 1300 AD.  Clearly they did not reach Brazil in this timeframe, at least not as related to these skeletal remains, but that does not preclude a later voyage.

Of the four options, the first two appear to be firmly eliminated which leaves only the second two options.

One of the puzzling aspects of this analysis it the “pure” Polynesian genome, eliminating admixture which precludes earlier arrival.

The second puzzling aspect is how the individuals, and there were at least two, came to find themselves in Minas Gerais, Brazil, and why we have not found this type of DNA on the more likely western coastal areas of South America.

Minas Gerais Brazil

Regardless of how they arrived, they did, and now we know at least a little more of their story.

GedMatch

At GedMatch, it’s interesting to view the results of the one-to-one matching.

Both kits have several matches.  At 5cM and 500 SNPs, kit F999963 has 86 matches.  Of those, the mitochondrial haplogroup distribution is overwhelmingly haplogroup B, specifically B4a1a1 with a couple of interesting haplogroup Ms.

F999963 mito

Y haplogroups are primarily C2, C3 and O.   C3 and O are found exclusively in Asia – meaning they are not Native.

F999963 Y

Kit F999963 matches a couple of people at over 30cM with a generation match estimate just under 5 generations.  Clearly, this isn’t possible given that this person had died by about 1760, according to the paper, which is 255 years or about 8.5-10 generations ago, but it says something about the staying power of DNA segments and probably about endogamy and a very limited gene pool as well.  All matches over 15cM are shown below.

F999963 largest

Kit F999964 matches 97 people, many who are different people that kit F999963 matched.  So these ancient Polynesian people,  F999963 and F999964 don’t appear to be immediate relatives.

F999964 mito

Again, a lot of haplogroup B mitochondrial DNA, but less haplogroup C Y DNA and no haplogroup O individuals.

F999964 Y

Kit F999964 doesn’t match anyone quite as closely as kit F999963 did in terms of total cM, but the largest segment is 12cM, so the generational estimate is still at 4.6,  All matches over 15cM are shown below.

F999964 largest

Who are these individuals that these ancient kits are matching?  Many of these individuals know each other because they are of Hawaiian or Polynesian heritage and have already been working together.  Several of the Hawaiian folks are upwards of 80%, one at 94% and one believed to be 100% Hawaiian.  Some of these matches are to Maori, a Polynesian people from New Zealand, with one believed to be 100% Maori in addition to several admixed Maori.  So obviously, these ancient remains are matching contemporary people with Polynesian ancestry.

The Unasked Question

Sooner or later, we as a community are going to have to face the question of exactly what is Native or aboriginal.  In this case, because we do have the definitive autosomal full genome testing that eliminates admixture, these two individuals are clearly NOT Native.  Without full genomic testing, we would have never known.

But what if they had arrived 200 years earlier, around 1500 AD, one way or another, possibly on an early European ship, and had intermixed with the Native people for 10 generations?  What if they carried a Polynesian mitochondrial (or Y) DNA motif, but they were nearly entirely Native, or so much Native that the Polynesian could no longer be found autosomally?  Are they Native?  Is their mitochondrial or Y DNA now also considered to be Native?  Or is it still Polynesian?  Is it Polynesian if it’s found in the Cook Islands or on Hawaii and Native if found in South America?  How would we differentiate?

What if they arrived, not in 1500 AD, but about the year 500 AD, or 1000 BCE or 2000 BCE or 3000 BCE – after the Native people from Asia arrived but unquestionably before European contact?  Does that make a difference in how we classify their DNA?

We don’t have to answer this yet today, but something tells me that we will, sooner or later…and we might want to start pondering the question.

Acknowledgements: 

I want to thank all of the people involved whose individual work makes this type of comparative analysis possible.  After all, the power of genetic genealogy, contemporary or ancient, is in collaboration.  Without sharing, we have nothing. We learn nothing.  We make no progress.

In addition to the various scientists and papers already noted, special thanks to Felix Immanual for preparing and uploading the ancient files.  This is no small task and the files often take a month of prep each.  Thanks to Kalani for bringing this to my attention.  Thanks to Ray Banks for his untiring work with haplogroup C and for maintaining his haplogroup webpage with specifics about where the various subgroups are found.  Thanks to ISOGG’s volunteers for the haplotree.  Thanks to GedMatch for providing this wonderful platform and tools.  Thanks to everyone who uploads their DNA, and that of their relatives and works on specific types of projects – like Hawaiian and Maori.  Thanks to my haplogroup C-P39 co-administrators, Dr. David Pike and Marie Rundquist, for their contributions to this discussion and for working together on the Native American Haplogroup C-P39 Project.  It’s important to have other people who are passionate about the same subjects to bounce things off of and to work with.  This is the perfect example of the power of collaboration!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research