Peopling of Europe 2014 – Identifying the Ghost Population

Beginning with the full sequencing of the Neanderthal genome, first published in May 2010 by the Max Planck Institute with Svante Paabo at the helm, and followed shortly thereafter with a Denisovan specimen, we began to unravel our ancient history.

neanderthal reconstructed

Neanderthal man, reconstructed at the National Museum of Nature and Science in Tokyo

The photo below shows a step in the process of extracting DNA from ancient bones at Max Planck.

planck extraction

Our Y and mitochondrial DNA haplogroups take us back thousands of years in time, but at some point, where and how people were settling and intermixing becomes fuzzy. Ancient DNA can put the people of that time and place in context.  We have discovered that current populations do not necessarily represent the ancient populations of a particular locale.

Recent information discovered from ancient burials tells us that the people of Europe descend from a 3 pronged model. Until recently, it was believed that Europeans descended from Paleolithic hunter-gatherers and Neolithic farmers, a two-pronged model.

Previously, it was believed that Europe was peopled by the ancient hunter-gatherers, the Paleolithic, who originally settled in Europe beginning about 45,000 years ago. At this time, the Neanderthal were already settled in Europe but weren’t considered to be anatomically modern humans, and it was believed, incorrectly, that the two groups did not interbreed.  These hunter-gatherers were the people who settled in Europe before the last major ice age, the Younger Dryas, taking refuge in the southern portions of Europe and Eurasia, and repeopling the continent after the ice receded, about 12,000 years ago.  By that time, the Neanderthals were gone, or as we now know, at least partially assimilated.

This graphic shows Europe during the last ice age.

ice age euripe

The second settlement wave, the agriculturalist farmers from the Near East either overran or integrated with the hunter-gatherers in the Neolithic period, depending on which theory you subscribe to, about 8000-10,000 years ago.

2012 – Ancient Northern European (ANE) Hints

Beginning in 2012, we began to see hints of a third lineage that contributed to the peopling of Europe as well, from the north. Buried in the 2012 paper, Estimating admixture proportions and dates with ADMIXTOOLS by Patterson et al, was a very interesting tidbit.  This new technique showed a third population, referred to by many as a “ghost population”, because no one knew who they were, that contributed to the European population.

patterson ane

The new population was termed Ancient North Eurasian, or ANE.

Dienekes covered this paper in his blog, but without additional information, in the community in general, there wasn’t much more than a yawn.

2013 – Mal’ta Child Stirs Excitement

The first real hint of meat on the bones of ANE came in the form of ancient DNA analysis of a 24,000 year old Siberian boy that has come to be named Mal’ta (Malta) Child. In the original paper, by Raghaven et al, Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans, he was referred to as MA-1.  I wrote about this in my article titled Native American Gene Flow – Europe?, Asia and the Americas.   Dienekes wrote about this paper as well.

This revelation caused quite a stir, because it was reported that the Ancestor of Native Americans in Asia was 30% Western Eurasian.  Unfortunately, in some cases, this was immediately interpreted to mean that Native Americans had come directly from Europe which is not what this paper said, nor inferred.  It was also inferred that the haplogroups of this child, R* (Y) and U (mtDNA) were Native American, which is also incorrect.  To date, there is no evidence for migration to the New World from Europe in ancient times, but that doesn’t mean we aren’t still looking for that evidence in early burials.

What this paper did show was that Europeans and Native Americans shared a common ancestor, and that the Siberian population had contributed to the European population as well as the Native American population.  In other words, descendants settled in both directions, east and west.

The most fascinating aspect of this paper was the match distribution map, below, showing which populations Malta child matched most closely.

malta child map

As you can see, MA-1, Malta Child, matches the Native American population most closely, followed by the northern European and Greenland populations. The further south in Europe and Asia, the more distant the matches and the darker the blue.

2013 – Michael Hammer and Haplogroup R

Last fall at the Family Tree DNA conference, Dr. Michael Hammer, from the Hammer Lab at the University of Arizona discussed new findings relative to ancient burials, specifically in relation to haplogroup R, or more specifically, the absence of haplogroup R in those early burials.

hammer 2013

hammer 2013-1

hammer 2013-2

hammer 2013-3

Based on the various theories and questions, ancient burials were enlightening.

hammer 2013-4

hammer 2013-5

In 2013, there were a total of 32 burials from the Neolithic period, after farmers arrived from the Near East, and haplogroup R did not appear. Instead, haplogroups G, I and E were found.

hammer 2013-7

What this tells us is that haplogroup R, as well as other haplogroup, weren’t present in Europe at this time. Having said this, these burials were in only 4 locations and, although unlikely, R could be found in other locations.

hammer 2-13-8

hammer 2013-9

hammer 2013-10

hammer 2013-11

Last year, Dr. Hammer concluded that haplogroup R was not found in the Paleolithic and likely arrived with the Neolithic farmers. That shook the community, as it had been widely believed that haplogroup R was one of the founding European haplogroups.

hammer 2013-12

While this provided tantalizing information, we still needed additional evidence. No paper has yet been published that addresses these findings.  The mass full sequencing of the Y chromosome over this past year with the introduction of the Big Y will provide extremely valuable information about the Y chromosome and eventually, the migration path into and across Europe.

2014 – Europe’s Three Ancient Tribes

In September 2014, another paper was published by Lazaridis et al that more fully defined this new ANE branch of the European human family tree.  An article in BBC News titled Europeans drawn from three ancient ‘tribes’ describes it well for the non-scientist.  Of particular interest in this article is the artistic rendering of the ancient individual, based on their genetic markers.  You’ll note that they had dark skin, dark hair and blue eyes, a rather unexpected finding.

In discussing the paper, David Reich from Harvard, one of the co-authors, said, “Prior to this paper, the models we had for European ancestry were two-way mixtures. We show that there are three groups. This also explains the recently discovered genetic connection between Europeans and Native Americans.  The same Ancient North Eurasian group contributed to both of them.”

The paper, Ancient human genomes suggest three ancestral populations for present-day Europeans, appeared as a letter in Nature and is behind a paywall, but the supplemental information is free.

The article summary states the following:

We sequenced the genomes of a ~7,000-year-old farmer from Germany and eight ~8,000-year-old hunter-gatherers from Luxembourg and Sweden. We analysed these and other ancient genomes1, 2, 3, 4 with 2,345 contemporary humans to show that most present-day Europeans derive from at least three highly differentiated populations: west European hunter-gatherers, who contributed ancestry to all Europeans but not to Near Easterners; ancient north Eurasians related to Upper Palaeolithic Siberians3, who contributed to both Europeans and Near Easterners; and early European farmers, who were mainly of Near Eastern origin but also harboured west European hunter-gatherer related ancestry. We model these populations’ deep relationships and show that early European farmers had ~44% ancestry from a ‘basal Eurasian’ population that split before the diversification of other non-African lineages.

This paper utilized ancient DNA from several sites and composed the following genetic contribution diagram that models the relationship of European to non-European populations.

Lazaridis tree

Present day samples are colored purple, ancient in red and reconstructed ancestral populations in green. Solid lines represent descent without admixture and dashed lines represent admixture.  WHG=western European hunter-gatherer, EEF=early European farmer and ANE=ancient north Eurasian

2014 – Michael Hammer on Europe’s Ancestral Population

For anyone interested in ancient DNA, 2014 has been a banner years. At the Family Tree DNA conference in Houston, Texas, Dr. Michael Hammer brought the audience up to date on Europe’s ancestral population, including the newly sequenced ancient burials and the information they are providing.

hammer 2014

hammer 2014-1

Dr. Hammer said that ancient DNA is the key to understanding the historical processes that led up to the modern. He stressed that we need to be careful inferring that the current DNA pattern is reflective of the past because so many layers of culture have occurred between then and now.

hammer 2014-2

Until recently, it was assumed that the genes of the Neolithic farmers replaced those of the Paleolithic hunter-gatherers. Ancient DNA is suggesting that this is not true, at least not on a wholesale level.

hammer 2014-3

The theory, of course, is that we should be able to see them today if they still exist. The migration and settlement pattern in the slide below was from the theory set forth in the 1990s.

hammer 2014-4

In 2013, Dr. Hammer discussed the theory that haplogroup R1b spread into Europe with the farmers from the Near East in the Neolithic. This year, he expanded upon that topic that based on the new findings from ancient burials.

hammer 2014-5

Last year, Dr. Hammer discussed 32 burials from 4 sites. Today, we have information from 15 ancient DNA sites and many of those remains have been full genome sequenced.

hammer 2014-6

Information from papers and recent research suggests that Europeans also have genes from a third source lineage, nicknamed the “ghost population of North Eurasia.”

hammer 2014-7

Scientists are finding a signal of northeast Asian related admixture in northern Europeans, first suggested in 2012.  This was confirmed with the sequencing of Malta child and then in a second sequencing of Afontova Gora2 in south central Siberia.

hammer 2014-8

We have complete genomes from nine ancient Europeans – Mesolithic hunter gatherers and Neothilic farmers. Hammer refers to the Mesolithic here, which is a time period between the Paleolithic (hunter gatherers with stone tools) and the Neolithic (farmers).

hammer 2014-9

In the PCA charts, shown above, you can see that Europeans and people from the Near East cluster separately, except for a bridge formed by a few Mediterranean and Jewish populations. On the slide below, the hunter-gatherers (WHG) and early farmers (EEF) have been overlayed onto the contemporary populations along with the MA-1 (Malta Child) and AG2 (Afontova Gora2) representing the ANE.

hammer 2014-10

When sequenced, separate groups formed including western hunter gathers and early european farmers include Otzi, the iceman.  A third group is the north south clinal variation with ANE contributing to northern European ancestry.  The groups are represented by the circles, above.

hammer 2014-11

hammer 2014-12

Dr. Hammer said that the team who wrote the “Ancient Human Genomes” paper just recently published used an F3 test, results shown above, which shows whether populations are an admixture of a reference population based on their entire genome. He mentioned that this technique goes well beyond PCA.

hammer 2014-13

Mapped onto populations today, most European populations are a combination of the three early groups. However, the ANE is not found in the ancient Paleolithic or Neolithic burials.  It doesn’t arrive until later.

hammer 2014-14

This tells us that there was a migration event 45,000 years ago from the Levant, followed about 7000 years ago by farmers from the Near East, and that ANE entered the population some time after that. All Europeans today carry some amount of ANE, but ancient burials do not.

These burials also show that southern Europe has more Neolithic farmer genes and northern Europe has more Paleolithic/Mesolithic hunter-gatherer genes.

hammer 2014-15

Pigmentation for light skin came with farmers – blue eyes existed in hunter gatherers even though their skin was dark.

hammer 2014-16

Dr. Hammer created these pie charts of the Y and mitochondrial haplogroups found in the ancient burials as compared to contemporary European haplogroups.

hammer 2014-17

The pie chart on the left shows the haplogroups of the Mesolithic burials, all haplogroup I2 and subclades. Note that in the current German population today, no I2a1b and no I1 was found.  The chart on the right shows current Germans where haplogroup I is a minority.

hammer 2014-18

Therefore, we can conclude that haplogroup I is a good candidate to be identified as a Paleolithic/Mesolithic haplogroup.

This information shows that the past is very different from today.

hammer 2014-19

In 2014 we have many more burials that have been sequenced than last year, as shown on the map above.

Green represents Neolithic farmers, red are Mesolithic hunter-gatherers, brown at bottom right represents more recent samples from the Metallic age.

hammer 2014-20

There are a total of 48 Neolithic burials where haplogroup G dominates. In the Mesolithic, there are a total of six haplogroup I.

This suggests that haplogroup I is a good candidate to be the father of the Paleolithic/Mesolithic and haplogroup G, the founding father of the Neolithic.

In addition to haplogroup G in the Neolithic, one sample of both E1b1b1 (M35) and C were also found in Spain.  E1b1b1 isn’t surprising given it’s north African genesis, but C was quite interesting.

The Metal ages, which according to wiki begin about 3300BC in Europe, is where haplogroup R, along with I1, first appear.

diffusion of metallurgy

Please note that the diffusion of melallurgy map above is not part of Dr. Hammer’s presentation. I have added it for clarification.

hammer 2014-21

Nothing is constant in Europe. The Y DNA was very upheaved, as indicated on the graphic above.  Mitochondrial DNA shifted from pre-Neolithic to Neolithic which isn’t terribly different from the present day.

Dr. Hammer did not say this, but looking at the Y versus the mtDNA haplogroups, I wonder if this suggests that indeed there was more of a replacement of the males in the population, but that the females were more widely assimilated. This would certainly make sense, especially if the invaders were warriors and didn’t have females with them.  They would have taken partners from the invaded population.

Haplogroup G represents the spread of farming into Europe.

hammer 2014-22

The most surprising revelation is that haplogroup R1b appears to have emerged after the Neolithic agriculture transition. Given that just three years ago we thought that haplogroup R1b was one of the original European settlers thousands of years ago, based on the prevalence of haplogroup R in Europe today, at about 50%, this is a surprising turn of events.  Last year’s revelation that R was maybe only 7000-8000 years old in Europe was a bit of a whammy, but the age of R in Europe in essence just got halved again and the source of R1b changed from the Near East to the Asian steppes.

Obviously, something conferred an advantage to these R1b men. Given that they arrived in the early Metalic age, was it weapons and chariots that enabled the R1b men who arrived to quickly become more than half of the population?

hammer 2014-23

The Bronze Age saw the first use of metal to create weapons. Warrior identity became a standard part of daily life.  Celts ranged over Europe and were the most dominant iron age warriors.  Indo-European languages and chariots arrived from Asia about this time.

hammer 2014-24

hammer 2014-25

hammer 2014-26

The map above shows the Hallstadt and LaTene Celtic cultures in Europe, about 600BC. This was not a slide presented by Dr. Hammer.

hammer 2014-27

Haplogroup R1b was not found in an ancient European context prior to a Bell Beaker period burial in Germany 4.8-4.0 kya (thousand years ago, i.e. 4,800-4,000 years ago).  R1b arrives about 4.6 kya and is also found in a Corded Ware culture burial in Germany.  A late introduction of these lineages which now predominate in Europe corresponds to the autosomal signal of the entry of Asian and Eastern European steppe invaders into western Europe.

hammer 2014-28

Local expansion occurred in Europe of R1b subgroups U106, L21 and U152.

hammer 2014-29

hammer 2014-30

A current haplogroup R distribution map that reflects the findings of this past year is shown above.

Haplogroup I is interesting for another reason. It looks like haplogroup I2a1b (M423) may have been replaced by I1 which expanded after the Mesolithic.

hammer 2014-31

On the slide above, the Loschbour sample from Luxembourg was mapped onto a current haplogroup I SNP map where his closest match is a current day Russian.

One of the benefits of ancient DNA genome processing is that we will be able to map current trees into maps of old SNPs and be able to tell who we match most closely.

Autosomal DNA can also be mapped to see how much of our DNA is from which ancient population.

hammer 2014-32

Dr. Hammer mapped the percentages of European Mesolithic/Paleolithic hunter-gatherers in blue, Neolithic Farmers from the Near East in magenta and Asian Steppe Invaders representing ANE in yellow, over current populations. Note the ancient DNA samples at the top of the list.  None of the burials except for Malta Child carry any yellow, indicating that the ANE entered the European population with the steppe invaders; the same group that brought us haplogroup R and possibly I1.

Dr. Hammer says that ANE was introduced to and assimilated into the European population by one or more incursions. We don’t know today if ANE in Europeans is a result of a single blast event or multiple events.  He would like to do some model simulations and see if it is related to timing and arrival of swords and chariots.

We know too that there are more recent incursions, because we’re still missing major haplogroups like J.

The further east you go, meaning the closer to the steppes and Volga region, the less well this fits the known models. In other words, we still don’t have the whole story.

At the end of the presentation, Michael was asked if the whole genomes sequenced are also obtaining Y STR data, which would allow us to compare our results on an individual versus a haplogroup level. He said he didn’t know, but he would check.

Family Tree DNA was asked if they could show a personal ancient DNA map in myOrigins, perhaps as an alternate view. Bennett took a vote and that seemed pretty popular, which he interpreted as a yes, we’d like to see that.

In Summary

The advent of and subsequent drop in the price of whole genome sequencing combined with the ability to extract ancient DNA and piece it back together have provided us with wonderful opportunities.  I think this is jut the proverbial tip of the iceberg, and I can’t wait to learn more.

If you are interested in other articles I’ve written about ancient DNA, check out these links:

Anzick (12,707-12,556), Ancient One, 52 Ancestors #42

anzick burial location

His name is Anzick, named for the family land, above, where his remains were found, and he is 12,500 years old, or more precisely, born between 12,707 and 12,556 years before the present.  Unfortunately, my genealogy software is not prepared for a birth year with that many digits.  That’s because, until just recently, we had no way to know that we were related to anyone of that age….but now….everything has changed ….thanks to DNA.

Actually, Anzick himself is not my direct ancestor.  We know that definitively, because Anzick was a child when he died, in present day Montana.

anzick on us map

Anzick was loved and cherished, because he was smeared with red ochre before he was buried in a cave, where he would be found more than 12,000 years later, in 1968, just beneath a layer of approximately 100 Clovis stone tools, shown below.  I’m sure his parents then, just as parents today, stood and cried as the laid their son to rest….never suspecting just how important their son would be some 12,500 years later.

anzick clovis tools

From 1968 until 2013, the Anzick family looked after Anzick’s bones, and in 2013, Anzick’s DNA was analyzed.

DNA analysis of Anzick provided us with his mitochondrial haplogroup,  D4h3a, a known Native American grouping, and his Y haplogroup was Q-L54, another known Native American haplogroup.  Haplogroup Q-L54 itself is estimated to be about 16,900 years old, so this finding is certainly within the expected range.  I’m not related to Anzick through Y or mitochondrial DNA.

Utilizing the admixture tools at GedMatch, we can see that Anzick shows most closely with Native American and Arctic with a bit of east Siberian.  This all makes sense.

Anzick MDLP K23b

Full genome sequencing was performed on Anzick, and from that data, it was discovered that Anzick was related to Native Americans, closely related to Mexican, Central and South Americans, and not closely related to Europeans or Africans.  This was an important discovery, because it in essence disproves the Solutrean hypothesis that Clovis predecessors emigrated from Southwest Europe during the last glacial maximum, about 20,000 years ago.

anzick matches

The distribution of these matches was a bit surprising, in that I would have expected the closest matches to be from North America, in particular, near to where Anzick was found, but his closest matches are south of the US border.  Although, in all fairness, few people in Native tribes in the US have DNA tested and many are admixed.

This match distribution tells us a lot about population migration and distribution of the Native people after they left Asia, crossed Beringia on the land bridge, now submerged, into present day Alaska.

This map of Beriginia, from the 2008 paper by Tamm et all, shows the migration of Native people into (and back from) the new world.

beringia map

Anzick’s ancestors crossed Beringia during this time, and over the next several thousand years, found their way to Montana.  Some of Anzick’s relatives found their way to Mexico, Central and South America.  The two groups may have split when Anzick’s family group headed east instead of south, possibly following the edges of glaciers, while the south-moving group followed the coastline.

Recently, from Anzick’s full genome data, another citizen scientist extracted the DNA locations that the testing companies use for autosomal DNA results, created an Anzick file, and uploaded the file to the public autosomal matching site, GedMatch.  This allowed everyone to see if they matched Anzick.  We expected no, or few, matches, because after all, Anzick was more than 12,000 years old and all of his DNA would have washed out long ago due to the 50% replacement in every generation….right?  Wrong!!!

What a surprise to discover fairly large segments of DNA matching Anzick in living people, and we’ve spent the past couple of weeks analyzing and discussing just how this has happened and why.  In spite of some technical glitches in terms of just how much individual people carry of the same DNA Anzick carried, one thing is for sure, the GedMatch matches confirm, in spades, the findings of the scientists who wrote the recent paper that describes the Anzick burial and excavation, the subsequent DNA processing and results.

For people who carry known Native heritage, matches, especially relatively large matches to Anzick, confirm not only their Native heritage, but his too.

For people who suspect Native heritage, but can’t yet prove it, an Anzick match provides what amounts to a clue – and it may be a very important clue.

In my case, I have proven Native heritage through the Micmac who intermarried with the Acadians in the 1600s in Nova Scotia.  Given that Anzick’s people were clearly on a west to east movement, from Beringia to wherever they eventually wound up, one might wonder if the Micmac were descended from or otherwise related to Anzick’s people.  Clearly, based on the genetic affinity map, the answer is yes, but not as closely related to Anzick as Mexican, Central and South Americans.

After several attempts utilizing various files, thresholds and factors that produced varying levels of matching to Anzick, one thing is clear – there is a match on several chromosomes.  Someplace, sometime in the past, Anzick and I shared a common ancestor – and it was likely on this continent, or Beringia, since the current school of thought is that all Native people entered the New World through this avenue.  The school of thought is not united in an opinion about whether there was a single migration event, or multiple migrations to the new word.  Regardless, the people came from the same base population in far northeast Asia and intermingled after arriving here if they were in the same location with other immigrants.

In other words, there probably wasn’t much DNA to pass around.  In addition, it’s unlikely that the founding population was a large group – probably just a few people – so in very short order their DNA would be all the same, being passed around and around until they met a new population, which wouldn’t happen until the Europeans arrived on the east side of the continent in the 1400s.  The tribes least admixed today are found south of the US border, not in the US.  So it makes sense that today the least admixed people would match Anzick the most closely – because they carry the most common DNA, which is still the same DNA that was being passed around and around back then.

Many of us with Native ancestors do carry bits and pieces of the same DNA as Anzick.  Anzick can’t be our ancestor, but he is certainly our cousin, about 500 generations ago, using a 25 year generation, so roughly our 500th cousin.  I had to laugh at someone this week, an adoptee who said, “Great, I can’t find my parents but now I have a 12,500 year old cousin.”  Yep, you do!  The ironies of life, and of genealogy, never fail to amaze me.

Utilizing the most conservative matching routine possible, on a phased kit, meaning one that combines the DNA shared by my mother and myself, and only that DNA, we show the following segment matches with Anzick.

Chr Start Location End Location Centimorgans (cM) SNPs
2 218855489 220351363 2.4 253
4 1957991 3571907 2.5 209
17 53111755 56643678 3.4 293
19 46226843 48568731 2.2 250
21 35367409 36761280 3.7 215

Being less conservative produces many more matches, some of which are questionable as to whether they are simply convergence, so I haven’t utilized the less restrictive match thresholds.

Of those matches above, the one on chromosomes 17 matches to a known Micmac segment from my Acadian lines and the match on chromosome 2 also matches an Acadian line, but I share so many common ancestors with this person that I can’t tell which family line the DNA comes from.

There are also Anzick autosomal matches on my father’s side.  My Native ancestry on his side reaches back to colonial America, in either Virginia or North Carolina, or both, and is unproven as to the precise ancestor and/or tribe, so I can’t correlate the Anzick DNA with proven Native DNA on that side.  Neither can I associate it with a particular family, as most of the Anzick matches aren’t to areas on my chromosome that I’ve mapped positively to a specific ancestor.

Running a special utility at GedMatch that compared Anzick’s X chromosome to mine, I find that we share a startlingly large X segment.  Sometimes, the X chromosome is passed for generations intact.

Interestingly enough, the segment 100,479,869-103,154,989 matches a segment from my mother exactly, but the large 6cM segment does not match my mother, so I’ve inherited that piece of my X from my father’s line.

Chr Start Location End Location Centimorgans (cM) SNPs
X 100479869 103154989 1.4 114
X 109322285 113215103 6.0 123

This tells me immediately that this segment comes from one of the pink or blue lines on the fan chart below that my father inherited from his mother, Ollie Bolton, since men don’t inherit an X chromosome from their father.  Utilizing the X pedigree chart reduces the possible lines of inheritance quite a bit, and is very suggestive of some of those unknown wives.

olliex

It’s rather amazing, if you think about it, that anyone today matches Anzick, or that we can map any of our ancestral DNA that both we and Anzick carry to a specific ancestor.

Indeed, we do live in exciting times.

Honoring Anzick

On a rainy Saturday in June, 2014, on a sagebrush hillside in Montana, in Native parlance, our “grandfather,” Anzick was reburied, bringing his journey full circle.  Sarah Anzick, a molecular biologist, the daughter of the family that owns the land where the bones were found, and who did part of the genetic discovery work on Anzick, returns the box with his bones for reburial.

anzick bones

More than 50 people, including scientists, members of the Anzick family and representatives of six Native American tribes, gathered for the nearly two-hour reburial ceremony. Tribe members said prayers, sang songs, played drums and rang bells to honor the ancient child. The bones were placed in the grave and sprinkled with red ocher, just like when his parents buried him some 12,500 years before.

Participants at the reburial ceremony filled in the grave with handfuls, then shovelfuls of dirt and covered it with stones. A stick tied with feathers marks Anzick’s final resting place.

Sarah Anzick tells us that, “At that point, it stopped raining. The clouds opened up and the sun came out. It was an amazing day.”

I wish I could have been there.  I would have, had I known.  After all, he is part of me, and I of him.

anzick grave'

Welcome to the family, Anzick, and thank you, thank you oh so much, for your priceless, unparalleled gift!!!

tobacco

If you want to read about the Anzick matching journey of DNA discovery, here are the articles I’ve written in the past two weeks.  It has been quite a roller coaster ride, but I’m honored and privileged to be doing this research.  And it’s all thanks to an ancient child named Anzick.

Utilizing Ancient DNA at Gedmatch

Analyzing the Native American Anzick Clovis Native American Results

New Native American Mitochondrial DNA Haplogroups Extrapolated from Anzick Match Results

Ancient DNA Matching, A Cautionary Tale

More Ancient DNA Samples for Comparison

Tenth Annual Family Tree DNA Conference Wrapup

baber summary

This slide, by Robert Baber, pretty well sums up our group obsession and what we focus on every year at the Family Tree DNA administrator’s conference in Houston, Texas.

Getting to Houston, this year, was a whole lot easier than getting out of Houston. They had storms yesterday and many of us spent the entire day becoming intimately familiar with the airport.  Jennifer Zinck, of Ancestor Central, is still there today and doesn’t have a flight until late.

And this is how my day ended, after I finally got out of Houston and into my home airport. This isn’t at the airport, by the way.  Everything was fine there, but I made the apparent error of stopping at a Starbucks on the way home.  This is the parking lot outside an hour or so later.  What can I say?  At least I had my coffee, and AAA rocks, as did the tow truck driver and my daughter for getting out of bed to come and rescue me!!!  Hmmm, I think maybe things have gone full circle.  I remember when I used to go and rescue her:)

jeep tow

So far, today hasn’t improved any, so let’s talk about something much more pleasant…the conference itself.

Resources

One of the reasons I mentioned Jennifer Zinck, aside from the fact that she’s still stuck in the airport, is because she did a great job actually covering the conference as it happened. Since I had some time yesterday to visit with her since our gates weren’t terribly far apart, I asked her how she got that done.  I took notes too, and photos, but she turned out a prodigious amount of work in a very short time.  While I took a lightweight MacBook Air, she took her regular PC that she is used to typing on, and she literally transcribed as the sessions were occurring.  She just added her photos later, and since she was working on a platform that she was familiar with, she could crop and make the other adjustments you never see but we perform behind the scenes before publishing a photo.

On the other hand, I struggled with a keyboard that works differently and is a different size than I’m used to as well as not being familiar with the photo tools to reduce the size of pictures, so I just took rough notes and wrote the balance later.  Having familiar tools make such a difference.  I think I’ll carry my laptop from now on, even though it is much heavier.  Kudos to Jennifer!

I was initially going to summarize each session, but since Jen did such a good job, I’m posting her links. No need to recreate a wheel that doesn’t need to be recreated.

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy/

ISOGG, the International Society of Genetic Genealogy is not affiliated with Family Tree DNA or any testing company, but Family Tree DNA is generous enough to allow an ISOGG meeting on Sunday before the first conference session.

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-isogg-meeting/

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-sunday/

You can find my conference postings here:

http://dna-explained.com/2014/10/11/tenth-annual-family-tree-dna-conference-opening-reception/

http://dna-explained.com/2014/10/12/tenth-annual-family-tree-dna-conference-day-2/

http://dna-explained.com/2014/10/13/tenth-annual-family-tree-dna-conference-day-3/

Several people were also posting on a twitter feed as well.

https://twitter.com/search?q=%23FTDNA2014&src=tyah

Those of you where are members of the ISOGG Yahoo group for project administrators can view photos posted by Katherine Borges in that group and there are also some postings on the Facebook ISOGG group as well.

Now that you have the links for the summaries, what I’d like to do is to discuss some of the aspects I found the most interesting.

The Mix

When I attended my first conference 10 years ago, I somehow thought that for the most part, the same group of people would be at the conferences every year. Some were, and in fact, a handful of the 160+ people attending this conference have attended all 10 conferences.  I know of two others for certain, but there were maybe another 3 or so who stood up when Bennett asked for everyone who had been present at all 10 conferences to stand.

Doug Mumma, the very first project administrator was with us this weekend, and still going strong. Now, if Doug and I could just figure out how we’re related…

Some of the original conference group has passed on to the other side where I’m firmly convinced that one of your rewards is that you get to see all of those dead ends of your tree. If we’re lucky, we get to meet them as well and ask all of those questions we have on this side.  We remember our friends fondly, and their departure sadly, but they enriched us while they were here and their memories make us smile.  I’m thinking specifically of Kenny Hedgepath and Leon Little as I write this, but there have been others as well.

The definition of a community is that people come and go, births, deaths and moves.

This year, about half of the attendees had never attended a conference before. I was very pleased to see this turn of events – because in order to survive, we do need new people who are as crazy as we are…er….I mean as dedicated as we are.

isogg reception

ISOGG traditionally hosts a potluck reception on Saturday evening. Lots of putting names with faces going on here.

Collaboration

I asked people about their favorite part of the conference or their favorite session. I was surprised at the number of people who said lunches and dinners.  Trust me, the food wasn’t that wonderful, so I asked them to elaborate.  In essence, the most valuable aspect of the conference was working with and talking to other administrators.

bar talk

It’s not like we don’t talk online, but there is somehow a difference between online communications and having a group discussion, or a one-on-one discussion. Laptops were out and in use everyplace, along with iPads and other tools.  It was so much fun to walk by tables and hear snippets of conversations like “the mutation at location 309.1….” and “null marker at 425” and “I ordered a kit for my great uncle…..”

I agree, as well. I had pre-arranged two dinners before arriving in order to talk with people with whom I share specific interests.  At lunches, I either tried to sit with someone I specifically needed to talk to, or I tried to meet someone new.

I also asked people about their specific goals for the next year. Some people had a particular goal in mind, such as a specific brick wall that needs focus.  Some, given that we are administrators, had wider-ranging project based goals, like Big Y testing certain family groups, and a surprising number had the goal of better utilizing the autosomal results.

Perhaps that’s why there were two autosomal sessions, an introduction by Jim Bartlett and then Tim Janzen’s more advanced session.

Autosomal DNA Results

jim bartlett

Note the cool double helix light fixture behind the speakers.

tim janzen

Tim specifically mentioned two misconceptions which I run across constantly.

Misconception 1 – A common surname means that’s how you match.  Just because you find a common surname doesn’t mean that’s your DNA match.  This belief is particularly prevalent in the group of people who test at Ancestry.com.

Misconception 2 – Your common ancestor has to be within the past 6 generations.  Not true, many matches can be 6-10th cousins because there are so many descendants of those early ancestors, even as many as 15 generations back.

Tim also mentioned that endogamous relationships are a tough problem with no easy answer. Polynesians, Ashkenazi Jews, Low German Mennonites, Acadians, Amish, and island populations.  Do I ever agree with him!  I have Brethren, Mennonite and Acadian in the same parent’s line.

Tim has been working with the Mennonite DNA project now for many years.

Tim included a great resource slide.

tim slide1

Tim has graciously made his entire presentation available for download.

tim slide2

There are probably a dozen or so of us that are actively mapping our ancestors, and a huge backlog of people who would like to. As Tim pointed out with one of his slides, this is not an easy task nor is it for the people who simply want to receive “an answer.”

tim slide3

I will also add that we “mappers” are working with and actively encouraging Family Tree DNA to develop tools so that the mapping is less spreadsheet manual work and more automated, because it certainly can be.

Upload GEDCOM Files

If you haven’t already, upload your GEDCOM to Family Tree DNA.  This is becoming an essential part of autosomal matching.  Furthermore, Family Tree DNA will utilize this file to construct your surname list and that will help immensely determining common surnames and your common ancestor with your Family Finder matches.  If you have sponsored tests for cousins, then upload a GEDCOM file for them or at least construct a basic tree on their Family Tree DNA page.

Ethics

Family Tree DNA always tries to provide a speaker about ethics, and the only speakers I’ve ever felt understood anything about what we want to do are Judy Russell and Blaine Bettinger.  I was glad to see Blaine presenting this year.

blaine bettinger

The essence of Blaine’s speech is that ethics isn’t about law. Law is cut and dried.  Ethics isn’t, and there are no ethics police.

Sometimes our decisions are colored necessarily by right and wrong.  Sometimes those decisions are more about the difference between a better and a worse way.

As a community, we want to reduce negative press coverage and increase positive coverage. We want to be proactive, not reactive.

Blaine stresses that while informed consent is crucial, that DNA doesn’t reveal secrets that aren’t also revealed by other genealogical forms of research. DNA often reveals more recent secrets, such as adoptions and NPEs, so it’s possibly more sensitive.

Two things need to govern our behavior. First, we need to do only things that we would be comfortable seeing above the fold in the New York Times.  Second, understand that we can’t make promises about topics like anonymity or about the absence of medical information, because we don’t know what we don’t know.

The SNP Tsunami

One of my concerns has been and remains the huge number of new SNPs that have been discovered over the past year or so with the Big Y by Family Tree DNA and  corresponding tests from other vendors.

When I say concern, I’m thrilled about this new technology and the advances it is allowing us to make as a community to discover and define the evolution of haplogroups. My concern is that the amount of data is overwhelming.  However, we are working through that, thanks to the hours and hours of volunteer work by haplogroup administrators and others.

Alice Fairhurst, who volunteers to maintain the ISOGG haplotree, mentioned that she has added over 10,000 SNPs to the Y tree this year alone, bringing the total to over 14,000. Those SNPs are fully vetted and placed.  There are many more in process and yet more still being discovered.  On the first page of the Y SNP tree, the list of SNP sources and other critical information, such as the criteria for a SNP to be listed, is provided.

isogg tree3

isogg snps

isogg snps 2014

So, if you’re waiting for that next haplotree poster, give it up because there isn’t a printing press that big, unless you want wallpaper.

isogg new development 2014

These slides are from Alice’s presentation. The ISOGG tree provides an invaluable resource for not only the genetic genealogy community, but also researchers world-wide.

As one example of how the SNP tsunami has affected the Y tree, Alice provided the following summary of R-U106, one of the two major branches of haplogroup R.

From the ISOGG 2006 Y tree, this was the entire haplogroup R Y tree. You can see U106 near the bottom with 3 sub-branches.  While this probably makes you chuckle today, remember that 2006 was only 8 years ago and that this tree didn’t change much for several years.

2006 entire tree

2007 was the same.

2008 u106 tree

2008 shows 5 subclades and one of the subclades had 2 subclades.

2009 u106 tree

2009 showed a total of 12 sub-branches and 2010 added one more.

2011 however, showed a large change. U106 in 2011 had 44 subgroups total and became too large to show on one screen shot.  2012 shows 99 subclades, if I counted accurately.  The 2014 U106 tree is shown below.

before big y

after big y

u106 now

u106 now2

There’s another slide too, but I didn’t manage to get the picture.  You get the idea though…

As you can imagine, for Family Tree DNA, trying to keep up with all of the haplogroups, not just one subgroup like U106 is a gargantuan task that is constantly changing, like hourly. Their Y tree is currently the National Geographic tree, and while they would like to update it, I’m sure, the definition of “current tree” is in a constant state of flux.  Literally, Mike Walsh, one of the admins in the R-L21 group uploads a new tree spreadsheet several times every day.

In order to deal attempt to deal with this, and to encourage people who don’t want to do a Big Y discovery type test, but do want to ferret out their location on their assigned portion of the tree, Family Tree DNA is reintroducing the Backbone tests.

They are starting with M222, also known as the Niall of the 9 Hostages haplogroup which is their beta for the new product and new process. You can see the provisional tree and results in the two slides they provided, below.  I apologize for the quality, but it was the best I could do.

M222

m222 pie

Haplogroup administrators are going to be heavily involved in this process. Family Tree DNA is putting SNP panels together that will help further define the tree and where various SNPs that have been recently discovered, and continue to be discovered, will fall on the tree.

As Big Y tests arrive, haplogroup project administrators typically assemble a spreadsheet of the SNPS and provisionally where they fall on the tree, based on the Big Y results.

What Bennett asked is for the admins to work with Family Tree DNA to assemble a testing panel based on those results. The goal is for the cost to be between $1.50 and $2 (US) for each SNP in the panel, which will reduce the one-off SNP testing and provide a much more complete and productive result at a far reduced price as compared to the current $29 or $39 per individual SNP.

If you are a haplogroup administrator, get in touch with Family Tree DNA to discuss your desired backbone panels. New panels, when it’s your turn, will take about 2 weeks to develop.

Keep in mind that the following SNPs, according to Bennett, are not optimal for panels:

  • Palindromic regions
  • Often mutating regions designated as .1, .2, etc.
  • SNPs in STRs

Nir Leibovich, the Chief Business Officer, also addressed the future and the Big Y to some extent in his presentation.

nir leibovich

ftdna future 2014

Utilizing the Big Y for Genealogy

In my case, during the last sale, I ordered several Big Y tests for my Estes family line because I have several genealogically documented lines from the original Estes family in Kent, England through our common ancestor, Robert Estes born in 1555 and his wife Anne Woodward. The participants also agreed to extend their markers to 111 markers as well.  When the results are back, we’ll be able to compare them on a full STR marker set, and also their SNPs.  Hopefully, they will match on their known SNPs and there will be some new novel variants that will be able to suffice as line marker mutations.

We need more BIG Y tests of these types of genealogically confirmed trees that have different sons’ lines from a distant common ancestor to test descendant lines. This will help immensely to determine the actual, not imputed, SNP mutation rate and allow us to extrapolate the ages of haplogroups more accurately.  Of course, it also goes without saying that it helps to flesh out the trees.

I personally expect the next couple of years will be major years of discovery. Yes, the SNP tsumani has hit land, but it’s far from over.

Research and Development

David Mittleman, Chief Scientific Officer, mentioned that Family Tree DNA now has their own R&D division where they are focused on how to best analyze data. They have been collaborating with other scientists.  A haplogroup G1 paper will be published shortly which states that SNP mutation rates equate to Sanger data.

FTDNA wants to get Big Y data into the public domain. They have set up consent for this to be done by uploading into NCBI.  Initially they sent a survey to a few people that  sampled the interest level.  Those who were interested received a release document.  If you are interested in allowing FTDNA to utilize your DNA for research, be it mitochondrial, Y or autosomal, please send them an e-mail stating such.

Don’t Forget About Y Genealogy Research

It’s very easy for us to get excited about the research and discovery aspect of DNA – and the new SNPs and extending haplotrees back in time as far as possible, but sometimes I get concerned that we are forgetting about the reason we began doing genetic genealogy in the first place.

Robert Baber’s presentation discussed the process of how to reconstruct a tree utilizing both genealogy and DNA results. It’s important to remember that the reason most of our participants test is to find their ancestors, not, primarily, to participate in the scientific process.

Robert baber

edward baber

Robert has succeeded in reconstructing 110 or 111 markers of the oldest known Baber ancestor, shown above. I wrote about how to do this in my article titled, Triangulation for Y DNA.

Not only does this allow us to compare everyone with the ancestor’s DNA, it also provides us with a tool to fit individuals who don’t know specific genealogical line into the tree relatively accurately. When I say relatively, the accuracy is based on line marker mutations that have, or haven’t, happened within that particular family.

Jim illustrated how to do this as well, and his methodology is available at the link on his slide, below.

baber method

I had to laugh. I’ve often wondered what our ancestors would think of us today.  Robert said that that 11 generations after Edward Baber died, he flew over church where Edward was buried and wondered what Edward would have thought about what we know and do today – cars, airplanes, DNA, radio, TV etc..  If someone looked in a crystal ball and told Edward what the future held 11 generations later, he would have thought that they were stark raving mad.

Eleven generations from my birth is roughly the year 2280. I’m betting we won’t be trying to figure out who our ancestors were through this type of DNA analysis then.  This is only a tiny stepping stone to an unknown world, as different to us as our world is to Edward Baber and all of our ancestors who lived in a time where we know their names but their lives and culture are entirely foreign to ours.

Publications

When the Journal of Genetic Genealogy was active, I, along with other citizen scientists published regularly.  The benefit of the journal was that it was peer reviewed and that assured some level of accuracy and because of that, credibility, and it was viewed by the scientific community as such.  My co-authored works published in JOGG as well as others have been cited by experts in the academic community.  It other words, it was a very valuable journal.  Sadly, it has fallen by the wayside and nothing has been published since 2011.  A new editor was recruited, but given their academic load, they have not stepped up to the plate.  For the record, I am still hopeful for a resurrection, but in the mean time, another opportunity has become available for genetic genealogists.

Brad Larkin has founded the Surname DNA Journal, which, like JOGG, is free to both authors and subscribers. In case you weren’t aware, most academic journal’s aren’t.  While this isn’t a large burden for a university, fees ranging from just over $1000 to $5000 are beyond the budget of genetic genealogists.  Just think of how many DNA tests one could purchase with that money.

brad larkin

surname dna journal

Brad has issued a call for papers. These papers will be peer reviewed, similarly to how they were reviewed for JOGG.

call for papers

Take a look at the articles published in this past year, since the founding of Surname DNA Journal.

The citizen science community needs an avenue to publish and share. Peer reviewed journals provide us with another level of credibility for our work. Sharing is clearly the lynchpin of genetic genealogy, as it is with traditional genealogy. Give some thought about what you might be able to contribute.

Brad Larkin solicited nominations prior to the conference and awarded a Genetic Genealogist of the Year award. This year’s award was dually presented to Ian Kennedy in Australia, who, unfortunately, was not present, and to CeCe Moore, who just happened to follow Brad’s presentation with her own.

Don’t Forget about Mitochondrial DNA Either

I believe that mitochondrial DNA the most underutilized DNA tool that we have, often because how to use mitochondrial DNA, and what it can tell you, is poorly understood. I wrote about this in an article titled, Mitochondrial, The Maligned DNA.

Given that I work with mitochondrial DNA daily when I’m preparing client’s Personalized DNA Reports (orderable from your personal page at Family Tree DNA or directly from my website), I know just how useful mitochondrial can be and see those examples regularly. Unfortunately, because these are client reports, I can’t write about them publicly.

CeCe Moore, however, isn’t constrained by this problem, because one of the ways she contributes to genetic genealogy is by working with the television community, in particular Genealogy Roadshow and the PBS series, Finding Your Roots. Now, I must admit, I was very surprised to see CeCe scheduled to speak about mitochondrial DNA, because the area of expertise where she is best known is autosomal DNA, especially in conjunction with adoptee research.

cece moore

cece mtdna

During the research for the production of these shows, CeCe has utilized mitochondrial DNA with multiple celebrities to provide information such as the ethnic identification of the ancestor who provided the mitochondrial DNA as Native American.

Autosomal DNA testing has a broad but shallow reach, across all of your lines, but just back a few generations.  Both Y and mitochondrial DNA have a very deep reach, but only on one specific line, which makes them excellent for identifying a common ancestor on that line, as well as the ethnicity of that individual.

I have seen other cases, where researchers connected the dots between people where no paper trail existed, but a relationship between women was suspected.

CeCe mentioned that currently there are only 44,000 full sequence results in the Family Tree DNA data base and and 185K total HVR1, HVR2 and full sequence tests. Y has half a million.  We need to increase the data base, which, of course increases matches and makes everyone happier.  If you haven’t tested your mitochondrial DNA to the full sequence level, this would be a great time!

There are several lessons on how to utilize mitochondrial DNA at this ISOGG link.

I’m very hopeful that CeCe’s presentation will be made available as I think her examples are quite powerful and will serve to inspire people.  Actually, since CeCe is in the “movie business,” perhaps a short video clip could be made available on the FTDNA website for anyone who hasn’t tested their mitochondrial DNA so they can see an example of why they should!

myOrigins

I would be fibbing to you if I told you I am happy with myOrigins. I don’t feel that it is as sensitive as other methods for picking up minority admixture, in particular, Native American, especially in small amounts.  Unfortunately, those small amounts are exactly what many people are looking for.

If someone has a great-great-great-great grandparent that is Native, they carry about 1%, more or less, of the Native ancestor’s DNA today. A 4X great grandparent puts their birth year in the range of 1800-1825 – or just before the Trail of Tears.  People whose colonial American families intermarried with Native families did so, generally, before the Trail of Tears.  By that time, many tribes were already culturally extinct and those east of the Mississippi that weren’t extinct were fighting for their lives, both literally and figuratively.

We really need the ability to develop the most sensitive testing to report even the smallest amounts of Native DNA and map those segments to our chromosomes so that we can determine who, and what line in our family, was Native.

I know that Family Tree DNA is looking to improve their products, and I provided this feedback to them. Many people test autosomally only for their ethnicity results and I surely would love to have those people’s results available as matches in the FTDNA data base.

Razib Khan has been working with Family Tree DNA on their myOrigins product and spoke about how the myOrigins data is obtained.

razib kahn

my origins pieces

Given that all humans are related, one way or another, far enough back in time, myOrigins has to be able to differentiate between groups that may not be terribly different. Furthermore, even groups that appear different today may not have been historically.  His own family, from India, has no oral history of coming from the East, but the genetic data clearly indicates that they did, along with a larger group, about 1000 years ago.  This may well be a result of the adage that history is written by the victors, or maybe whatever happened was simply too long ago or unremarkable to be recorded.

Razib mentioned that depending on the cluster and the reference samples, that these clusters and groups that we see on our myOrigins maps can range from 1000-10,000 years in age.

relatedness of clusters

The good news is that genetics is blind to any preconceived notions. The bad news is that the software has to fit your results to the best population, even though it may not be directly a fit.  Hopefully, as we have more and better reference populations, the results will improve as well.

my origin components

pca chart

Razib showed a PCA (principal components analysis) graph, above. These graphs chart reference populations in different quadrants.  Where the different populations overlap is where they share common historic ancestors.  As you can see, on this graph with these reference populations, there is a lot of overlap in some cases, and none in others.

Your personal results would then be plotted on top of the reference populations. The graph below shows me, as the white “target” on a PCA graph created by Doug McDonald.

my pca chart

The Changing Landscape

A topic discussed privately among the group, and primarily among the bloggers, is the changing landscape of genetic genealogy over the past year or so.  In many ways I think the bloggers are the canaries in the mine.

One thing that clearly happened is that the proverbial tipping point occurred, and we’re past it. DNA someplace along the line became mainstream.  Today, DNA is a household word.  At gatherings, at least someone has tested, and most people have heard about DNA testing for genealogy or at least consumer based DNA testing.

The good news in all of this is that more and more people are testing. The bad news is that they are typically less informed and are often impulse purchasers.  This gives us the opportunity for many more matches and to work with new people.  It also means there is a steep learning curve and those new testers often know little about their genealogy.  Those of us in the “public eye,” so to speak, have seen an exponential spike in questions and communications in the past several months.  Unfortunately, many of the new people don’t even attempt to help themselves before asking questions.

Sometimes opportunity comes with work clothes – for them and us both.

I was talking with Spencer about this at the reception and he told me I was stealing his presentation.  He didn’t seem too upset by this:)

spencer and me

I had to laugh, because this falls clearly into the “be careful what you wish for, you may get it” category. The Genographic project through National Geographic is clearly, very clearly, a critical component of the tipping point, and this was reflected in Spencer’s presentation.  Although I covered quite a bit of Spencer’s presentation in my day 2 summary, I want to close with Spencer here.  I also want to say that if you ever have the opportunity to hear Spencer speak, please do yourself the favor and be sure to take that opportunity.  Not only is he brilliant, he’s interesting, likeable and very approachable.  Of course, it probably doesn’t hurt that I’ve know him now for 9 years!  I’ve never thought to have my picture taken with Spencer before, but this time, one of my friends did me the favor.

I have to admit, I love talking to Spencer, and listening to him. He is the adventurer through whom we all live vicariously.  In the photo below, Spencer along with his crew, drove from London to Mongolia.  Not sure why he is standing on the top of the Land Rover, but I’m sure he will tell us in his upcoming book about that journey,

spencer on roof

I’m warning you all now, if I win the lottery, I’m going on the world tour that he hosts with National Geographic, and of course, you’ll all be coming with me via the blog!

Spencer talked about the consumer genomics market and where we are today.

spencer genomics

Spencer mentioned that genetic genealogy was a cottage industry originally. It was, and it was even smaller than that, if possible.  It actually was started by Bennett and his cell phone.  I managed to snap a picture of Bennett this weekend on the stage looking at his cell, and I thought to myself, “this is how it all started 14 years ago.”  Just look where we are today.  Thank you Michael Hammer for telling Bennett that you received “lots of phone calls from crazy genealogists like you.”

bennett first office

So, where exactly are we today?  In 2013, the industry crossed the millionth kit line.  The second millionth kit was sold in early summer 2014 and the third million will be sold in 2015.  No wonder we feel like a tidal wave has hit.  It has.

Why now?

DNA has become part of national consciousness.  Businesses advertise that “it’s in our DNA.”  People are now comfortable sharing via social media like facebook and twitter.  What DNA can do and show you, the secrets it can unlock is spreading by word of mouth.  Spencer termed this the “viral spread threshold” and we’ve crossed that invisible line in the sand.  He terms 2013 as the year of infection and based on my blog postings, subscriptions, hits, reach and the number of e-mails I receive, I would completely agree.  Hold on tight for the ride!

Spencer talked about predictions for near term future and said a 5 year plan is impossible and that an 18 month plan is more realistic. He predicts that we will continue to see exponential growth over the next several years.  He feels that genetic genealogy testing will be primary driver of growth because medical or health testing is subject to the clinical utility trap being experienced currently by 23andMe.  The Big 4 testing companies control 99% of consumer market in US (Ancestry, 23andMe, Family Tree DNA and National Geographic.)

Spencer sees a huge international market potential that is not currently being tapped. I do agree with him, but many in European countries are hesitant, and in some places, like France, DNA testing that might expose paternity is illegal.  When Europeans see DNA testing as a genealogical tool, he feels they will become more interested.  Most Europeans know where their ancestral village is, or they think they do, so it doesn’t have the draw for them that it does for some of us.

Ancestry testing (aka genetic genealogy as opposed to health testing) is now a mature industry with 100% growth rate.

Spencer also mentioned that while the Genographic data base is not open access, that affiliate researchers can send Nat Geo a proposal and thereby gain research access to the data base if their proposal is approved. This extends to citizen scientists as well.

spencer near term

Michael Hammer

You’ll notice that Michael Hammer’s presentation, “Ancient and Modern DNA Update, How Many Ancestral Populations for Europe,” is missing from this wrapup. It was absolutely outstanding, and fascinating, which is why I’m writing a separate article about his presentation in conjunction with some additional information.  So, stay tuned.

Testing, More Testing

It’s becoming quite obvious that the people who are doing the best with genetic genealogy are the ones who are testing the most family members, both close and distant. That provides them with a solid foundation for comparison and better ways to “drop matches” into the right ancestor box.  For example, if someone matches you and your mother’s sister, Aunt Margaret, especially if your mother is not available to test, that’s a very important hint that your match is likely from your mother’s line.

So, in essence, while initially we would advise people to test the oldest person in a generational line, now we’ve moved to the “test everyone” mentality.  Instead of a survey, now we need a census.  The exception might be that the “child” does not necessarily need to be tested because both parents have tested.  However, having said that, I would perhaps not make that child’s test a priority, but I would eventually test that child anyway.  Why?  Because that’s how we learn.  Let me give you an example.

I was sitting at lunch with David Pike. were discussing autosomal DNA generational transmission and inheritance.  He pulled out his iPad, passed it to me, and showed me a chromosome (not the X) that has been passed entirely intact from one generation to the next.  Had the child not been tested, we would never have known that.  Now, of course, if you’ll remember the 50% rule, by statistical prediction, the child should get half of the mother’s chromosome and half of the father’s, but that’s not how it worked.  So, because we don’t know what we don’t know, I’m now testing everyone I can find and convince in my family.  Unfortunately, my family is small.

Full genome testing is in the future, but we’re not ready yet. Several presenters mentioned full genome testing in some context.  Here’s the bottom line.  It’s not truly full genome testing today, only 95-96%.  The technology isn’t there yet, and we’re still learning.  In a couple of years, we will have the entire genome available for testing, and over time, the prices will fall.  Keep in mind that most of our genome is identical to that of all humans, and the autosomal tests today have been developed in order to measure what is different and therefore useful genealogially.  I don’t expect big breakthroughs due to full genome testing for genetic genealogy, although I could be wrong.  You can, however, count me in, because I’m a DNA junkie.  When the full genome test is below $1000, when we have comparison tools and when the coverage won’t necessitate doing a second or upgrade test a few years later, I’ll be there.

Thank you

I want to offer a heartfelt thank you to Max Blankfeld and Bennett Grenspan, founders of Family Tree DNA, shown with me in the photo below, for hosting and subsidizing the administrator’s conference – now for a decade. I look forward to seeing them, and all of the other attendees, next year.

I anticipate that this next decade will see many new discoveries resulting in tools that make our genealogy walls fall.  I can’t help but wonder what the article I’ll be writing on the 20th anniversary looking back at nearly a quarter century of genetic genealogy will say!

roberta, max and bennett

Ancient DNA Matching – A Cautionary Tale

egg

I hope that all of my readers realize that you are literally watching science hatch.  We are on the leading, and sometimes bleeding edge, of this new science of genetic genealogy.  Because many of these things have never been done before, we have to learn by doing and experimenting.  Because I blog about this, these experiments are “in public,” so there is no option of a private “oops.”  Fortunately, I’m not sensitive about these kinds of things.  Plus, I think people really enjoy coming along for the ride of discovery.  I mean, where else can you do that?  It’s really difficult to get a ride-along on the space shuttle!

One of the best pieces of advice I ever got was from someone who was taken from my life far too early.  I had made a mistake of some sort…don’t even remember what…and he gave me a card that said, “The only people who don’t make mistakes are the people who don’t try.”

This isn’t an “oops” moment.  More like an “aha” moment.  Or more precisely, a “huh” moment.  It falls in the “Houston, we’ve got a problem” category.

So, this week’s new discovery is that there seems to be some inconsistency in the matching to the Anzick kit at GedMatch.  Before I go any further, I want to say very clearly that this is in no way a criticism of anyone or any tool.  Every person involved is a volunteer and we would not be making any of these steps forward, including a few backwards, without these wonderful volunteers and tools.

I have reached out to the people involved and asked for their help to unravel this mystery, and I’m sharing the story with you, partly so you can understand what is involved, and the process, partly so that you don’t inadvertently encounter the same kinds of issues and draw unrealistic or incorrect conclusions, and partly so you can help.  If there has been any common theme in all of my articles in the past week or so about the ancient DNA articles, it has been that we really don’t understand what conclusions to draw yet…and we still don’t.  So don’t.

Let’s introduce the players here.

The Players

Felix Chandrakumar has very graciously prepared the various ancient DNA files and uploaded them to GedMatch.  Felix has written a number of DNA analysis tools as well.

John Olson is one of the two volunteers who created and does everything at Gedmatch, plus works a full time job.  By the way, in case you’re not aware, this is a contribution site, meaning they depend on your financial contributions to function, purchase hardware, servers, etc.  If you use this site, periodically scroll down and click on the donate button.  We, as a community, would be lost without John and his partner.

David Pike is a long time genetic genealogist who I have had the pleasure of working with on a number of Native American and related topics over the years. He also has created several genetic genealogy tools to deal with autosomal DNA. David prepared the Anzick files for some private work we were doing several months ago, so he has experience with this DNA as well.  Dr. Pike has a great deal of experience analyzing the endogamous population of Newfoundland, which is also admixed with Native Americans.

Marie Rundquist, also a long time genetic genealogist who specializes in both technology and Acadian history along with genetic genealogy.  Acadians are proven to be admixed with Native Americans.  Marie shares my deep interest and commitment to Native American study and genetics.  Furthermore, Marie and I also share ancestors and co-administer several related projects.   As you might imagine, Marie and I took this opportunity immediately to see if she and her mother share any of Anzick’s segments with me and my mother.

So, a big thank you to all of these people.

The Mystery

When Felix originally e-mailed me about the Anzick kit being uploaded to GedMatch, as you might imagine, I stopped doing whatever I was doing and immediately went to study Anzick and the other ancient DNA kits.

I wrote about this experience in the article, “Utilizing Ancient DNA at GedMatch.”

As part of that process, I not only ran Anzick’s kit utilizing the “one to many” option, I also compared my own kit to Anzick’s.  My proven Native lines descend through my mother, so I ran her kit against Anzick’s as well, at the same thresholds, and I combined the two results to see where mother and I overlapped.

I showed these overlaps in the article, along with which genealogy lines they matched by utilizing my ancestor matching spreadsheet.

Everything was hunky dory…for then.

Day 2

The next day, I received a note from Felix that the Anzick kit may not have been fully tokenized at GedMatch previously, so I reran the Anzick “one to all” comparison and wrote about those results in the second article, “Analyzing the Native American Clovis Ancient Results.”  Because it wasn’t yet fully processed originally, the second results produced more matches, not fewer.

I wasn’t worried about the one to one comparison of Anzick to my own kit, because one to one comparisons are available immediately, while one to many comparisons are not, per the GedMatch instructions.

“Once you have loaded your data, you will be able to use some features of the site within a minute or so. Additional batch processing, which usually takes a couple of days, must complete before you can use some of the tools comparing you to everyone in the data pool.”

So, everything was stlll hunky dory.

Day 3

The next day, Marie and I had a few minutes, sometime between 2 and 3AM, and no, I’m not kidding.  We decided to compare results.  I decided it would be quicker to run the match again at GedMatch than to sort through my Master spreadsheet, into which I had copied the results and added other information.  So, I did a second download of the Anzick comparison, utilizing the exact same thresholds (200 SNPs, 2cM, and the rest left at the default,) and added them to a spreadsheet that Marie and I were passing back and forth, and sent them to Marie.  I noticed that there seemed to be fewer matches, but by then it was after 3AM and I decided to follow up on that later.

Not so hunky dory…but I didn’t know it yet.

Day 4

The following day, Dr. Ann Turner (MD), also a long-time genetic genealogist, posted the following comment on the article.

“These results, finding “what appear to be contemporary matches for the Anzick child”, seemed very counter-intuitive to me, so I asked John Olson of GEDMatch to look under the hood a bit more. It turns out the ancient DNA sequence has many no-calls, which are treated as universal matches for segment analysis. Another factor which should be examined is whether some of the matching alleles are simply the variants with the highest frequency in all populations. If so, that would also lead to spurious matching segments. It may not be appropriate to apply tools developed for genetic genealogy to ancient DNA sequences like this without a more thorough examination of the underlying data.”

I had been aware of the no-calls due to the work that Dr. David Pike did back in March with the Anzick raw data files, but according to David, that shouldn’t affect the results.

Here’s what Dr. Pike, a Professor of Mathematics, had to say:

“Yes, these forensic samples have very high No-Call rates, which may give rise to more false matches than we would normally experience.  Also, be aware that false matches are more prone to occur when using reduced thresholds (such as 100 SNPs and 1 cM) and unphased data.  In this case I don’t think there’s any way around using low thresholds, simply because we’re looking for very small blocks of DNA (probably nobody alive today will have any large matching blocks with the Anzick child).

On the assumption that there will be a nearly constant noise ratio, meaning that most people will have about the same number of false matches with the Anzick child, those who are from the same gene pool should have an increased number of real matches.  So by comparing the total amount of matching DNA, it ought to be possible to gauge people’s affinity with Anzick’s gene pool.”

Here are Felix’s comments about no-calls as well:

“Personally, no calls are fine as long as there are more SNPs matching above the threshold level because the possibility of errors occurring exactly on no-call positions for all the matches in all their matching segments is impossible.”

Courtesy of Felix, we’ll see an example of how no calls intersperse in  a few minutes.

If no-calls were causing spurious matches in the Anzick kit, you’d expect to see the same for the other ancient DNA kits.  I know that the Denisovan and Neanderthal kits also have many no-calls, and based on the nature of ancient DNA, I’m sure all of them do.  So, if no calls are the culprit, they should be affecting matches to the other kits in the same way, and they aren’t.

Hunky-doryness is being replaced by a nonspecific nagging feeling…same one I used to get when my teenagers were up to something.

Day 5

A day or so later, Felix uploaded file F999913 to replace F999912 with the complete SNPs from all of the companies.  The original 999912 kit only included the SNP locations utilized by Family Tree DNA.  Felix added the SNPs utilized by 23and Me not utilized at Family Tree DNA, and the ones from Ancestry as well.  This is great news for anyone who tested at those two companies, but I had utilized my kit from Family Tree DNA, so for me, there should be no difference at all.

I later asked Felix if he had changed anything else in the file, and he said that he had not.  He provided extensive documentation about what he had done.

I waited until kit F999912 was deleted to be sure tokenizing was complete for F999913 and re-compared the data again.  As expected, Anzick’s one to all had more matches than before, because additional people were included due to the added SNPs from 23andMe and Ancestry.

Some of Anzick’s matches are in the contemporary range, at 3.1 estimated generations, with the largest cM segment of 22.8 and total cMs of 202.8.

anzick 999913

These relatively large matches cause Felix to question whether the sample is actually ancient, based on these relatively large segments.  I addressed my feelings on this in the article, Ancient DNA Matches – What Do They Mean?

Marie and Dr. Pike, both with extensive experience with admixed populations addressed this as well.  Marie commented,

“Native DNA found in the Anzick sample hasn’t changed all of that much and may still be found in modern, Native American populations, and that if people have Native American ancestry, they’ll match to it.”

Dr. Pike says:

“I agree with Marie on this… within endogamous populations, there is an increased likelihood of blocks of DNA being preserved over lengthy time frames.  Moreover, even if a block of DNA gets cut up via recombination, within an endogamous population the odds of some parts of the block later reuniting in a person’s DNA are higher than otherwise.  And it exaggerates the closeness of [the] relationship that gets predicted when comparing people.

I have seen something similar within the Newfoundland & Labrador Family Finder Project, whereby lots of people are sharing small blocks of DNA, likely as a result of DNA from the early colonists still circulating among the modern gene pool.

As an anecdotal example, I have a semi-distant relative (with ancestry from Newfoundland) at 23andMe who shares 3 blocks of DNA with my father, 2 with my mother and 5 five me.  As you can imagine, the relative is predicted to be a closer cousin to me than she is to either of my parents!

It doesn’t take an endogamous or isolated population to see this effect.

It can also happen in families involving cousin marriages too, although that would be more pronounced and not quite the same thing as we’re discussing with respect to ancient DNA.”

This addition of other companies SNPs should not affect my matches with Anzick because my kits are both from FTDNA and won’t utilize the added SNPs.

However, I ran my and my mother’s matches again, and we had a significantly different outcome than either of the previous times.

I utilized the same threshold for all downloads and those are the only values I changed – 200 SNPs and 2cM, leaving the other values at default, for all Anzick comparisons to my mother and my kits.

I am not hunky-dory anymore.

The Heartburn

These matches, which should be the same in all three downloads, produced significantly different results.

Here are the number of matches at the same threshold comparing me and Mom to the Anzick file:

Me and Anzick

  • original download 999912 – 47 matches
  • second download 999912 – 21 matches
  • 999913 – 35 matches

Mom and Anzick

  • original download 999912 – 63
  • second download 999912 – 37
  • 999913 – 36

And no, the 36 /35 that mom and I have for 999913 are not all the same.

Kit Number Matches Between Me, Mother and Anzick
#1-F999912 original download 19
#2-F999912 second download 6
#3-F999913 11

Of those various downloads, the following grid shows which ones matched each other.

#1 to #2 #2 to #3 #1 to #3 All 3
# of Matches 6 2 3 2

So, comparing the first download to the last download, of the 19 original matches, we lost 16 matches.  In the third download, we gained 8 matches and only 3 remained as common matches. So of 30 total matches between my mother, myself and Anzick, in two downloads that should have been exactly the same, only 3 matches held, or 10%.

Obviously, something is wrong, but what, and where?  At that point, I asked Marie to download her and her mother’s results again too, and she experienced the same issue.

Clearly a problem exists someplace.  That’s the question I asked Felix, John and David to help answer.

I realize that this spreadsheet it very long, and I apologize, but I think this issue is much easier to see visually.  I’ve compiled the matches by color and shade to make looking at them relatively easy.

My matches to the Anzick kit are in shades of pink – the first match download being the lightest and the last one to kit F999913 being the darkest.  Mother is green, same shading scheme.

The three columns to the right show the matching segments for each download – shaded in green.  You can easily see which ones line up, meaning which ones match consistently across all three downloads.  There aren’t many.  They should all match.

anzick me mom problem

Obviously this led to many questions that I asked of the various players involved.

My first thought was that perhaps a matching algorithm change occurred in GedMatch, but John assured me that he had made no changes.

Next question was whether or not Felix changed something other than adding the 23andMe and Ancestry SNPs.  He had not.

Felix was kind enough to explain about bunching and to do some analysis on the files.

“When you have low thresholds, make sure you don’t allow errors. For example, at 200 SNPs, the default ‘Mismatch Evaluation window’ and in GEDMatch is same as SNP threshold and ‘Mismatch-Bunching limit’ is half of mismatch evaluation window. So, at 200 cM, you are allowing 1 error every 100 SNPs apart from no-calls.

I did some analysis on your phased mother’s kit, PF6656M1 so that at least we know that it is an IBD for one generation.  The spreadsheet (below) are segments I found at 2 cM/200 SNPs threshold without allowing any errors.”

Kit PF6656M1 is one single kit created by phasing my data against my mother’s so that we don’t have to run both kits.  I had not utilized the phased kit previously, so I was interested in his results.

felix anzick

The results above confirm chromosome matches, 2, 17, 19 and 21, but introduce a new match on chromosome 4.  This match was present in the original download, but not in the second or third download, so once again, we have disparate data, except the thresholds Felix used were at a different level.

One of the more interesting things that Felix included is the no-call match information, the three columns to the right.  I want to show what the no-calls look like.  There are not huge segments that are blank and are being called as matches because they are no-calls, when they shouldn’t be.  No calls are scattered like salt and pepper.  In fact, no calls happen in every kit and they are called as matches so they don’t in fact disrupt a valid match string, potentially making it too small to be considered a match.  Of course, ancient DNA has more no-calls that contemporary DNA kits.

Below are the first few match positions from chromosome 2 where mother, Anzick and I have a confirmed match across all downloads.  The genotype shows you that both kits match.

felix no calls

For consistency, I ran the same kits that Felix ran, PF6656M1 and F999913, with the original thresholds I had used, and found the following:

Chr Start Location End Location Centimorgans (cM) SNPs
1 31358221 33567640 2.0 261
2 218855489 220351363 2.4 253
4 1957991 3571907 2.5 209
5 2340730 2982499 2.3 200
17 53111755 56643678 3.4 293
19 46226843 48568731 2.2 250
21 35367409 36761280 3.7 215

This introduces chromosomes 1 and 5, not shown above.   The chromosome 1 match was shown in the first and second download, but not the third, and the chromosome 5 match was shown in the first download only, but not the second or third.

Can you see me beating my head against the wall yet??

In a fit of apparent insanity, I decided to try, once again, an individual download of Anzick compared to my mother and to me, but not utilizing the phased kit – the original F6656 and F9141, and at the original thresholds, for consistency.  I wanted to see if the matches were the same now as they were a day or so ago.  They should be exact.  This first one is mine.

me second 999913

What you should see are two identical downloads.  I have color coded the rows so you can see easily – and what you should see are candy-cane stripes – one red and one white for every match location.

That’s not what we’re seeing.  The kits are the same, the match parameters are the same, but the results are not.  Once again, the downloads don’t match.

I did another match on mother and Anzick, and her results were consistent between the first and second match to kit F999913.

mom second 999913

The begs the next question.  Have mother’s results always been consistent, suggesting a problem with my kit?

I sorted all of her downloads, and no, they are not consistent, except for the first and second download matches to kit F999913, shown above.  The inconsistencies show up in both mother and my kits, although not in the same locations.  Recall also that Marie had the same issue.

In Summary

Something is wrong, someplace.  I know that sounds intuitively obvious – NOW.  But it wasn’t initially and I wouldn’t even have suspected a problem without running the second and third downloads, quite unintentionally.  Most people never do that, because once you’ve done the match, you have no reason to ever match to that particular person again.  Given that, you’ll never know if a problem exists.

So, the only Anzick GedMatch matches I have any confidence in at all, at this point, are the few that are consistent between all of the downloads, and I didn’t add the fourth download into the mix.  I don’t’ see any point because I’ve pretty much concluded that until we determine where the issue resides, that I won’t have confidence in the results.

The next question that comes to mind, and that I can’t answer, is whether or not this issue is present in contemporary matching kits – or if this is somehow an ancient DNA problem – although I don’t know quite how that could be – since matching is matching.

I haven’t saved any matches that I’ve run to other people in spreadsheets, so I can’t go back and see if a GedMatch match today produces the exact same results as a previous match.

Clearly there is no diagnosis or solution in this summary.  We are not yet hunky dory.

What You Can Do

  1. Run your Anzick and ancient DNA matches multiple times, at the same exact thresholds, on different days, to see if your results are consistent or inconsistent. Same kit, same thresholds, the results should be identical.
  2. If you have some saved GedMatch matches with contemporary people, and you are positive of the match thresholds used, please run them again to see if the results are identical. They should be.
  3. No drawing of or jumping to conclusions, please, especially about ancient DNA:) It’s a journey and we are fellow pilgrims!

If your results are not consistent, please document the problem and let the appropriate person know.  I don’t want to overwhelm John at GedMatch but I’m concerned at this point that the problem may not be isolated to ancient DNA matching since the issue seems to extend to Marie’s results as well.

If your results, especially to Anzick, from previous matches to now are consistent, that’s worth knowing too.  Please add a comment to that effect.

Thoughts and ideas are welcome.

Ancient DNA Matches – What Do They Mean?

The good news is that my three articles about the Anzick and other ancient DNA of the past few days have generated a lot of interest.

The bad news is that it has generated hundreds of e-mails every day – and I can’t possibly answer them all personally.  So, if you’ve written me and I don’t reply, I apologize and  I hope you’ll understand.  Many of the questions I’ve received are similar in nature and I’m going to answer them in this article.  In essence, people who have matches want to know what they mean.

Q – I had a match at GedMatch to <fill in the blank ancient DNA sample name> and I want to know if this is valid.

A – Generally, when someone asks if an autosomal match is “valid,” what they really mean is whether or not this is a genealogically relevant match or if it’s what is typically referred to as IBS, or identical by state.  Genealogically relevant samples are referred to as IBD, or identical by descent.  I wrote about that in this article with a full explanation and examples, but let me do a brief recap here.

In genealogy terms, IBD is typically used to mean matches over a particular threshold that can be or are GENEALOGICALLY RELEVANT.  Those last two words are the clue here.  In other words, we can match them with an ancestor with some genealogy work and triangulation.  If the segment is large, and by that I mean significantly over the threshold of 700 SNPs and 7cM, even if we can’t identify the common ancestor with another person, the segment is presumed to be IBD simply because of the math involved with the breakdown of segment into pieces.  In other words, a large segment match generally means a relatively recent ancestor and a smaller segment means a more distant ancestor.  You can readily see this breakdown on this ISOGG page detailing autosomal DNA transmission and breakdown.

Unfortunately, often smaller segments, or ones determined to be IBS are considered to be useless, but they aren’t, as I’ve demonstrated several times when utilizing them for matching to distant ancestors.  That aside, there are two kinds of IBS segments.

One kind of IBS segment is where you do indeed share a common ancestor, but the segment is small and you can’t necessarily connect it to the ancestor.  These are known as population matches and are interpreted to mean your common ancestor comes from a common population with the other person, back in time, but you can’t find the common ancestor.  By population, we could mean something like Amish, Jewish or Native American, or a country like Germany or the Netherlands.

In the cases where I’ve utilized segments significantly under 7cM to triangulate ancestors, those segments would have been considered IBS until I mapped them to an ancestor, and then they suddenly fell into the IBD category.

As you can see, the definitions are a bit fluid and are really defined by the genealogy involved.

The second kind of IBS is where you really DON’T share an ancestor, but your DNA and your matches DNA has managed to mutate to a common state by convergence, or, where your Mom’s and Dad’s DNA combined form a pseudo match, where you match someone on a segment run long enough to be considered a match at a low level.  I discussed how this works, with examples, in this article.  Look at example four, “a false match.”

So, in a nutshell, if you know who your common ancestor is on a segment match with someone, you are IBD, identical by descent.  If you don’t know who your common ancestor is, and the segment is below the normal threshold, then you are generally considered to be IBS – although that may or may not always be true.  There is no way to know if you are truly IBS by population or IBS by convergence, with the possible exception of phased data.

Data phasing is when you can compare your autosomal DNA with one or both parents to determine which half you obtained from whom.  If you are a match by convergence where your DNA run matches that of someone else because the combination of your parents DNA happens to match their segment, phasing will show that clearly.  Here’s an example for only one location utilizing only my mother’s data phased with mine.  My father is deceased and we have to infer his results based on my mother’s and my own.  In other words, mine minus the part I inherited from my mother = my father’s DNA.

My Result My Result Mother’s Result Mother’s Result Father’s Inferred Result Father’s Inferred Result
T A T G A

In this example of just one location, you can see that I carry a T and an A in that location.  My mother carries a T and a G, so I obviously inherited the T from her because I don’t have a G.  Therefore, my father had to have carried at least an A, but we can’t discern his second value.

This example utilized only one location.  Your autosomal data file will hold between 500,000 and 700,000 location, depending on the vendor you tested with and the version level.

You can phase your DNA with that of your parent(s) at GedMatch.  However, if both of your parents are living, an easier test would be to see if either of your parents match the individual in question.  If neither of your parents match them, then your match is a result of convergence or a data read error.

So, this long conversation about IBD and IBS is to reach this conclusion.

All of the ancient specimens are just that, ancient, so by definition, you cannot find a genealogy match to them, so they are not IBD.  Best case, they are IBS by population.  Worse case, IBS by convergence.  You may or may not be able to tell the difference.  The reason, in my example earlier this week, that I utilized my mother’s DNA and only looked at locations where we both matched the ancient specimens was because I knew those matches were not by convergence – they were in fact IBS by population because my mother and I both matched Anzick.

ancient compare5

Q – What does this ancient match mean to me?

A – Doggone if I know.  No, I’m serious.  Let’s look at a couple possibilities, but they all have to do with the research you have, or have not, done.

If you’ve done what I’ve done, and you’ve mapped your DNA segments to specific ancestors, then you can compare your ancient matching segments to your ancestral spreadsheet map, especially if you can tell unquestionably which side the ancestral DNA matches.  In my case, shown above, the Clovis Anzik matched my mother and me on the same segment and we both matched Cousin Herbie.  We know unquestionably who our common ancestor is with cousin Herbie – so we know, in our family line, which line this segment of DNA shared with Anzick descends through.

ancient compare6

If you’re not doing ancestor mapping, then I guess the Anzick match would come in the category of, “well, isn’t that interesting.”  For some, this is a spiritual connection to the past, a genetic epiphany.  For other, it’s “so what.”

Maybe this is a good reason to start ancestor mapping!  This article tells you how to get started.

Q – Does my match to Anzick mean he is my ancestor?

A – No, it means that you and Anzick share common ancestry someplace back in time, perhaps tens of thousands of years ago.

Q – I match the Anzick sample.  Does this prove that I have Native American heritage? 

A – No, and it depends.  Don’t you just hate answers like this?

No, this match alone does not prove Native American heritage, especially not at IBS levels.  In fact, many people who don’t have Native heritage match small segments?  How can this be?  Well, refer to the IBS by convergence discussion above.  In addition, Anzick child came from an Asian population when his ancestors migrated, crossing from Asia via Beringia.  That Eurasian population also settled part of Europe – so you could be matching on very small segments from a common population in Eurasia long ago.  In a paper just last year, this was discussed when Siberian ancient DNA was shown to be related to both Native Americans and Europeans.

In some cases, a match to Anzick on a segment already attributed to a Native line can confirm or help to confirm that attribution.  In my case, I found the Anzick match on segments in the Lore family who descend from the Acadians who were admixed with the Micmac.  I have several Anzick match segments that fit that criteria.

A match to Anzick alone doesn’t prove anything, except that you match Anzick, which in and of itself is pretty cool.

Q – I’m European with no ancestors from America, and I match Anzick too.  How can that be?

A – That’s really quite amazing isn’t it.  Just this week in Nature, a new article was published discussing the three “tribes” that settled or founded the European populations.  This, combined with the Siberian ancient DNA results that connect the dots between an ancient population that contributed to both Europeans and Native Americans explains a lot.

3 European Tribes

If you think about it, this isn’t a lot different than the discovery that all Europeans carry some small amount of Neanderthal and Denisovan DNA.

Well, guess what….so does Anzick.

Here are his matches to the Altai Neanderthal.

Chr Start Location End Location Centimorgans (cM) SNPs
2 241484216 242399416 1.1 138
3 19333171 21041833 2.6 132
6 31655771 32889754 1.1 133

He does not match the Caucasus Neanderthal.  He does, however, match the Denisovan individual on one location.

Chr Start Location End Location Centimorgans (cM) SNPs
3 19333171 20792925 2.1 107

Q – Maybe the scientists are just wrong and the burial is not 12,500 years old,  maybe just 100 years old and that’s why the results are matching contemporary people.

A – I’m not an archaeologist, nor do I play one…but I have been closely involved with numerous archaeological excavations over the past decade with The Lost Colony Research Group, several of which recovered human remains.  The photo below is me with Anne Poole, my co-director, sifting at one of the digs.

anne and me on dig

There are very specific protocols that are followed during and following excavation and an error of this magnitude would be almost impossible to fathom.  It would require  kindergarten level incompetence on the part of not one, but all professionals involved.

In the Montana Anzick case, in the paper itself, the findings and protocols are both discussed.  First, the burial was discovered directly beneath the Clovis layer where more than 100 tools were found, and the Clovis layer was undisturbed, meaning that this is not a contemporary burial that was buried through the Clovis layer.  Second, the DNA fragmentation that occurs as DNA degrades correlated closely to what would be expected in that type of environment at the expected age based on the Clovis layer.  Third, the bones themselves were directly dated using XAD-collagen to 12,707-12,556 calendar years ago.  Lastly, if the remains were younger, the skeletal remains would match most closely with Native Americans of that region, and that isn’t the case.  This graphic from the paper shows that the closest matches are to South Americans, not North Americans.

anzick matches

This match pattern is also confirmed independently by the recent closest GedMatch matches to South Americans.

Q – How can this match from so long ago possibly be real?

A – That’s a great question and one that was terribly perplexing to Dr. Svante Paabo, the man who is responsible for producing the full genome sequence of the first, and now several more, Neanderthals.  The expectation was, understanding autosomal DNA gets watered down by 50% in every generation though recombination, that ancient genomes would be long gone and not present in modern populations.  Imagine Svante’s surprise when he discovered that not only isn’t true, but those ancient DNA segmetns are present in all Europeans and many Asians as well.  He too agonized over the question about how this is possible, which he discussed in this great video.  In fact he repeated these tests over and over in different ways because he was convinced that modern individuals could not carry Neanderthal DNA – but all those repeated tests did was to prove him right.  (Paabo’s book, Neanderthal Man, In Search of Lost Genomes is an incredible read that I would highly recommend.)

What this means is that the population at one time, and probably at several different times, had to be very small.  In fact, it’s very likely that many times different pockets of the human race was in great jeopardy of dying out.  We know about the ones that survived.  Probably many did perish leaving no descendants today.  For example, no Neanderthal mitochondrial DNA has been found in any living or recent human.

In a small population, let’s say 5 males and 5 females who some how got separated from their family group and founded a new group, by necessity.  In fact, this could well be a description of how the Native Americans crossed Beringia.  Those 5 males and 5 females are the founding population of the new group.  If they survive, all of the males will carry the men’s haplogroups – let’s say they are Q and C, and all of the descendants will carry the mitochondrial haplogroups of the females – let’s say A, B, C, D and X.

There is a very limited amount of autosomal DNA to pass around.  If all of those 10 people are entirely unrelated, which is virtually impossible, there will be only 10 possible combinations of DNA to be selected from.  Within a few generations, everyone will carry part of those 10 ancestor’s DNA.  We all have 8 ancestors at the great-grandparent level.  By the time those original settlers’ descendants had great-great-grandparents – of which each one had 16, at least 6 of those original people would be repeated twice in their tree.

There was only so much DNA to be passed around.  In time, some of the segments would no longer be able to be recombined because when you look at phasing, the parents DNA was exactly the same, example below.  This is what happens in endogamous populations.

My Result My Result Mother’s Result Mother’s Result Father’s Result Father’s  Result
T T T T T T

Let’s say this group’s descendants lived without contact with other groups, for maybe 15,000 years in their new country.  That same DNA is still being passed around and around because there was no source for new DNA.  Mutations did occur from time to time, and those were also passed on, of course, but that was the only source of changed DNA – until they had contact with a new population.

When they had contact with a new population and admixture occurred, the normal 50% recombination/washout in every generation began – but for the previous 15,000 years, there had been no 50% shift because the DNA of the population was, in essence, all the same.  A study about the Ashkenazi Jews that suggests they had only a founding population of about 350 people 700 years ago was released this week – explaining why Ashkenazi Jewish descendants have thousands of autosomal matches and match almost everyone else who is Ashkenazi.  I hope that eventually scientists will do this same kind of study with Anzick and Native Americans.

If the “new population” we’ve been discussing was Native Americans, their males 15,000 year later would still carry haplogroups Q and C and the mitochondrial DNA would still be A, B, C, D and X.  Those haplogroups, and subgroups formed from mutations that occurred in their descendants, would come to define their population group.

In some cases, today, Anzick matches people who have virtually no non-Native admixture at the same level as if they were just a few generations removed, shown on the chart below.

anzick gedmatch one to all

Since, in essence, these people still haven’t admixed with a new population group, those same ancient DNA segments are being passed around intact, which tells us how incredibly inbred this original small population must have been.  This is known as a genetic bottleneck.

The admixture report below is for the first individual on the Anzick one to all Gedmatch compare at 700 SNPs and 7cM, above.  In essence, this currently living non-admixed individual still hasn’t met that new population group.

anzick1

If this “new population” group was Neanderthal, perhaps they lived in small groups for tens of thousands of years, until they met people exiting Africa, or Denisovans, and admixed with them.

There weren’t a lot of people anyplace on the globe, so by virtue of necessity, everyone lived in small population groups.  Looking at the odds of survival, it’s amazing that any of us are here today.

But, we are, and we carry the remains, the remnants of those precious ancestors, the Denisovans, the Neanderthals and Anzick.  Through their DNA, and ours, we reach back tens of thousands of years on the human migration path.  Their journey is also our journey.  It’s absolutely amazing and it’s no wonder people have so many questions and such a sense of enchantment.  But it’s true – and only you can determine exactly what this means to you.

Analyzing the Native American Clovis Anzick Ancient Results

This ancient DNA truly is the gift that keeps on giving.

Today, Felix Chandrakamur e-mailed me and told me that the Anzick results were not yet fully processed at Gedmatch when I performed a “compare to all.”  He knows this because he knows when he uploaded the results, and after they were finished, he ran the same compare and obtained vastly different results.  I am updating my original article to point to this one, so the data will be accurately reflected.

In fact, the results are utterly fascinating, take your breath away kind of fascinating.  Felix wrote an article about his findings, Clovis-Anzick-1 ancient DNA have matches with living people!

While finding what appear to be contemporary matches for the Anzick child may sound ho-hum, it’s not, and when you look at the results and the message they hold for us, it’s absolutely astounding.

Felix ran his comparison with default values of 7cM.  This is the threshold that is typically utilized as the line in the sand between “real” and IBS, matches – real meaning the results are and could be, if you could find your common ancestor, genealogically relevant.  In this case, that clearly isn’t true.

The exception to this rule is heavily admixed groups, such as Ashkenazi Jewish people who are related to every other Askhenazi Jewish person autosomally.  It seems, looking at these results, that this is the same situation we find with the 12,500 year old Anzick child and currently living people.  This population had to be painfully small for a very long time and the DNA had to exist in every person within that population group for it to be passed in segments this large to people living today.

After receiving Felix’s e-mail, of course, I had to go back and run the compares again.  In particular, I wanted to run the one to many, as he had.

I began at the 1cM level and noticed that I received exactly 1500 results, which seemed to me like a cutoff – not an actual number of matches.  So, I upped that threshold to 2, then 3, then 4, then 5, then 6, then finally to the default of 7.  It was only at 7, the IBS/IBD default, that the results were under the 1500 threshold, at 1466.

1466 current matches?????

This is absolutely amazing.  The Anzick child lived about 12,500 years ago in Montana.  How are 1466 matches to currently living people possible?

Many of these matches are to people from the southwest and Mexico today.  They are not, for the most part, from eastern Canada.

Let’s take a look at what we found.

In the 1466 results, as Felix mentioned, the closest matches match at current “cousin” levels to Anzick.  The highest 7 matches that show haplogroups are haplogroup Q1a3a.  Unfortunately, with the constant renaming of the haplogroups recently, it’s difficult to interpret the haplogroup exactly, which is why we’ve gone to SNP names.  Looking at some of the names and e-mails, several appear to carry Spanish surnames or be from Mexico or South America.

Of the 1466 results:

  • 2 were Y haplogroup C
  • 79 were Y haplogroup Q
  • 520 carried a mitochondrial DNA haplogroup of A, B, C, D, M or X
  • Of the 79 haplogroup Q carriers, 52 also carried a Native mitochondrial haplogroup.
  • A total 549 individuals out of 1466 carried at least one Native American haplogroup, or about 37.5%.  That’s amazingly high.

Of these closest matches who are Y haplogroup Q, they also all carry variant Native American mitochondrial DNA haplogroups as well, so these people may not be heavily admixed.  In other words, they may be almost “pure” Native American.

In order to test this theory, I entered the number of the kit that rated the highest in terms of total cM at 160.1 with the largest segment at 14.8.  You can click on the images to enlarge.

anzick1

As you can see, this individual is very nearly 100% Native American.

The second individual on the list, who may be from Guatemala, also carries almost no admixture.

anzick2

Of the highest 21 matches that listed any haplogroup information, all have either or both Native Y or mitochondrial DNA haplogroups.

Out of curiosity, I ran the first person on the list who had neither a Native American Y or mitochondrial haplogroup – both being European.

As you can see, below, they are still clearly heavily Native American, but clearly admixed.

anzick3

I moved to the last person of the 1466 on this list whose DNA matched at a total of 7cM, who did not carry a Native haplogroup.  This individual, below, is more heavily admixed.

anzick 3.5

Lastly, I ran the same admixture tool on the last person, who had a total of 7cM matching that did have a Native American mitochondrial haplogroup.

anzick4

Not surprisingly, the individual with almost no non-Native admixture is much more likely to carry the ancient segments in higher percentages than the individuals who are admixed.   This again strongly suggests that at one point, these segments were present in an entire group of Native people and may still be present in very high numbers in people who carry no admixture.

Out of curiosity, and assuming that these first two individuals are not known to be related to each other, I ran them against each other in a one to one comparison.

There were no matches at the default values, but by dropping them just a little, to 5cM and 500 SNPs, they match on 6 segments.

anzick5

It looks like they should match on chromosome 17 at the 700 SNP/7 cM default threshold.

At 200 SNPs and 2cM, there were 67 segments.  These are clearly ancient in nature and size, but matching just the same.  By lowering the threshold to 100 SNPs and 1cM, they share a whopping 990 segments.

Indeed, these two men very clearly share a lot of population specific DNA from the ancient people of the New World, including that of Anzick male child who lived in Montana 12,500 years ago.

Utilizing Ancient DNA at GedMatch

Mummy of 6 month old boy found in Greenland

It has been a wonderful week for those of us following ancient DNA full genome sequencing, because now we can compare our own results to those of the ancient people found whose DNA has been fully sequenced, including one Native American.

Felix Chandrakumar has uploaded the autosomal files of five ancient DNA specimens that have been fully sequenced to GedMatch.  Thanks Felix.

When news of these sequences first hit the academic presses, I was wishing for a way to compare our genomes – and now my wish has come true.

Utilizing GedMatch’s compare one to all function, I ran all of the sequences individually and found, surprisingly, that there are, in some cases, matches to contemporary people today.  I dropped the cM measure to 1 for both autosomal and X.

Please note that because these are ancient DNA sequences, they will all have some segments missing and none can be expected to be entirely complete.  Still, these sequences are far better than nothing.

1.  Montana Anzick at GedMatch

This is the only clearly Native American sample.

http://www.y-str.org/2014/09/clovis-anzick-dna.html

F999912

9-27-2014 – Please note that kit F999912 has been replaced by kit F999913.

No matches at 1cM in the compare to all.  This must be because the SNP count is still at default thresholds, in light of information discovered later in this article.

Update – as it turns out, this kit was not finished processing when I did the one to one compare.  After it finished, the results were vastly different.  See this article for results.

2.  Paleo Eskimo from Greenland at GedMatch

http://www.y-str.org/2013/12/palaeo-eskimo-2000-bc-dna.html

F999906

Thirty-nine matches with segments as large at 3.8.  One group of matches appears to be a family.  One of these matches is my cousin’s wife.  That should lead to some interesting conversation around the table this holiday season!  All of these matches, except 1, are on the X chromosome.  This must be a function of these segments being passed intact for many generations.

I wrote about some unusual properties of X chromosomal inheritance and this seems to confirm that tendency in the X chromosome, or the matching thresholds are different at GedMatch for the X.

3.  Altai Neanderthal at GedMatch

http://www.y-str.org/2013/08/neanderthal-dna.html

F999902

One match to what is obviously another Neaderthal entry.

4.  Russian Causasus Neanderthal at GedMatch

Another contribution from the Neanderthal Genome Project.

http://www.y-str.org/2014/09/mezmaiskaya-neanderthal-dna.html

F999909

No matches.

5.  Denisova at GedMatch

http://www.y-str.org/2013/08/denisova-dna.html

F999903

Two matches, one to yet another ancient entry and one to a contemporary individual on the X chromosome.

But now, for the fun part.

My Comparison

Before I start this section, I want to take a moment to remind everyone just how old these ancient segments are.

  • Anzick – about 12,500 years old
  • Paleo-Eskimo – about 4,000 years old
  • Altai Neanderthal – about 50,000 years old
  • Russian Caucasus Neanderthal – about 29,000 years old
  • Denisova – about 30,000 years old

In essence, the only way for these segments to survive intact to today would have been for them to enter the population of certain groups, as a whole, to be present in all of the members of that group, so that segment would no longer be divided and would be passed intact for many generation, until that group interbred with another group who did not carry that segment.  This is exactly what we see in endogamous populations today, such as the Askenazi Jewish population who is believed, based on their common shared DNA, to have descended from about 350 ancestors about 700 years ago.  Their descendants today number in the millions.

So, let’s see what we find.

I compared by own kit at GedMatch utilizing the one to one comparison feature, beginning with 500 SNPs and 1cM, dropping the SNP values to 400, then 300, then 200, until I obtained a match of some sort, if I obtained a match at all.

Typically in genetic genealogy, we’re looking for genealogy matches, so the default matching thresholds are set relatively high.  In this case, I’m looking for deep ancestral connections, if they exist, so I was intentionally forcing the thresholds low.  I’m particularly interested in the Anzick comparison, in light of my Native American and First Nations heritage.

The definition of IBS, identical by state, vs IBD, identical by descent segments varies by who is talking and in what context, but in essence, IBD means that there is a genealogy connection in the past several generations.

IBS means that the genealogy connection cannot be found and the IBS match can be a function of coming from a common population at some time in the past, or it can be a match by convergence, meaning that your DNA just happened to mutate to the same state as someone else’s.  If this is the case, then you wouldn’t expect to see multiple segments matching the same person and you would expect the matching segments to be quite short.  The chances of hundreds of SNPs just happening to align becomes increasingly unlikely the longer the matching SNP run.

So, having said that, here are my match results.

Anzick

I had 2 matches at 400 SNPs, several at 300 and an entire list at 200, shown below.

Chr Start Location End Location Centimorgans (cM) SNPs
1 6769350 7734985 1.7 232
1 26552555 29390880 1.9 264
1 31145273 33730360 2.7 300
1 55655110 57069976 1.9 204
1 71908934 76517614 2.8 265
1 164064635 165878596 2.8 264
1 167817718 171330902 3.3 466
1 186083870 192208998 4.2 250
2 98606363 100815734 1.4 256
2 171132725 173388331 2.0 229
2 218855489 220373983 2.5 261
3 128892631 131141396 1.7 263
3 141794591 143848459 2.5 207
4 1767539 3571907 2.7 235
4 70345811 73405268 2.5 223
5 2340730 2982499 2.3 200
5 55899022 57881001 2.3 231
5 132734528 134538202 1.9 275
5 137986213 140659207 1.7 241
6 34390761 36370969 1.8 293
8 17594903 18464321 1.9 200
8 23758017 25732105 1.7 240
8 109589884 115297391 1.9 203
9 122177526 124032492 1.6 229
10 101195132 102661955 1.2 264
10 103040561 105596277 1.3 304
10 106135611 108371247 1.5 226
12 38689229 41184500 1.6 247
13 58543514 60988948 1.6 220
13 94528801 95252127 1.0 277
14 60929984 62997711 1.8 255
14 63724184 65357663 1.7 201
14 72345879 74206753 1.7 263
15 36850933 38329491 2.7 238
16 1631282 2985328 2.5 273
16 11917282 13220406 3.7 276
16 15619825 17324720 3.1 305
16 29085336 31390250 1.3 263
16 51215026 52902771 3.4 224
17 52582669 56643678 4.7 438
19 11527683 13235913 1.7 203
19 15613137 16316773 1.2 204
19 46195917 49338412 3.3 397
20 17126434 18288231 2.1 225
21 35367409 36969215 4.1 254
21 42399499 42951171 1.6 233
22 33988022 35626259 5.0 289

In my case, I’m particularly fortunate, because my mother tested her DNA as well.  By process of elimination, I can figure out which of my matches are through her, and then by inference, which are through my father or are truly IBS by convergence.

I carry Native heritage on both sides, but my mother’s is proven to specific Native ancestors where my father’s is only proven to certain lines and not yet confirmed through genealogy records to specific ancestors.

Because I had so many matches, quite to my surprise, I also compared my mother’s DNA to the Anzick sample, combined the two results and put them in a common spreadsheet, shown below.  White are my matches.  Pink are Mom’s matches, and the green markers are on the segments where we both match the Anzick sample, confirming that my match is indeed through mother.

ancient compare

We’ll work with this information more in a few minutes.

Paleo

At 200 SNP level, 2 segments.

1 26535949 27884441 1.1 258
2 127654021 128768822 1.2 228

My mother matches on 9 segments, but neither of the two above, so they are either from my father’s side or truly IBS by convergence.

Altai Neanderthal

ancient compare2

Russian Neanderthal

Neither my mother nor I have any matches at 100SNPs and 1cM.

Denisovan

I have one match.

Chr Start Location End Location Centimorgans (cM) SNPs
4 8782230 9610959 1.2 100

My mother matches 2 segments at 100 SNPs but neither match is the same as my segment.

Matching to Ancestral Lines

I’ve been mapping my DNA to specific ancestors utilizing the genealogy information of matches and triangulation for some time.  This consists of finding common ancestors with your matches.  Finding one person who matches you and maps to a common ancestor on a particular segment consists of a hint.  Finding two that share the same ancestral line and match you and each other on the same segment is confirmation – hence, the three of you triangulate.  More than three is extra gravy:)

I have also recorded other relevant information in my matches file, like the GedMatch Native chromosomal comparisons when I wrote “The Autosomal Me” series about hunting for my Native chromosomal segments.

So, after looking at the information above, it occurred to me that I should add this ancestral match information to my matches spreadsheet, just for fun, if nothing else.

I added these matches, noted the source as GedMatch and then sorted the results, anxious to see what we might find.  Would at least one of these segments fall into the proven Native segments or the matches to people who also descend from those lines?

What I found was both astonishing and confusing….and true to form to genealogy, introduced new questions.

I have extracted relevant matching groups from my spreadsheet and will discuss them and why they are relevant.  You can click on any of the images to see a larger image.

ancient compare3

This first set of matches is intensely interesting, and equally as confusing.

First, these matches are to both me and mother, so they are confirmed through my mother’s lines.  In case anyone notices, yes, I did switch my mother’s line color to white and mine to pink to be consistent with my master match spreadsheet coloration.

Second, both mother and I match the Anzick line on the matches I’ve utilized as examples.

Third, both 23andMe and Dr. Doug McDonald confirmed the segments in red as Native which includes the entire Anzick segment.

Fourth, utilizing the Gedmatch admixture tools, mother and I had this range in common.  I described this technique in “The Autosomal Me” series.

Fifth, these segments show up for two distinct genealogy lines that do not intersect until my grandparents, the Johann Michael Miller line AND the Acadian Lore line.

Sixth, the Acadian Lore line is the line with proven Native ancestors.

Seventh, the Miller line has no Native ancestors and only one opportunity for a Native ancestor, which is the unknown wife of Philip Jacob Miller who married about 1750 to a women rumored to be Magdalena Rochette, but research shows absolutely no source for that information, nor any Rochette family anyplace in any proximity in the same or surrounding counties to the Miller family.  The Miller’s were Brethren.  Furthermore, there is no oral history of a Native ancestor in this line, but there have been other hints along the way, such as the matching segments of some of the “cousins” who show as Native as well.

Eighth, this makes my head hurt, because this looks, for all the world, like Philip Jacob Miller who was living in Bedford County, PA when he married about 1750 may have married someone related to the Acadian lines who had intermarried with the Micmac.  While this is certainly possible, it’s not a possibility I would ever have suspected.

Let’s see what else the matches show.

ancient compare4

In this matching segment Mom and I both match Emma, who descends from Marie, a MicMac woman.  Mom’s Anzik match is part of this same segment.

ancient compare5

In this matching segment, Mom and I both match cousin Denny who descends from the Lore line who is Acadian and confirmed to have MicMac ancestry.  Mom’s Anzik segments all fit in this range as well.

ancient compare6

In this matching segment, cousin Herbie’s match to Mom and I falls inside the Anzick segments of both Mom and I.

ancient compare7

More matching to the proven Miller line.

ancient compare8

This last grouping with Mom is equally as confusing at the first.  Mom and I both match cousin Denny on the Lore side, proven Acadian.

Mom and I both match the Miller side too, and the Anzik for both of us falls dead center in these matches.

There are more, several more matches, that also indicate these same families, but I’m not including them because they don’t add anything not shown in these examples.  Interestingly enough, there are no pointers to other families, so this isn’t something random.  Furthermore, on my father’s side, as frustrating as it is, here are no Anzick matches that correlate with proven family lines.  ARGGHHHHHH……

On matches that I don’t share with mother, there is one of particular interest.

ancient compare9

You’ll notice that the Anzik and the Paleo-Greenland samples match each other, as well as me.  This is my match, and by inference, not through mother.  Unfortunately, the other people in this match group don’t know their ancestors or we can’t identify a common ancestor.

Given the genetic genealogy gold standard of checking to see if your autosomal matches match each other, I went back to GedMatch to see if the Paleo-Greenland kit matched the Clovis Anzik kit on this segment, and indeed, they do, plus many more segments as well.  So, at some time, in some place, the ancestors of these two people separated by thousands of miles were related to each other.  Their common ancestor would have either been in Asia or in the Northern part of Canada if the Paleo people from Greenland entered from that direction.

Regardless, it’s interesting, very interesting.

What Have I Learned?

Always do experiments.  You never know what you’ll find.

I’m much more closely related to the Anzick individual than I am to the others. This isn’t surprising given my Native heritage along with the endogamous culture of the Acadians.

My relationship level to these ancient people is as follows:

Lived Years Ago Relatedness Comments
Montana Anzick 12,500 107.4cM at 200 SNP level Confirmed to Lore (Acadian) and Miller, but not other lines
Greenland Paleo 4,000 2.3cM at 200 SNP level No family line matches, does match to Anzick in one location
Altai Neanderthal 50,000 2.1cM at 200 SNP level No family line matches
Russian Neanderthal 29,000 0
Denisovan 30,000 1.2cM at 200 SNP No family line matches

The Lores and the Millers

Looking further at the Lore and Miller lines, there are only two options for how these matching segments could have occurred.  There are too many for them all to be convergence, so we’ll have to assume that they are indeed because we shared a common population at some time and place.

The nature of how small the segments are testify that this is not a relatively recent common ancestor, but how “unrecent” is open to debate.  Given that Neanderthal and Denisovan ancient segments are found in all Europeans today, it’s certainly possible for these segments to be passed intact, even after thousands of years.

The confirmations to the Lore line come through proven Lore cousins and also through other proven Acadian non-specific matches.  This means that the Acadian population is highly endogamous and when I find an Acadian match, it often means that I’m related through many ancestors many times.  This, of course, increases the opportunity for the DNA to be passed forward, and decreases the opportunity for it to be lost in transmission, but it also complicates the genealogy greatly and makes determining which ancestor the DNA segment came from almost impossible.

However, I think we are safe to say the segments are from the Acadian population, although my assumption would be that they are from the Native Ancestors and not the French, given the high number of Anzick matches, Anzick being proven to be Native.  Having said that, that assumption may not be entirely correct.

The Miller line is relatively well documented and entirely from Germany/Switzerland, immigrating in the early 1700s, with the exception of the one unknown wife in the first generation married in the US.  Further examination would have to be done to discover if any of the matches came through Johann Michael Miller’s sons other than Philip Jacob Miller, my ancestor.  There are only three confirmed children, all sons.  If this segment shows up in Johann Michael Miller’s line not associated with son Philip Jacob Miller, then we would confirm that indeed the segment came from Europe and not a previously unknown Native or mixed wife of Philip Jacob.

Bottom Line

So, what’s the bottom line here?  I know far more than I did.  The information confirms, yet again, the Acadian Native lines, but it introduces difficult questions about the Miller line.  I have even more tantalizing questions for which I have no answers today, but I tell you what, I wouldn’t trade this journey along the genetic pathway with all of its unexpected bumps, rocks, slippery slopes and crevices for anything!!  That’s why it’s called an adventure!