Tenth Annual Family Tree DNA Conference Wrapup

baber summary

This slide, by Robert Baber, pretty well sums up our group obsession and what we focus on every year at the Family Tree DNA administrator’s conference in Houston, Texas.

Getting to Houston, this year, was a whole lot easier than getting out of Houston. They had storms yesterday and many of us spent the entire day becoming intimately familiar with the airport.  Jennifer Zinck, of Ancestor Central, is still there today and doesn’t have a flight until late.

And this is how my day ended, after I finally got out of Houston and into my home airport. This isn’t at the airport, by the way.  Everything was fine there, but I made the apparent error of stopping at a Starbucks on the way home.  This is the parking lot outside an hour or so later.  What can I say?  At least I had my coffee, and AAA rocks, as did the tow truck driver and my daughter for getting out of bed to come and rescue me!!!  Hmmm, I think maybe things have gone full circle.  I remember when I used to go and rescue her:)

jeep tow

So far, today hasn’t improved any, so let’s talk about something much more pleasant…the conference itself.

Resources

One of the reasons I mentioned Jennifer Zinck, aside from the fact that she’s still stuck in the airport, is because she did a great job actually covering the conference as it happened. Since I had some time yesterday to visit with her since our gates weren’t terribly far apart, I asked her how she got that done.  I took notes too, and photos, but she turned out a prodigious amount of work in a very short time.  While I took a lightweight MacBook Air, she took her regular PC that she is used to typing on, and she literally transcribed as the sessions were occurring.  She just added her photos later, and since she was working on a platform that she was familiar with, she could crop and make the other adjustments you never see but we perform behind the scenes before publishing a photo.

On the other hand, I struggled with a keyboard that works differently and is a different size than I’m used to as well as not being familiar with the photo tools to reduce the size of pictures, so I just took rough notes and wrote the balance later.  Having familiar tools make such a difference.  I think I’ll carry my laptop from now on, even though it is much heavier.  Kudos to Jennifer!

I was initially going to summarize each session, but since Jen did such a good job, I’m posting her links. No need to recreate a wheel that doesn’t need to be recreated.

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy/

ISOGG, the International Society of Genetic Genealogy is not affiliated with Family Tree DNA or any testing company, but Family Tree DNA is generous enough to allow an ISOGG meeting on Sunday before the first conference session.

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-isogg-meeting/

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-sunday/

You can find my conference postings here:

http://dna-explained.com/2014/10/11/tenth-annual-family-tree-dna-conference-opening-reception/

http://dna-explained.com/2014/10/12/tenth-annual-family-tree-dna-conference-day-2/

http://dna-explained.com/2014/10/13/tenth-annual-family-tree-dna-conference-day-3/

Several people were also posting on a twitter feed as well.

https://twitter.com/search?q=%23FTDNA2014&src=tyah

Those of you where are members of the ISOGG Yahoo group for project administrators can view photos posted by Katherine Borges in that group and there are also some postings on the Facebook ISOGG group as well.

Now that you have the links for the summaries, what I’d like to do is to discuss some of the aspects I found the most interesting.

The Mix

When I attended my first conference 10 years ago, I somehow thought that for the most part, the same group of people would be at the conferences every year. Some were, and in fact, a handful of the 160+ people attending this conference have attended all 10 conferences.  I know of two others for certain, but there were maybe another 3 or so who stood up when Bennett asked for everyone who had been present at all 10 conferences to stand.

Doug Mumma, the very first project administrator was with us this weekend, and still going strong. Now, if Doug and I could just figure out how we’re related…

Some of the original conference group has passed on to the other side where I’m firmly convinced that one of your rewards is that you get to see all of those dead ends of your tree. If we’re lucky, we get to meet them as well and ask all of those questions we have on this side.  We remember our friends fondly, and their departure sadly, but they enriched us while they were here and their memories make us smile.  I’m thinking specifically of Kenny Hedgepath and Leon Little as I write this, but there have been others as well.

The definition of a community is that people come and go, births, deaths and moves.

This year, about half of the attendees had never attended a conference before. I was very pleased to see this turn of events – because in order to survive, we do need new people who are as crazy as we are…er….I mean as dedicated as we are.

isogg reception

ISOGG traditionally hosts a potluck reception on Saturday evening. Lots of putting names with faces going on here.

Collaboration

I asked people about their favorite part of the conference or their favorite session. I was surprised at the number of people who said lunches and dinners.  Trust me, the food wasn’t that wonderful, so I asked them to elaborate.  In essence, the most valuable aspect of the conference was working with and talking to other administrators.

bar talk

It’s not like we don’t talk online, but there is somehow a difference between online communications and having a group discussion, or a one-on-one discussion. Laptops were out and in use everyplace, along with iPads and other tools.  It was so much fun to walk by tables and hear snippets of conversations like “the mutation at location 309.1….” and “null marker at 425” and “I ordered a kit for my great uncle…..”

I agree, as well. I had pre-arranged two dinners before arriving in order to talk with people with whom I share specific interests.  At lunches, I either tried to sit with someone I specifically needed to talk to, or I tried to meet someone new.

I also asked people about their specific goals for the next year. Some people had a particular goal in mind, such as a specific brick wall that needs focus.  Some, given that we are administrators, had wider-ranging project based goals, like Big Y testing certain family groups, and a surprising number had the goal of better utilizing the autosomal results.

Perhaps that’s why there were two autosomal sessions, an introduction by Jim Bartlett and then Tim Janzen’s more advanced session.

Autosomal DNA Results

jim bartlett

Note the cool double helix light fixture behind the speakers.

tim janzen

Tim specifically mentioned two misconceptions which I run across constantly.

Misconception 1 – A common surname means that’s how you match.  Just because you find a common surname doesn’t mean that’s your DNA match.  This belief is particularly prevalent in the group of people who test at Ancestry.com.

Misconception 2 – Your common ancestor has to be within the past 6 generations.  Not true, many matches can be 6-10th cousins because there are so many descendants of those early ancestors, even as many as 15 generations back.

Tim also mentioned that endogamous relationships are a tough problem with no easy answer. Polynesians, Ashkenazi Jews, Low German Mennonites, Acadians, Amish, and island populations.  Do I ever agree with him!  I have Brethren, Mennonite and Acadian in the same parent’s line.

Tim has been working with the Mennonite DNA project now for many years.

Tim included a great resource slide.

tim slide1

Tim has graciously made his entire presentation available for download.

tim slide2

There are probably a dozen or so of us that are actively mapping our ancestors, and a huge backlog of people who would like to. As Tim pointed out with one of his slides, this is not an easy task nor is it for the people who simply want to receive “an answer.”

tim slide3

I will also add that we “mappers” are working with and actively encouraging Family Tree DNA to develop tools so that the mapping is less spreadsheet manual work and more automated, because it certainly can be.

Upload GEDCOM Files

If you haven’t already, upload your GEDCOM to Family Tree DNA.  This is becoming an essential part of autosomal matching.  Furthermore, Family Tree DNA will utilize this file to construct your surname list and that will help immensely determining common surnames and your common ancestor with your Family Finder matches.  If you have sponsored tests for cousins, then upload a GEDCOM file for them or at least construct a basic tree on their Family Tree DNA page.

Ethics

Family Tree DNA always tries to provide a speaker about ethics, and the only speakers I’ve ever felt understood anything about what we want to do are Judy Russell and Blaine Bettinger.  I was glad to see Blaine presenting this year.

blaine bettinger

The essence of Blaine’s speech is that ethics isn’t about law. Law is cut and dried.  Ethics isn’t, and there are no ethics police.

Sometimes our decisions are colored necessarily by right and wrong.  Sometimes those decisions are more about the difference between a better and a worse way.

As a community, we want to reduce negative press coverage and increase positive coverage. We want to be proactive, not reactive.

Blaine stresses that while informed consent is crucial, that DNA doesn’t reveal secrets that aren’t also revealed by other genealogical forms of research. DNA often reveals more recent secrets, such as adoptions and NPEs, so it’s possibly more sensitive.

Two things need to govern our behavior. First, we need to do only things that we would be comfortable seeing above the fold in the New York Times.  Second, understand that we can’t make promises about topics like anonymity or about the absence of medical information, because we don’t know what we don’t know.

The SNP Tsunami

One of my concerns has been and remains the huge number of new SNPs that have been discovered over the past year or so with the Big Y by Family Tree DNA and  corresponding tests from other vendors.

When I say concern, I’m thrilled about this new technology and the advances it is allowing us to make as a community to discover and define the evolution of haplogroups. My concern is that the amount of data is overwhelming.  However, we are working through that, thanks to the hours and hours of volunteer work by haplogroup administrators and others.

Alice Fairhurst, who volunteers to maintain the ISOGG haplotree, mentioned that she has added over 10,000 SNPs to the Y tree this year alone, bringing the total to over 14,000. Those SNPs are fully vetted and placed.  There are many more in process and yet more still being discovered.  On the first page of the Y SNP tree, the list of SNP sources and other critical information, such as the criteria for a SNP to be listed, is provided.

isogg tree3

isogg snps

isogg snps 2014

So, if you’re waiting for that next haplotree poster, give it up because there isn’t a printing press that big, unless you want wallpaper.

isogg new development 2014

These slides are from Alice’s presentation. The ISOGG tree provides an invaluable resource for not only the genetic genealogy community, but also researchers world-wide.

As one example of how the SNP tsunami has affected the Y tree, Alice provided the following summary of R-U106, one of the two major branches of haplogroup R.

From the ISOGG 2006 Y tree, this was the entire haplogroup R Y tree. You can see U106 near the bottom with 3 sub-branches.  While this probably makes you chuckle today, remember that 2006 was only 8 years ago and that this tree didn’t change much for several years.

2006 entire tree

2007 was the same.

2008 u106 tree

2008 shows 5 subclades and one of the subclades had 2 subclades.

2009 u106 tree

2009 showed a total of 12 sub-branches and 2010 added one more.

2011 however, showed a large change. U106 in 2011 had 44 subgroups total and became too large to show on one screen shot.  2012 shows 99 subclades, if I counted accurately.  The 2014 U106 tree is shown below.

before big y

after big y

u106 now

u106 now2

There’s another slide too, but I didn’t manage to get the picture.  You get the idea though…

As you can imagine, for Family Tree DNA, trying to keep up with all of the haplogroups, not just one subgroup like U106 is a gargantuan task that is constantly changing, like hourly. Their Y tree is currently the National Geographic tree, and while they would like to update it, I’m sure, the definition of “current tree” is in a constant state of flux.  Literally, Mike Walsh, one of the admins in the R-L21 group uploads a new tree spreadsheet several times every day.

In order to deal attempt to deal with this, and to encourage people who don’t want to do a Big Y discovery type test, but do want to ferret out their location on their assigned portion of the tree, Family Tree DNA is reintroducing the Backbone tests.

They are starting with M222, also known as the Niall of the 9 Hostages haplogroup which is their beta for the new product and new process. You can see the provisional tree and results in the two slides they provided, below.  I apologize for the quality, but it was the best I could do.

M222

m222 pie

Haplogroup administrators are going to be heavily involved in this process. Family Tree DNA is putting SNP panels together that will help further define the tree and where various SNPs that have been recently discovered, and continue to be discovered, will fall on the tree.

As Big Y tests arrive, haplogroup project administrators typically assemble a spreadsheet of the SNPS and provisionally where they fall on the tree, based on the Big Y results.

What Bennett asked is for the admins to work with Family Tree DNA to assemble a testing panel based on those results. The goal is for the cost to be between $1.50 and $2 (US) for each SNP in the panel, which will reduce the one-off SNP testing and provide a much more complete and productive result at a far reduced price as compared to the current $29 or $39 per individual SNP.

If you are a haplogroup administrator, get in touch with Family Tree DNA to discuss your desired backbone panels. New panels, when it’s your turn, will take about 2 weeks to develop.

Keep in mind that the following SNPs, according to Bennett, are not optimal for panels:

  • Palindromic regions
  • Often mutating regions designated as .1, .2, etc.
  • SNPs in STRs

Nir Leibovich, the Chief Business Officer, also addressed the future and the Big Y to some extent in his presentation.

nir leibovich

ftdna future 2014

Utilizing the Big Y for Genealogy

In my case, during the last sale, I ordered several Big Y tests for my Estes family line because I have several genealogically documented lines from the original Estes family in Kent, England through our common ancestor, Robert Estes born in 1555 and his wife Anne Woodward. The participants also agreed to extend their markers to 111 markers as well.  When the results are back, we’ll be able to compare them on a full STR marker set, and also their SNPs.  Hopefully, they will match on their known SNPs and there will be some new novel variants that will be able to suffice as line marker mutations.

We need more BIG Y tests of these types of genealogically confirmed trees that have different sons’ lines from a distant common ancestor to test descendant lines. This will help immensely to determine the actual, not imputed, SNP mutation rate and allow us to extrapolate the ages of haplogroups more accurately.  Of course, it also goes without saying that it helps to flesh out the trees.

I personally expect the next couple of years will be major years of discovery. Yes, the SNP tsumani has hit land, but it’s far from over.

Research and Development

David Mittleman, Chief Scientific Officer, mentioned that Family Tree DNA now has their own R&D division where they are focused on how to best analyze data. They have been collaborating with other scientists.  A haplogroup G1 paper will be published shortly which states that SNP mutation rates equate to Sanger data.

FTDNA wants to get Big Y data into the public domain. They have set up consent for this to be done by uploading into NCBI.  Initially they sent a survey to a few people that  sampled the interest level.  Those who were interested received a release document.  If you are interested in allowing FTDNA to utilize your DNA for research, be it mitochondrial, Y or autosomal, please send them an e-mail stating such.

Don’t Forget About Y Genealogy Research

It’s very easy for us to get excited about the research and discovery aspect of DNA – and the new SNPs and extending haplotrees back in time as far as possible, but sometimes I get concerned that we are forgetting about the reason we began doing genetic genealogy in the first place.

Robert Baber’s presentation discussed the process of how to reconstruct a tree utilizing both genealogy and DNA results. It’s important to remember that the reason most of our participants test is to find their ancestors, not, primarily, to participate in the scientific process.

Robert baber

edward baber

Robert has succeeded in reconstructing 110 or 111 markers of the oldest known Baber ancestor, shown above. I wrote about how to do this in my article titled, Triangulation for Y DNA.

Not only does this allow us to compare everyone with the ancestor’s DNA, it also provides us with a tool to fit individuals who don’t know specific genealogical line into the tree relatively accurately. When I say relatively, the accuracy is based on line marker mutations that have, or haven’t, happened within that particular family.

Jim illustrated how to do this as well, and his methodology is available at the link on his slide, below.

baber method

I had to laugh. I’ve often wondered what our ancestors would think of us today.  Robert said that that 11 generations after Edward Baber died, he flew over church where Edward was buried and wondered what Edward would have thought about what we know and do today – cars, airplanes, DNA, radio, TV etc..  If someone looked in a crystal ball and told Edward what the future held 11 generations later, he would have thought that they were stark raving mad.

Eleven generations from my birth is roughly the year 2280. I’m betting we won’t be trying to figure out who our ancestors were through this type of DNA analysis then.  This is only a tiny stepping stone to an unknown world, as different to us as our world is to Edward Baber and all of our ancestors who lived in a time where we know their names but their lives and culture are entirely foreign to ours.

Publications

When the Journal of Genetic Genealogy was active, I, along with other citizen scientists published regularly.  The benefit of the journal was that it was peer reviewed and that assured some level of accuracy and because of that, credibility, and it was viewed by the scientific community as such.  My co-authored works published in JOGG as well as others have been cited by experts in the academic community.  It other words, it was a very valuable journal.  Sadly, it has fallen by the wayside and nothing has been published since 2011.  A new editor was recruited, but given their academic load, they have not stepped up to the plate.  For the record, I am still hopeful for a resurrection, but in the mean time, another opportunity has become available for genetic genealogists.

Brad Larkin has founded the Surname DNA Journal, which, like JOGG, is free to both authors and subscribers. In case you weren’t aware, most academic journal’s aren’t.  While this isn’t a large burden for a university, fees ranging from just over $1000 to $5000 are beyond the budget of genetic genealogists.  Just think of how many DNA tests one could purchase with that money.

brad larkin

surname dna journal

Brad has issued a call for papers. These papers will be peer reviewed, similarly to how they were reviewed for JOGG.

call for papers

Take a look at the articles published in this past year, since the founding of Surname DNA Journal.

The citizen science community needs an avenue to publish and share. Peer reviewed journals provide us with another level of credibility for our work. Sharing is clearly the lynchpin of genetic genealogy, as it is with traditional genealogy. Give some thought about what you might be able to contribute.

Brad Larkin solicited nominations prior to the conference and awarded a Genetic Genealogist of the Year award. This year’s award was dually presented to Ian Kennedy in Australia, who, unfortunately, was not present, and to CeCe Moore, who just happened to follow Brad’s presentation with her own.

Don’t Forget about Mitochondrial DNA Either

I believe that mitochondrial DNA the most underutilized DNA tool that we have, often because how to use mitochondrial DNA, and what it can tell you, is poorly understood. I wrote about this in an article titled, Mitochondrial, The Maligned DNA.

Given that I work with mitochondrial DNA daily when I’m preparing client’s Personalized DNA Reports (orderable from your personal page at Family Tree DNA or directly from my website), I know just how useful mitochondrial can be and see those examples regularly. Unfortunately, because these are client reports, I can’t write about them publicly.

CeCe Moore, however, isn’t constrained by this problem, because one of the ways she contributes to genetic genealogy is by working with the television community, in particular Genealogy Roadshow and the PBS series, Finding Your Roots. Now, I must admit, I was very surprised to see CeCe scheduled to speak about mitochondrial DNA, because the area of expertise where she is best known is autosomal DNA, especially in conjunction with adoptee research.

cece moore

cece mtdna

During the research for the production of these shows, CeCe has utilized mitochondrial DNA with multiple celebrities to provide information such as the ethnic identification of the ancestor who provided the mitochondrial DNA as Native American.

Autosomal DNA testing has a broad but shallow reach, across all of your lines, but just back a few generations.  Both Y and mitochondrial DNA have a very deep reach, but only on one specific line, which makes them excellent for identifying a common ancestor on that line, as well as the ethnicity of that individual.

I have seen other cases, where researchers connected the dots between people where no paper trail existed, but a relationship between women was suspected.

CeCe mentioned that currently there are only 44,000 full sequence results in the Family Tree DNA data base and and 185K total HVR1, HVR2 and full sequence tests. Y has half a million.  We need to increase the data base, which, of course increases matches and makes everyone happier.  If you haven’t tested your mitochondrial DNA to the full sequence level, this would be a great time!

There are several lessons on how to utilize mitochondrial DNA at this ISOGG link.

I’m very hopeful that CeCe’s presentation will be made available as I think her examples are quite powerful and will serve to inspire people.  Actually, since CeCe is in the “movie business,” perhaps a short video clip could be made available on the FTDNA website for anyone who hasn’t tested their mitochondrial DNA so they can see an example of why they should!

myOrigins

I would be fibbing to you if I told you I am happy with myOrigins. I don’t feel that it is as sensitive as other methods for picking up minority admixture, in particular, Native American, especially in small amounts.  Unfortunately, those small amounts are exactly what many people are looking for.

If someone has a great-great-great-great grandparent that is Native, they carry about 1%, more or less, of the Native ancestor’s DNA today. A 4X great grandparent puts their birth year in the range of 1800-1825 – or just before the Trail of Tears.  People whose colonial American families intermarried with Native families did so, generally, before the Trail of Tears.  By that time, many tribes were already culturally extinct and those east of the Mississippi that weren’t extinct were fighting for their lives, both literally and figuratively.

We really need the ability to develop the most sensitive testing to report even the smallest amounts of Native DNA and map those segments to our chromosomes so that we can determine who, and what line in our family, was Native.

I know that Family Tree DNA is looking to improve their products, and I provided this feedback to them. Many people test autosomally only for their ethnicity results and I surely would love to have those people’s results available as matches in the FTDNA data base.

Razib Khan has been working with Family Tree DNA on their myOrigins product and spoke about how the myOrigins data is obtained.

razib kahn

my origins pieces

Given that all humans are related, one way or another, far enough back in time, myOrigins has to be able to differentiate between groups that may not be terribly different. Furthermore, even groups that appear different today may not have been historically.  His own family, from India, has no oral history of coming from the East, but the genetic data clearly indicates that they did, along with a larger group, about 1000 years ago.  This may well be a result of the adage that history is written by the victors, or maybe whatever happened was simply too long ago or unremarkable to be recorded.

Razib mentioned that depending on the cluster and the reference samples, that these clusters and groups that we see on our myOrigins maps can range from 1000-10,000 years in age.

relatedness of clusters

The good news is that genetics is blind to any preconceived notions. The bad news is that the software has to fit your results to the best population, even though it may not be directly a fit.  Hopefully, as we have more and better reference populations, the results will improve as well.

my origin components

pca chart

Razib showed a PCA (principal components analysis) graph, above. These graphs chart reference populations in different quadrants.  Where the different populations overlap is where they share common historic ancestors.  As you can see, on this graph with these reference populations, there is a lot of overlap in some cases, and none in others.

Your personal results would then be plotted on top of the reference populations. The graph below shows me, as the white “target” on a PCA graph created by Doug McDonald.

my pca chart

The Changing Landscape

A topic discussed privately among the group, and primarily among the bloggers, is the changing landscape of genetic genealogy over the past year or so.  In many ways I think the bloggers are the canaries in the mine.

One thing that clearly happened is that the proverbial tipping point occurred, and we’re past it. DNA someplace along the line became mainstream.  Today, DNA is a household word.  At gatherings, at least someone has tested, and most people have heard about DNA testing for genealogy or at least consumer based DNA testing.

The good news in all of this is that more and more people are testing. The bad news is that they are typically less informed and are often impulse purchasers.  This gives us the opportunity for many more matches and to work with new people.  It also means there is a steep learning curve and those new testers often know little about their genealogy.  Those of us in the “public eye,” so to speak, have seen an exponential spike in questions and communications in the past several months.  Unfortunately, many of the new people don’t even attempt to help themselves before asking questions.

Sometimes opportunity comes with work clothes – for them and us both.

I was talking with Spencer about this at the reception and he told me I was stealing his presentation.  He didn’t seem too upset by this:)

spencer and me

I had to laugh, because this falls clearly into the “be careful what you wish for, you may get it” category. The Genographic project through National Geographic is clearly, very clearly, a critical component of the tipping point, and this was reflected in Spencer’s presentation.  Although I covered quite a bit of Spencer’s presentation in my day 2 summary, I want to close with Spencer here.  I also want to say that if you ever have the opportunity to hear Spencer speak, please do yourself the favor and be sure to take that opportunity.  Not only is he brilliant, he’s interesting, likeable and very approachable.  Of course, it probably doesn’t hurt that I’ve know him now for 9 years!  I’ve never thought to have my picture taken with Spencer before, but this time, one of my friends did me the favor.

I have to admit, I love talking to Spencer, and listening to him. He is the adventurer through whom we all live vicariously.  In the photo below, Spencer along with his crew, drove from London to Mongolia.  Not sure why he is standing on the top of the Land Rover, but I’m sure he will tell us in his upcoming book about that journey,

spencer on roof

I’m warning you all now, if I win the lottery, I’m going on the world tour that he hosts with National Geographic, and of course, you’ll all be coming with me via the blog!

Spencer talked about the consumer genomics market and where we are today.

spencer genomics

Spencer mentioned that genetic genealogy was a cottage industry originally. It was, and it was even smaller than that, if possible.  It actually was started by Bennett and his cell phone.  I managed to snap a picture of Bennett this weekend on the stage looking at his cell, and I thought to myself, “this is how it all started 14 years ago.”  Just look where we are today.  Thank you Michael Hammer for telling Bennett that you received “lots of phone calls from crazy genealogists like you.”

bennett first office

So, where exactly are we today?  In 2013, the industry crossed the millionth kit line.  The second millionth kit was sold in early summer 2014 and the third million will be sold in 2015.  No wonder we feel like a tidal wave has hit.  It has.

Why now?

DNA has become part of national consciousness.  Businesses advertise that “it’s in our DNA.”  People are now comfortable sharing via social media like facebook and twitter.  What DNA can do and show you, the secrets it can unlock is spreading by word of mouth.  Spencer termed this the “viral spread threshold” and we’ve crossed that invisible line in the sand.  He terms 2013 as the year of infection and based on my blog postings, subscriptions, hits, reach and the number of e-mails I receive, I would completely agree.  Hold on tight for the ride!

Spencer talked about predictions for near term future and said a 5 year plan is impossible and that an 18 month plan is more realistic. He predicts that we will continue to see exponential growth over the next several years.  He feels that genetic genealogy testing will be primary driver of growth because medical or health testing is subject to the clinical utility trap being experienced currently by 23andMe.  The Big 4 testing companies control 99% of consumer market in US (Ancestry, 23andMe, Family Tree DNA and National Geographic.)

Spencer sees a huge international market potential that is not currently being tapped. I do agree with him, but many in European countries are hesitant, and in some places, like France, DNA testing that might expose paternity is illegal.  When Europeans see DNA testing as a genealogical tool, he feels they will become more interested.  Most Europeans know where their ancestral village is, or they think they do, so it doesn’t have the draw for them that it does for some of us.

Ancestry testing (aka genetic genealogy as opposed to health testing) is now a mature industry with 100% growth rate.

Spencer also mentioned that while the Genographic data base is not open access, that affiliate researchers can send Nat Geo a proposal and thereby gain research access to the data base if their proposal is approved. This extends to citizen scientists as well.

spencer near term

Michael Hammer

You’ll notice that Michael Hammer’s presentation, “Ancient and Modern DNA Update, How Many Ancestral Populations for Europe,” is missing from this wrapup. It was absolutely outstanding, and fascinating, which is why I’m writing a separate article about his presentation in conjunction with some additional information.  So, stay tuned.

Testing, More Testing

It’s becoming quite obvious that the people who are doing the best with genetic genealogy are the ones who are testing the most family members, both close and distant. That provides them with a solid foundation for comparison and better ways to “drop matches” into the right ancestor box.  For example, if someone matches you and your mother’s sister, Aunt Margaret, especially if your mother is not available to test, that’s a very important hint that your match is likely from your mother’s line.

So, in essence, while initially we would advise people to test the oldest person in a generational line, now we’ve moved to the “test everyone” mentality.  Instead of a survey, now we need a census.  The exception might be that the “child” does not necessarily need to be tested because both parents have tested.  However, having said that, I would perhaps not make that child’s test a priority, but I would eventually test that child anyway.  Why?  Because that’s how we learn.  Let me give you an example.

I was sitting at lunch with David Pike. were discussing autosomal DNA generational transmission and inheritance.  He pulled out his iPad, passed it to me, and showed me a chromosome (not the X) that has been passed entirely intact from one generation to the next.  Had the child not been tested, we would never have known that.  Now, of course, if you’ll remember the 50% rule, by statistical prediction, the child should get half of the mother’s chromosome and half of the father’s, but that’s not how it worked.  So, because we don’t know what we don’t know, I’m now testing everyone I can find and convince in my family.  Unfortunately, my family is small.

Full genome testing is in the future, but we’re not ready yet. Several presenters mentioned full genome testing in some context.  Here’s the bottom line.  It’s not truly full genome testing today, only 95-96%.  The technology isn’t there yet, and we’re still learning.  In a couple of years, we will have the entire genome available for testing, and over time, the prices will fall.  Keep in mind that most of our genome is identical to that of all humans, and the autosomal tests today have been developed in order to measure what is different and therefore useful genealogially.  I don’t expect big breakthroughs due to full genome testing for genetic genealogy, although I could be wrong.  You can, however, count me in, because I’m a DNA junkie.  When the full genome test is below $1000, when we have comparison tools and when the coverage won’t necessitate doing a second or upgrade test a few years later, I’ll be there.

Thank you

I want to offer a heartfelt thank you to Max Blankfeld and Bennett Grenspan, founders of Family Tree DNA, shown with me in the photo below, for hosting and subsidizing the administrator’s conference – now for a decade. I look forward to seeing them, and all of the other attendees, next year.

I anticipate that this next decade will see many new discoveries resulting in tools that make our genealogy walls fall.  I can’t help but wonder what the article I’ll be writing on the 20th anniversary looking back at nearly a quarter century of genetic genealogy will say!

roberta, max and bennett

Ancient DNA Matches – What Do They Mean?

The good news is that my three articles about the Anzick and other ancient DNA of the past few days have generated a lot of interest.

The bad news is that it has generated hundreds of e-mails every day – and I can’t possibly answer them all personally.  So, if you’ve written me and I don’t reply, I apologize and  I hope you’ll understand.  Many of the questions I’ve received are similar in nature and I’m going to answer them in this article.  In essence, people who have matches want to know what they mean.

Q – I had a match at GedMatch to <fill in the blank ancient DNA sample name> and I want to know if this is valid.

A – Generally, when someone asks if an autosomal match is “valid,” what they really mean is whether or not this is a genealogically relevant match or if it’s what is typically referred to as IBS, or identical by state.  Genealogically relevant samples are referred to as IBD, or identical by descent.  I wrote about that in this article with a full explanation and examples, but let me do a brief recap here.

In genealogy terms, IBD is typically used to mean matches over a particular threshold that can be or are GENEALOGICALLY RELEVANT.  Those last two words are the clue here.  In other words, we can match them with an ancestor with some genealogy work and triangulation.  If the segment is large, and by that I mean significantly over the threshold of 700 SNPs and 7cM, even if we can’t identify the common ancestor with another person, the segment is presumed to be IBD simply because of the math involved with the breakdown of segment into pieces.  In other words, a large segment match generally means a relatively recent ancestor and a smaller segment means a more distant ancestor.  You can readily see this breakdown on this ISOGG page detailing autosomal DNA transmission and breakdown.

Unfortunately, often smaller segments, or ones determined to be IBS are considered to be useless, but they aren’t, as I’ve demonstrated several times when utilizing them for matching to distant ancestors.  That aside, there are two kinds of IBS segments.

One kind of IBS segment is where you do indeed share a common ancestor, but the segment is small and you can’t necessarily connect it to the ancestor.  These are known as population matches and are interpreted to mean your common ancestor comes from a common population with the other person, back in time, but you can’t find the common ancestor.  By population, we could mean something like Amish, Jewish or Native American, or a country like Germany or the Netherlands.

In the cases where I’ve utilized segments significantly under 7cM to triangulate ancestors, those segments would have been considered IBS until I mapped them to an ancestor, and then they suddenly fell into the IBD category.

As you can see, the definitions are a bit fluid and are really defined by the genealogy involved.

The second kind of IBS is where you really DON’T share an ancestor, but your DNA and your matches DNA has managed to mutate to a common state by convergence, or, where your Mom’s and Dad’s DNA combined form a pseudo match, where you match someone on a segment run long enough to be considered a match at a low level.  I discussed how this works, with examples, in this article.  Look at example four, “a false match.”

So, in a nutshell, if you know who your common ancestor is on a segment match with someone, you are IBD, identical by descent.  If you don’t know who your common ancestor is, and the segment is below the normal threshold, then you are generally considered to be IBS – although that may or may not always be true.  There is no way to know if you are truly IBS by population or IBS by convergence, with the possible exception of phased data.

Data phasing is when you can compare your autosomal DNA with one or both parents to determine which half you obtained from whom.  If you are a match by convergence where your DNA run matches that of someone else because the combination of your parents DNA happens to match their segment, phasing will show that clearly.  Here’s an example for only one location utilizing only my mother’s data phased with mine.  My father is deceased and we have to infer his results based on my mother’s and my own.  In other words, mine minus the part I inherited from my mother = my father’s DNA.

My Result My Result Mother’s Result Mother’s Result Father’s Inferred Result Father’s Inferred Result
T A T G A

In this example of just one location, you can see that I carry a T and an A in that location.  My mother carries a T and a G, so I obviously inherited the T from her because I don’t have a G.  Therefore, my father had to have carried at least an A, but we can’t discern his second value.

This example utilized only one location.  Your autosomal data file will hold between 500,000 and 700,000 location, depending on the vendor you tested with and the version level.

You can phase your DNA with that of your parent(s) at GedMatch.  However, if both of your parents are living, an easier test would be to see if either of your parents match the individual in question.  If neither of your parents match them, then your match is a result of convergence or a data read error.

So, this long conversation about IBD and IBS is to reach this conclusion.

All of the ancient specimens are just that, ancient, so by definition, you cannot find a genealogy match to them, so they are not IBD.  Best case, they are IBS by population.  Worse case, IBS by convergence.  You may or may not be able to tell the difference.  The reason, in my example earlier this week, that I utilized my mother’s DNA and only looked at locations where we both matched the ancient specimens was because I knew those matches were not by convergence – they were in fact IBS by population because my mother and I both matched Anzick.

ancient compare5

Q – What does this ancient match mean to me?

A – Doggone if I know.  No, I’m serious.  Let’s look at a couple possibilities, but they all have to do with the research you have, or have not, done.

If you’ve done what I’ve done, and you’ve mapped your DNA segments to specific ancestors, then you can compare your ancient matching segments to your ancestral spreadsheet map, especially if you can tell unquestionably which side the ancestral DNA matches.  In my case, shown above, the Clovis Anzik matched my mother and me on the same segment and we both matched Cousin Herbie.  We know unquestionably who our common ancestor is with cousin Herbie – so we know, in our family line, which line this segment of DNA shared with Anzick descends through.

ancient compare6

If you’re not doing ancestor mapping, then I guess the Anzick match would come in the category of, “well, isn’t that interesting.”  For some, this is a spiritual connection to the past, a genetic epiphany.  For other, it’s “so what.”

Maybe this is a good reason to start ancestor mapping!  This article tells you how to get started.

Q – Does my match to Anzick mean he is my ancestor?

A – No, it means that you and Anzick share common ancestry someplace back in time, perhaps tens of thousands of years ago.

Q – I match the Anzick sample.  Does this prove that I have Native American heritage? 

A – No, and it depends.  Don’t you just hate answers like this?

No, this match alone does not prove Native American heritage, especially not at IBS levels.  In fact, many people who don’t have Native heritage match small segments?  How can this be?  Well, refer to the IBS by convergence discussion above.  In addition, Anzick child came from an Asian population when his ancestors migrated, crossing from Asia via Beringia.  That Eurasian population also settled part of Europe – so you could be matching on very small segments from a common population in Eurasia long ago.  In a paper just last year, this was discussed when Siberian ancient DNA was shown to be related to both Native Americans and Europeans.

In some cases, a match to Anzick on a segment already attributed to a Native line can confirm or help to confirm that attribution.  In my case, I found the Anzick match on segments in the Lore family who descend from the Acadians who were admixed with the Micmac.  I have several Anzick match segments that fit that criteria.

A match to Anzick alone doesn’t prove anything, except that you match Anzick, which in and of itself is pretty cool.

Q – I’m European with no ancestors from America, and I match Anzick too.  How can that be?

A – That’s really quite amazing isn’t it.  Just this week in Nature, a new article was published discussing the three “tribes” that settled or founded the European populations.  This, combined with the Siberian ancient DNA results that connect the dots between an ancient population that contributed to both Europeans and Native Americans explains a lot.

3 European Tribes

If you think about it, this isn’t a lot different than the discovery that all Europeans carry some small amount of Neanderthal and Denisovan DNA.

Well, guess what….so does Anzick.

Here are his matches to the Altai Neanderthal.

Chr Start Location End Location Centimorgans (cM) SNPs
2 241484216 242399416 1.1 138
3 19333171 21041833 2.6 132
6 31655771 32889754 1.1 133

He does not match the Caucasus Neanderthal.  He does, however, match the Denisovan individual on one location.

Chr Start Location End Location Centimorgans (cM) SNPs
3 19333171 20792925 2.1 107

Q – Maybe the scientists are just wrong and the burial is not 12,500 years old,  maybe just 100 years old and that’s why the results are matching contemporary people.

A – I’m not an archaeologist, nor do I play one…but I have been closely involved with numerous archaeological excavations over the past decade with The Lost Colony Research Group, several of which recovered human remains.  The photo below is me with Anne Poole, my co-director, sifting at one of the digs.

anne and me on dig

There are very specific protocols that are followed during and following excavation and an error of this magnitude would be almost impossible to fathom.  It would require  kindergarten level incompetence on the part of not one, but all professionals involved.

In the Montana Anzick case, in the paper itself, the findings and protocols are both discussed.  First, the burial was discovered directly beneath the Clovis layer where more than 100 tools were found, and the Clovis layer was undisturbed, meaning that this is not a contemporary burial that was buried through the Clovis layer.  Second, the DNA fragmentation that occurs as DNA degrades correlated closely to what would be expected in that type of environment at the expected age based on the Clovis layer.  Third, the bones themselves were directly dated using XAD-collagen to 12,707-12,556 calendar years ago.  Lastly, if the remains were younger, the skeletal remains would match most closely with Native Americans of that region, and that isn’t the case.  This graphic from the paper shows that the closest matches are to South Americans, not North Americans.

anzick matches

This match pattern is also confirmed independently by the recent closest GedMatch matches to South Americans.

Q – How can this match from so long ago possibly be real?

A – That’s a great question and one that was terribly perplexing to Dr. Svante Paabo, the man who is responsible for producing the full genome sequence of the first, and now several more, Neanderthals.  The expectation was, understanding autosomal DNA gets watered down by 50% in every generation though recombination, that ancient genomes would be long gone and not present in modern populations.  Imagine Svante’s surprise when he discovered that not only isn’t true, but those ancient DNA segmetns are present in all Europeans and many Asians as well.  He too agonized over the question about how this is possible, which he discussed in this great video.  In fact he repeated these tests over and over in different ways because he was convinced that modern individuals could not carry Neanderthal DNA – but all those repeated tests did was to prove him right.  (Paabo’s book, Neanderthal Man, In Search of Lost Genomes is an incredible read that I would highly recommend.)

What this means is that the population at one time, and probably at several different times, had to be very small.  In fact, it’s very likely that many times different pockets of the human race was in great jeopardy of dying out.  We know about the ones that survived.  Probably many did perish leaving no descendants today.  For example, no Neanderthal mitochondrial DNA has been found in any living or recent human.

In a small population, let’s say 5 males and 5 females who some how got separated from their family group and founded a new group, by necessity.  In fact, this could well be a description of how the Native Americans crossed Beringia.  Those 5 males and 5 females are the founding population of the new group.  If they survive, all of the males will carry the men’s haplogroups – let’s say they are Q and C, and all of the descendants will carry the mitochondrial haplogroups of the females – let’s say A, B, C, D and X.

There is a very limited amount of autosomal DNA to pass around.  If all of those 10 people are entirely unrelated, which is virtually impossible, there will be only 10 possible combinations of DNA to be selected from.  Within a few generations, everyone will carry part of those 10 ancestor’s DNA.  We all have 8 ancestors at the great-grandparent level.  By the time those original settlers’ descendants had great-great-grandparents – of which each one had 16, at least 6 of those original people would be repeated twice in their tree.

There was only so much DNA to be passed around.  In time, some of the segments would no longer be able to be recombined because when you look at phasing, the parents DNA was exactly the same, example below.  This is what happens in endogamous populations.

My Result My Result Mother’s Result Mother’s Result Father’s Result Father’s  Result
T T T T T T

Let’s say this group’s descendants lived without contact with other groups, for maybe 15,000 years in their new country.  That same DNA is still being passed around and around because there was no source for new DNA.  Mutations did occur from time to time, and those were also passed on, of course, but that was the only source of changed DNA – until they had contact with a new population.

When they had contact with a new population and admixture occurred, the normal 50% recombination/washout in every generation began – but for the previous 15,000 years, there had been no 50% shift because the DNA of the population was, in essence, all the same.  A study about the Ashkenazi Jews that suggests they had only a founding population of about 350 people 700 years ago was released this week – explaining why Ashkenazi Jewish descendants have thousands of autosomal matches and match almost everyone else who is Ashkenazi.  I hope that eventually scientists will do this same kind of study with Anzick and Native Americans.

If the “new population” we’ve been discussing was Native Americans, their males 15,000 year later would still carry haplogroups Q and C and the mitochondrial DNA would still be A, B, C, D and X.  Those haplogroups, and subgroups formed from mutations that occurred in their descendants, would come to define their population group.

In some cases, today, Anzick matches people who have virtually no non-Native admixture at the same level as if they were just a few generations removed, shown on the chart below.

anzick gedmatch one to all

Since, in essence, these people still haven’t admixed with a new population group, those same ancient DNA segments are being passed around intact, which tells us how incredibly inbred this original small population must have been.  This is known as a genetic bottleneck.

The admixture report below is for the first individual on the Anzick one to all Gedmatch compare at 700 SNPs and 7cM, above.  In essence, this currently living non-admixed individual still hasn’t met that new population group.

anzick1

If this “new population” group was Neanderthal, perhaps they lived in small groups for tens of thousands of years, until they met people exiting Africa, or Denisovans, and admixed with them.

There weren’t a lot of people anyplace on the globe, so by virtue of necessity, everyone lived in small population groups.  Looking at the odds of survival, it’s amazing that any of us are here today.

But, we are, and we carry the remains, the remnants of those precious ancestors, the Denisovans, the Neanderthals and Anzick.  Through their DNA, and ours, we reach back tens of thousands of years on the human migration path.  Their journey is also our journey.  It’s absolutely amazing and it’s no wonder people have so many questions and such a sense of enchantment.  But it’s true – and only you can determine exactly what this means to you.

Big Y DNA Results Divide and Unite Haplogroup Q Native Americans

featherOne of my long standing goals has been to resurrect the lost heritage of the Native American people.  By this I mean, primarily, for genealogists who search for and can’t find  their Native ancestors.  My blog, www.nativeheritageproject.com, is one of the ways that I contribute towards that end.  Many times, records are buried, don’t exist at all, or don’t reflect anything about Native heritage.  While documents can be somewhat evasive and frustratingly vague, the Y DNA of the male descendants is not.  It’s rock solid.

The Native communities became admixed beginning with the first visits of Europeans to what would become the Americas.  Native people accepted mixed race individuals as full tribal members, based on the ethnicity of the mother.  Adoption also played a key role.  If a female, the mother, was an adopted white child, the mother was considered to be fully Native, as was her child, regardless of the ethnicity of the father.

Therefore, some people who test their DNA expecting to find Native genetics do not – they instead find European or African – but that alone does not mean that their ancestors were not tribal members.  It means that these individuals have to rely on non-genetic records to prove their ancestors Native heritage – or they need to test a different line – like the descendants of the mother, through all females, for example, for mitochondrial DNA.

On the other hand, some people are quite surprised when their DNA results come back as Native.  Many have heard a vague story, but often, they don’t have a clue as to which genealogical line, if any, the Native ancestry originated.  Native ancestry was often hidden because the laws that prevailed at the time sanctioned discrimination of many kinds against people “of color,” and if you weren’t entirely of European origin, you were “of color.”  Many admixed people, as soon as they could, “became” white socially and never looked back. Not until recently, the late 20th century, when discrimination had for the most part become a thing of the past and one could embrace their Native or African heritage without fear of legal or social reprisal.

Back in December of 2010, we found the defining SNP that divided haplogroup Q between Europeans and Native Americans.  At the time, this was a huge step forward, a collaboration between testing participants, haplogroup administrators, citizen scientists and Family Tree DNA.

This allowed us to determine who was, and was not included in Native American haplogroups, but it was also the tip of the iceberg.  You can see below just how much the tree has expanded and its branches have been shuffled.  This is a big part of the reason for the change from haplogroup names like Q1a3 to Q-M346.  For example, at one time or another the SNP M3 was associated with haplogroup names Q1a3a, Q1a3a1 and Q1a3a1a.  On the ISOGG tree below, today M3 is associated with Q1a2a1a1.

isogg q tree

The new Family Tree DNA 2014 tree is shown below for one of the Big Y participants whose terminal SNP is L568, found beneath SNP CTS1780 which is found beneath L4, which is beneath L213 which is beneath L474 which is beneath MEH2 which is beneath L232 which is, finally, beneath M242.

ftdna 2014 q tree

The introduction of the Big Y product from Family Tree DNA, which sequences a large portion of the Y chromosome, provided us with the opportunity to make huge strides in unraveling and deciphering the haplogroup Q (and C, the other male Native haplogroup in the Americas) tree.  I am hopeful that in time, and with enough people taking the Big Y test, that we will one day be able to at least sort participants into language and perhaps migration groups.

In November, 2013, we asked for the public and testers to support our call for funds to be able to order several Big Y tests.  The project administrators intentionally did not order tests in family groups, but attempted to scatter the tests to the far corners, so to speak, and to include at least one person from each disparate group we have in the haplogroup Q project, based on STR matches, or lack thereof, and previous SNP testing.

Thanks to the generosity of contributors, we were able to order several tests.  In addition, some participants were able to order their own tests, and did.  Thank you one and all.

The tests are back now, and with the new Big Y SNP matching, recently introduced by Family Tree DNA, comparisons are a LOT easier.

So, of course, I had to see what I could find by comparing the SNP results of the several gentlemen who tested.

To protect the privacy of everyone involved, I have reduced their names to initials.  I have included their terminal SNP as identified at Family Tree DNA as well as any tribal, ethnic or location information we have available for their most distant paternal ancestor.

There are two individuals who believe their ancestors are from Europe, and there is a very large group of European haplogroup Q members, but I’m not convinced that the actual biological ancestors of these two gentlemen are from Europe.  I have included both of these individuals as well. Let’s just say the jury is still out. As a control, I have also included a gentleman who actually lives in Poland.

native match clusters

Of the individuals above, SD, CT and CM are SNP matches.

CD, WJS and WBS are SNP matches with each other.

BG and ETW are also SNP matches to each other.

None of the rest of these individuals have SNP matches.  (Note, you can click to enlarge the chart.)

native snp matches

In the table above, the Non-Matching Known SNPs are shown with the number of Shared Novel Variants.  For example, SD and CT have 4 non-matching SNPS and share 161 Novel Variants and are noted as 4/161.

We can easily tell which of the known SNPs are nonmatching, because they are shown on the participants match page.

snp matches page

What we don’t know, and can’t tell, is how many Novel Variants these people share with each other, and how many they might share with the individuals that aren’t shown as matches.

Keep in mind that there may be individuals here that are not shown as matches to due no-calls.  Only people with up to and including 4 non-matching Known SNPs are counted as matches.  If you have the wrong combination of no-calls, or, aren’t in the same terminal haplogroup, you may not be shown as a match when you otherwise would be.

The other reason for my intense interest in the Novel Variants is to see if they are actually Novel, as in found only in a few people, or if they are more widespread.

I downloaded each person’s Novel Variants through the Export Utility (blue button to the right at the top of your personal page,) and combined the Novel Variants into a single spreadsheet.  I colorized each person’s result rows so that they would be easy to track.  I have redacted their names. The white row, below, is the individual who lives in Poland.

novel variant 1

There are a total of 3506 Novel Variants between these men.  When sorting, many clustered as you would expect.  There is the Algonguian group and what I’ve taken to calling the Borderlands group.  This group has someone whose ancestor was born in VA and two in SC.  I have documentation for the Virginia family having descendants in SC, so that makes sense.  The third group is an unusual combination of the gentleman who believes his ancestors are from Germany and the gentleman whose ancestors are found in a New Mexico Pueblo tribe, but whose ancestor was, likely, based on church records, a detribalized Plains Indian who had been kidnapped and sold.

Clusters that I felt needed some scrutiny, for one reason or another, I highlighted in yellow in the Terminal SNP column.  Obviously the Polish/Pueblo matching needs some attention.

Another very interesting type of match are several where either all or nearly all of the individuals share a Novel Variant – 15 or 16 of 16 total participants.  I don’t think these will remain Novel Variants very long.  They clearly need to be classified as SNPs.  I’m not sure about the process that Family Tree DNA will use to do this, but I’ll be finding out shortly.

Here’s an example where everyone shares this Novel Variant at location 7688075,except the gentleman who lives in Poland, the man who believes his ancestor is from Germany, and the Creek descendant.

novel variant 2

I was very surprised at how many Novel Variants appear in all 16 results of the participants, including the gentleman who lives in Poland – represented by the white row below.

novel variant 3

So, how were the Novel Variants distributed?

Category # of Variants Comments
Algonquian Group 140 This is to be expected since it’s within a specific group.  Any matches that include people outside the 3 Algonquian individuals are counted in a separate category.  These matches give us the ability to classify anyone who tests with these marker results as provisionally Algonquian.
Borderlands 83 This confirms that these three individuals are indeed a “group” of some sort.  This also gives us the ability to classify future participants using these mutations.
All or Nearly All – 15 or 16 Participants 80 These are clearly candidates for SNPs, and, given that they are found in the Native and the European groups, they appear to predate the division of haplogroup Q.
Several Native and European, Combined 45 This may or may not include the person who lives in Poland.  This group needs additional scrutiny to determine if it actually does exist in Europe, but given that there are more than 3 individuals with each of these Novel Variants, they need to be considered for SNPhood.
Pueblo/NC 1
Poland/Borderlands 2
Mexico/Algonquian 2
German/Pueblo 9 I wonder if this person is actually German.
Poland/Mexico 20 I wonder if this person’s ancestors are actually from Poland.
Algonquian, NC, Creek 1
Borderland, Mexico, Creek 1
Algonquian/Cherokee 1
All Native, no Euro 2
Algonquian, Borderlands, Mexico, NC 1
Algonquian, Mexico, Borderlands 1
Borderlands, Pueblo 1
Borderlands, Creek, NC 1
Algonquian, Cherokee, Mexico 3
Algonquian, Pueblo, Creek, Borderlands 1
Cherokee, NC 2
Algonquian, Borderlands 2
Borderlands, NC 1
Algonquian, NC 1
Polish/NC 10

Some of this distribution makes me question if these SNP mutations truly are a “once in the history of mankind” kind of thing.  For example, how did the same SNP appear in the Polish person and the NC person, or the Pueblo person, and not in the rest of the Native people?

New SNPs?

So, are you sitting down?

Based on these numbers, it looks like we have at least 125 new SNP candidates for  haplogroup Q.  If we count the Algonquian and the Borderlands groups of matches, that number rises to about 250.  This is very exciting.  Far, far more than I ever expected.  of these SNPS, about half will identify Native people, even Native groupings of people.  This is a huge step forward, a red letter day for Native American ancestry!

SNPs and STRs

Lastly, I wanted to see how the SNP matching compared to STR matching, or if it did at all, for these men.

Only two men match each other on any STR markers.  CD and WJS matched on 12 markers, but not on higher panels.  The TIP calculator estimated their common ancestor at the 50th percentile to be 17 generations, or between 425 and 510 years ago.  We all know how unrealistic it is to depend on the TIP calculator, but it’s the only tool we have in situations like this.

Given that these are the only two men who do match on STR markers, albeit distantly, in a genealogical timeframe, let’s see what the estimates using the 150 years per SNP mutation comes up with.  This estimate is just that, devised by the haplogroup R-U106 project administrators, and others, based on their project findings.  150 years is actually the high end of the estimate, 98 being the lower end.  Of course, different haplogroups may vary and these results are very early.  Just saying.

CD has 207 high quality Novel Variants.  He shares 188 of those with WJS, leaving 19 unshared Novel Variants.  Utilizing this number, and multiplying by 150, this suggests that, if the 150 years per SNP is anyplace close to accurate, their common ancestor lived about 2850 years ago.  If you presume that both men are incurring mutations at the same rate in their independent lines, then you would divide the number of years in half, so the common ancestor would be more likely 1425 years ago.  If you use 100 years instead of 150, the higher number of years is 1900 and the half number is about 950 years.

It’s fun to speculate a bit, but until a lot more study has occurred, we won’t be able to reasonably estimate SNP age or age to common ancestor from this information.   Having said all of that, it’s not a long stretch from 710 years to 950 years.

It looks like STR markers are still the way to go for genealogical matching and that SNPS may help to pull together the deeper ancestry, migration patterns and perhaps define family lines.  I hope the day comes soon that I can order the Big Y for lots more project members.  Most of these men do have STR marker matches, and to men with both the same and different surnames.  I’d love to see the Big Y results for those individuals who match more closely in time.

This is still the tip of the iceberg.  There is a lot left to discover!  If you or a family member have haplogroup Q results, please consider ordering the Big Y.  It would make a wonderful gift and a great way to honor your ancestors!

You can also contribute to the American Indian project at this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=AIP

In order to donate to the haplogroup C-P39 project which also includes Native Americans, please click this link:

http://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

Big Y Matching

A few days ago, Family Tree DNA announced and implemented Big Y Matching between participants who have taken the Big Y test.

This is certainly welcome news.  Let’s take a look at Big Y matching, what it means and how to utilize the features.

First, there are really two different groups of people who will benefit from the Big Y tests.

People trying to sort through lines of a common and related surname – like the McDonald or Campbell families, for example – and haplogroup researchers and project administrators.

My own family, for example, is badly brick walled with Charles Campbell first found in Hawkins County, TN in the 1780s.  We know, via STR testing that indeed, he matches the Campbell Clan from Scotland, but we have no idea who is father might have been.  STR testing hasn’t been definitive enough on Charles’ two known sons’ descendants, so I’m very hopeful that someday enough Campbell men will test that we’ll be able between STR and SNP mutations to at least narrow the possible family lines.  If I’m incredibly lucky, maybe there will be a family line SNP (Novel Variant) and it won’t just narrow the line, it will give me a long-awaited answer by genetically announcing which line was his.  Could I be that lucky???  That’s like winning the genetic genealogy lottery!

For today, the Big Y test at $695 is expensive to run on an entire project of people, not to mention that many of the original participants in projects, the long-time hard-core genealogists, have since passed away.  We are now into our 15th years of genetic genealogy.

For those studying haplogroups, the Big Y is a huge sandbox and those researchers have lost no time whatsoever comparing various individuals’ SNPS, both known and novel, and creating haplogroup trees of those SNPs.  This is done by hand today, or maybe more accurately stated, by Excel.  This is “not fun” to put it mildly.  We owe these folks a huge debt of gratitude.  Their results are curated and posted, provisionally, on the ISOGG Tree.

There is an in-between group as well, and those are people who are working to establish relationships between people of different surnames.  In my case, Native American ancestors whose descendants have different surnames today, but who do share a common ancestor in some timeframe.  That timeframe of course could be anyplace from a couple hundred to several thousand years, since their entry into the Americas across Beringia someplace in the neighborhood of 12-15 thousand years ago.

The Big Y matching is extremely helpful to projects.

Let’s take a look.

Big Y Matches

Big Y landing

On your personal page, under “Other Results,” you’ll see the Big Y results.  Click on Results” and you’ll see the following page.

big y results

The Known SNPs and Novel Variants tabs have been there since release, but the Matching tab, top left, is new.

By clicking on the Matching tab, you will then see the men you match based on your terminal SNP as determined in the Big Y Known SNPs data base.  You will be matched to men who carry up to and including 4 mutations difference in known SNPs, and unlimited novel variant differences.  If you have a zero in the “Known SNP Difference” column, that means you have no differences at all in known SNPs.

big y matches cropped2

The individual being used for an example here has paternal ancestry from Hungary.  His terminal SNP is reported as R-CTS11962.  Therefore, all of the people he matches should also carry this same SNP as their terminal SNP.

This is actually quite interesting, because of his 10 exact matches, 9 of them have surnames or genealogy that suggests eastern European/Slavic ancestry.  The 10th, however, which happens to be his closest match, carries an English surname and reports their ancestor to be from Yorkshire, England.  His one mutation differences carry the same pattern, with one being from England and two of the other three from eastern Europe.

Our participant has 155 total Novel Variants, 135 high quality and 20 medium quality.  Only high quality are listed in the comparison.  Medium quality are not.

Ancestral Location Known SNP Difference Shared Novel Variants Non Matching Known SNPs
Yorkshire, England 0 134 None
Prussia 0 127 None
Ukraine 0 121 None
Poland 0 121 None
Belarus 0 119 None
Poland 0 116 None
Poland 0 116 None
Russian e-mail 0 113 None
Bulgaria 0 113 None
Slovakia 0 111 None
English surname 1 126 PF6085
Undetermined, poss German 1 121 F1816
Poland 1 118 F552
Poland 1 116 CTS10137
Prussia 2 122 CTS11840 PF4522
Poland 2 112 L1029 PR6932
Russia 3 116 CTS3184 L1029 PF3643
Poland 3 106 CTS11962 L1029 L260
Ukraine 3 105 CTS11962 L1029 L260
Poland 3 104 CTS11962 L1029 L260
Poland 3 100 CTS11962 L1029 L260
Poland 3 99 CTS11962 L1029 L260
Eastern European surname 3 98 CTS11962 L1029 L260
Poland/Germany 3 97 CTS11962 L1029 L260
Austria/Galacia 3 93 CTS11962 L1029 L260
Poland 4 97 CTS11562 CTS11962 L1029 L260

It’s also very interesting to note that his non-matching known SNPs tend to cluster.  Non-matching known SNPs can go in either direction – meaning that they could be absent in our participant and present in the rest, or vice versa.

l1029 search

It’s easy to tell.  In the Big Y Results, under Known SNPs, there is a search feature.  This means that it’s easy to search for SNPs and to determine their status.  For example, above, our participant does carry SNP L1029 (he’s derived or positive (+) for the mutation in question).  This means that our participant has developed L1029, and, it just so happens, also CTS11962 and L260, the three clustered SNPs, since these men shared a common ancestor.

It’s difficult not to speculate a little.  If the TMCRA Big Y SNP estimates are correct, this suggests that these 3 clustered SNPS occurred someplace between 4350 and about 5000 years ago, based on the range (93-106) of the number of high quality novel variant differences.  We’ll talk more about this in a minute.

f552 search

For SNP F552, our participant is negative, meaning that that other person has developed this SNP since their shared ancestor.  In fact, he’s negative for all of the other Known SNP differences.

Novel Variants

The Novel Variants are quite interesting.  Novel Variants are mutations that if found in enough people who are not related within a family group will someday become SNPs on the tree.  Think of them as ripening SNPs.

By clicking on the “Show All” dropdown box you can see the list of the participants novel variants and how many of his matches share that Novel Variant.

novel variant list

In this example, all 26 of our participant’s novel variants share 13142597.  I’m thinking that this Novel Variant will someday become classified as a SNP and not as a Novel Variant anymore.  When that happens, and no, we don’t know how often Family Tree DNA will be reviewing the Novel Variants for SNP candidates, it will no longer be in the Novel Variant list.  The Novel Variants are meant to be family, novel or lineage SNPs, not population based SNPS that apply to a wide variety of people.  Finding these, of course, and adding them to the human haplotree is the entire purpose of full sequence Y chromosomal testing.  Just look at tall of this new information about this man’s ancestors and the DNA that they passed on to this gentleman.

By scrolling down to the bottom of that list, we find that our participant has 8 different Novel Variants where he matches only one individual.  By clicking on the Novel Variant number, you can see who he matches.  Of those 8, 7 of them match to the man who carries the English surname and one matches to a gentleman from Prussia.

This information is extremely interesting, but it gets even more interesting when compared against STR matches.  Our participant has a fairly unusual haplotype above 12 markers.  He has three 67 marker matches, two 37 marker matches and thirty-three 25 marker matches.  None of the men he matches on the SNP test match him on any of those tests.  I did not check his 12 marker matches, because I felt that anyone who would invest the money in the Big Y would certainly have tested above 12 markers plus our participants has several hundred 12 marker matches.

The numbers being bantered around by people working with SNP information suggest that one Big Y mutation equals about 150 years.  If this is true, then his closest match, the English gentleman from Yorkshire, England would share an ancestor about 2850 years ago.  That is clearly beyond the reach of STR markers in terms of generational predictions, so maybe STR matches are not expected in this situation, IF, the 150 year per novel variant estimate is close to accurate.

Another interesting piece of information that can be deduced from this information is how many SNPs were actually found.

At the bottom of our participants page, under Known SNPs, it says “Showing 24 of…571 entries (filtered from 36,274 total entries.)”  We know that the entire data base of SNPs that Family Tree is utilizing, which includes but is not limited to the 12,000+ Geno 2.0 SNPs, is 36,274.  In other words, 36,274 are the number of SNPs available to be found and counted as a SNP because they have already been defined as such.  Any other SNPs discovered are counted as Novel Variants.

Not all available SNPs are found and read in this type of next generation test.  The number of “Matching SNPs” with each individual gives us an idea of how many SNPs actually were found and read at either a medium and high confidence level.  Low confidence SNPs and no-calls are eliminated from reporting.

Our participants best match matches him on 25,397 SNPs.  This leaves a total of 10,877 SNPs that were not called.

The Future

SNP Matching is a wonderful feature and a first in this industry.  A hearty thank you to Family Tree DNA!

However, like all passionate people, we are already looking ahead to see what can be and should be done.

Here are some suggestions and questions I have about how the future will unwrap relative to Big Y SNP testing and matching.

  1. Within surname projects, matching should be relatively easy, unless hundreds of people test. I would be happy to have that problem. Today, administrators are creating spreadsheets of matches and novel SNPs and attempting to “reverse engineer” trees. In family groups, those trees would be of Novel SNPs, and in haplogroup projects, those trees would be of both Known SNPs and Novel Variants and where the Novel SNPS slip in-between the known SNPs to create new branches and sub-branches of the haplotree. We, as a community, need some tools to assist in this endeavor, for both the surname project admin and the haplogroup project admin as well.
  2. As new SNPs are discovered in the future, one will not be retested on this platform. As new SNPs are added to the tree, this could affect the matching by terminal SNP. Family Tree DNA needs to be prepared to deal with this eventuality.
  3. As a community, we desperately need a better tool to determine our actual “terminal SNP” as opposed to the Geno 2.0 terminal SNP. Yes, I know the ISOGG tree is provisional, but the contributed tools initially provided by volunteers to search the ISOGG tree utilizing the known SNPs reported in Big Y no longer work. We desperately need something similar while Family Tree DNA is revamping its own tree. I would hope that Family Tree DNA could add something like a secondary “search ISOGG tree” function as a customer courtesy, even if it needs some disclaimer verbiage as to the provisional nature of the tree.
  4. With the number of SNPs being searched for and reported, no calls begin to become an issue, especially if the no-call happens to be on the terminal SNP. We need to be able to determine whether a non-match with someone is actually a non-match or could be as a result of a no-call, and without resorting to searching raw data files. Today, participants can order a SNP test of a SNP position that has been reported as a no-call, but one needs to first figure that out that it is a no-call by looking at the BAM and BED files, something that is beyond the capability of most genetic genealogists. Furthermore, in the case of a “suspicious” no-call, where, for example, individuals in the same surname project with the same surname and other matching SNPS and STRs, some type of “smart-matching” needs to be put into place to alert the participant and project admin of this situation so that they can decide up on a proper course of action. In other words, no-calls need to be reported and accounted for in some fashion, as they are important data points for the genetic genealogist.

I am extremely grateful to Family Tree DNA for their efforts and for Big Y matching.  After all, matching is the backbone of genetic genealogy.  This list is not a complaint list, in any sense.  Family Tree DNA has a very long history of being responsive to their client base and I fully expect they will do the same with the next step in the Big Y journey.

The story of our DNA is not yet told.  Where our STR matches are found and where our SNP matches are found tells the story of the migration of our ancestors.  Today, SNPs and STRs promise to overlap, and already have in some cases.  If I could, I would order a Big Y test for every individual that I sponsor and for every person in each of my projects. I feel that these tests, combined, will help immensely to complete the puzzle to which we have disparate pieces today.  I look forward to the day when the time to the most recent common ancestor can be calculated by utilizing the Y STR markers, the known SNPs and the Novel Variants.  In a very large sense, the future has arrived today.  Now, we just have to test and figure out how all of the puzzle pieces fit together.

If you haven’t yet ordered a Big Y, you can order here.  The more people who test, the larger the comparison data base, and the sooner we will all have the answers we seek.

Haplogroups, SNPs and Family Group Confusion

The transition at Family Tree DNA from the old haplogroup naming convention to the new SNP-only naming convention has generated a great deal of confusion.  It’s like surgery – had to be done – but it has been painful.

I’ve received several questions, many that are similar, so I’d like to attempt to resolve some of the confusing points here.

First, just a little background.

Ancient History

Remember, in 2008, when Michael Hammer et al rewrote the Y tree?  If you do, then count yourself as an old-timer.  Names such as R1b1c became R1b1a2.  E3a became E1b1a and E3b became E1b1b1.  We thought we were all going to die.  But we didn’t – and now, if I hadn’t just told you, you wouldn’t even be able to remember the previous name of R1b1a2.

Why did this happen?  Because when you have a step-wise tree where each step is given a number and letter, like this, you have no room for expansion.

R

R1

R1a

R1a1

Each of these haplogroup names is assigned a SNP, and when a new SNP is discovered between R and R1, for example, the name R1 gets assigned to the new SNP and everyone downstream gets renamed and/or a new SNP assigned.  If you think this is confusing, it is and was – terribly so.  In fact, as testimony to this, the last version of the FTDNA tree, the ISOGG tree and the tree used by 23andMe are entirely out of sync with each other.

With the shift from about 800 SNPs to 12,000 SNPs with the Geno2.0 chip, it was definitely time to redo and rethink how haplogroup names are assigned.  What seemed initially like a great idea turned out not to be when the magnitude of the number of SNPs that actually exist was realized.  In reality, they needed to be obsoleted, but the familiar cadence of the letter number path will forever be gone – with the exception of the fact that the SNP is prefaced with the haplogroup name.  We will no longer have our signposts, sadly, but our signposts were becoming overwhelmingly long.  Here’s one example I copied from the ISOGG tree.  R1b1a2a1a1c2b2a1a1b2a1a – seriously – I can’t remember that.

So, today, and forever more, R1b1a2 will be R-M269.  It will not be shifted or “become” anything else.  Moving a SNP to a new location becomes painless, because it will not affect anything upstream or downstream.

However, as you get use to this new beast, you’re going to want to refer to “what something was” before.  You’ll find that articles, papers and who knows what else will refer to the haplogroup name – and you’ll need a conversion reference.

Here’s a link to that reference.  I don’t know about you, but I copied this and created a .pdf file in case this reference disappears – not that that ever happens in the electronic world.

Why the Confusion?

Within projects, men with the same surname now have different haplogroups assigned, and the SNP names look entirely different.  Before, if most of the surname group was R1b1a2, and one person had SNP tested at a deeper level and showed R1b1a2a1a1b4, it was easy to tell by looking that R1b1a2a1a1b4 fell underneath R1b1a2, and was a subclade.  Today, with the new tree, everyone that was R1b1a2 is now shown as R-M269 and the lone R1b1a2a1a1b4 person is shown as R-L21.  You can’t tell by looking if R-L21 is a subclade of R-M269 or the other way around.  And another few SNP tests at different levels into the mix, and you have one confused administrator.

One thing hasn’t changed.  Notice the haplogroup I-M253 individual in the purple group below.  There is a note that their parentage is uncertain.  Given the completely different haplogroup – this individual does not fit into any groups of Estes males biologically.  So completely different haplogroups are still exclusive, meaning you can tell at a glance that these folks do not share a common ancestor, even though their genealogy says that they should.

estes project cropped

Ok, got that now?  Good, because it gets more confusing.

Family Tree DNA did not do a one to one conversion, meaning they did not create a conversion table where R1b1a2=R-M269.  They did an entirely new prediction routine.  This makes sense, because they don’t hard code the haplogroup – it’s fluid and based on either a hard and fast SNP test or a prediction routine. This also allows for easy future improvements, and they utilize 37 markers for haplogroup predictions now instead of just 12, in most cases.

Unfortunately, or fortunately, the prediction routine produces different results for people within the same family group, based on STR marker results and how many STRs are tested.

What this means is that different people in the same family line will have different haplogroup predictions, as you can see in the groups above of individuals all descended from one male, Abraham Estes.

This isn’t wrong, as in incorrect, but it is confusing, especially when you’re used to seeing everyone who has not been SNP tested have a matching haplogroup within families.

Enter the Terminal SNP

The terminal SNP is your SNP that is furthest down the tree based on the SNPs that you have tested.  That second part is really important – based on the SNPs that you have tested.

When you’re looking at your matches, you can see their terminal SNP in the column below to the right, but what you can’t tell is if they have tested for any downstream SNPs and were found negative.

Estes match cropped

For example, if you are tested positive for R-M269 (formerly R1b1a2) and someone else that you match is R-L21, which is downstream of R-M269 – this does not exclude them as valid matches, UNLESS the first R-M269+ gentleman has actually tested for R-L21 and is negative.  You, of course, have no way of knowing this without asking the other participant.

Also, testing “negative” is a bit subjective, because there are known no-calls in the Geno 2.0 results – so if the Geno 2.0 result did not include the terminal haplogroup you expected, and the outcome is truly important to you, meaning family defining – have that defining SNP, if it’s absent in the Geno 2.0 raw data results, tested individually through regular Sanger sequencing – meaning purchase it separately through Family Tree DNA.  A non-positive result in the Geno 2.0 results is typically interpreted to mean negative, but that is not always the case.  In most situations, if everything else matches, meaning surname, STRs and other SNPs, it’s not necessary to test the SNP separately – but it is available if you need to know, positively.

Secondly, the terminal SNP on the new Family Tree DNA haplotree and in your results, if you have taken the Big Y, the Walk Through the Y or purchased individuals SNPs, may be different.  Why, and how would you know?

The why is because Family Tree DNA has synced to the Geno 2.0 tree at this point, and there have been many new SNPs discovered since the Geno 2.0 tree was developed in 2012.  The ISOGG tree is more current, but keep in mind that it is a provisional tree.  However, you still need to have a way to determine your terminal SNP beyond the Geno 2.0 criteria if you have had advanced testing.

There were originally some tools created by individuals to help with this dilemma, but both tools appear to no longer work.  Kitty Cooper blogged about this, and was apparently recently successful, but I was not.  I downloaded the updated version of the Big Y Chromosome extension that I wrote about and was using the Morley tree but that no longer functions either.  Let’s just say that the word frustrated doesn’t even begin to apply….

My suggestion is to work closely with your haplogroup and surname project administrator(s).  Many of the administrators have put together provisional charts and the haplogroup project pages are grouped by SNP groupings with suggestions for additional relevant testing.

The U106 project is a great example of proactive administrators.  Individual participants are clearly categorized and the categories suggest an appropriate “next step.”  Looking at their home page, the administrators make themselves readily available to project members for consulting about how to proceed.

u106 project

Yes, all of this change is a bit fuzzy right now, but give it a bit of time and the fog will clear.  It did in 2008 and we all survived.

Tree Updates

Family Tree DNA has committed to at least one more tree update this year, and let’s hope that it includes all of the SNPs in the reference data base they are using for the Big Y.

I’ll be talking about Big Y comparisons in a future article.

Charles Campbell (c1750 – c1825) and the Great Warrior Path – 52 Ancestors #19

When I discovered that I was going to be visiting Scotland in the fall of 2013, I couldn’t bypass the opportunity to visit the seat of the Clan Campbell.

Campbell isn’t my maiden name, but it was the maiden name of my ancestor, Elizabeth Campbell born about 1802 who married in about 1820, probably in Claiborne County, TN, to Lazarus Dodson, born about 1795.  Elizabeth’s father was John Campbell, born 1772-1775 in Virginia and her mother was Jane “Jenny” Dobkins.  John’s brother is believed to be George Campbell, born around 1770-1771.  We are fairly certain that their father was one Charles Campbell who died before May 31, 1825 in Hawkins County, Tennessee when a survey for his neighbor mentions the heirs of Charles Campbell.

Charles Campbell was in Hawkins County by about 1788.  A Charles Campbell was mentioned in Sullivan County, the predecessor of Hawkins, as early as 1783, but we don’t know if it’s the same man.  The history of Charles Campbell’s Hawkins County land begins in 1783 when it was originally granted to Edmond Holt.

1783, Oct 25, 440 (pg 64 Tn Land Entries John Armstrong’s office) – Edmond Holt enters 300 ac on the South side of Holston river near the west end of Bays Mountain, includes a large spring near the mountain and runs about, includes Holt’s improvement at an Indian old War Ford, warrant issued June 7, 1784, grant to Mark Mitchell.

Hawkins view of Campbell land

This photo shows the area of Dodson’s creek from across the Holston River atop a high hill.  Dodson’s Creek, today, is located beside the TVA power plant.  In this photo, Dodson’s Creek would be just slightly to the right of the power plant in the distance.  You can’t see the Holston River in this photo, but it is just in front of the power plant.  This is a good representation of the rolling mountains of this region.  I stayed in this house for nearly a week while doing research in Hawkins County before realizing that the land I was looking at, daily, out the back door, off of the porch swing, was the land of both my Campbell and Dodson ancestors.  Talk about a jolting moment.

The Old War Ford is the crossing of the Holston River at the mouth of Dodson Creek where the Indians used to camp and cross, on the Great Warrior Path.

Indian war path

My cousin helped me locate the Great Warrior Path crossing and I took the  photos below during a visit to locate the Dodson and Campbell lands.

1790, May 26 – Mark Mitchell to Charles Campbell 100# Virginia money, Dodson’s Ck, Beginning at a synns on the nw side Bays mountain thence on Stokely Donelson’s, north 60 then west 218 poles to a small black and post oak on a flat Hill then south 30 west 219 to two white oaks in a flat, then s 60 east 218 poles to a stake then north 30 east 219 poles along Bays Mountain to the beginning containing 300 acres. Signed, wit John (I) Owen mark, William Wallen, George Campbell mark (kind of funny P), R. Mitchell (it appears that this transaction actually took place in 1788, but wasn’t registered until later.) south side of the Holston on the west fork of Dodson Creek.

Today, the road that originally led to the ford of the Holston River dead ends into a road and the part of the road that was the “ford” is gone.  A field exists in its place, and a historical marker, and that’s it.  Not even any memories as the ford was no longer needed when bridges were built, and by now, there have already been several generations of bridges.

old war ford

Here’s the field.  The trees grow along the river and help to control erosion from flooding today.  Walking up to the area, you can see the actual ford area, although there is nothing to give away the fact that this used to be a ford of the river.  The locals say there is bedrock here.

old war ford 2

This area is flood plain, so one would not live here.  The old cemetery where we believe Raleigh Dodson is buried is across the current road and up the hill.  The land where we think Charles Campbell lived is just up Dodson Creek from this area as well, but on somewhat higher ground.

Possible Campbell land

I believe this is or is very near the current day location of the Charles Campbell land.  Dodson Creek runs adjacent the road, and you have to cross the creek to get to the farmable land from the road.  You can see the makeshift bridge above.

Beautiful pool at the bend in Dodson Creek where it leaves the road.

Dodson Creek is beautiful and lush.

Dodson Creek 2

1793/1794 – Charles Campbell to George and John Campbell, all of Hawkins County, for 45#, 150 acres on the south side of the Holston, west fork of Dodson Ck beginning at 2 white oaks then (metes and bounds), signed, John Payne witness.

1802, Feb 26 – George Campbell and John Campbell of Hawkins County to Daniel Leyster (Leepter?, Seyster, Septer) of same, 225# tract on west fork of Dodson’s Creek being same place where said John Campbell now lives, 149 acres, then (metes and bounds) description. Both sign,  Witness, Charles Campbell, Michael Roark and William Paine.  Proven in May session 1802 by oath of Michael Roark (inferring that the sellers are gone from the area).

Is the difference between 149 and 150 acres a cemetery, a church or a school?

Dodson Creek is where Charles Campbell lived.  This is the Dodson family who John Campbell’s daughter, Elizabeth, would marry into a generation later in Claiborne County.  Dodson Creek was also just a few miles from Jacob Dobkins’ home, whose daughter’s George and John Campbell would marry.  Jacob Dobkins, George and John Campbell and their Dobkins wives would be in Claiborne County, Tennessee by 1802.

We believe Charles Campbell came from the Augusta or Rockingham County area of Virginia, but we don’t know for sure.  Unfortunately the deed where his heirs conveyed his land is recorded in the court record, but never in the deed book, so we have no idea who his heirs were.  The will of his neighbor, Michael Roark, who was born in Bucks County, PA and then lived in Rockingham Co., VA stated that he bought the land of Charles Campbell from his heirs joining the tract “I live on.”  Charles’ other neighbor was a Grigsby, and so was Michael Roark’s wife. It’s not unlikely that Charles Campbell was related to one or both of these men.  Perhaps the key to finding Charles Campbell back in Virginia is to find both Michael Roark and the Grigsby family as well.

in the 1783 Shenandoah Co., VA, tax list, we find both Charles Campbell and Jacob Dobkins in Alexander Hite’s district. Jacob Dobkins is the father of Jane “Jenny” Dobkins who would eventually marry John Campbell and her sister,  Elizabeth Dobkins who would marry George Campbell, believed to be the brother of John Campbell.

Several years ago, we DNA tested both a male Campbell descendant of both John and George and confirmed that indeed, these line match each other as well as the Campbell clan line from Scotland and that the descendants of the lines of both men also match autosomally as cousins, further confirming that John and George were most likely brothers.  This was good news, because even though we don’t know the exact names of Charles ancestors, thanks to DNA, we still know the history of those ancestors before they immigrated, probably in the early 1700 with the first waves of the Scotch-Irish.

So, for me, the opportunity to visit the clan seat, and meet the current Duke of Argyll, the 26th chief of the Clan Campbell and the 12the Duke of Argyll, Torquhil Campbell, personally, was literally the chance of a lifetime.

The Duke, Torquhil Campbell, is much different from other aristocracy.  He lives at Inveraray Castle, the clan seat, but parts of the castle are open to the public.  In addition, the castle is his actual full time residence and he actively manages the estate, including signing books about Inveraray in the gift shop in the castle.

OLYMPUS DIGITAL CAMERA

You can’t miss him if he’s there, as he has on an apron that says “Duke.”  He’s a lot younger than I expected as well, born in 1968, but extremely gracious and welcoming.  There must be tens of thousands of Campbell descendants and many probably make their way back to Inverary like the butterflies return to Mexico every winter.

While I was visiting Inveraray, I purchased two books about the clan Campbell and a third, written by the Duke himself, about Inveraray. The Campbell clan origins are shrouded in myth and mists, as you might imagine, but let me share them with you anyway.

Campbell coat of arms

The first origin story, from a book called “Campbell, The Origins of the Clan Campbell and Their Place in History” by John Mackay, says :

“The first Campbells were a Scots family who crossed from Ireland to the land of the Picts.  The Clan Campbell originated from the name O’Duibhne, one of whose chiefs in ancient times was known as Diarmid and the name Campbell was first used in the 1050s in the reign of Malcolm Canmore after a sporran-bearer or purse-bearer to the king previously called Paul O’Duihne was dubbed with his new surname.

Historians after such obscure and legendary times, have agreed that the can name comes from the Gaelic ‘cam’ meaning crooked and ‘beul’ meaning the mouth, when it was the fashion to be surnamed from some unusual physical feature, in this case by the characteristic curved or crooked mouth of the family of what is certainly one of the oldest clan named in the Highlands.

It was the Marquis who insisted that he was descended from a Scots family in Ireland who had crossed to what was then mostly the land of Picts to establish the first Scots colony in the district of Dalriada – a comparatively small part of what we know today as Argyll at the heart of what would in time become the kingdom of Scotland.  It is marked by the fort of Dunadd, of the A816, a few miles north of Lochgilphead, set in the inlet called Loch Gilp off from Loch Fyne.”

Loch Fyne is where the current castle of Inveraray, clan seat, is located and where I visited.

The second source is a booklet called “Campbell, Your Clan Heritage,” by Alan McNie, which is condensed from a larger book, Highland Clans of Scotland by George –Eyre-Todd published in 1923.

It says:

“Behind Torrisdale in Kintyre rises a mountain named Ben an Tuire, the “Hill of the Boar.”  It takes its name from a famous event in Celtic legend.  There, according to tradition, Diarmid O’Duibhne slew the fierce boar which had ravaged the district.  Diarmid was of the time of the Ossianic heroes.

Diarmid is said to have been the ancestor of th race of O’Duibhne who owned the shores of Loch Awe, which were the original Oire Gaidheal, or Argyhll, the “Land of the Gael,”

The race is said to have ended in the reign of Alexander III in an heiress, Eva, daughter of Paul O’Duibhne, otherwise Paul of the Sporran so named because as the kings treasurer, he was supposed to carry the money-bag.  Eva married a certain Archibald of Gillespie Campbell, to whom she carried the possession of her house.  This tradition is supported by a charter of David II in 1368 which secured to Archibald Campbell of that date certain lands of Loch Awe ‘as freely as there were enjoyed by his ancestor, Duncan O’Diubhne.’

Who the original Archibald Campbell was remains a matter of dispute.  By some he is said to have been a Norman knight by the name of De Campo Bello.  The name Campo Bello, however, is not Norman but Italian.  It is out of all reason to suppose that an Italian ever made his way into the Highlands at such a time to secure a footing as a Highland Chief.”

This book then goes on to recite the “crooked mouth” story as well.

A third origin story is recorded in the book written by the current Duke, himself, “Inveraray Castle, Ancestral Home of the Dukes of Argyll.”  In this book, the Duke says:

“The Campbells, thought to be of British stock, from the Kingdom of Strathclyde, probably arrived in Argyll as part of a royal expedition in circa 1220.  They settled on Lochaweside where they were placed in charge of the king’s land in the area.

The Chief of Clan Campbell takes his Gaelic title of ‘MacCailein Mor’ from Colin Mor Campbell – ‘Colin the Great’ – who was killed in a quarrel with the MacDougalls of Lorne in 1296.

His son was Sir Neil Campbell, boon companion and brother-in-law to King Robert the Bruce, whose son, Sir Colin was rewarded in 1315 by the grant of the lands of Lochawe and Ardscotnish of which he now became Lord.

From Bruce’s time at least, their headquarters had been at the great castle of Innischonnell, on Loch Awe.   Around the mid 1400s, Sir Duncan Campbell of Lochawe, great-grandson of Sir Colon, moved his headquarters to Inveraray, controlling most of the landward communications of Argyll.”

From the Campbell DNA Project website, we find this pedigree chart of the Clan Campbell, beginning with the present Duke at the bottom.

Campbell pedigree

Let’s see if Y chromosome DNA results can tell us about the Campbell Clan history.

Originally, the DNA testing told us that the Campbell men were R1b1.  The predicted haplogroup was R1b1a2, now known as R-M269, but some of the Campbell men who have tested further are haplogroup R1b1a2a1b4, or R-L21.

Looking at my cousin’s matches map at 37 markers, below, the Campbell men cluster heavily around the Loch Lomond/Greenock region which is very close to the traditional Campbell seat of Inverary.

Campbell cluster

At 12 markers, the cluster near Greenock, slightly northwest of Glasgow, is quite pronounced.  Most of these matches are Campbell surnames.

Campbell Greenock cluster

Another item of interest is that several men in this cluster have tested for SNP L1335.  This is the SNP that Jim Wilson announced is an indicator of Pictish heritage, although it is widely thought that this was a marketing move with little solid data behind it.  Otherwise, Jim Wilson, a geneticist, would surely be publishing academically, not via press announcements from a company that has previously damaged their own credibility, several times.

Regardless, our Campbell group tested positive for this SNP.  I contacted Kevin Campbell, the Campbell DNA project administrator, who is equally as cautious about the Pictish label, but we both agree that this marker indicates ancient, “indigenous Scots,” and yes, they could be Picts.  Time will tell!

In the next few days, I’ll be writing about my visit to Inverary.  I hope you’ll join me!

2014 Y Tree Released by Family Tree DNA

On April 25th, DNA Day and Arbor Day, Family Tree DNA updated and released their 2014 Y haplotree created in partnership with the Genographic project.  This has been a massive project, expanding the tree from about 850 SNPs to over 6200, of which about 1200 are “terminal,” meaning the end of a branch, and the rest being proven to be duplicates.

If you’re a newbie, this would be a good place perhaps to read about what a haplogroup is and the new Y naming convention which replaces the well-known group names like R1b1a2 with the SNP shorthand version of the same haplogroup name, R-M269.  From this time forward, the haplogroups will be known by their SNP names and the longhand version is obsolete, although you will always see it in older documents, articles and papers.  In fact, this entire tree has been made possible by SNP testing by both academic organizations and consumers.  To understand the difference between regular STR marker testing and SNP testing, click here.

I’ve divided this article into two parts.  The first part is the “what did they do and why” part and the second is the “what does it mean to you” portion.

This tree update has been widely anticipated for some time now.  We knew that Family Tree DNA was calibrating the tree in partnership with the Genographic project, but we didn’t know what else would be included until the tree was released.

What Did Family Tree DNA Do, and Why?

Janine Cloud, the liaison at Family Tree DNA for Project Administrators has provided some information as to the big picture.

“First, we’re committed to the next iteration of the tree and it will be more comprehensive, but we’re going to be really careful about the data we use from other sources. It HAS to be from raw data, not interpreted data. Second, I’ve italicized what I think is really the mission statement for all the work that’s been done on this tree and that will be done in the future.”

Janine interviewed Elliott Greenspan of Family Tree DNA about the new tree, and here are some of the salient points from that discussion.

“This year we’re committing to launching another tree. This tree will be more comprehensive, utilizing data from external sources: known Sanger data, as well as data such as Big Y, and if we have direct access to the raw data to make the proof (from large companies, such as the Chromo2) or a publication, or something of that nature. That is our intention that it be added into the data.

We’re definitely committed to update at least once per year. Our intention is to use data from other sources, as well as any SNPs we can, but it must be well-vetted. NGS and SNP technology inherently has errors. You must curate for those errors otherwise you’re just putting slop out to customers. There are some SNPs that may bind to the X chromosome that you didn’t know. There are some low coverages that you didn’t know.

With technology such as this you’re able to overcome the urge to test only what you’re likely to be positive for, and instead use the shotgun method and test everything. This allows us to make the discovery that SNPs are not nearly as stable as we thought, and they have a larger potential use in that sense.

Not only does the raw data need to be vetted but it needs to make sense.  Using Geno 2.0, I only accepted samples that had the highest call rate, not just because it was the best quality but because it was the most data. I don’t want to be looking at data where I’m missing potential information A, or I may become confused by potential information B.  That is something that will bog us down. When you’re looking at large data sets, I’d much rather throw out 20% of them because they’re going to take 90% of the time than to do my best to get 1 extra SNP on the tree or 1 extra branch modified, that is not worth all of our time and effort. What is, is figuring out what the broader scope of people are, because that is how you break down origins. Figuring one single branch for one group of three people is not truly interesting until it’s 50 people, because 50 people is a population. Three people may be a family unit.  You have to have enough people to determine relevance. That’s why using large datasets and using complete datasets are very, very important.

I want it to be the most accurate tree it can be, but I also want it to be interesting. That’s the key. Historical relevance is what we’re to discover. Anthropological relevance. It’s not just who has the largest tree, it’s who can make the most sense out of what you have is important.”

Thanks to both Janine and Elliott for providing this information.

What is Provided in the Update?

The genetic genealogy community was hopeful that the new 2014 tree would be comprehensive, meaning that it would include not only the Genographic SNPs, but ones from Walk the Y, perhaps some Chromo2, Full Genomes results and the Big Y.  Perhaps we were being overly optimistic, especially given the huge influx of new SNPs, the SNP tsunami as we call it, over the past few months.  Family Tree DNA clearly had to put a stake in the sand and draw the line someplace.  So, what is actually included, how did they select the SNPs for the new tree and how does this integrate with the Genographic information?  This information was provided by Family Tree DNA.

Family Tree DNA created the 2014 Y-DNA Haplotree in partnership with the National Geographic Genographic Project using the proprietary GenoChip. Launched publicly in late 2012, the chip tests approximately 10,000 Y-DNA SNPs that had not, at the time, been phylogenetically classified.

The team used the first 50,000 male samples with the highest quality results to determine SNP positions. Using only tests with the highest possible “call rate” meant more available data, since those samples had the highest percentage of SNPs that produced results, or “calls.”

In some cases, SNPs that were on the 2010 Y-DNA Haplotree didn’t work well on the GenoChip, so the team used Sanger sequencing on anonymous samples to test those SNPs and to confirm ambiguous locations.

For example, if it wasn’t clear if a clade was a brother (parallel) clade, or a downstream clade, they tested for it.

The scope of the project did not include going farther than SNPs currently on the GenoChip in order to base the tree on the most data available at the time, with the cutoff for inclusion being about November of 2013.

Where data were clearly missing or underrepresented, the team curated additional data from the chip where it was available in later samples. For example, there were very few Haplogroup M samples in the original dataset of 50,000, so to ensure coverage, the team went through eligible Geno 2.0 samples submitted after November, 2013, to pull additional Haplogroup M data. That additional research was not necessary on, for example, the robust Haplogroup R dataset, for which they had a significant number of samples.

Family Tree DNA, again in partnership with the Genographic Project, is committed to releasing at least one update to the tree this year. The next iteration will be more comprehensive, including data from external sources such as known Sanger data, Big Y testing, and publications. If the team gets direct access to raw data from other large companies’ tests, then that information will be included as well. We are also committed to at least one update per year in the future.

Known SNPs will not intentionally be renamed. Their original names will be used since they represent the original discoverers of the SNP. If there are two names, one will be chosen to be displayed and the additional name will be available in the additional data, but the team is taking care not to make synonymous SNPs seems as if they are two separate SNPs. Some examples of that may exist initially, but as more SNPs are vetted, and as the team learns more, those examples will be removed.

In addition, positions or markers within STRs, as they are discovered, or large insertion/deletion events inside homopolymers, potentially may also be curated from additional data because the event cannot accurately be proven. A homopolymer is a sequence of identical bases, such as AAAAAAAAA or TTTTTTTTT. In such cases it’s impossible to tell which of the bases the insertion is, or if/where one was deleted. With technology such as Next Generation Sequencing, trying to get SNPs in regions such as STRs or homopolymers doesn’t make sense because we’re discovering non-ambiguous SNPs that define the same branches, so we can use the non-ambiguous SNPs instead.

Some SNPs from the 2010 tree have been intentionally removed. In some cases, those were SNPs for which the team never saw a positive result, so while it may be a legitimate SNP, even haplogroup defining, it was outside of the current scope of the tree. In other cases, the SNP was found in so many locations that it could cause the orientation of the tree to be drawn in more than one way. If the SNP could legitimately be positioned in more than one haplogroup, the team deemed that SNP to not be haplogroup defining, but rather a high polymorphic location.

To that end, SNPs no longer have .1, .2, or .3 designations. For example, J-L147.1 is simply J-L147, and I-147.2 is simply I-147.  Those SNPs are positioned in the same place, but back-end programming will assign the appropriate haplogroup using other available information such as additional SNPs tested or haplogroup origins listed. If other SNPs have been tested and can unambiguously prove the location of the multi-locus SNP for the sample, then that data is used. If not, matching haplogroup origin information is used.

We will also move to shorthand haplogroup designations exclusively. Since we’re committing to at least one iteration of the tree per year, using longhand that could change with each update would be too confusing.  For example, Haplogroup O used to have three branches: O1, O2, and O3. A SNP was discovered that combined O1 and O2, so they became O1a and O1b.

There are over 1200 branches on the 2014 Y Haplogroup tree, as compared to about 400 on the 2010 tree. Those branches contain over 6200 SNPs, so we’ve chosen to display select SNPs as “active” with an adjacent “More” button to show the synonymous SNPs if you choose.

In addition to the Family Tree DNA updates, any sample tested with the Genographic Project’s Geno 2.0 DNA Ancestry Kit, then transferred to FTDNA will automatically be re-synched on the Geno side. The Genographic Project is currently integrating the new data into their system and will announce on their website when the process is complete in the coming weeks.  At that time, all Geno 2.0 participants’ results will be updated accordingly and will be accessible via the Genographic Project website.

In summary:

  • Created in partnership with National Geographic’s Genographic Project
  • Used GenoChip containing ~10,000 previously unclassified Y-SNPs
  • Some of those SNPs came from Walk Through the Y and the 1000 Genome Project
  • Used first 50,000 high-quality male Geno 2.0 samples
  • Verified positions from 2010 YCC by Sanger sequencing additional anonymous samples
  • Filled in data on rare haplogroups using later Geno 2.0 samples

Statistics

  • Expanded from approximately 400 to over 1200 terminal branches
  • Increased from around 850 SNPs to over 6200 SNPs
  • Cut-off date for inclusion for most haplogroups was November 2013

Total number of SNPs broken down by haplogroup

A 406 DE 16 IJ 29 LT 12 P 81
B 69 E 1028 IJK 2 M 17 Q 198
BT 8 F 90 J 707 N 168 R 724
C 371 G 401 K 11 NO 16 S 5
CT 64 H 18 K(xLT) 1 O 936 T 148
D 208 I 455 L 129

myFTDNA Interface

  • Existing customers receive free update to predictions and confirmed branches based on existing SNP test results.
  • Haplogroup badge updated if new terminal branch is available
  • Updated haplotree design displays new SNPs and branches for your haplogroup
  • Branch names now listed in shorthand using terminal SNPs
  • For SNPs with more than one name, in most cases the original name for SNP was used, with synonymous SNPs listed when you click “More…”
  • No longer using SNP names with .1, .2, .3 suffixes. Back-end programming will place SNP in correct haplogroup using available data.
  • SNPs recommended for additional testing are pre-populated in the cart for your convenience. Just click to remove those you don’t want to test.
  • SNPs recommended for additional testing are based on 37-marker haplogroup origins data where possible, 25- or 12-marker data where 37 markers weren’t available.
  • Once you’ve tested additional SNPs, that information will be used to automatically recommend additional SNPs for you if they’re available.
  • If you remove those prepopulated SNPs from the cart, but want to re-add them, just refresh your page or close the page and return.
  • Only one SNP per branch can be ordered at one time – synonymous SNPs can possibly ordered from the Advanced Orders section on the Upgrade Order page.
  • Tests taken have moved to the bottom of the haplogroup page.

Coming attractions

  • Group Administrator Pages will have longhand removed.
  • At least one update to the tree to be released this year.
  • Update will include: data from Big Y, relevant publications, other companies’ tests from raw data.
  • We’ll set up a system for those who have tested with other big data companies to contribute their raw data file to future versions of the tree.
  • We’re committed to releasing at least one update per year.
  • The Genographic Project is currently integrating the new data into their system and will announce on their website when the process is complete in the coming weeks. At that time, all Geno 2.0 participants’ results will be updated accordingly and accessible via the Genographic Project website.

What Does This Mean to You?

Your Badge

On your welcome page, your badges are listed.  Your badge previously would have included the longhand form of the haplogroup, such as R1b1a2, but now it shows R-M269.

2014 y 1

Please note that badges are not yet showing on all participants pages.  If yours aren’t yet showing, clicking on the Haplotree and SNP page under the YDNA option on the blue options bar where your more detailed information is shown, below.

Your Haplogroup Name

Your haplogroup is now noted only as the SNP designation, R-M269, not the older longhand names.

2014 y 2 v2

Haplogroup R is a huge haplogroup, so you’ll need to scroll down to see your confirmed or predicted haplogroup, shown in green below.

2014 y 3

Redesigned Page

The redesigned haplotree page includes an option to order SNPs downstream of your confirmed or predicted haplogroup.  This refines your haplogroup and helps isolate your branch on the tree.  You may or may not want to do this.  In some cases, this does help your genealogy, especially in cases where you’re dealing with haplogroup R.  For the most part, haplogroups are more historical in nature.  For example, they will help you determine whether your ancestors are Native American, African, Anglo Saxon or maybe Viking.  Haplogroups help us reach back before the advent of surnames.

The new page shows which SNPs are available for you to order from the SNPs on the tree today, shown above, in blue to the right of the SNP branch.

SNPs not on the Tree

Not all known SNPs are on the tree.  Like I said, a line in the sand had to be drawn.  There are SNPs, many recently discovered, that are not on the tree.

To put this in perspective, the new tree incorporates 6200 SNPs (up from 850), but the Big Y “pool” of known SNPs against which Family Tree DNA is comparing those results was 36,562 when the first results were initially released at the end of February.

If you have taken advanced SNP testing, such as the Walk the Y, the Big Y, or tested individual SNPs, your terminal SNP may not be on the tree, which means that your terminal SNP shown on your page, such as R-M269 above, MAY NOT BE ACCURATE in light of that testing.  Why?  Because these newly discovered SNPs are not yet on the tree. This only affects people who have done advanced testing which means it does not affect most people.

Ordering SNPs

You can order relevant SNPs for your haplogroup on the tree by clicking on the “Add” button beside the SNP.

You can order SNPs not on the tree by clicking on the “Advanced Order Form” link available at the bottom of the haplotree page.

2014 y 4

If you’re not sure of what you want to do, or why, you might want to touch bases with your project administrators.  Depending on your testing goal, it might be much more advantageous, both scientifically and financially, for you to take either the Geno2 test or the Big Y.

At this point, in light of some of the issues with the new release, I would suggest maybe holding tight for a bit in terms of ordering new SNPs unless you’re positive that your haplogroup is correct and that the SNP selection you want to order would actually be beneficial to you.

Words of Caution

This are some bugs in this massive update.  You might want to check your haplogroup assignment to be sure it is reflected accurately based on any SNP testing you have had done, of course, excepting the very advanced tests mentioned above.

If you discover something that is inaccurate or questionable, please notify Family Tree DNA.  This is especially relevant for project administrators who are familiar with family groups and know that people who are in the same surname group should share a common base haplogroup, although some people who have taken further SNP testing will be shown with a downstream haplogroup, further down that particular branch of the tree.

What kind of result might you find suspicious or questionable?  For example, if in your surname project, your matching surname cousins are all listed at R-M269 and you were too previously, but now you’re suddenly in a different haplogroup, like E, there is clearly an error.

Any suspected or confirmed errors should be reported to Family Tree DNA.

They have made it very easy by providing a “Feedback” button on the top of the page and there is a “Y tree” option in the dropdown box.

2014 y 5

For administrators providing reports that involve more than one participant, please send to Groups@familytreedna.com and include the kit numbers, the participants names and the nature of the issue.

Additional Information

Family Tree DNA provides a free webinar that can be viewed about the 2014 Y Tree release.  You can see all of the webinars that are archived and available for viewing at:  https://www.familytreedna.com/learn/ftdna/webinars/

What’s Next?

The Genographic Project is in the process of updating to the same tree so their results can be synchronized with the 2014 tree.  A date for this has not yet been released.

Family Tree DNA has committed to at least one more update this year.

I know that this update was massive and required extensive reprogramming that affected almost every aspect of their webpage.  If you think about it, nearly every page had to be updated from the main page to the order page.  The tree is the backbone of everything.  I want to thank the Family Tree DNA and Genograpic combined team for their efforts and Bennett Greenspan for making sure this did happen, just as he committed to do in November at the last conference.

Like everyone else, I want everything NOW, not tomorrow.  We’re all passionate about this hobby – although I think it is more of a life mission for many – and surpassed hobby status long ago.

I know there are issues with the tree and they frustrate me, like everyone else.  Those issues will be resolved.  Family Tree DNA is actively working on reported issues and many have already been fixed.

There is some amount of disappointment in the genetic genealogy community about the SNPs not included on the tree, especially the SNPs recently discovered in advanced tests like the Big Y.  Other trees, like the ISOGG tree, do in fact reflect many of these newly discovered SNPs.

There are a couple of major differences.  First, ISOGG has an virtual army of volunteers who are focused on maintaining this tree.  We are all very lucky that they do, and that Alice Fairhurst coordinates this effort and has done so now for many years.  I would be lost without the ISOGG tree.

However, when a change is made to the ISOGG tree, and there have been thousands of changes, adds and moves over the years, nothing else is affected.  No one’s personal page, no one’s personal tree, no projects, no maps, no matches and no order pages.  ISOGG has no “responsibility” to anyone – in other words – it’s widely known and accepted that they are a volunteer organization without clients.

Family Tree DNA, on the other hand has half a million (or so) paying customers.  Tree changes have a huge domino ripple effect there – not only on their customers’ personal pages, but to their entire website, projects, support and orders.  A change at Family Tree DNA is much more significant than on the ISOGG page – not to mention – they don’t have the same army of volunteers and they have to rely on the raw science, not interpretation, as they said in the information they provided.  A tree update at Family Tree DNA is a very different animal than updating a stand-alone tree, especially considering their collaboration with various scientific organizations, including the National Geographic Society.

I commend Family Tree DNA for this update and thank them for the update and the educational materials.  I’m also glad to see that they do indeed rely only on science, not interpretation.  Frustrating to the genetic genealogist in me?  Sure.  But in the long run, it’s worth it to be sure the results are accurate.

Could this release have been smoother and more accurate?  Certainly.  Hopefully this is the big speed bump and future releases will be much more graceful.  It’s easy to see why there aren’t any other companies providing this type of comprehensive testing.  It’s gone from an easy 12 marker “do we match” scenario to the forefront of pioneering population genetics.  And all within a decade.  It’s amazing that any company can keep up.