Big Y DNA Results Divide and Unite Haplogroup Q Native Americans

featherOne of my long standing goals has been to resurrect the lost heritage of the Native American people.  By this I mean, primarily, for genealogists who search for and can’t find  their Native ancestors.  My blog, www.nativeheritageproject.com, is one of the ways that I contribute towards that end.  Many times, records are buried, don’t exist at all, or don’t reflect anything about Native heritage.  While documents can be somewhat evasive and frustratingly vague, the Y DNA of the male descendants is not.  It’s rock solid.

The Native communities became admixed beginning with the first visits of Europeans to what would become the Americas.  Native people accepted mixed race individuals as full tribal members, based on the ethnicity of the mother.  Adoption also played a key role.  If a female, the mother, was an adopted white child, the mother was considered to be fully Native, as was her child, regardless of the ethnicity of the father.

Therefore, some people who test their DNA expecting to find Native genetics do not – they instead find European or African – but that alone does not mean that their ancestors were not tribal members.  It means that these individuals have to rely on non-genetic records to prove their ancestors Native heritage – or they need to test a different line – like the descendants of the mother, through all females, for example, for mitochondrial DNA.

On the other hand, some people are quite surprised when their DNA results come back as Native.  Many have heard a vague story, but often, they don’t have a clue as to which genealogical line, if any, the Native ancestry originated.  Native ancestry was often hidden because the laws that prevailed at the time sanctioned discrimination of many kinds against people “of color,” and if you weren’t entirely of European origin, you were “of color.”  Many admixed people, as soon as they could, “became” white socially and never looked back. Not until recently, the late 20th century, when discrimination had for the most part become a thing of the past and one could embrace their Native or African heritage without fear of legal or social reprisal.

Back in December of 2010, we found the defining SNP that divided haplogroup Q between Europeans and Native Americans.  At the time, this was a huge step forward, a collaboration between testing participants, haplogroup administrators, citizen scientists and Family Tree DNA.

This allowed us to determine who was, and was not included in Native American haplogroups, but it was also the tip of the iceberg.  You can see below just how much the tree has expanded and its branches have been shuffled.  This is a big part of the reason for the change from haplogroup names like Q1a3 to Q-M346.  For example, at one time or another the SNP M3 was associated with haplogroup names Q1a3a, Q1a3a1 and Q1a3a1a.  On the ISOGG tree below, today M3 is associated with Q1a2a1a1.

isogg q tree

The new Family Tree DNA 2014 tree is shown below for one of the Big Y participants whose terminal SNP is L568, found beneath SNP CTS1780 which is found beneath L4, which is beneath L213 which is beneath L474 which is beneath MEH2 which is beneath L232 which is, finally, beneath M242.

ftdna 2014 q tree

The introduction of the Big Y product from Family Tree DNA, which sequences a large portion of the Y chromosome, provided us with the opportunity to make huge strides in unraveling and deciphering the haplogroup Q (and C, the other male Native haplogroup in the Americas) tree.  I am hopeful that in time, and with enough people taking the Big Y test, that we will one day be able to at least sort participants into language and perhaps migration groups.

In November, 2013, we asked for the public and testers to support our call for funds to be able to order several Big Y tests.  The project administrators intentionally did not order tests in family groups, but attempted to scatter the tests to the far corners, so to speak, and to include at least one person from each disparate group we have in the haplogroup Q project, based on STR matches, or lack thereof, and previous SNP testing.

Thanks to the generosity of contributors, we were able to order several tests.  In addition, some participants were able to order their own tests, and did.  Thank you one and all.

The tests are back now, and with the new Big Y SNP matching, recently introduced by Family Tree DNA, comparisons are a LOT easier.

So, of course, I had to see what I could find by comparing the SNP results of the several gentlemen who tested.

To protect the privacy of everyone involved, I have reduced their names to initials.  I have included their terminal SNP as identified at Family Tree DNA as well as any tribal, ethnic or location information we have available for their most distant paternal ancestor.

There are two individuals who believe their ancestors are from Europe, and there is a very large group of European haplogroup Q members, but I’m not convinced that the actual biological ancestors of these two gentlemen are from Europe.  I have included both of these individuals as well. Let’s just say the jury is still out. As a control, I have also included a gentleman who actually lives in Poland.

native match clusters

Of the individuals above, SD, CT and CM are SNP matches.

CD, WJS and WBS are SNP matches with each other.

BG and ETW are also SNP matches to each other.

None of the rest of these individuals have SNP matches.  (Note, you can click to enlarge the chart.)

native snp matches

In the table above, the Non-Matching Known SNPs are shown with the number of Shared Novel Variants.  For example, SD and CT have 4 non-matching SNPS and share 161 Novel Variants and are noted as 4/161.

We can easily tell which of the known SNPs are nonmatching, because they are shown on the participants match page.

snp matches page

What we don’t know, and can’t tell, is how many Novel Variants these people share with each other, and how many they might share with the individuals that aren’t shown as matches.

Keep in mind that there may be individuals here that are not shown as matches to due no-calls.  Only people with up to and including 4 non-matching Known SNPs are counted as matches.  If you have the wrong combination of no-calls, or, aren’t in the same terminal haplogroup, you may not be shown as a match when you otherwise would be.

The other reason for my intense interest in the Novel Variants is to see if they are actually Novel, as in found only in a few people, or if they are more widespread.

I downloaded each person’s Novel Variants through the Export Utility (blue button to the right at the top of your personal page,) and combined the Novel Variants into a single spreadsheet.  I colorized each person’s result rows so that they would be easy to track.  I have redacted their names. The white row, below, is the individual who lives in Poland.

novel variant 1

There are a total of 3506 Novel Variants between these men.  When sorting, many clustered as you would expect.  There is the Algonguian group and what I’ve taken to calling the Borderlands group.  This group has someone whose ancestor was born in VA and two in SC.  I have documentation for the Virginia family having descendants in SC, so that makes sense.  The third group is an unusual combination of the gentleman who believes his ancestors are from Germany and the gentleman whose ancestors are found in a New Mexico Pueblo tribe, but whose ancestor was, likely, based on church records, a detribalized Plains Indian who had been kidnapped and sold.

Clusters that I felt needed some scrutiny, for one reason or another, I highlighted in yellow in the Terminal SNP column.  Obviously the Polish/Pueblo matching needs some attention.

Another very interesting type of match are several where either all or nearly all of the individuals share a Novel Variant – 15 or 16 of 16 total participants.  I don’t think these will remain Novel Variants very long.  They clearly need to be classified as SNPs.  I’m not sure about the process that Family Tree DNA will use to do this, but I’ll be finding out shortly.

Here’s an example where everyone shares this Novel Variant at location 7688075,except the gentleman who lives in Poland, the man who believes his ancestor is from Germany, and the Creek descendant.

novel variant 2

I was very surprised at how many Novel Variants appear in all 16 results of the participants, including the gentleman who lives in Poland – represented by the white row below.

novel variant 3

So, how were the Novel Variants distributed?

Category # of Variants Comments
Algonquian Group 140 This is to be expected since it’s within a specific group.  Any matches that include people outside the 3 Algonquian individuals are counted in a separate category.  These matches give us the ability to classify anyone who tests with these marker results as provisionally Algonquian.
Borderlands 83 This confirms that these three individuals are indeed a “group” of some sort.  This also gives us the ability to classify future participants using these mutations.
All or Nearly All – 15 or 16 Participants 80 These are clearly candidates for SNPs, and, given that they are found in the Native and the European groups, they appear to predate the division of haplogroup Q.
Several Native and European, Combined 45 This may or may not include the person who lives in Poland.  This group needs additional scrutiny to determine if it actually does exist in Europe, but given that there are more than 3 individuals with each of these Novel Variants, they need to be considered for SNPhood.
Pueblo/NC 1
Poland/Borderlands 2
Mexico/Algonquian 2
German/Pueblo 9 I wonder if this person is actually German.
Poland/Mexico 20 I wonder if this person’s ancestors are actually from Poland.
Algonquian, NC, Creek 1
Borderland, Mexico, Creek 1
Algonquian/Cherokee 1
All Native, no Euro 2
Algonquian, Borderlands, Mexico, NC 1
Algonquian, Mexico, Borderlands 1
Borderlands, Pueblo 1
Borderlands, Creek, NC 1
Algonquian, Cherokee, Mexico 3
Algonquian, Pueblo, Creek, Borderlands 1
Cherokee, NC 2
Algonquian, Borderlands 2
Borderlands, NC 1
Algonquian, NC 1
Polish/NC 10

Some of this distribution makes me question if these SNP mutations truly are a “once in the history of mankind” kind of thing.  For example, how did the same SNP appear in the Polish person and the NC person, or the Pueblo person, and not in the rest of the Native people?

New SNPs?

So, are you sitting down?

Based on these numbers, it looks like we have at least 125 new SNP candidates for  haplogroup Q.  If we count the Algonquian and the Borderlands groups of matches, that number rises to about 250.  This is very exciting.  Far, far more than I ever expected.  of these SNPS, about half will identify Native people, even Native groupings of people.  This is a huge step forward, a red letter day for Native American ancestry!

SNPs and STRs

Lastly, I wanted to see how the SNP matching compared to STR matching, or if it did at all, for these men.

Only two men match each other on any STR markers.  CD and WJS matched on 12 markers, but not on higher panels.  The TIP calculator estimated their common ancestor at the 50th percentile to be 17 generations, or between 425 and 510 years ago.  We all know how unrealistic it is to depend on the TIP calculator, but it’s the only tool we have in situations like this.

Given that these are the only two men who do match on STR markers, albeit distantly, in a genealogical timeframe, let’s see what the estimates using the 150 years per SNP mutation comes up with.  This estimate is just that, devised by the haplogroup R-U106 project administrators, and others, based on their project findings.  150 years is actually the high end of the estimate, 98 being the lower end.  Of course, different haplogroups may vary and these results are very early.  Just saying.

CD has 207 high quality Novel Variants.  He shares 188 of those with WJS, leaving 19 unshared Novel Variants.  Utilizing this number, and multiplying by 150, this suggests that, if the 150 years per SNP is anyplace close to accurate, their common ancestor lived about 2850 years ago.  If you presume that both men are incurring mutations at the same rate in their independent lines, then you would divide the number of years in half, so the common ancestor would be more likely 1425 years ago.  If you use 100 years instead of 150, the higher number of years is 1900 and the half number is about 950 years.

It’s fun to speculate a bit, but until a lot more study has occurred, we won’t be able to reasonably estimate SNP age or age to common ancestor from this information.   Having said all of that, it’s not a long stretch from 710 years to 950 years.

It looks like STR markers are still the way to go for genealogical matching and that SNPS may help to pull together the deeper ancestry, migration patterns and perhaps define family lines.  I hope the day comes soon that I can order the Big Y for lots more project members.  Most of these men do have STR marker matches, and to men with both the same and different surnames.  I’d love to see the Big Y results for those individuals who match more closely in time.

This is still the tip of the iceberg.  There is a lot left to discover!  If you or a family member have haplogroup Q results, please consider ordering the Big Y.  It would make a wonderful gift and a great way to honor your ancestors!

You can also contribute to the American Indian project at this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=AIP

In order to donate to the haplogroup C-P39 project which also includes Native Americans, please click this link:

http://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Big Y Matching

A few days ago, Family Tree DNA announced and implemented Big Y Matching between participants who have taken the Big Y test.

This is certainly welcome news.  Let’s take a look at Big Y matching, what it means and how to utilize the features.

First, there are really two different groups of people who will benefit from the Big Y tests.

People trying to sort through lines of a common and related surname – like the McDonald or Campbell families, for example – and haplogroup researchers and project administrators.

My own family, for example, is badly brick walled with Charles Campbell first found in Hawkins County, TN in the 1780s.  We know, via STR testing that indeed, he matches the Campbell Clan from Scotland, but we have no idea who is father might have been.  STR testing hasn’t been definitive enough on Charles’ two known sons’ descendants, so I’m very hopeful that someday enough Campbell men will test that we’ll be able between STR and SNP mutations to at least narrow the possible family lines.  If I’m incredibly lucky, maybe there will be a family line SNP (Novel Variant) and it won’t just narrow the line, it will give me a long-awaited answer by genetically announcing which line was his.  Could I be that lucky???  That’s like winning the genetic genealogy lottery!

For today, the Big Y test at $695 is expensive to run on an entire project of people, not to mention that many of the original participants in projects, the long-time hard-core genealogists, have since passed away.  We are now into our 15th years of genetic genealogy.

For those studying haplogroups, the Big Y is a huge sandbox and those researchers have lost no time whatsoever comparing various individuals’ SNPS, both known and novel, and creating haplogroup trees of those SNPs.  This is done by hand today, or maybe more accurately stated, by Excel.  This is “not fun” to put it mildly.  We owe these folks a huge debt of gratitude.  Their results are curated and posted, provisionally, on the ISOGG Tree.

There is an in-between group as well, and those are people who are working to establish relationships between people of different surnames.  In my case, Native American ancestors whose descendants have different surnames today, but who do share a common ancestor in some timeframe.  That timeframe of course could be anyplace from a couple hundred to several thousand years, since their entry into the Americas across Beringia someplace in the neighborhood of 12-15 thousand years ago.

The Big Y matching is extremely helpful to projects.

Let’s take a look.

Big Y Matches

Big Y landing

On your personal page, under “Other Results,” you’ll see the Big Y results.  Click on Results” and you’ll see the following page.

big y results

The Known SNPs and Novel Variants tabs have been there since release, but the Matching tab, top left, is new.

By clicking on the Matching tab, you will then see the men you match based on your terminal SNP as determined in the Big Y Known SNPs data base.  You will be matched to men who carry up to and including 4 mutations difference in known SNPs, and unlimited novel variant differences.  If you have a zero in the “Known SNP Difference” column, that means you have no differences at all in known SNPs.

big y matches cropped2

The individual being used for an example here has paternal ancestry from Hungary.  His terminal SNP is reported as R-CTS11962.  Therefore, all of the people he matches should also carry this same SNP as their terminal SNP.

This is actually quite interesting, because of his 10 exact matches, 9 of them have surnames or genealogy that suggests eastern European/Slavic ancestry.  The 10th, however, which happens to be his closest match, carries an English surname and reports their ancestor to be from Yorkshire, England.  His one mutation differences carry the same pattern, with one being from England and two of the other three from eastern Europe.

Our participant has 155 total Novel Variants, 135 high quality and 20 medium quality.  Only high quality are listed in the comparison.  Medium quality are not.

Ancestral Location Known SNP Difference Shared Novel Variants Non Matching Known SNPs
Yorkshire, England 0 134 None
Prussia 0 127 None
Ukraine 0 121 None
Poland 0 121 None
Belarus 0 119 None
Poland 0 116 None
Poland 0 116 None
Russian e-mail 0 113 None
Bulgaria 0 113 None
Slovakia 0 111 None
English surname 1 126 PF6085
Undetermined, poss German 1 121 F1816
Poland 1 118 F552
Poland 1 116 CTS10137
Prussia 2 122 CTS11840 PF4522
Poland 2 112 L1029 PR6932
Russia 3 116 CTS3184 L1029 PF3643
Poland 3 106 CTS11962 L1029 L260
Ukraine 3 105 CTS11962 L1029 L260
Poland 3 104 CTS11962 L1029 L260
Poland 3 100 CTS11962 L1029 L260
Poland 3 99 CTS11962 L1029 L260
Eastern European surname 3 98 CTS11962 L1029 L260
Poland/Germany 3 97 CTS11962 L1029 L260
Austria/Galacia 3 93 CTS11962 L1029 L260
Poland 4 97 CTS11562 CTS11962 L1029 L260

It’s also very interesting to note that his non-matching known SNPs tend to cluster.  Non-matching known SNPs can go in either direction – meaning that they could be absent in our participant and present in the rest, or vice versa.

l1029 search

It’s easy to tell.  In the Big Y Results, under Known SNPs, there is a search feature.  This means that it’s easy to search for SNPs and to determine their status.  For example, above, our participant does carry SNP L1029 (he’s derived or positive (+) for the mutation in question).  This means that our participant has developed L1029, and, it just so happens, also CTS11962 and L260, the three clustered SNPs, since these men shared a common ancestor.

It’s difficult not to speculate a little.  If the TMCRA Big Y SNP estimates are correct, this suggests that these 3 clustered SNPS occurred someplace between 4350 and about 5000 years ago, based on the range (93-106) of the number of high quality novel variant differences.  We’ll talk more about this in a minute.

f552 search

For SNP F552, our participant is negative, meaning that that other person has developed this SNP since their shared ancestor.  In fact, he’s negative for all of the other Known SNP differences.

Novel Variants

The Novel Variants are quite interesting.  Novel Variants are mutations that if found in enough people who are not related within a family group will someday become SNPs on the tree.  Think of them as ripening SNPs.

By clicking on the “Show All” dropdown box you can see the list of the participants novel variants and how many of his matches share that Novel Variant.

novel variant list

In this example, all 26 of our participant’s novel variants share 13142597.  I’m thinking that this Novel Variant will someday become classified as a SNP and not as a Novel Variant anymore.  When that happens, and no, we don’t know how often Family Tree DNA will be reviewing the Novel Variants for SNP candidates, it will no longer be in the Novel Variant list.  The Novel Variants are meant to be family, novel or lineage SNPs, not population based SNPS that apply to a wide variety of people.  Finding these, of course, and adding them to the human haplotree is the entire purpose of full sequence Y chromosomal testing.  Just look at tall of this new information about this man’s ancestors and the DNA that they passed on to this gentleman.

By scrolling down to the bottom of that list, we find that our participant has 8 different Novel Variants where he matches only one individual.  By clicking on the Novel Variant number, you can see who he matches.  Of those 8, 7 of them match to the man who carries the English surname and one matches to a gentleman from Prussia.

This information is extremely interesting, but it gets even more interesting when compared against STR matches.  Our participant has a fairly unusual haplotype above 12 markers.  He has three 67 marker matches, two 37 marker matches and thirty-three 25 marker matches.  None of the men he matches on the SNP test match him on any of those tests.  I did not check his 12 marker matches, because I felt that anyone who would invest the money in the Big Y would certainly have tested above 12 markers plus our participants has several hundred 12 marker matches.

The numbers being bantered around by people working with SNP information suggest that one Big Y mutation equals about 150 years.  If this is true, then his closest match, the English gentleman from Yorkshire, England would share an ancestor about 2850 years ago.  That is clearly beyond the reach of STR markers in terms of generational predictions, so maybe STR matches are not expected in this situation, IF, the 150 year per novel variant estimate is close to accurate.

Another interesting piece of information that can be deduced from this information is how many SNPs were actually found.

At the bottom of our participants page, under Known SNPs, it says “Showing 24 of…571 entries (filtered from 36,274 total entries.)”  We know that the entire data base of SNPs that Family Tree is utilizing, which includes but is not limited to the 12,000+ Geno 2.0 SNPs, is 36,274.  In other words, 36,274 are the number of SNPs available to be found and counted as a SNP because they have already been defined as such.  Any other SNPs discovered are counted as Novel Variants.

Not all available SNPs are found and read in this type of next generation test.  The number of “Matching SNPs” with each individual gives us an idea of how many SNPs actually were found and read at either a medium and high confidence level.  Low confidence SNPs and no-calls are eliminated from reporting.

Our participants best match matches him on 25,397 SNPs.  This leaves a total of 10,877 SNPs that were not called.

The Future

SNP Matching is a wonderful feature and a first in this industry.  A hearty thank you to Family Tree DNA!

However, like all passionate people, we are already looking ahead to see what can be and should be done.

Here are some suggestions and questions I have about how the future will unwrap relative to Big Y SNP testing and matching.

  1. Within surname projects, matching should be relatively easy, unless hundreds of people test. I would be happy to have that problem. Today, administrators are creating spreadsheets of matches and novel SNPs and attempting to “reverse engineer” trees. In family groups, those trees would be of Novel SNPs, and in haplogroup projects, those trees would be of both Known SNPs and Novel Variants and where the Novel SNPS slip in-between the known SNPs to create new branches and sub-branches of the haplotree. We, as a community, need some tools to assist in this endeavor, for both the surname project admin and the haplogroup project admin as well.
  2. As new SNPs are discovered in the future, one will not be retested on this platform. As new SNPs are added to the tree, this could affect the matching by terminal SNP. Family Tree DNA needs to be prepared to deal with this eventuality.
  3. As a community, we desperately need a better tool to determine our actual “terminal SNP” as opposed to the Geno 2.0 terminal SNP. Yes, I know the ISOGG tree is provisional, but the contributed tools initially provided by volunteers to search the ISOGG tree utilizing the known SNPs reported in Big Y no longer work. We desperately need something similar while Family Tree DNA is revamping its own tree. I would hope that Family Tree DNA could add something like a secondary “search ISOGG tree” function as a customer courtesy, even if it needs some disclaimer verbiage as to the provisional nature of the tree.
  4. With the number of SNPs being searched for and reported, no calls begin to become an issue, especially if the no-call happens to be on the terminal SNP. We need to be able to determine whether a non-match with someone is actually a non-match or could be as a result of a no-call, and without resorting to searching raw data files. Today, participants can order a SNP test of a SNP position that has been reported as a no-call, but one needs to first figure that out that it is a no-call by looking at the BAM and BED files, something that is beyond the capability of most genetic genealogists. Furthermore, in the case of a “suspicious” no-call, where, for example, individuals in the same surname project with the same surname and other matching SNPS and STRs, some type of “smart-matching” needs to be put into place to alert the participant and project admin of this situation so that they can decide up on a proper course of action. In other words, no-calls need to be reported and accounted for in some fashion, as they are important data points for the genetic genealogist.

I am extremely grateful to Family Tree DNA for their efforts and for Big Y matching.  After all, matching is the backbone of genetic genealogy.  This list is not a complaint list, in any sense.  Family Tree DNA has a very long history of being responsive to their client base and I fully expect they will do the same with the next step in the Big Y journey.

The story of our DNA is not yet told.  Where our STR matches are found and where our SNP matches are found tells the story of the migration of our ancestors.  Today, SNPs and STRs promise to overlap, and already have in some cases.  If I could, I would order a Big Y test for every individual that I sponsor and for every person in each of my projects. I feel that these tests, combined, will help immensely to complete the puzzle to which we have disparate pieces today.  I look forward to the day when the time to the most recent common ancestor can be calculated by utilizing the Y STR markers, the known SNPs and the Novel Variants.  In a very large sense, the future has arrived today.  Now, we just have to test and figure out how all of the puzzle pieces fit together.

If you haven’t yet ordered a Big Y, you can order here.  The more people who test, the larger the comparison data base, and the sooner we will all have the answers we seek.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Charles Campbell (c1750 – c1825) and the Great Warrior Path – 52 Ancestors #19

When I discovered that I was going to be visiting Scotland in the fall of 2013, I couldn’t bypass the opportunity to visit the seat of the Clan Campbell.

Campbell isn’t my maiden name, but it was the maiden name of my ancestor, Elizabeth Campbell born about 1802 who married in about 1820, probably in Claiborne County, TN, to Lazarus Dodson, born about 1795.  Elizabeth’s father was John Campbell, born 1772-1775 in Virginia and her mother was Jane “Jenny” Dobkins.  John’s brother is believed to be George Campbell, born around 1770-1771.  We are fairly certain that their father was one Charles Campbell who died before May 31, 1825 in Hawkins County, Tennessee when a survey for his neighbor mentions the heirs of Charles Campbell.

Charles Campbell was in Hawkins County by about 1788.  A Charles Campbell was mentioned in Sullivan County, the predecessor of Hawkins, as early as 1783, but we don’t know if it’s the same man.  The history of Charles Campbell’s Hawkins County land begins in 1783 when it was originally granted to Edmond Holt.

1783, Oct 25, 440 (pg 64 Tn Land Entries John Armstrong’s office) – Edmond Holt enters 300 ac on the South side of Holston river near the west end of Bays Mountain, includes a large spring near the mountain and runs about, includes Holt’s improvement at an Indian old War Ford, warrant issued June 7, 1784, grant to Mark Mitchell.

Hawkins view of Campbell land

This photo shows the area of Dodson’s creek from across the Holston River atop a high hill.  Dodson’s Creek, today, is located beside the TVA power plant.  In this photo, Dodson’s Creek would be just slightly to the right of the power plant in the distance.  You can’t see the Holston River in this photo, but it is just in front of the power plant.  This is a good representation of the rolling mountains of this region.  I stayed in this house for nearly a week while doing research in Hawkins County before realizing that the land I was looking at, daily, out the back door, off of the porch swing, was the land of both my Campbell and Dodson ancestors.  Talk about a jolting moment.

The Old War Ford is the crossing of the Holston River at the mouth of Dodson Creek where the Indians used to camp and cross, on the Great Warrior Path.

Indian war path

My cousin helped me locate the Great Warrior Path crossing and I took the  photos below during a visit to locate the Dodson and Campbell lands.

1790, May 26 – Mark Mitchell to Charles Campbell 100# Virginia money, Dodson’s Ck, Beginning at a synns on the nw side Bays mountain thence on Stokely Donelson’s, north 60 then west 218 poles to a small black and post oak on a flat Hill then south 30 west 219 to two white oaks in a flat, then s 60 east 218 poles to a stake then north 30 east 219 poles along Bays Mountain to the beginning containing 300 acres. Signed, wit John (I) Owen mark, William Wallen, George Campbell mark (kind of funny P), R. Mitchell (it appears that this transaction actually took place in 1788, but wasn’t registered until later.) south side of the Holston on the west fork of Dodson Creek.

Today, the road that originally led to the ford of the Holston River dead ends into a road and the part of the road that was the “ford” is gone.  A field exists in its place, and a historical marker, and that’s it.  Not even any memories as the ford was no longer needed when bridges were built, and by now, there have already been several generations of bridges.

old war ford

Here’s the field.  The trees grow along the river and help to control erosion from flooding today.  Walking up to the area, you can see the actual ford area, although there is nothing to give away the fact that this used to be a ford of the river.  The locals say there is bedrock here.

old war ford 2

This area is flood plain, so one would not live here.  The old cemetery where we believe Raleigh Dodson is buried is across the current road and up the hill.  The land where we think Charles Campbell lived is just up Dodson Creek from this area as well, but on somewhat higher ground.

Possible Campbell land

I believe this is or is very near the current day location of the Charles Campbell land.  Dodson Creek runs adjacent the road, and you have to cross the creek to get to the farmable land from the road.  You can see the makeshift bridge above.

Beautiful pool at the bend in Dodson Creek where it leaves the road.

Dodson Creek is beautiful and lush.

Dodson Creek 2

1793/1794 – Charles Campbell to George and John Campbell, all of Hawkins County, for 45#, 150 acres on the south side of the Holston, west fork of Dodson Ck beginning at 2 white oaks then (metes and bounds), signed, John Payne witness.

1802, Feb 26 – George Campbell and John Campbell of Hawkins County to Daniel Leyster (Leepter?, Seyster, Septer) of same, 225# tract on west fork of Dodson’s Creek being same place where said John Campbell now lives, 149 acres, then (metes and bounds) description. Both sign,  Witness, Charles Campbell, Michael Roark and William Paine.  Proven in May session 1802 by oath of Michael Roark (inferring that the sellers are gone from the area).

Is the difference between 149 and 150 acres a cemetery, a church or a school?

Dodson Creek is where Charles Campbell lived.  This is the Dodson family who John Campbell’s daughter, Elizabeth, would marry into a generation later in Claiborne County.  Dodson Creek was also just a few miles from Jacob Dobkins’ home, whose daughter’s George and John Campbell would marry.  Jacob Dobkins, George and John Campbell and their Dobkins wives would be in Claiborne County, Tennessee by 1802.

We believe Charles Campbell came from the Augusta or Rockingham County area of Virginia, but we don’t know for sure.  Unfortunately the deed where his heirs conveyed his land is recorded in the court record, but never in the deed book, so we have no idea who his heirs were.  The will of his neighbor, Michael Roark, who was born in Bucks County, PA and then lived in Rockingham Co., VA stated that he bought the land of Charles Campbell from his heirs joining the tract “I live on.”  Charles’ other neighbor was a Grigsby, and so was Michael Roark’s wife. It’s not unlikely that Charles Campbell was related to one or both of these men.

Michael Roark’s will dates August 25, 1834 and proven on February 4, 1839 says, among other things, that he leaves to grandson James Rork, son of John, tract of land that I now live on after wife and I die, son John 4 shares of tract of land that I bought of the heirs of Charles Campbell joining the tract I live on and containing about 150 acres. Unfortunately, the deed between the Campbell heirs and Michael Roark was never registered.

In a deed from Michael Roark to Neil and Simpson with John Scruggs as their trustee, registered July 17, 1835, where Michael Roark had in essence mortgaged his land in November of 1830 and by 1835 was unable to pay his debt.  The verbiage says in part that Michael not only conveys his land, which is described, but he adds “and also the interest I have in the shares of the 4 legatees of Charles Campbell, decd, to a tract of land lying on Dodson’s Creek.”  He does not say that his wife is a daughter of Charles Campbell, but it’s certainly possible.  He described one of the two tracts of Roark land he is conveying as having been conveyed to him by James Roark in 1811.

This 1835 entry tells us that Charles Campbell’s land apparently had not yet been sold and that there were at least 4 legatees.

Roark, Michael cabin

Years ago, in a book in the library in Hawkins County, I stumbled across this photo of a picture of the cabin of Michael Roark.  You know that Charles Campbell’s cabin didn’t look much different.  A quite elderly descendant of Michael, Libby Roark Schmalzreid, claimed that her grandfather built his house on this land, and is buried on a hill just above the home he built.  She was in her 90s more than half a decade ago, and never said who her grandfather was.  She did say on Rootsweb that the location is on Dodson Creek not far from Strahl.  Given that Michael Roark and Charles Campbell were neighbors, if we find Michael’s cabin, we can also find Charles land.  I mean his actual land, not just a general area.  On the map below, Dodson Creek is shown by the arrows, and Strahl is marked as well.  It’s about 2000 feet from Strahl to the red arrow below noting Dodson Creek.  Dodson Creek and its branches wanders all over this neighborhood.  So, if anyone knows who Libby’s grandfather was, where he built his house or where he is buried, please give me a shout.

Strahl

Perhaps the key to finding Charles Campbell back in Virginia is to find both Michael Roark and the Grigsby family as well.

On the 1783 Shenandoah Co., VA, tax list, we find both Charles Campbell and Jacob Dobkins in Alexander Hite’s district. Jacob Dobkins is the father of Jane “Jenny” Dobkins who would eventually marry John Campbell and her sister,  Elizabeth Dobkins who would marry George Campbell, believed to be the brother of John Campbell.

Of course, there were also 2 Charles Campbells in Rockingham County, VA in 1782 and 1 in Fayette and one in Lincoln, both in 1787.

Several years ago, we DNA tested both a male Campbell descendant of both John and George and confirmed that indeed, these line match each other as well as the Campbell clan line from Scotland and that the descendants of the lines of both men also match autosomally as cousins, further confirming that John and George were most likely brothers.  This was good news, because even though we don’t know the exact names of Charles ancestors, thanks to DNA, we still know the history of those ancestors before they immigrated, probably in the early 1700 with the first waves of the Scotch-Irish.

So, for me, the opportunity to visit the clan seat, and meet the current Duke of Argyll, the 26th chief of the Clan Campbell and the 12the Duke of Argyll, Torquhil Campbell, personally, was literally the chance of a lifetime.

The Duke, Torquhil Campbell, is much different from other aristocracy.  He lives at Inveraray Castle, the clan seat, but parts of the castle are open to the public.  In addition, the castle is his actual full time residence and he actively manages the estate, including signing books about Inveraray in the gift shop in the castle.

OLYMPUS DIGITAL CAMERA

You can’t miss him if he’s there, as he has on an apron that says “Duke.”  He’s a lot younger than I expected as well, born in 1968, but extremely gracious and welcoming.  There must be tens of thousands of Campbell descendants and many probably make their way back to Inverary like the butterflies return to Mexico every winter.

While I was visiting Inveraray, I purchased two books about the clan Campbell and a third, written by the Duke himself, about Inveraray. The Campbell clan origins are shrouded in myth and mists, as you might imagine, but let me share them with you anyway.

Campbell coat of arms

The first origin story, from a book called “Campbell, The Origins of the Clan Campbell and Their Place in History” by John Mackay, says :

“The first Campbells were a Scots family who crossed from Ireland to the land of the Picts.  The Clan Campbell originated from the name O’Duibhne, one of whose chiefs in ancient times was known as Diarmid and the name Campbell was first used in the 1050s in the reign of Malcolm Canmore after a sporran-bearer or purse-bearer to the king previously called Paul O’Duihne was dubbed with his new surname.

Historians after such obscure and legendary times, have agreed that the can name comes from the Gaelic ‘cam’ meaning crooked and ‘beul’ meaning the mouth, when it was the fashion to be surnamed from some unusual physical feature, in this case by the characteristic curved or crooked mouth of the family of what is certainly one of the oldest clan named in the Highlands.

It was the Marquis who insisted that he was descended from a Scots family in Ireland who had crossed to what was then mostly the land of Picts to establish the first Scots colony in the district of Dalriada – a comparatively small part of what we know today as Argyll at the heart of what would in time become the kingdom of Scotland.  It is marked by the fort of Dunadd, of the A816, a few miles north of Lochgilphead, set in the inlet called Loch Gilp off from Loch Fyne.”

Loch Fyne is where the current castle of Inveraray, clan seat, is located and where I visited.

The second source is a booklet called “Campbell, Your Clan Heritage,” by Alan McNie, which is condensed from a larger book, Highland Clans of Scotland by George –Eyre-Todd published in 1923.

It says:

“Behind Torrisdale in Kintyre rises a mountain named Ben an Tuire, the “Hill of the Boar.”  It takes its name from a famous event in Celtic legend.  There, according to tradition, Diarmid O’Duibhne slew the fierce boar which had ravaged the district.  Diarmid was of the time of the Ossianic heroes.

Diarmid is said to have been the ancestor of th race of O’Duibhne who owned the shores of Loch Awe, which were the original Oire Gaidheal, or Argyhll, the “Land of the Gael,”

The race is said to have ended in the reign of Alexander III in an heiress, Eva, daughter of Paul O’Duibhne, otherwise Paul of the Sporran so named because as the kings treasurer, he was supposed to carry the money-bag.  Eva married a certain Archibald of Gillespie Campbell, to whom she carried the possession of her house.  This tradition is supported by a charter of David II in 1368 which secured to Archibald Campbell of that date certain lands of Loch Awe ‘as freely as there were enjoyed by his ancestor, Duncan O’Diubhne.’

Who the original Archibald Campbell was remains a matter of dispute.  By some he is said to have been a Norman knight by the name of De Campo Bello.  The name Campo Bello, however, is not Norman but Italian.  It is out of all reason to suppose that an Italian ever made his way into the Highlands at such a time to secure a footing as a Highland Chief.”

This book then goes on to recite the “crooked mouth” story as well.

A third origin story is recorded in the book written by the current Duke, himself, “Inveraray Castle, Ancestral Home of the Dukes of Argyll.”  In this book, the Duke says:

“The Campbells, thought to be of British stock, from the Kingdom of Strathclyde, probably arrived in Argyll as part of a royal expedition in circa 1220.  They settled on Lochaweside where they were placed in charge of the king’s land in the area.

The Chief of Clan Campbell takes his Gaelic title of ‘MacCailein Mor’ from Colin Mor Campbell – ‘Colin the Great’ – who was killed in a quarrel with the MacDougalls of Lorne in 1296.

His son was Sir Neil Campbell, boon companion and brother-in-law to King Robert the Bruce, whose son, Sir Colin was rewarded in 1315 by the grant of the lands of Lochawe and Ardscotnish of which he now became Lord.

From Bruce’s time at least, their headquarters had been at the great castle of Innischonnell, on Loch Awe.   Around the mid 1400s, Sir Duncan Campbell of Lochawe, great-grandson of Sir Colon, moved his headquarters to Inveraray, controlling most of the landward communications of Argyll.”

From the Campbell DNA Project website, we find this pedigree chart of the Clan Campbell, beginning with the present Duke at the bottom.

Campbell pedigree

Let’s see if Y chromosome DNA results can tell us about the Campbell Clan history.

Originally, the DNA testing told us that the Campbell men were R1b1.  The predicted haplogroup was R1b1a2, now known as R-M269, but some of the Campbell men who have tested further are haplogroup R1b1a2a1b4, or R-L21.

Looking at my cousin’s matches map at 37 markers, below, the Campbell men cluster heavily around the Loch Lomond/Greenock region which is very close to the traditional Campbell seat of Inverary.

Campbell cluster

At 12 markers, the cluster near Greenock, slightly northwest of Glasgow, is quite pronounced.  Most of these matches are Campbell surnames.

Campbell Greenock cluster

Another item of interest is that several men in this cluster have tested for SNP L1335.  This is the SNP that Jim Wilson announced is an indicator of Pictish heritage, although it is widely thought that this was a marketing move with little solid data behind it.  Otherwise, Jim Wilson, a geneticist, would surely be publishing academically, not via press announcements from a company that has previously damaged their own credibility, several times.

Regardless, our Campbell group tested positive for this SNP.  I contacted Kevin Campbell, the Campbell DNA project administrator, who is equally as cautious about the Pictish label, but we both agree that this marker indicates ancient, “indigenous Scots,” and yes, they could be Picts.  Time will tell!

In the next few days, I’ll be writing about my visit to Inverary.  I hope you’ll join me!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2014 Y Tree Released by Family Tree DNA

On April 25th, DNA Day and Arbor Day, Family Tree DNA updated and released their 2014 Y haplotree created in partnership with the Genographic project.  This has been a massive project, expanding the tree from about 850 SNPs to over 6200, of which about 1200 are “terminal,” meaning the end of a branch, and the rest being proven to be duplicates.

If you’re a newbie, this would be a good place perhaps to read about what a haplogroup is and the new Y naming convention which replaces the well-known group names like R1b1a2 with the SNP shorthand version of the same haplogroup name, R-M269.  From this time forward, the haplogroups will be known by their SNP names and the longhand version is obsolete, although you will always see it in older documents, articles and papers.  In fact, this entire tree has been made possible by SNP testing by both academic organizations and consumers.  To understand the difference between regular STR marker testing and SNP testing, click here.

I’ve divided this article into two parts.  The first part is the “what did they do and why” part and the second is the “what does it mean to you” portion.

This tree update has been widely anticipated for some time now.  We knew that Family Tree DNA was calibrating the tree in partnership with the Genographic project, but we didn’t know what else would be included until the tree was released.

What Did Family Tree DNA Do, and Why?

Janine Cloud, the liaison at Family Tree DNA for Project Administrators has provided some information as to the big picture.

“First, we’re committed to the next iteration of the tree and it will be more comprehensive, but we’re going to be really careful about the data we use from other sources. It HAS to be from raw data, not interpreted data. Second, I’ve italicized what I think is really the mission statement for all the work that’s been done on this tree and that will be done in the future.”

Janine interviewed Elliott Greenspan of Family Tree DNA about the new tree, and here are some of the salient points from that discussion.

“This year we’re committing to launching another tree. This tree will be more comprehensive, utilizing data from external sources: known Sanger data, as well as data such as Big Y, and if we have direct access to the raw data to make the proof (from large companies, such as the Chromo2) or a publication, or something of that nature. That is our intention that it be added into the data.

We’re definitely committed to update at least once per year. Our intention is to use data from other sources, as well as any SNPs we can, but it must be well-vetted. NGS and SNP technology inherently has errors. You must curate for those errors otherwise you’re just putting slop out to customers. There are some SNPs that may bind to the X chromosome that you didn’t know. There are some low coverages that you didn’t know.

With technology such as this you’re able to overcome the urge to test only what you’re likely to be positive for, and instead use the shotgun method and test everything. This allows us to make the discovery that SNPs are not nearly as stable as we thought, and they have a larger potential use in that sense.

Not only does the raw data need to be vetted but it needs to make sense.  Using Geno 2.0, I only accepted samples that had the highest call rate, not just because it was the best quality but because it was the most data. I don’t want to be looking at data where I’m missing potential information A, or I may become confused by potential information B.  That is something that will bog us down. When you’re looking at large data sets, I’d much rather throw out 20% of them because they’re going to take 90% of the time than to do my best to get 1 extra SNP on the tree or 1 extra branch modified, that is not worth all of our time and effort. What is, is figuring out what the broader scope of people are, because that is how you break down origins. Figuring one single branch for one group of three people is not truly interesting until it’s 50 people, because 50 people is a population. Three people may be a family unit.  You have to have enough people to determine relevance. That’s why using large datasets and using complete datasets are very, very important.

I want it to be the most accurate tree it can be, but I also want it to be interesting. That’s the key. Historical relevance is what we’re to discover. Anthropological relevance. It’s not just who has the largest tree, it’s who can make the most sense out of what you have is important.”

Thanks to both Janine and Elliott for providing this information.

What is Provided in the Update?

The genetic genealogy community was hopeful that the new 2014 tree would be comprehensive, meaning that it would include not only the Genographic SNPs, but ones from Walk the Y, perhaps some Chromo2, Full Genomes results and the Big Y.  Perhaps we were being overly optimistic, especially given the huge influx of new SNPs, the SNP tsunami as we call it, over the past few months.  Family Tree DNA clearly had to put a stake in the sand and draw the line someplace.  So, what is actually included, how did they select the SNPs for the new tree and how does this integrate with the Genographic information?  This information was provided by Family Tree DNA.

Family Tree DNA created the 2014 Y-DNA Haplotree in partnership with the National Geographic Genographic Project using the proprietary GenoChip. Launched publicly in late 2012, the chip tests approximately 10,000 Y-DNA SNPs that had not, at the time, been phylogenetically classified.

The team used the first 50,000 male samples with the highest quality results to determine SNP positions. Using only tests with the highest possible “call rate” meant more available data, since those samples had the highest percentage of SNPs that produced results, or “calls.”

In some cases, SNPs that were on the 2010 Y-DNA Haplotree didn’t work well on the GenoChip, so the team used Sanger sequencing on anonymous samples to test those SNPs and to confirm ambiguous locations.

For example, if it wasn’t clear if a clade was a brother (parallel) clade, or a downstream clade, they tested for it.

The scope of the project did not include going farther than SNPs currently on the GenoChip in order to base the tree on the most data available at the time, with the cutoff for inclusion being about November of 2013.

Where data were clearly missing or underrepresented, the team curated additional data from the chip where it was available in later samples. For example, there were very few Haplogroup M samples in the original dataset of 50,000, so to ensure coverage, the team went through eligible Geno 2.0 samples submitted after November, 2013, to pull additional Haplogroup M data. That additional research was not necessary on, for example, the robust Haplogroup R dataset, for which they had a significant number of samples.

Family Tree DNA, again in partnership with the Genographic Project, is committed to releasing at least one update to the tree this year. The next iteration will be more comprehensive, including data from external sources such as known Sanger data, Big Y testing, and publications. If the team gets direct access to raw data from other large companies’ tests, then that information will be included as well. We are also committed to at least one update per year in the future.

Known SNPs will not intentionally be renamed. Their original names will be used since they represent the original discoverers of the SNP. If there are two names, one will be chosen to be displayed and the additional name will be available in the additional data, but the team is taking care not to make synonymous SNPs seems as if they are two separate SNPs. Some examples of that may exist initially, but as more SNPs are vetted, and as the team learns more, those examples will be removed.

In addition, positions or markers within STRs, as they are discovered, or large insertion/deletion events inside homopolymers, potentially may also be curated from additional data because the event cannot accurately be proven. A homopolymer is a sequence of identical bases, such as AAAAAAAAA or TTTTTTTTT. In such cases it’s impossible to tell which of the bases the insertion is, or if/where one was deleted. With technology such as Next Generation Sequencing, trying to get SNPs in regions such as STRs or homopolymers doesn’t make sense because we’re discovering non-ambiguous SNPs that define the same branches, so we can use the non-ambiguous SNPs instead.

Some SNPs from the 2010 tree have been intentionally removed. In some cases, those were SNPs for which the team never saw a positive result, so while it may be a legitimate SNP, even haplogroup defining, it was outside of the current scope of the tree. In other cases, the SNP was found in so many locations that it could cause the orientation of the tree to be drawn in more than one way. If the SNP could legitimately be positioned in more than one haplogroup, the team deemed that SNP to not be haplogroup defining, but rather a high polymorphic location.

To that end, SNPs no longer have .1, .2, or .3 designations. For example, J-L147.1 is simply J-L147, and I-147.2 is simply I-147.  Those SNPs are positioned in the same place, but back-end programming will assign the appropriate haplogroup using other available information such as additional SNPs tested or haplogroup origins listed. If other SNPs have been tested and can unambiguously prove the location of the multi-locus SNP for the sample, then that data is used. If not, matching haplogroup origin information is used.

We will also move to shorthand haplogroup designations exclusively. Since we’re committing to at least one iteration of the tree per year, using longhand that could change with each update would be too confusing.  For example, Haplogroup O used to have three branches: O1, O2, and O3. A SNP was discovered that combined O1 and O2, so they became O1a and O1b.

There are over 1200 branches on the 2014 Y Haplogroup tree, as compared to about 400 on the 2010 tree. Those branches contain over 6200 SNPs, so we’ve chosen to display select SNPs as “active” with an adjacent “More” button to show the synonymous SNPs if you choose.

In addition to the Family Tree DNA updates, any sample tested with the Genographic Project’s Geno 2.0 DNA Ancestry Kit, then transferred to FTDNA will automatically be re-synched on the Geno side. The Genographic Project is currently integrating the new data into their system and will announce on their website when the process is complete in the coming weeks.  At that time, all Geno 2.0 participants’ results will be updated accordingly and will be accessible via the Genographic Project website.

In summary:

  • Created in partnership with National Geographic’s Genographic Project
  • Used GenoChip containing ~10,000 previously unclassified Y-SNPs
  • Some of those SNPs came from Walk Through the Y and the 1000 Genome Project
  • Used first 50,000 high-quality male Geno 2.0 samples
  • Verified positions from 2010 YCC by Sanger sequencing additional anonymous samples
  • Filled in data on rare haplogroups using later Geno 2.0 samples

Statistics

  • Expanded from approximately 400 to over 1200 terminal branches
  • Increased from around 850 SNPs to over 6200 SNPs
  • Cut-off date for inclusion for most haplogroups was November 2013

Total number of SNPs broken down by haplogroup

A 406 DE 16 IJ 29 LT 12 P 81
B 69 E 1028 IJK 2 M 17 Q 198
BT 8 F 90 J 707 N 168 R 724
C 371 G 401 K 11 NO 16 S 5
CT 64 H 18 K(xLT) 1 O 936 T 148
D 208 I 455 L 129

myFTDNA Interface

  • Existing customers receive free update to predictions and confirmed branches based on existing SNP test results.
  • Haplogroup badge updated if new terminal branch is available
  • Updated haplotree design displays new SNPs and branches for your haplogroup
  • Branch names now listed in shorthand using terminal SNPs
  • For SNPs with more than one name, in most cases the original name for SNP was used, with synonymous SNPs listed when you click “More…”
  • No longer using SNP names with .1, .2, .3 suffixes. Back-end programming will place SNP in correct haplogroup using available data.
  • SNPs recommended for additional testing are pre-populated in the cart for your convenience. Just click to remove those you don’t want to test.
  • SNPs recommended for additional testing are based on 37-marker haplogroup origins data where possible, 25- or 12-marker data where 37 markers weren’t available.
  • Once you’ve tested additional SNPs, that information will be used to automatically recommend additional SNPs for you if they’re available.
  • If you remove those prepopulated SNPs from the cart, but want to re-add them, just refresh your page or close the page and return.
  • Only one SNP per branch can be ordered at one time – synonymous SNPs can possibly ordered from the Advanced Orders section on the Upgrade Order page.
  • Tests taken have moved to the bottom of the haplogroup page.

Coming attractions

  • Group Administrator Pages will have longhand removed.
  • At least one update to the tree to be released this year.
  • Update will include: data from Big Y, relevant publications, other companies’ tests from raw data.
  • We’ll set up a system for those who have tested with other big data companies to contribute their raw data file to future versions of the tree.
  • We’re committed to releasing at least one update per year.
  • The Genographic Project is currently integrating the new data into their system and will announce on their website when the process is complete in the coming weeks. At that time, all Geno 2.0 participants’ results will be updated accordingly and accessible via the Genographic Project website.

What Does This Mean to You?

Your Badge

On your welcome page, your badges are listed.  Your badge previously would have included the longhand form of the haplogroup, such as R1b1a2, but now it shows R-M269.

2014 y 1

Please note that badges are not yet showing on all participants pages.  If yours aren’t yet showing, clicking on the Haplotree and SNP page under the YDNA option on the blue options bar where your more detailed information is shown, below.

Your Haplogroup Name

Your haplogroup is now noted only as the SNP designation, R-M269, not the older longhand names.

2014 y 2 v2

Haplogroup R is a huge haplogroup, so you’ll need to scroll down to see your confirmed or predicted haplogroup, shown in green below.

2014 y 3

Redesigned Page

The redesigned haplotree page includes an option to order SNPs downstream of your confirmed or predicted haplogroup.  This refines your haplogroup and helps isolate your branch on the tree.  You may or may not want to do this.  In some cases, this does help your genealogy, especially in cases where you’re dealing with haplogroup R.  For the most part, haplogroups are more historical in nature.  For example, they will help you determine whether your ancestors are Native American, African, Anglo Saxon or maybe Viking.  Haplogroups help us reach back before the advent of surnames.

The new page shows which SNPs are available for you to order from the SNPs on the tree today, shown above, in blue to the right of the SNP branch.

SNPs not on the Tree

Not all known SNPs are on the tree.  Like I said, a line in the sand had to be drawn.  There are SNPs, many recently discovered, that are not on the tree.

To put this in perspective, the new tree incorporates 6200 SNPs (up from 850), but the Big Y “pool” of known SNPs against which Family Tree DNA is comparing those results was 36,562 when the first results were initially released at the end of February.

If you have taken advanced SNP testing, such as the Walk the Y, the Big Y, or tested individual SNPs, your terminal SNP may not be on the tree, which means that your terminal SNP shown on your page, such as R-M269 above, MAY NOT BE ACCURATE in light of that testing.  Why?  Because these newly discovered SNPs are not yet on the tree. This only affects people who have done advanced testing which means it does not affect most people.

Ordering SNPs

You can order relevant SNPs for your haplogroup on the tree by clicking on the “Add” button beside the SNP.

You can order SNPs not on the tree by clicking on the “Advanced Order Form” link available at the bottom of the haplotree page.

2014 y 4

If you’re not sure of what you want to do, or why, you might want to touch bases with your project administrators.  Depending on your testing goal, it might be much more advantageous, both scientifically and financially, for you to take either the Geno2 test or the Big Y.

At this point, in light of some of the issues with the new release, I would suggest maybe holding tight for a bit in terms of ordering new SNPs unless you’re positive that your haplogroup is correct and that the SNP selection you want to order would actually be beneficial to you.

Words of Caution

This are some bugs in this massive update.  You might want to check your haplogroup assignment to be sure it is reflected accurately based on any SNP testing you have had done, of course, excepting the very advanced tests mentioned above.

If you discover something that is inaccurate or questionable, please notify Family Tree DNA.  This is especially relevant for project administrators who are familiar with family groups and know that people who are in the same surname group should share a common base haplogroup, although some people who have taken further SNP testing will be shown with a downstream haplogroup, further down that particular branch of the tree.

What kind of result might you find suspicious or questionable?  For example, if in your surname project, your matching surname cousins are all listed at R-M269 and you were too previously, but now you’re suddenly in a different haplogroup, like E, there is clearly an error.

Any suspected or confirmed errors should be reported to Family Tree DNA.

They have made it very easy by providing a “Feedback” button on the top of the page and there is a “Y tree” option in the dropdown box.

2014 y 5

For administrators providing reports that involve more than one participant, please send to Groups@familytreedna.com and include the kit numbers, the participants names and the nature of the issue.

Additional Information

Family Tree DNA provides a free webinar that can be viewed about the 2014 Y Tree release.  You can see all of the webinars that are archived and available for viewing at:  https://www.familytreedna.com/learn/ftdna/webinars/

What’s Next?

The Genographic Project is in the process of updating to the same tree so their results can be synchronized with the 2014 tree.  A date for this has not yet been released.

Family Tree DNA has committed to at least one more update this year.

I know that this update was massive and required extensive reprogramming that affected almost every aspect of their webpage.  If you think about it, nearly every page had to be updated from the main page to the order page.  The tree is the backbone of everything.  I want to thank the Family Tree DNA and Genograpic combined team for their efforts and Bennett Greenspan for making sure this did happen, just as he committed to do in November at the last conference.

Like everyone else, I want everything NOW, not tomorrow.  We’re all passionate about this hobby – although I think it is more of a life mission for many – and surpassed hobby status long ago.

I know there are issues with the tree and they frustrate me, like everyone else.  Those issues will be resolved.  Family Tree DNA is actively working on reported issues and many have already been fixed.

There is some amount of disappointment in the genetic genealogy community about the SNPs not included on the tree, especially the SNPs recently discovered in advanced tests like the Big Y.  Other trees, like the ISOGG tree, do in fact reflect many of these newly discovered SNPs.

There are a couple of major differences.  First, ISOGG has an virtual army of volunteers who are focused on maintaining this tree.  We are all very lucky that they do, and that Alice Fairhurst coordinates this effort and has done so now for many years.  I would be lost without the ISOGG tree.

However, when a change is made to the ISOGG tree, and there have been thousands of changes, adds and moves over the years, nothing else is affected.  No one’s personal page, no one’s personal tree, no projects, no maps, no matches and no order pages.  ISOGG has no “responsibility” to anyone – in other words – it’s widely known and accepted that they are a volunteer organization without clients.

Family Tree DNA, on the other hand has half a million (or so) paying customers.  Tree changes have a huge domino ripple effect there – not only on their customers’ personal pages, but to their entire website, projects, support and orders.  A change at Family Tree DNA is much more significant than on the ISOGG page – not to mention – they don’t have the same army of volunteers and they have to rely on the raw science, not interpretation, as they said in the information they provided.  A tree update at Family Tree DNA is a very different animal than updating a stand-alone tree, especially considering their collaboration with various scientific organizations, including the National Geographic Society.

I commend Family Tree DNA for this update and thank them for the update and the educational materials.  I’m also glad to see that they do indeed rely only on science, not interpretation.  Frustrating to the genetic genealogist in me?  Sure.  But in the long run, it’s worth it to be sure the results are accurate.

Could this release have been smoother and more accurate?  Certainly.  Hopefully this is the big speed bump and future releases will be much more graceful.  It’s easy to see why there aren’t any other companies providing this type of comprehensive testing.  It’s gone from an easy 12 marker “do we match” scenario to the forefront of pioneering population genetics.  And all within a decade.  It’s amazing that any company can keep up.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

 

Haplogroup Comparisons Between Family Tree DNA and 23andMe

Recently, I’ve received a number of questions about comparing people and haplogroups between 23andMe and Family Tree DNA.  I can tell by the questions that a significant amount of confusion exists about the two, so I’d like to talk about both.  In you need a review of “What is a Haplogroup?”, click here.

Haplogroup information and comparisons between Family Tree DNA information and that at 23andMe is not apples and apples.  In essence, the haplogroups are not calculated in the same way, and the data at Family Tree DNA is much more extensive.  Understanding the differences is key to comparing and understanding results. Unfortunately, I think a lot of misinterpretation is happening due to misunderstanding of the essential elements of what each company offers, and what it means.

There are two basic kinds of tests to establish haplogroups, and a third way to estimate.

Let’s talk about mitochondrial DNA first.

Mitochondrial DNA

You have a very large jar of jellybeans.  This jar is your mitochondrial DNA.

jellybeans

In your jar, there are 16,569 mitochondrial DNA locations, or jellybeans, more or less.  Sometimes the jelly bean counter slips up and adds an extra jellybean when filling the jar, called an insertion, and sometimes they omit one, called a deletion.

Your jellybeans come in 4 colors/flavors, coincidentally, the same colors as the 4 DNA nucleotides that make up our double helix segments.  T for tangerine, A for apricot, C for chocolate and G for grape.

Each of the 16,569 jellybeans has its own location in the jar.  So, in the position of address 1, an apricot jellybean is always found there.  If the jellybean jar filler makes a mistake, and puts a grape jellybean there instead, that is called a mutation.  Mistakes do happen – and so do mutations.  In fact, we count on them.  Without mutations, genetic genealogy would be impossible because we would all be exactly the same.

When you purchase a mitochondrial DNA test from Family Tree DNA, you have in the past been able to purchase one of three mitochondrial testing levels.  Today, on the website, I see only the full sequence test for $199, which is a great value.

However, regardless of whether you purchase the full mitochondrial sequence test today, which tests all of your 16,569 locations, or the earlier HVR1 or HVR1+HVR2 tests, which tested a subset of about 10% of those locations called the HyperVariable Region, Family Tree DNA looks at each individual location and sees what kind of a jellybean is lodged there.  In position 1, if they find the normal apricot jellybean, they move on to position 2.  If they find any other kind of jellybean in position 1, other than apricot, which is supposed to be there, they record it as a mutation and record whether the mutation is a T,C or G.  So, Family Tree DNA reads every one of your mitochondrial DNA addresses individually.

Because they do read them individually, they can also discover insertions, where extra DNA is inserted, deletions, where some DNA dropped out of line, and an unusual conditions called a heteroplasmy which is a mutation in process where you carry some of two kinds of jellybean in that location – kind of a half and half 2 flavor jellybean.  We’ll talk about heteroplasmic mutations another time.

So, at Family Tree DNA, the results you see are actually what you carry at each of your individual 16,569 mitochondrial addresses.  Your results, an example shown below, are the mutations that were found.  “Normal” is not shown.  The letter following the location number, 16069T, for example, is the mutation found in that location.  In this case, normal is C.  In the RSRS model of showing mitochondrial DNA mutations, this location/mutation combination would be written as C16069T so that you can immediately see what is normal and then the mutated state.  You can click on the images to enlarge.

ftdna mito results

Family Tree DNA gives you the option to see your results either in the traditional CRS (Cambridge Reference Sequence) model, above, or the more current Reconstructed Sapiens Reference Sequence (RSRS) model.  I am showing the CRS version because that is the version utilized by 23andMe and I want to compare apples and apples.  You can read about the difference between the two versions here.

Defining Haplogroups

Haplogroups are defined by specific mutations at certain addresses.

For example, the following mutations, cumulatively, define haplogroup J1c2f.  Each branch is defined by its own mutation(s).

Haplogroup Required Mutations  
J C295T, T489C, A10398G!,   A12612G, G13708A, C16069T
J1 C462T, G3010A
J1c G185A, G228A,   T14798C
J1c2 A188G
J1c2f G9055A

You can see, below, that these results, shown above, do carry these mutations, which is how this individual was assigned to haplogroup J1c2f. You can read about how haplogroups are defined here.

ftdna J1c2f mutations

At 23andMe, they use chip based technology that scans only specifically programmed locations for specific values.  So, they would look at only the locations that would be haplogroup producing, and only those locations.  Better yet if there is one location that is utilized in haplogroup J1c2f that is predictive of ONLY J1c2f, they would select and use that location.

This same individual at 23andMe is classified as haplogroup J1c2, not J1c2f.  This could be a function of two things.  First, the probes might not cover that final location, 9055, and second, 23andMe may not be utilizing the same version of the mitochondrial haplotree as Family Tree DNA.

By clicking on the 23andMe option for “Ancestry Tools,” then “Haplogroup Tree Mutation Mapper,” you can see which mutations were tested with the probes to determine a haplogroup assignment.  23andMe information for this haplogroup is shown below.  This is not personal information, meaning it is not specific to you, except that you know you have mutations at these locations based on the fact that they have assigned you to the specific haplogroup defined by these mutations.  What 23andMe is showing in their chart is the ancestral value, which is the value you DON’T have.  So your jelly bean is not chocolate at location 295, it’s tangerine, apricot or grape.

Notice that 23andMe does not test for J1c2f.  In addition, 23andMe cannot pick up on insertions, deletions or heteroplasmies.  Normally, since they aren’t reading each one of your locations and providing you with that report, missing insertions and deletions doesn’t affect anything, BUT, if a deletion or insertion is haplogroup defining, they will miss this call.  Haplogroup K comes to mind.

J defining mutations

J1 defining mutations

J1c defining mutations

23andMe never looks at any locations in the jelly bean jar other than the ones to assign a haplogroup, in this case,17 locations.  Family Tree DNA reads every jelly bean in the jelly bean jar, all 16,569.  Different technology, different results.  You also receive your haplogroup at 23andMe as part of a $99 package, but of course the individual reading of your mitochondrial DNA at Family Tree DNA is more accurate.  Which is best for you depends on your personal testing goals, so long as you accurately understand the differences and therefore how to interpret results.  A haplogroup match does not mean you’re a genealogy match.  More than one person has told me that they are haplogroup J1c, for example, at Family Tree DNA and they match someone at 23andMe on the same haplogroup, so they KNOW they have a common ancestor in the past few generations.  That’s an incorrect interpretation.  Let’s take a look at why.

Matches Between the Two

23andMe provides the tester with a list of the people who match them at the haplogroup level.  Most people don’t actually find this information, because it is buried on the “My Results,” then “Maternal Line” page, then scrolling down until your haplogroup is displayed on the right hand side with a box around it.

Those who do find this are confused because they interpret this to mean they are a match, as in a genealogical match, like at Family Tree DNA, or like when you match someone at either company autosomally.  This is NOT the case.

For example, other than known family members, this individual matches two other people classified as haplogroup J1c2.  How close of a match is this really?  How long ago do they share a common ancestor?

Taking a look at Doron Behar’s paper, “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root,” in the supplemental material we find that haplogroup J1c2 was born about 9762 years ago with a variance of plus or minus about 2010 years, so sometime between 7,752 and 11,772 years ago.  This means that these people are related sometime in the past, roughly, 10,000 years – maybe as little as 7000 years ago.  This is absolutely NOT the same as matching your individual 16,569 markers at Family Tree DNA.  Haplogroup matching only means you share a common ancestor many thousands of years ago.

For people who match each other on their individual mitochondrial DNA location markers, their haplotype, Family Tree DNA provides the following information in their FAQ:

    • Matching on HVR1 means that you have a 50% chance of sharing a common maternal ancestor within the last fifty-two generations. That is about 1,300 years.
    • Matching on HVR1 and HVR2 means that you have a 50% chance of sharing a common maternal ancestor within the last twenty-eight generations. That is about 700 years.
    • Matching exactly on the Mitochondrial DNA Full Sequence test brings your matches into more recent times. It means that you have a 50% chance of sharing a common maternal ancestor within the last 5 generations. That is about 125 years.

I actually think these numbers are a bit generous, especially on the full sequence.  We all know that obtaining mitochondrial DNA matches that we can trace are more difficult than with the Y chromosome matches.  Of course, the surname changing in mitochondrial lines every generation doesn’t help one bit and often causes us to “lose” maternal lines before we “lose” paternal lines.

Autosomal and Haplogroups, Together

As long as we’re mythbusting here – I want to make one other point.  I have heard people say, more than once, that an autosomal match isn’t valid “because the haplogroups don’t match.”  Of course, this tells me immediately that someone doesn’t understand either autosomal matching, which covers all of your ancestral lines, or haplogroups, which cover ONLY either your matrilineal, meaning mitochondrial, or patrilineal, meaning Y DNA, line.  Now, if you match autosomally AND share a common haplogroup as well, at 23andMe, that might be a hint of where to look for a common ancestor.  But it’s only a hint.

At Family Tree DNA, it’s more than a hint.  You can tell for sure by selecting the “Advanced Matching” option under Y-DNA, mtDNA or Family Finder and selecting the options for both Family Finder (autosomal) and the other type of DNA you are inquiring about.  The results of this query tell you if your markers for both of these tests (or whatever tests are selected) match with any individuals on your match list.

Advanced match options

Hint – for mitochondrial DNA, I never select “full sequence” or “all mtDNA” because I don’t want to miss someone who has only tested at the HVR1 level and also matches me autosomally.  I tend to try several combinations to make sure I cover every possibility, especially given that you may match someone at the full sequence level, which allows for mutations, that you don’t match at the HVR1 level.  Same situation for Y DNA as well.  Also note that you need to answer “yes” to “Show only people I match on all selected tests.”

Y-DNA at 23andMe

Y-DNA works pretty much the same at 23andMe as mitochondrial meaning they probe certain haplogroup-defining locations.  They do utilize a different Y tree than Family Tree DNA, so the haplogroup names may be somewhat different, but will still be in the same base haplogroup.  Like mitochondrial DNA, by utilizing the haplogroup mapper, you can see which probes are utilized to determine the haplogroup.  The normal SNP name is given directly after the rs number.  The rs number is the address of the DNA on the chromosome.  Y mutations are a bit different than the display for mitochondrial DNA.  While mitochondrial DNA at 23andMe shows you only the normal value, for Y DNA, they show you both the normal, or ancestral, value and the derived, or current, value as well.  So at SNP P44, grape is normal and you have apricot if you’ve been assigned to haplogroup C3.

C3 defining mutations

As we are all aware, many new haplogroups have been defined in the past several months, and continue to be discovered via the results of the Big Y and Full Y test results which are being returned on a daily basis.  Because 23andMe does not have the ability to change their probes without burning an entirely new chip, updates will not happen often.  In fact, their new V4 chip just introduced in December actually reduced the number of probes from 967,000 to 602,000, although CeCe Moore reported that the number of mtDNA and Y probes increased.

By way of comparison, the ISOGG tree is shown below.  Very recently C3 was renamed to C2, which isn’t really the point here.  You can see just how many haplogroups really exist below C3/C2 defined by SNP M217.  And if you think this is a lot, you should see haplogroup R – it goes on for days and days!

ISOGG C3-C2 cropped

How long ago do you share a common ancestor with that other person at 23andMe who is also assigned to haplogroup C3?  Well, we don’t have a handy dandy reference chart for Y DNA like we do for mitochondrial – partly because it’s a constantly moving target, but haplogroup C3 is about 12,000 years old, plus or minus about 5,000 years, and is found on both sides of the Bering Strait.  It is found in indigenous Native American populations along with Siberians and in some frequency, throughout all of Asia and in low frequencies, into Europe.

How do you find out more about your haplogroup, or if you really do match that other person who is C3?  Test at Family Tree DNA.  23andMe is not in the business of testing individual markers.  Their business focus is autosomal DNA and it’s various applications, medical and genealogical, and that’s it.

Y-DNA at Family Tree DNA

At Family Tree DNA, you can test STR markers at 12, 25, 37, 67 and 111 marker levels.  Most people, today, begin with either 37 or 67 markers.

Of course, you receive your results in several ways at Family Tree DNA, Haplogroup Origins, Ancestral Origins, Matches Maps and Migration Maps, but what most people are most interested in are the individual matches to other people.  These STR markers are great for genealogical matching.  You can read about the difference between STR and SNP markers here.

When you take the Y test, Family Tree DNA also provides you with an estimated haplogroup.  That estimate has proven to be very accurate over the years.  They only estimate your haplogroup if you have a proven match to someone who has been SNP tested. Of course it’s not a deep haplogroup – in haplogroup R1b it will be something like R1b1a2.  So, while it’s not deep, it’s free and it’s accurate.  If they can’t predict your haplogroup using that criteria, they will test you for free.  It’s called their SNP assurance program and it has been in place for many years.  This is normally only necessary for unusual DNA, but, as a project administrator, I still see backbone tests being performed from time to time.

If you want to purchase SNP tests, in various formats, you can confirm your haplogroup and order deeper testing.

You can order individual SNP markers for about $39 each and do selective testing.  On the screen below you can see the SNPs available to purchase for haplogroup C3 a la carte.

FTDNA C3 SNPs

You can order the Geno 2.0 test for $199 and obtain a large number of SNPs tested, over 12,000, for the all-inclusive price.  New SNPs discovered since the release of their chip in July of 2012 won’t be included either, but you can then order those a la carte if you wish.

Or you can go all out and order the new Big Y for $695 where all of your Y jellybeans, all 13.5 million of them in your Y DNA jar are individually looked at and evaluated.  People who choose this new test are compared against a data base of more than 36,000 known SNPs and each person receives a list of “novel variants” which means individual SNPs never before discovered and not documented in the SNP data base of 36,000.

Don’t know which path to take?  I would suggest that you talk to the haplogroup project administrator for the haplogroup you fall into.  Need to know how to determine which project to join, and how to join? Click here.  Haplogroup project administrators are generally very knowledgeable and helpful.  Many of them are spearheading research into their haplogroup of interest and their knowledge of that haplogroup exceeds that of anyone else.  Of course you can also contact Family Tree DNA and ask for assistance, you can purchase a Quick Consult from me, and you can read this article about comparing your options.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

STRs vs SNPs, Multiple DNA Personalities

One of the questions I receive rather regularly is about the difference between STRs and SNPs.

Generally, what people really want to understand is the difference between the products, and a basic answer is really all they want.  I explain that an STR or Short Tandem Repeat is a different kind of a mutation than a SNP or a Single Nucleotide Polymorphism.  STRs are useful genealogically, to determine to whom you match within a recent timeframe, of say, the past 500 years or so, and SNPs define haplogroups which reach much further back in time.  Furthermore SNPs are considered “once in a lifetime,” or maybe better stated, “once in the lifetime of mankind” type of events, known as a UEP, Unique Event Polymorphism, where STRs happen “all the time,” in every haplogroup.  In fact, this is why you can check for the same STR markers in every haplogroup – those markers we all know and love.

STR

This was a pretty good explanation for a long time but as sequencing technology has improved and new tests have become available, such as the Full Y and Big Y tests, new mutations are being very rapidly discovered which blurs the line between the timeframes that had been used to separate these types of tests.  In fact, now they are overlapping in time, so SNPs are, in some cases becoming genealogically useful.  This also means that these newly discovered family SNPs are relatively new, meaning they only occurred between the current generation and 1000 years ago, so we should not expect to find huge numbers of these newly developed mutations in the population.  For example, if the SNP that defined haplogroup R1b1a2, M269, occurred 15,000 years ago in one man, his descendants have had 15,000 years to procreate and pass his M269 on down the line(s), something they have done very successfully since about half of Europe is either M269 or a subclade.

Each subclade has a SNP all its own.  In fact, each subclade is defined by a specific SNP that forms its own branch of the human Y haplotree.

So far, so good.

But what does a SNP or an STR really look like, I mean, in the raw data?  How do you know that you’re seeing one or the other?

Like Baseball – 4 Bases

The smallest units of DNA are made up of 4 base nucleotides, DNA words, that are represented by the following letters:

A = Adenine
C = Cytosine
G = Guanine
T = Thymine

TACG

These nucleotides combine in pairs to form the ladder rungs of DNA, shown right that connect the helix backbones.  T typically combines with A and C usually combines with G, reaching between the backbones of the double helix to connect with their companion protein in the center.

You don’t need to remember the words or even the letters, just remember that we are looking for pattern matches of segments of DNA.

Point Mutations

Your DNA when represented on paper looks like a string of beads where there are 4 kinds of beads, each representing one of the nucleotides above.  One segment of your DNA might look like this:

Indel example 1

If this is what the standard or reference sequence for your haplotype (your personal DNA results) or your family haplogroup (ancestral clan) looks like, then a mutation would be defined as any change, addition, or deletion.  A change would be if the first A above were to change to T or G or C as in the example below:

Indel example 2

A deletion would be noticed if the leading A were simply gone.

Indel example 3

An addition of course would be if a new bead were inserted in the sequence at that location.

Indel example 4

All of the above changes involve only one location.  These are all known as Point Mutations, because they occur at one single point.

SNPs

A point mutation may or may not be a SNP.  A SNP is defined by geneticists as a point mutation that is found in more than 1% of the population.  This should tell you right away that when we say “we’ve discovered a new SNP,” we’re really mis-applying that term, because until we determine that the frequency which it is found in the population is over the 1% threshold, it really isn’t a SNP, but is still considered a point mutation or binary polymorphism.

Today, when SNPS, or point mutations are discovered, they are considered “private mutations” or “family mutations.”  There has been consternation for some time about how to handle these types of situations.  ISOGG has set forth their criteria on their website.  They currently have the most comprehensive tree, but they certainly have their work cut out for them with the incoming tsunami of new SNPS that will be discovered utilizing these next generation tests, hundreds of which are currently in process.

STRs

A STR, or Short Tandem Repeat is analogous to a genetic stutter, or the copy machine getting stuck.  In the same situation as above, utilizing the same base for comparison, we see a group of inserted nucleotides that are all duplicates of each other.

STR example

In this case, we have a short tandem repeat that is 4 segments in length meaning that CT is inserted 4 times.  To translate, if this is marker DYS marker 390, you have a value of 5, meaning 5 repeats of CT.

So I’ve been fat and happy with this now for years, well over a decade.

The Monkey Wrench

And then I saw this:

“The L69/L159 polymorphism is essentially a SNP/STR oxymoron.”

To the best of my knowledge, this is impossible – one type of mutation excludes the other.  I googled about this topic and found nothing, nor did I find additional discussion of L69, other than this.

L69 verbiage

My first reaction to this was “that’s impossible,” followed by “Bloody Hell,” and my next reaction was to find someone who knew.

I reached out to Dr. David Mittelman, geneticist and Chief Scientific Officer at Gene by Gene, parent company of Family Tree DNA.  I asked him about the SNP/STR oxymoron and he said:

“This is impossible. There is no such thing as a SNP/STR.”

Whew!  I must say, I’m relieved.  I thought there for a minute there I had lost my mind.

I asked him what is really going on in this sequence, and he replied that, “This would be a complex variant — when multiple things are happening at once.”

Now, that I understand.  I have children, and grandchildren – I fully understand multiple things happening at once.  Let’s break this example apart and take a look at what is really happening.

HUGO is a reference standard, so let’s start there as our basis for comparison.

HUGO variant 1

In the L69 variant we have the following sequence.

HUGO variant 2

We see two distinct things happening in this sequence.  First, we have the deletion of two Gs, and secondly, we have the insertion of one additional TG.  According to Dr. Mittelman, both of these events are STRs, multiple insertions or deletions, and neither are point mutations or SNPs, so neither of these should really have SNP names, they should have STR type of names.

Let’s look at the L159 variant.

HUGO variant 3

In this case, we have the GG insertion and then we have a TG deletion.

In both cases, L69 and L159, the actual length of the DNA sequence remains the same as the reference, but the contents are different.  Both had 2 nucleotides removed and 2 added.

The good news is, as a consumer, that you don’t really need to know this, not at this level.  The even better news is that with the new discoveries forthcoming, whether they be STRs or SNPs, at the leafy end of the branch, they are often now overlapping with SNPs becoming much more genealogically useful.  In the past, if you were looking at a genetics mutation timeline, you had STRs that covered current to 1000 years, then nothing, then beginning at 5,000 or 10,000 years, you have SNPs that were haplogroup defining.

That gap has been steadily shrinking, and today, there often is no gap, the chasm is gone, and we’re discovering freshly hatched recently-occurring SNPs on a daily basis.

The day is fast approaching when you’ll want the full Y sequence, not to further define your haplogroup, but to further delineate your genealogy lines.  You’ll have two tools to do that, SNPs and STRs both, not just one.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

That Unruly X….Chromosome That Is

Iceberg

Something is wrong with the X chromosome.  More specifically, something is amiss with trying to use it, the way we normally use recombinant chromosomes for genealogy.  In short, there’s a problem.

If you don’t understand how the X chromosome recombines and is passed from generation to generation, now would be a good time to read my article, “X Marks the Spot” about how this works.  You’ll need this basic information to understand what I’m about to discuss.

The first hint of this “problem” is apparent in Jim Owston’s “Phasing the X Chromosome” article.  Jim’s interest in phasing his X, or figuring out where it came from genealogically, was spurred by his lack of X matches with his brothers.  This is noteworthy, because men don’t inherit any X from their father, so Jim’s failure to share much of his X with his brothers meant that he had inherited most of his X from just one of his mother’s parents, and his brothers inherited theirs from the other parent.  Utilizing cousins, Jim was able to further phase his X, meaning to attribute portions to the various grandparents from whence it came.  After doing this work, Jim said the following”

“Since I can only confirm the originating grandparent of 51% my X-DNA, I tend to believe (but cannot confirm at the present) that my X-chromosome may be an exact copy of my mother’s inherited X from her mother. If this is the case, I would not have inherited any X-DNA from my grandfather. This would also indicate that my brother Chuck’s X-DNA is 97% from our grandfather and only 3% from our grandmother. My brother John would then have 77% of his X-DNA from our grandfather and 23% from our grandmother.”

As a genetic genealogist, at the time Jim wrote this piece, I was most interested in the fact that he had phased or attributed the pieces of the X to specific ancestors and the process he used to do that.  I found the very skewed inheritance “interesting” but basically attributed it to an anomaly.  It now appears that this is not an anomaly.  It was, instead the tip of the iceberg and we didn’t recognize it as such.  Let’s look at what we would normally expect.

Recombination

The X chromosome does recombine when it can, or at least has the capacity to do so.  This means that a female who receives an X from both her father and mother receives a recombined X from her mother, but receives an X that is not recombined from her father.  That is because her father only receives one X, from his mother, so he has nothing to recombine with.  In the mother, the X recombines “in the normal way” meaning that parts of both her mother’s and her father’s X are given to her children, or at least that opportunity exists.  If you’re beginning to see some “weasel words” here or “hedge betting,” that’s because we’ve discovered that things aren’t always what they seem or could be.

The 50% Rule

In the statistical world of DNA, on the average, we believe that each generation receives roughly half of the DNA of the generations before them.  We know that each child absolutely receives 50% of the DNA of both parents, but how the grandparents DNA is divided up into that 50% that goes to each offspring differs.  It may not be 50%.  I am in the process of doing a generational inheritance study, which I will publish soon, which discusses this as a whole.

However, let’s use the 50% rule here, because it’s all we have and it’s what we’ve been working with forever.

In a normal autosomal, meaning non-X, situation, every generation provides to the current generation the following approximate % of DNA:

Autosomal % chart

Please note Blaine Bettinger’s X maternal inheritance chart percentages from his “More X-Chromosome Charts” article, and used with his kind permission in the X Marks the Spot article.

Blaine's maternal X %

I’m enlarging the inheritance percentage portion so you can see it better.

Blaine's maternal X % cropped

Taking a look at these percentages, it becomes evident that we cannot utilize the normal predictive methods of saying that if we share a certain percentage of DNA with an individual, then we are most likely a specific relationship.  This is because the percentage of X chromosome inherited varies based on the inheritance path, since men don’t receive an X from their fathers.  Not only does this mean that you receive no X from many ancestors, you receive a different percentage of the X from your maternal grandmother, 25%, because your mother inherited an X from both of her parents, versus from your paternal grandmother, 50%, because your father inherited an X from only his mother.

The Genetic Kinship chart, below, from the ISOGG wiki, is the “Bible” that we use in terms of estimating relationships.  It doesn’t work for the X.

Mapping cousin chart

Let’s look at the normal autosomal inheritance model as compared to the maternal X chart fan chart percentages, above, and similar calculations for the paternal side.  Remember, the Maternal Only column applies only to men, because in the very first generation, men’s and women’s inheritance percentages diverge.  Men receive 100% of their X from their mothers, while women receive 50% from each parent.

Generational X %s

Recombination – The Next Problem

The genetic genealogy community has been hounding Family Tree DNA incessantly to add the X chromosome matching into their Family Finder matching calculations.

On January 2, 2014, they did exactly that.  What’s that old saying, “Be careful what you ask for….”  Well, we got it, but “it” doesn’t seem to be providing us with exactly what we expected.

First, there were many reports of women having many more matches than men.  That’s to be expected at some level because women have so many more ancestors in the “mix,” especially when matching other women.

23andMe takes this unique mixture into consideration, or at least attempts to compensate for it at some level.  I’m not sure if this is a good or bad thing or if it’s useful, truthfully.  While their normal autosomal SNP matching threshold is 7cM and 700 matching SNPs within that segment, for X, their thresholds are:

  • Male matched to male – 1cM/200 SNPs
  • Male matched to female – 6cM/600 SNPs
  • Female matched to female – 6cM/1200 SNPs

Family Tree DNA does not use the X exclusively for matching.  This means that if you match someone utilizing their normal autosomal matching criteria of approximately 7.7cM and 500 SNPs, and you match them on the X chromosome, they will report your X as matching.  If you don’t match someone on any chromosome except the X, you will not be reported as a match.

The X matching criteria at Family Tree DNA is:

  • 1cM/500 SNPs

However, matching isn’t all of the story.

The X appears to not recombine normally.  By normally, I don’t mean something is medically wrong, I mean that it’s not what we are expecting to see in terms of the 50% rule.  In essence, we would expect to see approximately half of the X of each parent, grandfather and grandmother, passed on to the child from the mother in the maternal line where recombination is a possibility.  That appears to not be happening reliably.  Not only is this not happening in the nice neat 50% number, the X chromosome seems to be often not recombining at all.  If you think the percentages in the chart above threw a monkey wrench into genetic genealogy predictions, this information, if it holds up in a much larger test, in essence throws our predictive capability, at least as we know it today, out the window.

The X Doesn’t Recombine as Expected

In my generational study, I noticed that the X seemed not to be recombining.  Then I remembered something that Matt Dexter said at the Family Tree DNA Conference in November 2013 in Houston.  Matt has the benefit of having a full 3 generation pedigree chart where everyone has been tested, and he has 5 children, so he can clearly see who got the DNA from which of their grandparents.

I contacted Matt, and he provided me with his X chromosomal information about his family, giving me permission to share it with you.  I have taken the liberty of reformatting it in a spreadsheet so that we can view various aspects of this data.

Dexter table

First, note that I have sorted these by grandchild.  There are two females, who have the opportunity to inherit from 3 grandparents.  The females inherited one copy of the X from their mother, who had two copies herself, and one copy of the X from her father who only had his mother’s copy.  Therefore, the paternal grandfather is listed above, but with the note “cannot inherit.”  This distinguishes this event from the circumstance with Grandson 1 where he could inherit some part of his maternal grandfather’s X, but did not.

For the three grandsons, I have listed all 4 grandparents and noted the paternal grandmother and grandfather as “cannot inherit.”  This is of course because the grandsons don’t inherit an X from their father.  Instead they inherit the Y, which is what makes them male.

According to the Rule of 50%, each child should receive approximately half of the DNA of each maternal grandparent that they can inherit from.  I added the columns, % Inherited cM and % Inherited SNP to illustrate whether or not this number comes close to the 50% we would expect.  The child MUST have a complete X chromosome which is comprised of 18092 SNPs and is 195.93cM in length, barring anomalies like read errors and such, which do periodically occur.  In these columns, 1=100%, so in the Granddaughter 1 column of % Inherited cM, we see 85% for the maternal grandfather and about 15% for the maternal grandmother.  That is hardly 50-50, and worse yet, it’s no place close to 50%.

Granddaughter 1 and 2 must inherit their paternal grandmother’s X intact, because there is nothing to recombine with.

Granddaughter 2 inherited even more unevenly, with about 90% and 10%, but in favor of the other grandparent.  So, statistically speaking, it’s about 50% for each grandparent between the two grandchildren, but it is widely variant when looking at them individually.

Grandson 1, as mentioned, inherited his entire X from his maternal grandmother with absolutely no recombination.

Grandsons 2 and 3 fall much closer to the expected 50%.

The problem for most of us is that you need 3 or 4 consecutive generations to really see this happening, and most of us simply don’t have data that deep or robust.

A recent discussion on the DNA Genealogy Rootsweb mailing list revealed several more of these documented occurrences, among them, two separate examples where the X chromosome was unrecombined for 4 generations.

Robert Paine, a long-time genetic genealogy contributor and project administrator reported that in his family medical/history project, at 23andMe, 25% of his participants show no recombination on the X chromosome.  That’s a staggering percentage.  His project consists of  21 people in with 2 blood lines tested 5 generations deep and 2 bloodlines tested at 4 generations

One woman’s X matches her great-great-grandmother’s X exactly.  That’s 4 separate inheritance events in a row where the X was not recombined at all.

The graphic below, provided by Robert,  shows the chromosome browser at 23andMe where you can see the X matches exactly for all three participants being compared.

The screen shot is of the gg-granddaughter Evelyn being compared to her gg-grandmother, Shevy, Evelyn’s g-grandfather Rich and Evelyn’s grandmother Cyndi. 23andme only lets you compare 3 individuals at a time so Robert did not include Evelyn’s mother Shay, who is an exact match with Evelyn.

Paine X

Where Are We?

So what does this mean to genetic genealogy?  It certainly does not mean we should throw the baby out with the bath water.  What it is, is an iceberg warning that there is more lurking beneath the surface.  What and how big?  I can’t tell you.  I simply don’t know.

Here’s what I can tell you.

  • The X chromosome matching can tell you that you do share a common ancestor someplace back in time.
  • The amount of DNA shared is not a reliable predictor of how long ago you shared that ancestor.
  • The amount of DNA shared cannot predict your relationship with your match.  In fact, even a very large match can be many generations removed.
  • The absence of an X match, even with someone closely related whom you should match does not disprove a descendant relationship/common ancestor.
  • The X appears to not recombine at a higher rate than previously thought, the previous expectation being that this would almost never happen.
  • The X, when it does recombine appears to do so in a manner not governed by the 50% rule.  In fact, the 50% rule may not apply at all except as an average in large population studies, but may well be entirely irrelevant or even misleading to the understanding of X chromosome inheritance in genetic genealogy.

The X is still useful to genetic genealogists, just not in the same way that other autosomal data is utilized.  The X is more of an auxiliary chromosome that can provide information in addition to your other matches because of its unique inheritance pattern.

Unfortunately, this discovery leaves us with more questions than answers.  I found it incomprehensible that this phenomenon has never been studied in humans, or in animals, for that matter, at least not that I could find.  What few references I did find indicated that the X seems to recombine with the same frequency as the other autosomes, which we are finding to be untrue.

What is needed is a comprehensive study of hundreds of X transmission events at least 3 generations deep.

As it turns out, we’re not the only ones confused by the behavior of the X chromosome.  Just yesterday, the New York Times had an article about Seeing the X Chromosome in a New Light.  It seems that either one copy of the X, or the other, is disabled cell by cell in the human body.  If you are interested in this aspect of science, it’s a very interesting read.  Indeed, our DNA continues to both amaze and amuse us.

A special thank you to Jim Owston, Matt Dexter, Blaine Bettinger and Robert Paine for sharing their information.

Additional sources:

Polymorphic Variation in Human Meiotic
Recombination (2007)
Vivian G. Cheung
University of Pennsylvania
http://repository.upenn.edu/cgi/viewcontent.cgi?article=1102&context=be_papers

A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome, Science October 2005, Myers et al
http://www.sciencemag.org/content/310/5746/321.full.pdf
Supplemental Material
http://www.sciencemag.org/content/suppl/2005/10/11/310.5746.321.DC1

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2013’s Dynamic Dozen – Top Genetic Genealogy Happenings

dna 8 ball

Last year I wrote a column at the end of the year titled  “2012 Top 10 Genetic Genealogy Happenings.”  It’s amazing the changes in this industry in just one year.  It certainly makes me wonder what the landscape a year from now will look like.

I’ve done the same thing this year, except we have a dozen.  I couldn’t whittle it down to 10, partly because there has been so much more going on and so much change – or in the case of Ancestry, who is noteworthy because they had so little positive movement.

If I were to characterize this year of genetic genealogy, I would call it The Year of the SNP, because that applies to both Y DNA and autosomal.  Maybe I’d call it The Legal SNP, because it is also the year of law, court decisions, lawsuits and FDA intervention.  To say it has been interesting is like calling the Eiffel Tower an oversized coat hanger.

I’ll say one thing…it has kept those of us who work and play in this industry hopping busy!  I guarantee you, the words “I’m bored” have come out of the mouth of no one in this industry this past year.

I’ve put these events in what I consider to be relatively accurate order.  We could debate all day about whether the SNP Tsunami or the 23andMe mess is more important or relevant – and there would be lots of arguing points and counterpoints…see…I told you lawyers were involved….but in reality, we don’t know yet, and in the end….it doesn’t matter what order they are in on the list:)

Y Chromosome SNP Tsunami Begins

The SNP tsumani began as a ripple a few years ago with the introduction at Family Tree DNA of the Walk the Y program in 2007.  This was an intensively manual process of SNP discovery, but it was effective.

By the time that the Geno 2.0 chip was introduced in 2012, 12,000+ SNPs would be included on that chip, including many that were always presumed to be equivalent and not regularly tested.  However, the Nat Geo chip tested them and indeed, the Y tree became massively shuffled.  The resolution to this tree shuffling hasn’t yet come out in the wash.  Family Tree DNA can’t really update their Y tree until a publication comes out with the new tree defined.  That publication has been discussed and anticipated for some time now, but it has yet to materialize.  In the mean time, the volunteers who maintain the ISOGG tree are swamped, to say the least.

Another similar test is the Chromo2 introduced this year by Britain’s DNA which scans 15,000 SNPs, many of them S SNPs not on the tree nor academically published, adding to the difficulty of figuring out where they fit on the Y tree.  While there are some very happy campers with their Chromo2 results, there is also a great deal of sloppy science, reporting and interpretation of “facts” through this company.  Kind of like Jekyll and Hyde.  See the Sloppy Science section.

But Walk the Y, Chromo2 and Geno 2.0, are only the tip of the iceburg.  The new “full Y” sequencing tests brought into the marketspace quietly in early 2013 by Full Genomes and then with a bang by Family Tree DNA with the their Big Y in November promise to revolutionize what we know about the Y chromosome by discovering thousands of previously unknown SNPs.  This will in effect swamp the Y tree whose branches we thought were already pretty robust, with thousands and thousands of leaves.

In essence, the promise of the “fully” sequenced Y is that what we might term personal or family SNPs will make SNP testing as useful as STR testing and give us yet another genealogy tool with which to separate various lines of one genetic family and to ratchet down on the time that the most common recent ancestor lived.

http://dna-explained.com/2013/03/31/new-y-dna-haplogroup-naming-convention/

http://dna-explained.com/2013/11/10/family-tree-dna-announces-the-big-y/

http://dna-explained.com/2013/11/16/what-about-the-big-y/

http://www.yourgeneticgenealogist.com/2013/11/first-look-at-full-genomes-y-sequencing.html

http://cruwys.blogspot.com/2013/12/a-first-look-at-britainsdna-chromo-2-y.html

http://cruwys.blogspot.com/2013/11/yseqnet-new-company-offering-single-snp.html

http://cruwys.blogspot.com/2013/11/the-y-chromosome-sequence.html

http://cruwys.blogspot.com/2013/11/a-confusion-of-snps.html

http://cruwys.blogspot.com/2013/11/a-simplified-y-tree-and-common-standard.html

23andMe Comes Unraveled

The story of 23andMe began as the consummate American dotcom fairy tale, but sadly, has deteriorated into a saga with all of the components of a soap opera.  A wealthy wife starts what could be viewed as an upscale hobby business, followed by a messy divorce and a mystery run-in with the powerful overlording evil-step-mother FDA.  One of the founders of 23andMe is/was married to the founder of Google, so funding, at least initially wasn’t an issue, giving 23andMe the opportunity to make an unprecedented contribution in the genetic, health care and genetic genealogy world.

Another way of looking at this is that 23andMe is the epitome of the American Dream business, a startup, with altruism and good health, both thrown in for good measure, well intentioned, but poorly managed.  And as customers, be it for health or genealogy or both, we all bought into the altruistic “feel good” culture of helping find cures for dread diseases, like Parkinson’s, Alzheimer’s and cancer by contributing our DNA and responding to surveys.

The genetic genealogy community’s love affair with 23andMe began in 2009 when 23andMe started focusing on genealogy reporting for their tests, meaning cousin matches.  We, as a community, suddenly woke up and started ordering these tests in droves.  A few months later, Family Tree DNA also began offering this type of testing as well.  The defining difference being that 23andMe’s primary focus has always been on health and medical information with Family Tree DNA focused on genetic genealogy.  To 23andMe, the genetic genealogy community was an afterthought and genetic genealogy was just another marketing avenue to obtain more people for their health research data base.  For us, that wasn’t necessarily a bad thing.

For awhile, this love affair went along swimmingly, but then, in 2012, 23andMe obtained a patent for Parkinson’s Disease.  That act caused a lot of people to begin to question the corporate focus of 23andMe in the larger quagmire of the ethics of patenting genes as a whole.  Judy Russell, the Legal Genealogist, discussed this here.  It’s difficult to defend 23andMe’s Parkinson’s patent while flaying alive Myriad for their BRCA patent.  Was 23andMe really as altruistic as they would have us believe?

Personally, this event made me very nervous, but I withheld judgment.  But clearly, that was not the purpose for which I thought my DNA, and others, was being used.

But then came the Designer Baby patent in 2013.  This made me decidedly uncomfortable.  Yes, I know, some people said this really can’t be done, today, while others said that it’s being done anyway in some aspects…but the fact that this has been the corporate focus of 23andMe with their research, using our data, bothered me a great deal.  I have absolutely no issue with using this information to assure or select for healthy offspring – but I have a personal issue with technology to enable parents who would select a “beauty child,” one with blonde hair and blue eyes and who has the correct muscles to be a star athlete, or cheerleader, or whatever their vision of their as-yet-unconceived “perfect” child would be.  And clearly, based on 23andMe’s own patent submission, that is the focus of their patent.

Upon the issuance of the patent, 23andMe then said they have no intention of using it.  They did not say they won’t sell it.  This also makes absolutely no business sense, to focus valuable corporate resources on something you have no intention of using?  So either they weren’t being truthful, they lack effective management or they’ve changed their mind, but didn’t state such.

What came next, in late 2013 certainly points towards a lack of responsible management.

23andMe had been working with the FDA for approval the health and medical aspect of their product (which they were already providing to consumers prior to the November 22nd cease and desist order) for several years.  The FDA wants assurances that what 23andMe is telling consumers is accurate.  Based on the letter issued to 23andMe on November 22nd, and subsequent commentary, it appears that both entities were jointly working towards that common goal…until earlier this year when 23andMe mysteriously “somehow forgot” about the FDA, the information they owed them, their submissions, etc.  They also forgot their phone number and their e-mail addresses apparently as well, because the FDA said they had heard nothing from them in 6 months, which backdates to May of 2013.

It may be relevant that 23andMe added the executive position of President and filled it in June of 2013, and there was a lot of corporate housecleaning that went on at that time.  However, regardless of who got housecleaned, the responsibility for working with the FDA falls squarely on the shoulders of the founders, owners and executives of the company.  Period.  No excuses.  Something that critically important should be on the agenda of every executive management meeting.   Why?  In terms of corporate risk, this was obviously a very high risk item, perhaps the highest risk item, because the FDA can literally shut their doors and destroy them.  There is little they can do to control or affect the FDA situation, except to work with the FDA, meet deadlines and engender goodwill and a spirit of cooperation.  The risk of not doing that is exactly what happened.

It’s unknown at this time if 23andMe is really that corporately arrogant to think they could simply ignore the FDA, or blatantly corporately negligent or maybe simply corporately stupid, but they surely betrayed the trust and confidence of their customers by failing to meet their commitments with and to the FDA, or even communicate with them.  I mean, really, what were they thinking?

There has been an outpouring of sympathy for 23andme and negative backlash towards the FDA for their letter forcing 23andMe to stop selling their offending medical product, meaning the health portion of their testing.  However, in reality, the FDA was only meting out the consequences that 23andMe asked for.  My teenage kids knew this would happen.  If you do what you’re not supposed to….X, Y and Z will, or won’t, happen.  It’s called accountability.  Just ask my son about his prom….he remembers vividly.  Now why my kids, or 23andMe, would push an authority figure to that point, knowing full well the consequences, utterly mystifies me.  It did when my son was a teenager and it does with 23andMe as well.

Some people think that the FDA is trying to stand between consumers and their health information.  I don’t think so, at least not in this case.  Why I think that is because the FDA left the raw data files alone and they left the genetic genealogy aspect alone.  The FDA knows full well you can download your raw data and for $5 process it at a third party site, obtaining health related genetic information.  The difference is that Promethease is not interpreting any data for you, only providing information.

There is some good news in this and that is that from a genetic genealogy perspective, we seem to be safe, at least for now, from government interference with the testing that has been so productive for genetic genealogy.  The FDA had the perfect opportunity to squish us like a bug (thanks to the opening provided by 23andMe,) and they didn’t.

The really frustrating aspect of this is that 23andMe was a company who, with their deep pockets in Silicon Valley and other investors, could actually afford to wage a fight with the FDA, if need be.  The other companies who received the original 2010 FDA letter all went elsewhere and focused on something else.  But 23andMe didn’t, they decided to fight the fight, and we all supported their decision.  But they let us all down.  The fight they are fighting now is not the battle we anticipated, but one brought upon themselves by their own negligence.  This battle didn’t have to happen, and it may impair them financially to such a degree that if they need to fight the big fight, they won’t be able to.

Right now, 23andMe is selling their kits, but only as an ancestry product as they work through whatever process they are working through with the FDA.  Unfortunately, 23andMe is currently having some difficulties where the majority of matches are disappearing from some testers records.  In other cases, segments that previously matched are disappearing.  One would think, with their only revenue stream for now being the genetic genealogy marketspace that they would be wearing kid gloves and being extremely careful, but apparently not.  They might even consider making some of the changes and enhancements we’ve requested for so long that have fallen on deaf ears.

One thing is for sure, it will be extremely interesting to see where 23andMe is this time next year.  The soap opera continues.

I hope for the sake of all of the health consumers, both current and (potentially) future, that this dotcom fairy tale has a happy ending.

Also, see the Autosomal DNA Comes of Age section.

http://dna-explained.com/2013/10/05/23andme-patents-technology-for-designer-babies/

http://www.thegeneticgenealogist.com/2013/10/07/a-new-patent-for-23andme-creates-controversy/

http://dna-explained.com/2013/11/13/genomics-law-review-discusses-designing-children/

http://www.thegeneticgenealogist.com/2013/06/11/andy-page-fills-new-president-position-at-23andme/

http://dna-explained.com/2013/11/25/fda-orders-23andme-to-discontinue-testing/

http://dna-explained.com/2013/11/26/now-what-23andme-and-the-fda/

http://dna-explained.com/2013/12/06/23andme-suspends-health-related-genetic-tests/

http://www.legalgenealogist.com/blog/2013/11/26/fooling-with-fda/

Supreme Court Decision – Genes Can’t Be Patented – Followed by Lawsuits

In a landmark decision, the Supreme Court determined that genes cannot be patented.  Myriad Genetics held patents on two BRCA genes that predisposed people to cancer.  The cost for the tests through Myriad was about $3000.  Six hours after the Supreme Court decision, Gene By Gene announced that same test for $995.  Other firms followed suit, and all were subsequently sued by Myriad for patent infringement.  I was shocked by this, but as one of my lawyer friends clearly pointed out, you can sue anyone for anything.  Making it stick is yet another matter.  Many firms settle to avoid long and very expensive legal battles.  Clearly, this issue is not yet resolved, although one would think a Supreme Court decision would be pretty definitive.  It potentially won’t be settled for a long time.

http://dna-explained.com/2013/06/13/supreme-court-decision-genes-cant-be-patented/

http://www.legalgenealogist.com/blog/2013/06/14/our-dna-cant-be-patented/

http://dna-explained.com/2013/09/07/message-from-bennett-greenspan-free-my-genes/

http://www.thegeneticgenealogist.com/2013/06/13/new-press-release-from-dnatraits-regarding-the-supreme-courts-holding-in-myriad/

http://www.legalgenealogist.com/blog/2013/08/18/testing-firms-land-counterpunch/

http://www.legalgenealogist.com/blog/2013/07/11/myriad-sues-genetic-testing-firms/

Gene By Gene Steps Up, Ramps Up and Produces

As 23andMe comes unraveled and Ancestry languishes in its mediocrity, Gene by Gene, the parent company of Family Tree DNA has stepped up to the plate, committed to do “whatever it takes,” ramped up the staff both through hiring and acquisitions, and is producing results.  This is, indeed, a breath of fresh air for genetic genealogists, as well as a welcome relief.

http://dna-explained.com/2013/08/07/gene-by-gene-acquires-arpeggi/

http://dna-explained.com/2013/12/05/family-tree-dna-listens-and-acts/

http://dna-explained.com/2013/12/10/family-tree-dnas-family-finder-match-matrix-released/

http://www.haplogroup.org/ftdna-family-finder-matches-get-new-look/

http://www.haplogroup.org/ftdna-family-finder-new-look-2/

http://www.haplogroup.org/ftdna-family-finder-matches-new-look-3/

Autosomal DNA Comes of Age

Autosomal DNA testing and analysis has simply exploded this past year.  More and more people are testing, in part, because Ancestry.com has a captive audience in their subscription data base and more than a quarter million of those subscribers have purchased autosomal DNA tests.  That’s a good thing, in general, but there are some negative aspects relative to Ancestry, which are in the Ancestry section.

Another boon to autosomal testing was the 23andMe push to obtain a million records.  Of course, the operative word here is “was” but that may revive when the FDA issue is resolved.  One of the down sides to the 23andMe data base, aside from the fact that it’s not genealogist friendly, is that so many people, about 90%, don’t communicate.  They aren’t interested in genealogy.

A third factor is that Family Tree DNA has provided transfer ability for files from both 23andMe and Ancestry into their data base.

Fourth is the site, GedMatch, at www.gedmatch.com which provides additional matching and admixture tools and the ability to match below thresholds set by the testing companies.  This is sometimes critically important, especially when comparing to known cousins who just don’t happen to match at the higher thresholds, for example.  Unfortunately, not enough people know about GedMatch, or are willing to download their files.  Also unfortunate is that GedMatch has struggled for the past few months to keep up with the demand placed on their site and resources.

A great deal of time this year has been spent by those of us in the education aspect of genetic genealogy, in whatever our capacity, teaching about how to utilize autosomal results. It’s not necessarily straightforward.  For example, I wrote a 9 part series titled “The Autosomal Me” which detailed how to utilize chromosome mapping for finding minority ethnic admixture, which was, in my case, both Native and African American.

As the year ends, we have Family Tree DNA, 23andMe and Ancestry who offer the autosomal test which includes the relative-matching aspect.  Fortunately, we also have third party tools like www.GedMatch.com and www.DNAGedcom.com, without which we would be significantly hamstrung.  In the case of DNAGedcom, we would be unable to perform chromosome segment matching and triangulation with 23andMe data without Rob Warthen’s invaluable tool.

http://dna-explained.com/2013/06/21/triangulation-for-autosomal-dna/

http://dna-explained.com/2013/07/13/combining-tools-autosomal-plus-y-dna-mtdna-and-the-x-chromosome/

http://dna-explained.com/2013/07/26/family-tree-dna-levels-the-playing-field-sort-of/

http://dna-explained.com/2013/08/03/kitty-coopers-chromsome-mapping-tool-released/

http://dna-explained.com/2013/09/29/why-dont-i-match-my-cousin/

http://dna-explained.com/2013/10/03/family-tree-dna-updates-family-finder-and-adds-triangulation/

http://dna-explained.com/2013/10/21/why-are-my-predicted-cousin-relationships-wrong/

http://dna-explained.com/2013/12/05/family-tree-dna-listens-and-acts/

http://dna-explained.com/2013/12/09/chromosome-mapping-aka-ancestor-mapping/

http://dna-explained.com/2013/12/10/family-tree-dnas-family-finder-match-matrix-released/

http://dna-explained.com/2013/12/15/one-chromosome-two-sides-no-zipper-icw-and-the-matrix/

http://dna-explained.com/2013/06/02/the-autosomal-me-summary-and-pdf-file/

DNAGedcom – Indispensable Third Party Tool

While this tool, www.dnagedcom.com, falls into the Autosomal grouping, I have separated it out for individual mention because without this tool, the progress made this year in autosomal DNA ancestor and chromosomal mapping would have been impossible.  Family Tree DNA has always provided segment matching boundaries through their chromosome browser tool, but until recently, you could only download 5 matches at a time.  This is no longer the case, but for most of the year, Rob’s tool saved us massive amounts of time.

23andMe does not provide those chromosome boundaries, but utilizing Rob’s tool, you can obtain each of your matches in one download, and then you can obtain the list of who your matches match that is also on your match list by requesting each of those files separately.  Multiple steps?  Yes, but it’s the only way to obtain this information, and chromosome mapping without the segment data is impossible

A special hats off to Rob.  Please remember that Rob’s site is free, meaning it’s donation based.  So, please donate if you use the tool.

http://www.yourgeneticgenealogist.com/2013/01/brought-to-you-by-adoptiondna.html

I covered www.Gedmatch.com in the “Best of 2012” list, but they have struggled this year, beginning when Ancestry announced that raw data file downloads were available.  GedMatch consists of two individuals, volunteers, who are still struggling to keep up with the required processing and the tools.  They too are donation based, so don’t forget about them if you utilize their tools.

Ancestry – How Great Thou Aren’t

Ancestry is only on this list because of what they haven’t done.  When they initially introduced their autosomal product, they didn’t have any search capability, they didn’t have a chromosome browser and they didn’t have raw data file download capability, all of which their competitors had upon first release.  All they did have was a list of your matches, with their trees listed, with shakey leaves if you shared a common ancestor on your tree.  The implication, was, and is, of course, that if you have a DNA match and a shakey leaf, that IS your link, your genetic link, to each other.  Unfortunately, that is NOT the case, as CeCe Moore documented in her blog from Rootstech (starting just below the pictures) as an illustration of WHY we so desperately need a chromosome browser tool.

In a nutshell, Ancestry showed the wrong shakey leaf as the DNA connection – as proven by the fact that both of CeCe’s parents have tested at Ancestry and the shakey leaf person doesn’t match the requisite parent.  And there wasn’t just one, not two, but three instances of this.  What this means is, of course, that the DNA match and the shakey leaf match are entirely independent of each other.  In fact, you could have several common ancestors, but the DNA at any particular location comes only from one on either Mom or Dad’s side – any maybe not even the shakey leaf person.

So what Ancestry customers are receiving is a list of people they match and possible links, but most of them have no idea that this is the case, and blissfully believe they have found their genetic connection.  They have found a genealogical cousin, and it MIGHT be the genetic connection.  But then again, they could have found that cousin simply by searching for the same ancestor in Ancestry’s data base.  No DNA needed.

Ancestry has added a search feature, allowed raw data file downloads (thank you) and they have updated their ethnicity predictions.  The ethnicity predictions are certainly different, dramatically different, but equally as unrealistic.  See the Ethnicity Makeovers section for more on this.  The search function helps, but what we really need is the chromosome browser, which they have steadfastly avoided promising.  Instead, they have said that they will give us “something better,” but nothing has materialized.

I want to take this opportunity, to say, as loudly as possible, that TRUST ME IS NOT ACCEPTABLE in any way, shape or form when it comes to genetic matching.  I’m not sure what Ancestry has in mind by the way of “better,” but it if it’s anything like the mediocrity with which their existing DNA products have been rolled out, neither I nor any other serious genetic genealogist will be interested, satisfied or placated.

Regardless, it’s been nearly 2 years now.  Ancestry has the funds to do development.  They are not a small company.  This is obviously not a priority because they don’t need to develop this feature.  Why is this?  Because they can continue to sell tests and to give shakey leaves to customers, most of whom don’t understand the subtle “untruth” inherent in that leaf match – so are quite blissfully happy.

In years past, I worked in the computer industry when IBM was the Big Dog against whom everyone else competed.  I’m reminded of an old joke.  The IBM sales rep got married, and on his wedding night, he sat on the edge of the bed all night long regaling his bride in glorious detail with stories about just how good it was going to be….

You can sign a petition asking Ancestry to provide a chromosome browser here, and you can submit your request directly to Ancestry as well, although to date, this has not been effective.

The most frustrating aspect of this situation is that Ancestry, with their plethora of trees, savvy marketing and captive audience testers really was positioned to “do it right,” and hasn’t, at least not yet.  They seem to be more interested in selling kits and providing shakey leaves that are misleading in terms of what they mean than providing true tools.  One wonders if they are afraid that their customers will be “less happy” when they discover the truth and not developing a chromosome browser is a way to keep their customers blissfully in the dark.

http://dna-explained.com/2013/03/21/downloading-ancestrys-autosomal-dna-raw-data-file/

http://dna-explained.com/2013/03/24/ancestry-needs-another-push-chromosome-browser/

http://dna-explained.com/2013/10/17/ancestrys-updated-v2-ethnicity-summary/

http://www.thegeneticgenealogist.com/2013/06/21/new-search-features-at-ancestrydna-and-a-sneak-peek-at-new-ethnicity-estimates/

http://www.yourgeneticgenealogist.com/2013/03/ancestrydna-raw-data-and-rootstech.html

http://www.legalgenealogist.com/blog/2013/09/15/dna-disappointment/

http://www.legalgenealogist.com/blog/2013/09/13/ancestrydna-begins-rollout-of-update/

Ancient DNA

This has been a huge year for advances in sequencing ancient DNA, something once thought unachievable.  We have learned a great deal, and there are many more skeletal remains just begging to be sequenced.  One absolutely fascinating find is that all people not African (and some who are African through backmigration) carry Neanderthal and Denisovan DNA.  Just this week, evidence of yet another archaic hominid line has been found in Neanderthal DNA and on Christmas Day, yet another article stating that type 2 Diabetes found in Native Americans has roots in their Neanderthal ancestors. Wow!

Closer to home, by several thousand years is the suggestion that haplogroup R did not exist in Europe after the ice age, and only later, replaced most of the population which, for males, appears to have been primarily haplogroup G.  It will be very interesting as the data bases of fully sequenced skeletons are built and compared.  The history of our ancestors is held in those precious bones.

http://dna-explained.com/2013/01/10/decoding-and-rethinking-neanderthals/

http://dna-explained.com/2013/07/04/ancient-dna-analysis-from-canada/

http://dna-explained.com/2013/07/10/5500-year-old-grandmother-found-using-dna/

http://dna-explained.com/2013/10/25/ancestor-of-native-americans-in-asia-was-30-western-eurasian/

http://dna-explained.com/2013/11/12/2013-family-tree-dna-conference-day-2/

http://dna-explained.com/2013/11/22/native-american-gene-flow-europe-asia-and-the-americas/

http://dna-explained.com/2013/12/05/400000-year-old-dna-from-spain-sequenced/

http://www.thegeneticgenealogist.com/2013/10/16/identifying-otzi-the-icemans-relatives/

http://cruwys.blogspot.com/2013/12/recordings-of-royal-societys-ancient.html

http://cruwys.blogspot.com/2013/02/richard-iii-king-is-found.html

http://dna-explained.com/2013/12/22/sequencing-of-neanderthal-toe-bone-reveals-unknown-hominin-line/

http://dna-explained.com/2013/12/26/native-americans-neanderthal-and-denisova-admixture/

http://dienekes.blogspot.com/2013/12/ancient-dna-what-2013-has-brought.html

Sloppy Science and Sensationalist Reporting

Unfortunately, as DNA becomes more mainstream, it becomes a target for both sloppy science or intentional misinterpretation, and possibly both.  Unfortunately, without academic publication, we can’t see results or have the sense of security that comes from the peer review process, so we don’t know if the science and conclusions stand up to muster.

The race to the buck in some instances is the catalyst for this. In other cases, and not in the links below, some people intentionally skew interpretations and results in order to either fulfill their own belief agenda or to sell “products and services” that invariably report specific findings.

It’s equally as unfortunate that much of these misconstrued and sensationalized results are coming from a testing company that goes by the names of BritainsDNA, ScotlandsDNA, IrelandsDNA and YorkshiresDNA. It certainly does nothing for their credibility in the eyes of people who are familiar with the topics at hand, but it does garner a lot of press and probably sells a lot of kits to the unwary.

I hope they publish their findings so we can remove the “sloppy science” aspect of this.  Sensationalist reporting, while irritating, can be dealt with if the science is sound.  However, until the results are published in a peer-reviewed academic journal, we have no way of knowing.

Thankfully, Debbie Kennett has been keeping her thumb on this situation, occurring primarily in the British Isles.

http://dna-explained.com/2013/08/24/you-might-be-a-pict-if/

http://cruwys.blogspot.com/2013/12/the-british-genetic-muddle-by-alistair.html

http://cruwys.blogspot.com/2013/12/setting-record-straight-about-sara.html

http://cruwys.blogspot.com/2013/09/private-eye-on-britainsdna.html

http://cruwys.blogspot.com/2013/07/private-eye-on-prince-williams-indian.html

http://cruwys.blogspot.com/2013/06/britainsdna-times-and-prince-william.html

http://cruwys.blogspot.com/2013/03/sense-about-genealogical-dna-testing.html

http://cruwys.blogspot.com/2013/03/sense-about-genetic-ancestry-testing.html

Citizen Science is Coming of Age

Citizen science has been slowing coming of age over the past few years.  By this, I mean when citizen scientists work as part of a team on a significant discovery or paper.  Bill Hurst comes to mind with his work with Dr. Doron Behar on his paper, A Copernican Reassessment of the Human Mitochondrial DNA from its Root or what know as the RSRS model.  As the years have progressed, more and more discoveries have been made or assisted by citizen scientists, sometimes through our projects and other times through individual research.  JOGG, the Journal of Genetic Genealogy, which is currently on hiatus waiting for Dr. Turi King, the new editor, to become available, was a great avenue for peer reviewed publication.  Recently, research projects have been set up by citizen scientists, sometimes crowd-funded, for specific areas of research.  This is a very new aspect to scientific research, and one not before utilized.

The first paper below includes the Family Tree DNA Lab, Thomas and Astrid Krahn, then with Family Tree DNA and Bonnie Schrack, genetic genealogist and citizen scientist, along with Dr. Michael Hammer from the University of Arizona and others.

http://dna-explained.com/2013/03/26/family-tree-dna-research-center-facilitates-discovery-of-ancient-root-to-y-tree/

http://dna-explained.com/2013/04/10/diy-dna-analysis-genomeweb-and-citizen-scientist-2-0/

http://dna-explained.com/2013/06/27/big-news-probable-native-american-haplogroup-breakthrough/

http://dna-explained.com/2013/07/22/citizen-science-strikes-again-this-time-in-cameroon/

http://dna-explained.com/2013/11/30/native-american-haplogroups-q-c-and-the-big-y-test/

http://www.yourgeneticgenealogist.com/2013/03/citizen-science-helps-to-rewrite-y.html

Ethnicity Makeovers – Still Not Soup

Unfortunately, ethnicity percentages, as provided by the major testing companies still disappoint more than thrill, at least for those who have either tested at more than one lab or who pretty well know their ethnicity via an extensive pedigree chart.

Ancestry.com is by far the worse example, swinging like a pendulum from one extreme to the other.  But I have to hand it to them, their marketing is amazing.  When I signed in, about to discover that my results had literally almost reversed, I was greeted with the banner “a new you.”  Yea, a new me, based on Ancestry’s erroneous interpretation.  And by reversed, I’m serious.  I went from 80% British Isles to 6% and then from 0% Western Europe to 79%. So now, I have an old wrong one and a new wrong one – and indeed they are very different.  Of course, neither one is correct…..but those are just pesky details…

23andMe updated their ethnicity product this year as well, and fine tuned it yet another time.  My results at 23andMe are relatively accurate.  I saw very little change, but others saw more.  Some were pleased, some not.

The bottom line is that ethnicity tools are not well understood by consumers in terms of the timeframe that is being revealed, and it’s not consistent between vendors, nor are the results.  In some cases, they are flat out wrong, as with Ancestry, and can be proven.  This does not engender a great deal of confidence.  I only view these results as “interesting” or utilize them in very specific situations and then only using the individual admixture tools at www.Gedmatch.com on individual chromosome segments.

As Judy Russell says, “it’s not soup yet.”  That doesn’t mean it’s not interesting though, so long as you understand the difference between interesting and gospel.

http://dna-explained.com/2013/08/05/autosomal-dna-ancient-ancestors-ethnicity-and-the-dandelion/

http://dna-explained.com/2013/10/04/ethnicity-results-true-or-not/

http://www.legalgenealogist.com/blog/2013/09/15/dna-disappointment/

http://cruwys.blogspot.com/2013/09/my-updated-ethnicity-results-from.html?utm_source=feedburner&utm_medium=email&utm_campaign=Feed%3A+Cruwysnews+%28Cruwys+news%29

http://dna-explained.com/2013/10/17/ancestrys-updated-v2-ethnicity-summary/

http://dna-explained.com/2013/10/19/determining-ethnicity-percentages/

http://www.thegeneticgenealogist.com/2013/09/12/ancestrydna-launches-new-ethnicity-estimate/

http://cruwys.blogspot.com/2013/12/a-first-look-at-chromo-2-all-my.html

Genetic Genealogy Education Goes Mainstream

With the explosion of genetic genealogy testing, as one might expect, the demand for education, and in particular, basic education has exploded as well.

I’ve written a 101 series, Kelly Wheaton wrote a series of lessons and CeCe Moore did as well.  Recently Family Tree DNA has also sponsored a series of free Webinars.  I know that at least one book is in process and very near publication, hopefully right after the first of the year.  We saw several conferences this year that provided a focus on Genetic Genealogy and I know several are planned for 2014.  Genetic genealogy is going mainstream!!!  Let’s hope that 2014 is equally as successful and that all these folks asking for training and education become avid genetic genealogists.

http://dna-explained.com/2013/08/10/ngs-series-on-dna-basics-all-4-parts/

https://sites.google.com/site/wheatonsurname/home

http://www.yourgeneticgenealogist.com/2012/08/getting-started-in-dna-testing-for.html

http://dna-explained.com/2013/12/17/free-webinars-from-family-tree-dna/

http://www.thegeneticgenealogist.com/2013/06/09/the-first-dna-day-at-the-southern-california-genealogy-society-jamboree/

http://www.yourgeneticgenealogist.com/2013/06/the-first-ever-independent-genetic.html

http://cruwys.blogspot.com/2013/10/genetic-genealogy-comes-to-ireland.html

http://cruwys.blogspot.com/2013/03/wdytya-live-day-3-part-2-new-ancient.html

http://cruwys.blogspot.com/2013/03/who-do-you-think-you-are-live-day-3.html

http://cruwys.blogspot.com/2013/03/who-do-you-think-you-are-live-2013-days.html

http://genealem-geneticgenealogy.blogspot.com/2013/03/the-surnames-handbook-guide-to-family.html

http://www.isogg.org/wiki/Beginners%27_guides_to_genetic_genealogy

A Thank You in Closing

I want to close by taking a minute to thank the thousands of volunteers who make such a difference.  All of the project administrators at Family Tree DNA are volunteers, and according to their website, there are 7829 projects, all of which have at least one administrator, and many have multiple administrators.  In addition, everyone who answers questions on a list or board or on Facebook is a volunteer.  Many donate their time to coordinate events, groups, or moderate online facilities.  Many speak at events or for groups.  Many more write articles for publications from blogs to family newsletters.  Additionally, there are countless websites today that include DNA results…all created and run by volunteers, not the least of which is the ISOGG site with the invaluable ISOGG wiki.  Without our volunteer army, there would be no genetic genealogy community.  Thank you, one and all.

2013 has been a banner year, and 2014 holds a great deal of promise, even without any surprises.  And if there is one thing this industry is well known for….it’s surprises.  I can’t wait to see what 2014 has in store for us!!!  All I can say is hold on tight….

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Native American Gene Flow – Europe?, Asia and the Americas

Pre-release information from the paper, “Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans” which included results and analysis of DNA sequencing of 24,000 year old skeletal remains of a 4 year old Siberian boy caused quite a stir.  Unfortunately, it was also misconstrued and incorrectly extrapolated in some articles.  Some people misunderstood, either unintentionally or intentionally, and suggested that people with haplogroups U and R are Native American.  That is not what either the prerelease or the paper itself says.  Not only is that information and interpretation incorrect, the paper itself with the detailed information wasn’t published until November 20th, in Nature.

The paper is currently behind a paywall, so I’m going to discuss parts of it here, along with some additional information from other sources.  To help with geography, the following google map shows the following locations: A=the Altai Republic, in Russia, B=Mal’ta, the location of the 24,000 year old skeletal remains and C=Lake Baikal, the region from where the Native American population originated in Asia.

native flow map

Nature did publish an article preview.  That information is in bold, italics and I will be commenting in nonbold, nonitalics.

The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians1, 2, 3, there is no consensus with regard to which specific Old World populations they are closest to4, 5, 6, 7, 8. Here we sequence the draft genome of an approximately 24,000-year-old individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 1×. To our knowledge this is the oldest anatomically modern human genome reported to date.

Within the paper, the authors also compare the MA-1 sequence to that of another 40,000 year old individual from Tianyuan Cave, China whose genome has been partially sequenced.  This Chinese individual has been shown to be ancestral to both modern-day Asians and Native Americans.  This comparison was particularly useful, because it showed that MA-1 is not closely related to the Tianyuan Cave individual, and is more closely related to Native Americans.  This means that MA-1’s line and Tianyuan Cave’s line had not yet met and admixed into the population that would become the Native Americans.  That occurred sometime later than 24,000 years ago and probably before crossing Beringia into North America sometime between about 18,000 and 20,000 years ago.

The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic and Mesolithic European hunter-gatherers10, 11, 12, and the Y chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most Native American lineages5.

The paper goes on to say that MA-1 is a member of mitochondrial (maternal) haplogroup U, very near the base of that haplogroup, but without affiliation to any known subclade, implying either that the subclade is rare or extinct in modern populations.  In other words, this particular line of haplogroup U has NOT been found in any population, anyplace.  According to the landmark paper,  “A ‘‘Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root,” by Behar et al, 2012, haplogroup U itself was born about 46,500 years ago (plus or minus 3.200 years) and today has 9 major subclades (plus haplogroup K) and about 300 branching clades from those 9 subclades, excluding haplogroup K.

The map below, from the supplemental material included with the paper shows the distribution of haplogroup U, the black dots showing locations of haplogroup U comparison DNA.

Native flow Hap U map

In a recent paper, “Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity” by Brandt et al (including the National Geographic Consortium) released in October 2013, the authors report that in the 198 ancient DNA samples collected from 25 German sites and compared to almost 68,000 current results, all of the ancient Hunter-Gatherer cultural results were haplogroup U, U4, U5 and U8.  No other haplogroups were represented.  In addition, those haplogroups disappeared from the region entirely with the advent of farming, shown on the chart below.

Native flow Brandt map

So, if someone who carries haplogroup U wants to say that they are distantly related to MA-1 who lived 24,000 years ago who was also related to their common ancestor who lived sometime prior to that, between 24,000 and 50,000 years ago, probably someplace between the Middle East where U was born, Mal’ta, Siberia and Western Europe, they would be correct.  They are also distantly related to every other person in the world who carries haplogroup U, and many much more closely that MA-1 whose mitochondrial DNA line is either rare as chicken’s teeth (i.e. never found) or has gone extinct.

Let me be very clear about this, there is no evidence, none, that mitochondrial haplogroup U is found in the Native American population today that is NOT a result of post-contact admixture.  In other words, in the burials that have been DNA tested, there is not one example in either North or South America of a burial carrying mitochondrial haplogroup U, or for that matter, male Y haplogroup R.  Native American haplogroups found in the Americas remain subsets of mitochondrial haplogroups A, B, C, D and X and Y DNA haplogroups C and Q.  Mitochondrial haplogroup M has potentially been found in one Canadian burial.  No other haplogroups have been found.  Until pre-contact remains are found with base haplogroups other than the ones listed above, no one can ethically claim that other haplogroups are of Native American origin.  Finding any haplogroup in a contemporary Native population does not mean that it was originally Native, or that it should be counted as such.  Admixture and adoption have been commonplace since Europeans first set foot on the soil of the Americas. 

Now let’s talk about the Y DNA of MA-1.

The authors state that MA-1’s results are found very near the base of haplogroup R.  They note that the sister lineage of haplogroup R, haplogroup Q, is the most common haplogroup in Native Americans and that the closest Eurasian Q results to Native Americans come from the Altai region.

The testing of the MA-1 Y chromosome was much more extensive than the typical STR genealogy tests taken by consumers today.  MA-1’s Y chromosome was sequenced at 5.8 million base pairs at a coverage of 1.5X.

The resulting haplotree is shown below, again from the supplementary material.

Native flow R tree

 native flow r tree text

The current haplogroup distribution range for haplogroup R is shown below, again with comparison points as black dots.

Native flow R map

The current distribution range for Eurasian haplogroup Q is shown on the map below.  Haplogroup Q is the most common haplogroup in Native Americans.

Native flow Q map

Similarly, we find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically closely related to modern-day Native Americans, with no close affinity to east Asians. This suggests that populations related to contemporary western Eurasians had a more north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we estimate that 14 to 38% of Native American ancestry may originate through gene flow from this ancient population. This is likely to have occurred after the divergence of Native American ancestors from east Asian ancestors, but before the diversification of Native American populations in the New World. Gene flow from the MA-1 lineage into Native American ancestors could explain why several crania from the First Americans have been reported as bearing morphological characteristics that do not resemble those of east Asians2, 13.

Kennewick Man is probably the most famous of the skeletal remains that don’t neatly fit into their preconceived box.  Kennewick man was discovered on the bank of the Columbia River in Kennewick, Washington in 1996 and is believed to be from 7300 to 7600 years old.  His anatomical features were quite different from today’s Native Americans and his relationship to ancient people is unknown.  An initial evaluation and a 2010 reevaluation of Kennewick Man let to the conclusion by Doug Owsley, a forensic anthropologist, that Kennewick Man most closely resembles the Ainu people of Japan who themselves are a bit of an enigma, appearing much more Caucasoid than Asian.  Unfortunately, DNA sequencing of Kennewick Man originally was ussuccessful and now, due to ongoing legal issues, more technologically advanced DNA testing has not been allowed.  Nova sponsored a facial reconstruction of Kennewick Man which you can see here.

Sequencing of another south-central Siberian, Afontova Gora-2 dating to approximately 17,000 years ago14, revealed similar autosomal genetic signatures as MA-1, suggesting that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans.

In addition to the sequencing they set forth above, the authors compared the phenotype information obtainable from MA-1 to the Tyrolean Iceman, typically called Otzi.  You can see Otzi’s facial reconstruction along with more information here.  This is particularly interesting in light of the pigmentation change from darker skin in Africa to lighter skin in Eurasia, and the question of when this appearance change occurred.  MA-1 shows a genetic affinity with the contemporary people of northern Europe, the population today with the highest frequency of light pigmentation phenotypes.  The authors compared the DNA of MA-1 with a set of 124 SNPs identified in 2001 by Cerquira as informative on skin, hair and eye pigmentation color, although they also caution that this method has limited prediction accuracy.  Given that, they say that MA-1 had dark hair, skin and eyes, but they were not able to sequence the full set of SNPs.  MA-1 also had the SNP value associated with a high risk of male pattern baldness, a trait seldom found in Native American people and was not lactose tolerant, a trait found in western Eurasians.  MA-1 also does not carry the mutation associated with hair thickness and shovel shaped incisors in Asians.

The chart below from the supplemental material shows the comparison with MA-1 and the Tyrolean Iceman.

Native flow Otzi table

The Tarim Mummies, found in the Tarim Basin in present-day Xinjiang, China are another example of remains that seem out of place.  The earliest Tarim mummies, found at Qäwrighul and dated to 1800 BCE, are of a Europoid physical type whose closest affiliation is to the Bronze Age populations of southern Siberia, Kazakhstan, Central Asia, and the Lower Volga.

The cemetery at Yanbulaq contained 29 mummies which date from 1100–500 BCE, 21 of which are Mongoloid—the earliest Mongoloid mummies found in the Tarim Basin—and eight of which are of the same Europoid physical type found at Qäwrighul.

Notable mummies are the tall, red-haired “Chärchän man” or the “Ur-David” (1000 BCE); his son (1000 BCE), a small 1-year-old baby with brown hair protruding from under a red and blue felt cap, with two stones positioned over its eyes; the “Hami Mummy” (c. 1400–800 BCE), a “red-headed beauty” found in Qizilchoqa; and the “Witches of Subeshi” (4th or 3rd century BCE), who wore 2-foot-long (0.61 m) black felt conical hats with a flat brim. Also found at Subeshi was a man with traces of a surgical operation on his neck; the incision is sewn up with sutures made of horsehair.

Their costumes, and especially textiles, may indicate a common origin with Indo-European neolithic clothing techniques or a common low-level textile technology. Chärchän man wore a red twill tunic and tartan leggings. Textile expert Elizabeth Wayland Barber, who examined the tartan-style cloth, discusses similarities between it and fragments recovered from salt mines associated with the Hallstatt culture.

DNA testing revealed that the maternal lineages were predominantly East Eurasian haplogroup C with smaller numbers of H and K, while the paternal lines were all R1a1a. The geographic location of where this admixing took place is unknown, although south Siberia is likely.  You can view some photographs of the mummies here.

In closing, the authors of the MA-1 paper state that the study has four important implications.

First, we find evidence that contemporary Native Americans and western Eurasians shareancestry through gene flow from a Siberian Upper  Palaeolithic population into First Americans.

Second, our findings may provide an explanation for the presence of mtDNA haplogroup X in Native Americans, which is related to western Eurasians but not found in east Asian populations.

Third, such an easterly presence in Asia of a population related to contemporary western Eurasians provides a possibility that non-east Asian cranial characteristics of the First Americans derived from the Old World via migration through Beringia, rather than by a trans-Atlantic voyage from Iberia as proposed by the Solutrean hypothesis.

Fourth, the presence of an ancient western Eurasian genomic signature in the Baikal area before and after the LGM suggests that parts of south-central Siberia were occupied by humans throughout the coldest stages of the last ice age.

The times, they are a changin’.

Dr. Michael Hammer’s presentation at the 9th Annual International Conference on Genetic Genealogy may shed some light on all of this seeming confusing and somewhat conflicting information.

The graphic below shows the Y haplogroup base tree as documented by van Oven.

Native flow basic Y

You can see, in the lower right corner, that Y haplogroup K (not to be confused with mtDNA haplogroup K discussed in conjunction with mtDNA haplogroup U) was the parent of haplogroup P which is the parent of both haplogroups Q and R.

It has always been believed that haplogroup R made its way into Europe before the arrival of Neolithic farmers about 10,000 years ago.  However, that conclusion has been called into question, also by the use of Ancient DNA results.  You can view additional information about Hammer’s presentation here, but in a nutshell, he said that there is no early evidence in burials, at all, for haplogroup R being in Europe at an early age.  In about 40 burials from several location, haplogroup R has never been found.  If it were present, especially in the numbers expected given that it represents more than half of the haplogroups of the men of Europe today, it should be represented in these burials, but it is not.  Hammer concludes that evidence supports a recent spread of haplogroup R into Europe about 5000 years ago.  Where was haplogroup R before spreading into Europe?  In Asia.

Native flow hammer dist

It appears that haplogroup K diversified in Southeast Asian, giving birth to haplogroups P, Q and R. Dr. Hammer said that this new information, combined with new cluster information and newly discovered SNP information over the past two years requires that haplogroup K be significantly revised.  Between the revision of haplogroup K, the parent of both haplogroup R, previously believed to be European, and haplogroup Q, known to be Asian, European and Native, we may be in for a paradigm shift in terms of what we know about ancient migrations and who is whom.  This path for haplogroup R into Europe really shouldn’t be surprising.  It’s the exact same distribution as haplogroup Q, except haplogroup Q is much less frequently found in Europe than haplogroup R.

What Can We Say About MA-1?

In essence, we can’t label MA-1 as paternally European because of Y haplogroup R which now looks to have had an Asian genesis and was not known to have been in Europe 24,000 years ago, only arriving about 5,000 years ago.  We can’t label haplogroup R as Native American, because it has never been found in a pre-Columbian New World burial.

We can say that mitochondrial haplogroup U is found in Europe in Hunter-Gatherer groups six thousand years ago (R  was not) but we really don’t know if haplogroup U was in Europe 24,000 years ago.  We cannot label haplogroup U as Native because it has never been found in a pre-Columbian New World burial.

We can determine that MA-1 did have ancestors who eventually became European due to autosomal analysis, but we don’t know that those people lived in what is now Europe 24,000 years ago.  So the migration might have been into Europe, not out of Europe.  MA-1, his ancestors and descendants, may have lived in Asia and subsequently settled in Europe or lived someplace inbetween.  We can determine that MA-1’s line of people eventually admixed with people from East Asia, probably in Siberia, and became today’s First People of North and South America.

We can say that MA-1 appears to have been about 30% what is today Western Eurasian and that he is closely related to modern day Native Americans, but not eastern Asians.  The authors estimate that between 14% and 38% of Native American ancestry comes from MA-1’s ancient population.

Whoever thought we could learn so much from a 4 year old?

For anyone seriously interested in Native American population genetics, “Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans” is a must read.

It’s been a great month for ancient DNA.  Additional recent articles which pertain to this topic include:

http://www.nytimes.com/2013/11/21/science/two-surprises-in-dna-of-boy-found-buried-in-siberia.html?src=me&ref=general&_r=0

http://www.sciencedaily.com/releases/2013/11/131120143631.htm

http://dienekes.blogspot.com/2013/11/ancient-dna-from-upper-paleolithic-lake.html

http://blogs.discovermagazine.com/gnxp/2013/11/long-first-age-mankind/#.Uo0eOcSkrIU

http://cruwys.blogspot.com/2013/11/day-1-at-royal-societys-2013-ancient.html

http://cruwys.blogspot.co.uk/2013/11/day-2-at-royal-societys-2013-ancient.html

http://www.sciencedaily.com/releases/2013/11/131118081251.htm

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2013 Family Tree DNA Conference Day 1

This article is probably less polished than my normal articles.  I’d like to get this information out and to you sooner rather than later, and I’m still on the road the rest of this week with little time to write.  So you’re getting a spruced up version of my notes.  There are some articles here I’d like to write about more indepth later, after I’m back at home and have recovered a bit.

Max Blankfield and Bennett Greenspan, founders, opened the conference on the first day as they always do.  Max began with a bit of a story.

13 years ago Bennett started on a quest….

Indeed he did, and later, Bennett will be relating his own story of that journey.

Someone mentioned to Max that this must be a tough time in this industry.  Max thought about this and said, really, not.  Competition validates what you are doing.

For competition it’s just a business opportunity – it was not and is not approached with the passion and commitment that Family Tree DNA has and has always had.

He said this has been their best year ever and great things in the pipeline.

One of the big moves is that Arpeggi merged into Family Tree DNA.

10th Anniversary Pioneer Awards

Quite unexpectedly, Max noted and thanked the early adopters and pioneers, some of which who are gone now but remain with us in spirit.

Max and Bennett recognized the administrators who have been with Family Tree DNA for more than 10 years.  The list included about 20 or so early adopters.  They provided plaques for us and many of us took a photo with Max as the plaques were handed out.

Plaque Max and Me 2013

I am always impressed by the personal humility and gratitude of Max and Bennett, both, to their administrators.  A good part of their success is attributed, I’m sure, to their personal commitment not only to this industry, but to the individual people involved.  When Max noted the admins who were leaders and are no longer with us, he could barely speak.  There were a lot of teary eyes in the room, because they were friends to all of us and we all have good memories.

Thank you, Max and Bennett.

The second day, we took a group photo of all of the recipients along with Max and Bennett.

With that, it was Bennett’s turn for a few remarks.

Bennett remarks

Bennett says that having their own lab provides a wonderful environment and allows them to benchmark and respond to an ever changing business environment.

Today, they are a College of American Pathologists certified lab and tomorrow, we will find out more about what is coming.  Tomorrow, David Mittleman will speak about next generation sequencing.

The handout booklet includes the information that Family Tree DNA now includes over 656,898 records in more than 8,700 group projects. These projects are all managed by volunteer administrators, which in and of itself, is a rather daunting number and amount of volunteer crowd-sourcing.

Session 1 – Amy McGuire, PhD, JD – Am I My Brother’s Keeper?

Dr. McGuire went to college for a very long time.  Her list of degrees would take a page or so.  She is the Director of the Center for Medical Ethics and Health Policy at Baylor College of Medicine.

Thirteen years ago, Amy’s husband was sitting next to Bennett’s wife on an airplane and she gave him a business card.  Then two months ago, Amy wound up sitting next to Max on another airplane.  It’s a very small world.

I will tell you that Amy said that her job is asking the difficult questions, not providing the answers.  You’ll see from what follows that she is quite good at that.

How is genetic genealogy different from clinical genetics in terms of ethics and privacy?  How responsible are we to other family members who share our DNA?

What obligations do we have to relatives in all areas of genetics – both clinical, direct to consumer that related to medical information and then for genetic genealogy.

She referenced the article below, which I blogged about here.  There was unfortunately, a lot of fallout in the media.

Identifying Personal Genomes by Surname Inference – Science magazine in January 2013.  I blogged about this at the time.

She spoke a bit about the history of this issue.

Mcguire

In 2004, a paper was published that stated that it took only 30 to 80 specifically selected SNPS to identify a person.

2008 – Can you identify an individual from pooled or aggregated or DNA?  This is relevant to situations like 911 where the DNA of multiple individuals has been mixed together.  Can you identify individuals from that brew?

2005 – 15 year old boy identifies his biological father who was a sperm donor.  Is this a good thing or a bad thing?  Some feel that it’s unethical and an invasion of the privacy of the father.  But others feel that if the donor is concerned about that, they shouldn’t be selling their sperm.

Today, for children conceived from sperm donors, there are now websites available to identify half-siblings.

The movement today is towards making sure that people are informed that their anonymity may not be able to be preserved.  DNA is the ultimate identifier.

Genetic Privacy – individual perspectives vary widely.  Some individuals are quite concerned and some are not the least bit concerned.

Some of the concern is based in the eugenics movement stemming from the forced sterilization (against their will) of more than 60,000 Americans beginning in 1907.  These people were considered to be of no value or injurious to the general population – meaning those institutionalized for mental illness or in prison.

1927 – Buck vs Bell – The Supreme court upheld forced sterilization of a woman who was the third generation institutionalized female for retardation.  “Three generations of imbeciles is enough.”  I must say, the question this leaves me with is how institutionalized retarded women got pregnant in what was supposed to be a “protected” environment.

Hitler, of course, followed and we all know about the Holocaust.

I will also note here that in my experience, concern is not rooted in Eugenics, but she deals more with medical testing and I deal with genetic genealogy.

The issues of privacy and informed consent have become more important because the technology has improved dramatically and the prices have fallen exponentially.

In 2012, the Nonopore OSB Sequencer was introduced that can sequence an entire genome for about $1000.

Originally, DNA data was provided in open access data bases and was anonymized by removing names.  The data base from which the 2013 individuals were identified removed names, but included other identifying information including ages and where the individuals lived.  Therefore, using Y-STRs, you could identify these families just like an adoptee utilizes data bases like Y-Search to find their biological father.

Today, research data bases have moved to controlled access, meaning other researchers must apply to have access so that their motivations and purposes can be evaluated.

In a recent medical study, a group of people in a research study were informed and educated about the utility of public data bases and why they are needed versus the tradeoffs, and then they were given a release form providing various options.  53% wanted their info in public domain, 33 in restricted access data bases and 13% wanted no data release.  She notes that these were highly motivated people enrolled in a clinical study.  Other groups such as Native Americans are much more skeptical.

People who did not release their data were concerned with uncertainly of what might occur in the future.

People want to be respected as a research participant.  Most people said they would participate if they were simply asked.  So often it’s less about the data and more about how they are treated.

I would concur with Dr. McGuire on this.  I know several people who refused to participate in a research study because their results would not be returned to them personally.  All they wanted was information and to be treated respectfully.

What  the new genetic privacy issues are really all about is whether or not you are releasing data not just about yourself, but about your family as well.  What rights or issues do the other family members have relative to your DNA?

Jim Watson, one of the discoverers of DNA, wanted to release his data publicly…except for his inherited Alzheimer’s status.  It was redacted, but, you can infer the “answer” from surrounding (flanking regions) DNA.  He has two children.  How does this affect his children?  Should his children sign a consent and release before their father’s genome is published, since part of it is their sequence as well? The academic community was concerned and did not publish this information.  Jim Watson published his own.

There is no concrete policy about this within the academic community.

Dr McGuire then referenced the book, “The Immortal Life of Henrietta Lacks”.  Henrietta Lacks was a poor African-American woman with ovarian cancer.  At that time, in the 1950s, her cancer was considered “waste” and no release was needed as waste could be utilized for research.  She was never informed or released anything, but then they were following the protocols of the time.  From her cell line, the HeLa cell line, the first immortal cell line was created which ultimately generated a great deal of revenue for research institutes. The family however, remained impoverished.  The genome was eventually fully sequenced and published.  Henrietta Lacks granddaughter said that this was private family information and should never have been published without permission, even though all of the institutions followed all of the protocols in place.

So, aside from the original ethics issues stemming from the 1950s – who is relevant family?  And how does or should this affect policy?

How does this affect genetic genealogy?  Should the rules be different for genetic genealogy, assuming there are (will be) standard policies in place for medical genetics?  Should you have to talk to family members before anyone DNA tests?  Is genetic information different than other types of information?

Should biological relatives be consulted before someone participates in a medical research study as opposed to genetic genealogy?  How about when the original tester dies?  Who has what rights and interests?  What about the unborn?  What about when people need DNA sequencing due to cancer or another immediate and severe health condition which have hereditary components.  Whose rights trump whose?

Today, the data protections are primarily via data base access restrictions.

Dr. Mcguire feels the way to protect people is through laws like GINA (Genomic Information Nondiscrimination Act) which protects people from discrimination, but does not reach to all industries like life insurance.

Is this different than people posting photos of family members or other private information without permission on public sites?

While much of Dr. McGuire’s focus in on medical testing and ethics, the topic surely is applicable to genetic genealogy as well and will eventually spill over.  However, I shudder to think that someone would have to get permission from their relatives before they can have a Y-line DNA test.  Yes, there is information that becomes available from these tests, including haplogroup information which has the potential to make people uncomfortable if they expected a different ethnicity than what they receive or an undocumented adoption is involved.  However, doesn’t the DNA carrier have the right to know, and does their right to know what is in their body override the concerns about relatives who should (but might not) share the same haplogroup and paternal line information?

And as one person submitted as a question at the end of the session, isn’t that cat already out of the bag?

Session 2 – Dr. Miguel Vilar – Geno 2.0 Update and 2014 Tree

Dr. Vilar is the Science manager for the National Geographic’s Genographic Project.

“The greatest book written is inside of us.”

Miguel is a molecular anthropologist and science writer at the University of Pennsylvania. He has a special interest in Puerto Rico which has 60% Native mitochondrial DNA – the highest percentage of Native American DNA of any Caribbean Island.

The Genographic project has 3 parts, the indigenous population testing, the Legacy project which provides grants back to the indigenous community and the public participation portion which is the part where we purchase kits and test.

Below, Dr. Vilars discussed the Legacy portion of the project.

Villars

The indigenous population aspect focuses both on modern indigenous and ancient DNA as well.  This information, cumulatively, is used to reconstruct human population migratory routes.

These include 72,000 samples collected 2005-2012 in 12 research centers on 6 continents.  Many of these are working with indigenous samples, including Africa and Australia.

42 academic manuscripts and >80 conference presentations have come forth from the project.  More are in the pipeline.

Most recently, a Science paper was published about the spread of mtDNA throughout Europe across the past 5000 years.  More than 360 ancient samples were collected across several different time periods.  There seems to be a divide in the record about 7000 years ago when several disappear and some of the more well known haplogroups today appear on the scene.

Nat Geo has funded 7 new scientific grants since the Geno 2.0 portion began for autosomal including locations in Australia, Puerto Rico and others.

Public participants – Geno 1.0 went over 500,000 participants, Geno 2.0 has over 80,000 participants to date.

Dr. Vilar mentioned that between 2008 and today, the Y tree has grown exponentially.  That’s for sure.  “We are reshaping the tree in an enormous way.”  What was once believed to very homogenous, but in reality, as it drills down to the tips, it’s very heterogenous – a great deal of diversity.

As anyone who works with this information on a daily basis knows, that is probably the understatement of the year.  The Geno 2.0 project, the Walk the Y along with various other private labs are discovering new SNPs more rapidly than they can be placed on the Y tree.  Unfortunately, this has led to multiple trees, none of which are either “official” or “up to date.”  This isn’t meant as a criticism, but more a testimony of just how fast this part of the field is emerging.  I’m hopeful that we will see a tree in 2014, even if it is an interim tree. In fact, Dr. Vilars referred to the 2014 tree.

Next week, the Nat Geo team goes to Ireland and will be looking for the first migrants and settlers in Ireland – both for Y DNA and mitochondrial DNA.  Dr. Vilars says “something happened” about 4000 years ago that changed the frequency of the various haplogroups found in the population.  This “something” is not well understood today but he feels it may be a cultural movement of some sort and is still being studied.

Nat Geo is also focused on haplogroup Q in regions from the Arctic to South America.  Q-M3 has also been found in the Caribbean for the first time, marking a migration up the chain of islands from Mexico and South America within the past 5,000 years.  Papers are coming within the next year about this.

They anticipate that interest will double within the next year.  They expect that based on recent discoveries, the 2015 Y tree will be much larger yet.  Dr. Michael Hammer will speak tomorrow on the Y tree.

Nat Geo will introduce a “new chip by next year.”  The new Ireland data should be available on the National Geographic website within a couple of weeks.

They are also in the process up updating the website with new heat maps and stories.

Session 3 – Matt Dexter – Autosomal Analyses

Matt is a surname administrator, an adoptee and has a BS in Computer Science.  Matt is a relatively new admin, as these things go, beginning his adoptive search in 2008.

Matt found out as a child that he was adopted through a family arrangement.  He contacted his birth mother as an adult.  She told him who his father was who subsequently took a paternity test which disclosed that the man believed to be his biological father, was not.  Unfortunately, his ‘father’ had been very excited to be contacted by Matt, and then, of course, was very disappointed to discover that Matt was not his biological child.

Matt asked his mother about this, and she indicated that yes, “there was another guy, but I told him that the other guy was your father.’  With that, Matt began the search for his biological father.

In order to narrow the candidates, his mother agreed to test, so by process of elimination, Matt now knows which side of his family his autosomal results are from.

Matt covers how autosomal DNA works.

This search has led Matt to an interest in how DNA is passed in general, and specifically from grandparents to grandchildren.

One advantage he has is that he has five children whose DNA he can then compare to his wife and three of their grandparents, inferring of course, the 4th grandparent by process of elimination.  While his children’s DNA doesn’t help him identify his father, it did give him a lot of data to work with to learn about how to use and interpret autosomal DNA.    Here, Matt is discussing his children’s inheritance.

Matt dexter

Session 4 – Jeffrey Mark Paul – Differences in Autosomal DNA Characteristics between Jewish and Non-Jewish Populations and Implications for the Family Finder Test

Dr.Jeffrey Paul, who has a doctorate in Public Health from John Hopkins, noticed that his and his wife’s Family Finder results were quite different, and he wanted to know why.  Why did he, Jewish, have so many more?

There are 84 participants in the Jewish project that he used for the autosomal comparison.

What factors make Ashkenazi Jews endogamous.  The Ashkenazi represent 80%of world’sJewish population.

Arranged marriages based on family backgrounds.  Rabbinical lineages are highly esteemed and they became very inbred with cousins marrying cousins for generations.

Cultural and legal restrictions restrict Jewish movements and who they could marry.

Overprediction, meaning people being listed as being cousins more closely than they are, is one of the problems resulting from the endogamous population issue.  Some labs “correct” for this issue, but the actual accuracy of the correction is unknown.

Jeffrey compared his FTDNA Family Finder test with the expected results for known relatives and he finds the results linear – meaning that the results line up with the expected match percentages for unrelated relatives.  This means that FTDNA’s Jewish “correction” seems to be working quite well.  Of course, they do have a great family group with which to calibrate their product.  Bennett’s family is Jewish.

Jeffrey has downloaded the results of group participants into MSAccess and generates queries to test the hypothesis that Jewish participants have more matches than a non-Jewish control group.

The Jewish group had approximately a total of 7% total non-Ashkenazi Jewish in their Population Finder results, meaning European and Middle Eastern Jewish.  The non-Jewish group had almost exactly the opposite results.

  • Jewish people have from 1500-2100 matches.
  • Interfaith 700-1100 (Jewish and non)
  • NonJewish 60-616

Jewish people match almost 33% of the other Jewish people in the project.  Jewish people match both Jewish and Interfaith families.  NonJewish families match NonJewish and interfaith matches.

Jeffrey mentioned that many people have Jewish ancestry that they are unaware of.

This session was quite interesting.  This study while conducted on the Jewish population, still applies to other endogamous populations that are heavily intermarried.  One of the differences between Jewish populations and other groups, such as Amish, Brethren, Mennonite and Native American groups is that there are many Jewish populations that are still unmixed, where most of these other groups are currently intermixed, although of course there are some exceptions.  Furthermore, the Jewish community has been endogamous longer than some of the other groups.  Between both of those factors, length of endogamy and current mixture level, the Jewish population is probably much more highly admixed than any other group that could be readily studied.

Due to this constant redistribution of Jewish DNA within the same population, many Jewish people have a very high percentage of distant cousin relationships.

For non-Jewish people, if you are finding match number is the endogamous range, and a very high number of distant cousins, proportionally, you might want to consider the possibility that some of your ancestors descend from an endogamous population.

Unfortunately, the photo of Dr. Paul was unuseable.  I knew I should have taken my “real camera.”

Session 5 – Finding Your Indian Prince(ss) Without Having to Kiss Too Many Frogs

This was my session, and I’ll write about it later.

Someone did get a photo, which I’ve lifted from Jennifer Zinck’s great blog (thank you Jennifer), Ancestor Central.  In fact, you can see her writeup for Day 1 here and she is probably writing Day 2’s article as I type this, so watch for it too.

 Estes Indian Princess photo

Session 6 – Roundtable – Y-SNPs, hosted by Roberta Estes, Rebekah Canada and Marie Rundquist

At the end of the day, after the breakout sessions, roundtable discussions were held.  There were several topics.  Rebekah Canada, Marie Rundquist and I together “hostessed” the Y DNA and SNP discussion group, which was quite well attended.  We had a wide range of expertise in the group and answered many questions.  One really good aspect of these types of arrangements is that they are really set up for the participants to interact as well.  In our group, for example, we got the question about what is a public versus a private SNP, and Terry Barton who was attending the session answered the question by telling about his “private” Barton SNPs which are no longer considered private because they have now been found in three other surname individuals/groups.  This means they are listed on the “tree.”  So sometimes public and private can simply be a matter of timing and discovery.

FTDNA roundtable 2013

Here’s Bennett leading another roundtable discussion.

roundtable bennett

Session 7 – Dr. David Mittleman

Mittleman

Dr. Mittleman has a PhD in genetics, is a professor as well as an entrepreneur.  He was one of the partners in Arpeggi and came along to Gene by Gene with the acquisition.  He seems to be the perfect mixture of techie geek, scientist and businessman.

He began his session by talking a bit about the history of DNA sequencing, next generation sequencing and a discussion about the expectation of privacy and how that has changed in the past few years with Google which was launched in 2006 and Facebook in 2010.

David also discussed how the prices have dropped exponentially in the past few years based on the increase in the sophistication of technology.  Today, Y SNPs individually cost $39 to test, but for $199 at Nat Geo you can test 12,000 Y SNPs.

The WTY test, now discontinued tsted about 300,000 SNPs on the Y.  It cost between $950 (if you were willing to make your results public) and $1500 (if the results were private,)

Today, the Y chromosome can be sequenced on the Illumina chip which is the same chip that Nat Geo used and that the autosomal testing uses as well.  Family Tree DNA announced their new Big Y product that will sequence 10 million positions and 25,000 known SNPs for an introductory sale price of $495 for existing customers.  This is not a test that a new customer would ever order.  The test will normally cost $695.

Candid Shots

Tech row in the back of the room – Elliott Greenspan at left seated at the table.

tech row

ISOGG Reception

The ISOGG reception is one of my favorite parts of the conference because everyone comes together, can sit in groups and chat, and the “arrival” adrenaline has worn off a bit.  We tend to strategize, share success stories, help each other with sticky problems and otherwise have a great time.  We all bring food or drink and sometimes pitch in to rent the room.  We also spill out into the hallways where our impromptu “meetings” generally happen.  And we do terribly, terribly geeky things like passing our iPhones around with our chromosome painting for everyone to see.  Do we know how to party or what???

Here’s Linda Magellan working hard during the reception.  I think she’s ordering the Big Y actually.  We had several orders placed by admins during the conference.

magellan.jpg

We stayed up way too late visiting and the ISOGG meeting starts at 8 AM tomorrow!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research