Native American Haplogroups Q, C and the Big Y Test

Sicangu man c 1900I’m writing this to provide an update about Native American paternal research, and to ask for your help and support, but first, let me tell you why.  It’s a very exciting time.

If you don’t want the details, but you know you want to help now….and we have to pay for these tests by the end of the day December 1 to take advantage of the sale price…you can click below to help fund the Big Y testing for Native American haplogroups Q and C.  Both projects need approximately $990.  Everything contributed goes directly to testing.

To donate to the American Indian project, in memory of someone, a family member perhaps, or maybe in honor of an ancestor, or anonymously, click this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=AIP

In order to donate to haplogroup C-P39 project, please click this link:

http://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

Now for the story…

As many of you know, haplogroup Q and C are the two Native American male haplogroups.  To date, every individual with direct paternal Native American ancestors descends from a subgroup of either haplogroup Q or C, Q being by far the most prevalent.  Both of these haplogroups are also found to some extent in Asia and Europe, but there are distinct and specific lineages found in the Americas that represent only Native Americans.  These subgroups are not found in either Europe or Asia.

In December, 2010, we found the first SNP (single nucleotide polymorphism) marker that separated the European and the Native American subclades of haplogroup Q.  Since that time, additional markers have been found through the Walk the Y program and other research.

How did this happen?  A collaborative research approach between individual testers and project administrators.  In this case, Lenny Trujillo was a member of the haplogroup Q project and he agreed to take the WTY (Walk the Y) test, which indeed, discovered a very unique SNP marker that defines Native American haplogroup Q, as opposed to European haplogroup Q.

Much has changed in three years.  The WTY test which was focused solely on research is entirely obsolete, being replaced by a new much more powerful test called the Big Y, and at a reduced cost.  The Big Y sequences a much larger portion of the Y chromosome, which will allow us to discover even more markers.

Why is this important?  Because today, in haplogroups Q and C, we are learning through standard STR (short tandem repeat) surname marker tests who is related to whom, and how distantly, but it’s not enough.  For example, we have a group of haplogroup Q men in Canada who match each other, but then another group with a different SNP marker that is located in the Southwest, Mexico, and then in the North Carolina/Virginia border area.  Oh yes, and one more from Charleston, SC.  Most Native American men who carry haplogroup C are found in Northeastern Canada….but then there is one in the Southwest. What do these people have in common?  Is their relationship “old” or relative new?  Do they perhaps share a common historical language group?  We don’t know, and we’d like to.  In order to do that, we need to further refine their genetic relationship.  Hence, the new tool, the Big Y.

The Big Y sequences almost all of the Y chromosome – over 10 million base pairs and nearly 25,000 known SNPs.  But the good news is that the Big Y, like its predecessor, the WTY, has the ability to find new SNPs.  And they are being found by the buckets – so fast that the haplogroup trees can’t even keep up.  For example, the haplogroup project page still lists most Native people as Q1a3a, but in reality many new SNPs have been discovered.

That’s the good news – that the Big Y represents a huge research opportunity for us to make major discoveries that may well divide the Native groups in the Haplogroup C and Q projects into either language groups, or maybe, if we are lucky, into tribal “confederacies,” for lack of a better word.  I hate to use the word tribes, because the definition of a tribe has changed so much.  What we would like to be able to do it to tell someone from their test results that they are Iroquoian, for example, or Athabascan, or Siouian.  This has been our overarching goal for years, and now we’re actually getting close.  That potential rests with the Big Y.

The bad news is that the test costs $495, and that’s the sale price good only through Dec. 1., and we need funding.  In the haplogroup Q project, we do have a few people who are testing.  Everyone who did the WTY has been sent a $50 coupon to apply towards the Big Y test.  I hope everyone who did do the WTY will indeed order the Big Y as well.  If not, then the coupon can be donated to us, as project administrators, to apply towards the Big Y test of someone else in the group who is testing.  If you’re not going to test, please donate your coupon.

In haplogroup Q, we have two additional men who we desperately want to take the Big Y test, and 2 in haplogroup C as well.  We’re asking for two things.  First, for unused $50 coupons and second, for contributions against the $495 price.  We’d certainly welcome large contributions, or a sponsor for an entire test, but we’d also welcome $5, $10, $25 or whatever you’d like to contribute.  Every little bit helps.

To donate to the American Indian project and to help fund this critical research, click this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=AIP

In order to donate to haplogroup C-P39 project for this research, please click this link:

http://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

Thank you everyone, in advance, for your help.  We can’t do this without you.  This is what collaborative citizen science is all about.  Of course, we’ll report findings as we receive them and can process the information.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Be Still my H(e)art…

You’re not going to believe this.  I’m not sure I believe it.

Remember, I closed my article on the Younger family yesterday by saying that I was hopeful that I might solve the mystery of who Marcus Younger’s wife, Susanna, was?  Well, I said that, but I had no real expectation that it would really happen, not after one already huge breakthrough.  I began working through cousin Larry’s matches, sending e-mails, and within six hours or so, I had several replies, one of which was this:

“Hello my name is Andrea. Thank you for sending me this email. I am new to genealogy and have a large interest in my family history. Younger is not a known surname for me, although Hart is. My oldest known Hart ancestor is Anthony Hart born in Oct 1755 in King and Queen, Virginia. He was my 5th great grandfather. He lived in Halifax Virginia in 1840 with his children and grandchildren. How is the surname Hart related to Younger?”

Oh Andrea, let me tell you.  You have made my day, my decade, my 30 years, and yes, indeed, this is the second jackpot hit in two days in the same family line.  I shoulda bought a lottery ticket but I think I’d rather have this:)

It has always been speculated that Marcus Younger’s wife, Susanna, was a Hart.  In fact, it was speculated that she was the possible sister of that one and the same Anthony Hart in Halifax County, Virginia, based on this tax record from King and Queen County, Va. just before Marcus Younger moved to Halifax County.  Robert Hart is believed to be Anthony’s father, but that is unproven.

1785

Alterations of land in King and Queen County

Proprietor’s Name                     QT Land                     of whom had

Anthony Hart                               190a                         Robert Hart

Anthony Hart                                94a                          Marcus Younger

There are a couple of other records in which they appear together too.

Unfortunately, King and Queen County is a burned county.

Now, we have a couple of pretzel twists that need to be considered.  In Larry’s line, Marcus’s son John married Lucy Hart who is mentioned in Anthony Hart’s Revolutionary War pension application in 1832.  So Larry could be expected to match Andrea regardless of who Marcus’s wife was.

However, I don’t descend from the same line as Larry and Andrea matches me as well.  I descend from Marcus through his daughter, Mary, sister to John who married Lucy Hart.  So, I should NOT match Andrea unless I too carry some Hart DNA.  But I do, in two distinct places where I also match Larry.  On the chromosome browser below, Andrea is orange, I am blue and we are being compared to Larry.  You can see that we all 3 match on the same segments on chromosomes 1 and 8.

younger hart 1

Additionally, Andrea matches other cousins descended from my Younger line.

Furthermore, Andrea and David (from the previous article whose pedigree proved that Marcus and Thomas Younger are related) both match Lawson, but they don’t match each other.  This makes perfect sense.  David descends from Thomas Younger, who has no known Hart connection.  So David matches Larry because of the Younger line and Andrea matches Larry because of the Hart line.

You can see in the chromosome browser view below that indeed, both Andrea, orange, and David, blue match Larry, but in no location do they match each other in addition to matching Larry.  No place does their DNA show one under the other, overlapping, when compared to Larry.

younger hart 2

Turning now to the spreadsheet where I can see all of the people who match both Larry and David together, I want to know who else Andrea matches.

First, I confirmed that Andrea does not match anyone else from the Alexander Younger line through sons Thomas and James, and she does not.  If she had, that would put a very big fly in the ointment and would prevent any conclusion about Marcus’s wife.  But since she doesn’t, that obstacle is removed.

Andrea does match the following people on several segments:

  • Me
  • Loujean, our newly found adoptee cousin whose closest autosomal match is Larry
  • Larry
  • Buster, my cousin, who also descends through Marcus’s daughter, Mary

We are all four descended from the Marcus line and she doesn’t match anyone who descends from the Thomas or Alexander lines, which makes perfect sense since Anthony Hart looks to be the probable brother of Marcus Younger’s wife, Susannah, based on the historical records and some relationship is now confirmed by the DNA.

Am I ready to call this a positive match yet and Susannah a Hart?  Technically, I probably could, but I’m rather conservative and I’m just not quite ready to give an unconditional thumbs up.  To make myself feel entirely warm and fuzzy, I’d love to see another Hart match for me or my cousins not descended through John’s line. I’d also love to be able to reconstruct the Hart family back in Queen and King and Essex Counties and have some additional paper document to go along with the results.  That would certainly be easier to accomplish were the Queen and King records not burned.  This family lived on the border between the two and had records in both counties.

Truly, I’m left speechless about my good fortune this weekend.  I’m happy dancing a hole in the floor.

happy dance 2

But I’m also left wondering how many other answers are really there, in the DNA of the people we match and I just haven’t worked with the matches effectively.  Maybe those walls are just waiting to fall….waiting for me to notice them.  Maybe yours are too.

Update: Please note that as of August 2019, this connection is still not proven. Still hoping!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Gene by Gene Genomics Research Center Lab Tour

 ftdna inside sign cropped

Both before and after the 9th Annual Family Tree DNA International Conference for Genetic Genealogy this past weekend, Max Blankfeld and Bennett Greenspan were gracious enough to allow interested administrators to visit and tour their labs.  I’ve toured other DNA labs, but their lab has very cool leading edge equipment.  It was a wonderful treat to see it in action.

What I didn’t have was my “good” camera, so I’m sharing my iPhone photos.

I went on the last tour available and there were only a few of us, so it an excellent opportunity to see things up close and personal.

ftdna genomics research center

This lab is much larger than I expected.  Gene by Gene, in addition to doing all of the DNA processing for Family Tree DNA, DNA Traits and the National Geographic Genographic project, is doing a significant amount of processing for research institutions such as medical schools. While we were there, they were getting ready to prep to run a large order of several hundred exome samples.

But come along with me and you can see for yourself.  Bennett gave the tour personally.  The bad news is that you’re going to have to rely on my memory, because nothing was allowed in the lab other than our cameras.  This was to prevent contamination.

ftdna lisa footies

There are other contamination prevention methods as well.  Anyone with open toed shoes had to put on booties.  Here’s my friend Lisa, who comments periodically on my blog, suiting up for the tour.  Next, we were given lab coats to wear inside the facility which we then took off and left by the door, but inside the lab, as we left.

ftdna lisa lab coat

The first stop inside is where they prepare the kits for shipping to customers when an order is placed.  They purchase the empty vials, prepare the formula and fill and cap the vials, all automatically.

ftdna vials for kit

The “capping” process is the most interesting part and caused them the most consternation in trying to figure out the best way to do this.  Bennett said they worried about having a non-tethered lid that might be dropped by the customer, and contaminated, as it turns out, needlessly.

After the kits come back, all but one of the vials goes into storage, shown below, beside the lab, for future testing.  This environment does not have to be specially controlled outside of a normal office environment.

ftdna sample storage

The vial that gets opened for the testing undergoes a different process that begins with removing the DNA from the vial and mixing it with a chemical solution that shakes the DNA out of the cells.

ftdna lab

This is done overnight in a shaker machine.  Reminded me of a paint shaker.

ftdna shaker

Have you ever seen a custom $600,000 freezer with a robot to retrieve the frozen goods?  No?  Well, you’re about to.  If you have ever tested with Family Tree DNA and there is any DNA left in a vial that has been opened, it’s in this freezer which took the vendor 7 weeks to assemble on site.  Capacity is over 550,000 vials and it’s about half full currently.

After the DNA is shaken out of the cells, that mixture has to be handled differently.  It has been barcoded during the entire process and the prepared DNA mixture is then put into storage plates which are robotically stored.  This retrieval process is initiated when an order is received by the robotic software.  Keep in mind that the unit holds more samples than Family Tree DNA has today, in a very regulated deep freeze environment.  Depending on what this robotic arm is doing, meaning moving plates around or extracting a specific vial, it changes its own tool on the end of its arm.  It knows where every vial is in the freezer.  I must admit, my Mom who has been gone since 2006 has DNA there and it made me feel kind of funny to know I was visiting “her.” But my DNA is with hers, along with a whole lot of other family members, so I guess it’s just one big family reunion in there.

After the correct vial is retrieved and the DNA mixture is extracted, the liquid is put onto a “chip” for the autosomal testing.  The chip itself is about an inch by maybe 3 inches and holds 12 tests.

ftdna chip 12

The DNA is pipetted into the side and then it is wicked into the chip itself.

ftdna loading dna on chip

Here is a set of two chips loaded and ready to be processed.  This means that at total of 24 individual samples are being sequenced.   Notice the little grey square to the size of each larger grey square.  That tiny grey square is where the DNA mixture it placed and it’s wicked into the larger grey square for processing.  We asked how that is done and were told that the technique is part of Illumina’s trade secrets.

ftdna chip loaded

Gene by Gene owns several sequencing machines.  I know they have at least two Sanger sequencing machines and 4 different sizes and types of Illumina sequencing machines that run chip based tests like the Geno 2, the Family Finder and now the Big Y tests, in addition to the exome and full genome tests.  These machines are incredible given that they can run hundreds of tests at a time, which is also how they have dropped the test costs exponentially in the past few years.  Some equipment is optimized for running many samples but more slowly and some for running fewer samples but more quickly.

ftdna sequencer

After reading and being automatically scored, the DNA results are reported to the client.

At the end of the lab tour, just outside, is the Customer Service area where the Customer Service Reps work.  I’ll tell you what, they had their hands full this week and weekend with their regular call load, a conference and an office full of nosey and interested project administrators.

ftdna csr area

Of course, during the course of the day, I had to visit the restroom.  I’ve always loved Max and Bennett’s sense of humor.

ftdna men cropped

In case you don’t know, the Y chromosome is much smaller than the X, hence, the difference in the signs.

 ftdna women

Let’s just say that in light of their new product announcement, the “Big Y,” I did a bit of a structural modification for them:)

Thanks again to Max and Bennett for their hospitality.

Jennifer Zinck also wrote about the Friday lab tour on her blog, Ancestor Central.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2013 Family Tree DNA Conference Day 2

ISOGG Meeting

The International Society of Genetic Genealogy always meets at 8 AM on Sunday morning.  I personally think that 8AM meeting should be illegal, but then I generally work till 2 or 3 AM (it’s 1:51 AM now), so 8 is the middle of my night.

Katherine Borges, the Director speaks about current and future activities, and Alice Fairhurst spoke about the many updates to the Y tree that have happened and those coming as well.  It has been a huge challenge to her group to keep things even remotely current and they deserve a huge round of virtual applause from all of us for the Y tree and their efforts.

Bennett opened the second day after the ISOGG meeting.

“The fact that you are here is a testament to citizen science” and that we are pushing or sometimes pulling academia along to where we are.

Bennett told the story of the beginning of Family Tree DNA.  “Fourteen years ago when the hair that I have wasn’t grey,” he began, “I was unemployed and tried to reorganize my wife’s kitchen and she sent me away to do genealogy.”  Smart woman, and thankfully for us, he went.  But he had a roadblock.  He felt there was a possibility that he could use the Y chromosome to solve the roadblock.  Bennett called the author of one of the two papers published at that time, Michael Hammer.  He called Michael Hammer on Sunday morning at his home, but Michael was running out the door to the airport.  He declined Bennett’s request, told him that’s not what universities do, and that he didn’t know of anyplace a Y test could be commercially be done.  Bennett, having run out of persuasive arguments, started mumbling about “us little people providing money for universities.”  Michael said to him, “Someone should start a company to do that because I get phone calls from crazy genealogists like you all the time.”  Let’s just say Bennett was no longer unemployed and the rest, as they say, is history.  With that, Bennett introduced one of our favorite speakers, Dr. Michael Hammer from the Hammer Lab at the University of Arizona.

Bennett day 2 intro

Session 1 – Michael Hammer – Origins of R-M269 Diversity in Europe

Michael has been at all of the conferences.  He says he doesn’t think we’re crazy.  I personally think we’ve confirmed it for him, several times over, so he KNOWS we’re crazy.  But it obviously has rubbed off on him, because today, he had a real shocker for us.

I want to preface this by saying that I was frantically taking notes and photos, and I may have missed something.  He will have his slides posted and they will be available through a link on the GAP page at FTDNA by the end of the week, according to Elliott.

Michael started by saying that he is really exciting opportunity to begin breaking family groups up with SNPs which are coming faster than we can type them.

Michael rolled out the Y tree for R and the new tree looks like a vellum scroll.

Hammer scroll

Today, he is going to focus on the basic branches of the Y tree because the history of R is held there.

The first anatomically modern humans migrated from Africa about 45,000 years ago.

After last glacial maximum 17,000 years ago, there was a significant expansion into Europe.

Neolithic farmers arrived from the near east beginning 10,000 years ago.

Farmers had an advantage over hunter gatherers in terms of population density.  People moved into Northwestern Europe about 5,000 years ago.

What did the various expansions contribute to the population today?

Previous studies indicate that haplogroup R has a Paleolithic origin, but 2 recent studies agree that this haplogroup has a more recent origin in Europe – the Neolithic but disagree about the timing of the expansion.

The first study, Joblin’s study in 2010, argued that geographic diversity is explained by single Near East source via Anaotolia.

It conclude that the Y of Mesololithic hunger-gatherers were nearly replaced by those of incoming farmers.

In the most recent study by Busby in 2012 is the largest study and concludes that there is no diversity in the mapping of R SNP markers so they could not date lineage and expansion.  They did find that most basic structure of R tree did come from the near east.  They looked at P311 as marker for expansion into Europe, wherever it was.  Here is a summary page of Neolithic Europe that includes these studies.

Hammer says that in his opinion, he thought that if P311 is so frequent and widespread in Europe it must have been there a long time.  However, it appears that he and most everyone else, was wrong.

The hypothesis to be tested is if P311 originated prior to the Neolithic wave, it would predict higher diversity it the near east, closer to the origins of agriculture.  If P311 originated after the expansion, would be able to see it migrate across Europe and it would have had to replace an existing population.

Because we now have sequences the DNA of about 40 ancient DNA specimens, Michael turned to the ancient DNA literature.  There were 4 primary locations with skeletal remains.  There were caves in France, Spain, Germany and then there’s Otzi, found in the Alps.

hammer ancient y

All of these remains are between 6000-7000 years old, so prior to the agricultural expansion into Europe.

In France, the study of 22 remains produced, 20 that were G2a and 2 that were I2a.

In Spain, 5 G2a and 1 E1b.

In Germany, 1I G2a and 2 F*.

Otzi is haplogroup G2a2b.

There was absolutely 0, no, haplogroup R of any flavor.

In modern samples, of 172 samples, 94 are R1b.

To evaluate this, he is dropping back to the backbone of haplogroup R.

hammer backbone

This evidence supports a recent spread of haplogroup R lineages in western Europe about 5K years ago.  This also supports evidence that P311 moved into Europe after the Neolithic agricultural transition and nearly displaced the previously existing western European Neolithic Y, which appears to be G2a.

This same pattern does not extrapolate to mitochondrial DNA where there is continuity.

What conferred advantage to these post Neolithic men?  What was that advantage?

Dr. Hammer then grouped the major subgroups of haplogroup R-P3111 and found the following clusters.

  • U106 is clustered in Germany
  • L21 clustered in the British Isles
  • U152 has an Alps epicenter

hammer post neolithic epicenters

This suggests multiple centers of re-expansion for subgroups of haplogroup R, a stepwise process leading to different pockets of subhaplogroup density.

Archaeological studies produce patterns similar to the hap epicenters.

What kind of model is going on for this expansion?

Ancestral origin of haplogroup R is in the near east, with U106, P312 and L21 which are then found in 3 European locations.

This research also suggests thatG2a is the Neolithic version of R1b – it was the most commonly found haplogroup before the R invasion.

To make things even more interesting, the base tree that includes R has also been shifted, dramatically.

Haplogroup K has been significantly revised and is the parent of haplogroups P, R and Q.

It has been broken into 4 major branches from several individual lineages – widely shifted clades.

hammer hap k

Haps R and Q are the only groups that are not restricted to Oceana and Southeast Asia.

Rapid splitting of lineages in Southeast Asia to P, R and Q, the last two of which then appear in western Europe.

hammer r and q in europe

R then, populated Europe in the last 4000 years.

How did these Asians get to Europe and why?

Asian R1b overtook Neolithic G2a about 4000 years ago in Europe which means that R1b, after migrating from Africa, went to Asia as haplogroup K and then divided into P, Q and R before R and Q returned westward and entered Europe.  If you are shaking your head right about now and saying “huh?”…so were we.

Hammer hap r dist

Here is Dr. Hammer’s revised map of haplogroup dispersion.

hammer haplogroup dispersion map

Moving away from the base tree and looking at more recent SNPs, Dr. Hammer started talking about some of the findings from the advanced SNP testing done through the Nat Geo project and some of what it looks like and what it is telling us.

For example, the R1bs of the British Isles.

There are many clades under L 21.  For example, there is something going on in Scotland with one particular SNP (CTS11722?) as it comprises one third of the population in Scotland, but very rare in Ireland, England and Wales.

New Geno 2.0 SNP data is being utilized to learn more about these downstream SNPs and what they had to say about the populations in certain geographies.

For example, there are 32 new SNPs under M222 which will help at a genealogical level.

These SNPs must have arisen in the past couple thousand years.

Michael wants to work with people who have significant numbers of individuals who can’t be broken out with STRs any further and would like to test the group to break down further with SNPs.  The Big Y is one option but so is Nat Geo and traditional SNP testing, depending on the circumstance.

G2a is currently 4-5% of the population in Europe today and R is more than 40%.

Therefore, P312 split in western Eurasia and very rapidly came to dominate Europe

Session 2 – Dr. Marja Pirttivaara – Bridging Social Media and DNA

Dr. Pirttivaara has her PhD in Physics and is passionate about genetic genealogy, history and maps.  She is an administrator for DNA projects related to Finland and haplogroup N1c1, found in Finland, of course.

marja

Finland has the population of Minnesota and is the size of New Mexico.

There are 3750 Finland project members and of them 614 are haplogroup N1c1.

Combining the N1c1 and the Uralic map, we find a correlation between the distribution of the two.

Turku, the old capital, was full or foreigners, in Medieval times which is today reflected in the far reaching DNA matches to Finnish people.

Some of the interest in Finland’s DNA comes from migration which occurred to the United States.

Facebook and other social media has changed the rules of communication and allows the people from wide geographies to collaborate.  The administrator’s role has also changed on social media as opposed to just a FTDNA project admin.  Now, the administrator becomes a negotiator and a moderator as well as the DNA “expert.”

Marja has done an excellent job of motivating her project members.  They are very active within the project but also on Facebook, comparing notes, posting historical information and more.

Session 3 – Jason Wang – Engineering Roadmap and IT Update

Jason is the Chief Technology Officer at Family Tree DNA and recently joined with the Arpeggi merger and has a MS in Computer Engineering.

Regarding the Gene by Gene/FTDNA partnership, “The sum of the parts is greater than the whole.”  He notes that they have added people since last year in addition to the Arpeggi acquisition.

Jason introduced Elliott Greenspan, who, to most of us, needed no introduction at all.

Elliott began manually scoring mitochondrial DNA tests at age 15.  He joined FTDNA in 2006 officially.

Year in review and What’s Coming

4 times the data processed in the past year.

Uploads run 10 times faster.  With 23andMe and Ancestry autosomal uploads, processing will start in about 5 minutes, and matches will start then.

FTDNA reinvented Family Finder with the goal of making the user experience easier and more modern.   They added photos, profiles and the new comparison bars along with an advanced section and added push to chromosome browser.

Focus on users uploading the family tree.  Tools don’t matter if the data isn’t there.  In order to utilize the genealogy aspect, the genealogy info needs to be there.   Will be enhancing the GEDCOM viewer.  New GEDCOMs replace old GEDCOMs so as you update yours, upload it again.

They are now adding a SNP request form so that you can request a SNP not currently available.  This is not to be confused with ordering an existing SNP.

They currently utilize build 14 for mitochondrial DNA.  They are skipping build 15 entirely and moving forward with 16.

They added steps to the full sequence matches so that you can see your step-wise mutations and decide whether and if you are related in a genealogical timeframe.

New Y tree will be released shortly as a result of the Geno 2.0 testing.  Some of the SNPs have mutated as much as 7 times, and what does that mean in terms of the tree and in terms of genealogical usefulness.  This tree has taken much longer to produce than they expected due to these types of issues which had to be revised individually.

New 2014 tree has 6200 SNPS and 1000 branches.

  • Commitment to take genetic genealogy to the next level
  • Y draft tree
  • Constant updates to official tree
  • Commitment to accurate science

If a single sample comes back as positive for a SNP, they will put it on the tree and will constantly update this.

If 3 or 4 people have the same SNP that are not related it will go directly to the tree.  This is the reason for the new SNP request form.

Part of the reason that the tree has taken so long is that not every SNP is public and it has been a huge problem.

When they find a new SNP, where does it go on the tree?  When one SNP is found or a SNP fails, they have run over 6000 individual SNPs on Nat Geo samples to vet to verify the accuracy of the placement.  For example, if a new SNP is found in a particular location, or one is found not to be equivalent that was believe to be so previously, they will then test other samples to see where the SNP actually belongs.

X Matching

Matching differential is huge in early testing.  One child may inherit as little as 20% of the X and another 90%.  Some first cousins carry none.

X matching will be an advanced feature and will have their own chromosome browser.

End of the year – January 1.  Happy New Year!!!

Population Finder

It’s definitely in need of an upgrade and have assigned one person full time to this product.

There are a few contention points that can be explained through standard history.

It’s going to get a new look as well and will be easily upgradeable in the future.

They cannot utilize the National Geographic data because it’s private to Nat Geo.

Bennett – “Committed to an engineering team of any size it takes to get it done.  New things will be rolling out in first and second quarter of next year.”  Then Bennett kind of sighed and said “I can’t believe I just said that.”

Session 4 – Dr. Connie Bormans – Laboratory Update

The Gene by Gene lab, which of course processes all of the FTDNA samples is now a regulated lab which allows them to offer certain regulated medical tests.

  • CLIA
  • CAP
  • AABB
  • NYSDOH

Between these various accreditations, they are inspected and accredited once yearly.

Working to decrease turn-around time.

SNP request pipeline is an online form and is in place to request a new SNP be added to their testing menu.

Raised the bar for all of their tests even though genetic genealogy isn’t medical testing because it’s good for customers and increases quality and throughput.

New customer support software and new procedures to triage customer requests.

Implement new scoring software that can score twice as many tests in half the time.  This decreases turn-around time to the customer as well.

New projects include improved method of mtDNA analysis, new lab techniques and equipment and there are also new products in development.

Ancient DNA (meaning DNA from deceased people) is being considered as an offering if there is enough demand.

Session 5 – Maurice Gleeson – Back to Our Past, Ireland

Maurice Gleeson coordinated a world class genealogy event in Dublin, Ireland Oct. 18-20, 2013.  Family Tree DNA and ISOGG volunteers attended to educate attendees about genetic genealogy and DNA. It was a great success and the DNA kits from the conference were checked in last week and are in process now.  Hopefully this will help people with Irish ancestry.

12% of the Americans have Irish ancestry, but a show of hands here was nearly 100% – so maybe Irish descendants carry the crazy genealogist gene!

They developed a website titled Genetic Genealogy Ireland 2013.  Their target audience was twofold, genetic genealogy in general and also the Irish people.  They posted things periodically to keep people interested.  They also created a Facebook page.  They announced free (sponsored) DNA tests and the traffic increased a great deal.  Today ISOGG has a free DNA wiki page too.  They also had a prize draw sponsored by the Ireland DNA and mtdna projects. Maurice said that the sessions and the booth proximity were quite symbiotic because when y ou came out of the DNA session, the booth was right there.

2000-5000 people passed by the booth

500 people in the booth

Sold 99 kits – 119 tests

45 took Y 37 marker tests

56 FF, 20 male, 36 female

18 mito tests

They passed out a lot of educational material the first two days.  It appeared that the attendees were thinking about things and they came back the last day which is when half of the kits were sold, literally up until they threatened to turn the lights out on them.

They have uploaded all of the lectures to a YouTube channel and they have had over 2000 views.  Of all of the presentation, which looked to be a list of maybe 10-15, the autosomal DNA lecture has received 25% of the total hits for all of the videos.

This is a wonderful resource, so be sure to watch these videos and publicize them in your projects.

Session 6 – Brad Larkin – Introducing Surname DNA Journal

Brad Larkin is the FTDNA video link to the “how to appropriately” scrape for a DNA test.  That’s his minute or two of fame!  I knew he looked familiar.

Brad began a peer reviewed genetic genealogy journal in order to help people get their project stories published.  It’s free, open access, web based and the author retains the copyright..  www.surnamedna.com

Conceived in 2012, the first article was published in January 2013.  Three papers published to date.

Encourage administrators to write and publish their research.  This helps the publication withstand the test of time.

Most other journals are not free, except for JOGG which is now inactive.  Author fees typically are $1320 (PLOS) to $5000 (Nature) and some also have subscription or reader fees.

Peer review is important.  It is a critical review, a keen eye and an encouraging tone.  This insures that the information is evidence based, correct and replicable.

Session 7 – mtdna Roundtable – Roberta Estes and Marie Rundquist

This roundtable was a much smaller group than yesterday’s Y DNA and SNP session, but much more productive for the attendees since we could give individual attention to each person.  We discussed how to effectively use mtdna results and what they really mean.  And you just never know what you’re going to discover.  Marie was using one of her ancestors whose mtDNA was not the haplogroup expected and when she mentioned the name, I realized that Marie and I share yet another ancestral line.  WooHoo!!

Q&A

FTDNA kits can now be tested for the Nat Geo test without having to submit a new sample.

After the new Y tree is defined, FTDNA will offer another version of the Deep Clade test.

Illumina chip, most of the time, does not cover STRs because it measures DNA in very small fragments.  As they work with the Big Y chip, if the STRs are there, then they will be reported.

80% of FTDNA orders are from the US.

Microalleles from the Houston lab are being added to results as produced, but they do not have the data from the older tests at the University of Arizona.

Holiday sale starts now, runs through December 31 and includes a restaurant.com $100 gift card for anyone who purchases any test or combination of tests that includes Family Finder.

That’s it folks.  We took a few more photos with our friends and left looking forward to next year’s conference.  Below, left to right in rear, Marja Pirttivaara, Marie Rundquist and David Pike.  Front row, left to right, me and Bennett Greenspan.

Goodbyes

See y’all next year!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

10 Year Pioneers Recognized by Family Tree DNA

ftdna 10 year

Family Tree DNA awarded plaques to their project administrators who have surpassed the 10 year mark.  Bennett mentioned that this group is a testament to citizen science.  I’m very pleased to be included, of course.  We’ve all been in this foxhole together for a decade now.  Thank you to Family Tree DNA for recognizing these folks.  The group is shown here and the list of individuals are:

  • Leo Baca
  • Mic Barnette
  • Janet Baker Burks
  • Roberta Estes
  • Robert Noles
  • Dyann Hersey Noles
  • Nora Probasco
  • Whitney Keen
  • Jim Barnett
  • Michael DeWitt McCown
  • James Rader
  • Steven Perkins
  • Ken Graves
  • Linda Magellan
  • Allan Grant
  • Katherine Hope Borges
  • Phillip Crow
  • George Valko
  • Therese Bucker
  • Nancy Custer
  • Peter Roberts
  • Louise Rorer Rosett
  • Jerry Cole

Of course, Max and Bennett are with us, Max on the far left and Bennett on the far right.  I think that Bennett is officially the first project administrator!

Here’s to another wonderful decade!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Family Tree DNA Announces “The Big Y”

Day 2 of the conference began early this morning and is just now ending and it’s after midnight.  I do have a lot to tell you, but most of it going to have to wait a bit.The Big Y

Today’s big news is that David Mittelman with Family Tree DNA late this afternoon announced the Big Y DNA test which would be known as a ‘full Y sequence” test. The test will provide results on 10,000,000 base-pairs and approximately 25,000 SNPs on the Y chromosome.

The regular price is $695, but it is being initially offered to current clients only for $495 though the end of November.  A current vial can be used if one exists, otherwise a new one will be sent.

Big y splashDelivery will be in 10-12 weeks and it will be accompanied by comparison tools.

Bennett says, “If the WTY (Walk the Y) was the moon shot, then this is the mission to Mars.”

Debbie Kennett compiled information from several folks who were tweeting and posting today and you can read more information at the link below.

http://cruwys.blogspot.com/2013/11/the-new-big-y-test-from-family-tree-dna.html

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Determining Ethnicity Percentages

Recently, as a comment to one of my blog postings, someone asked how the testing companies can reach so far back in time and tell you about your ancestors.  Great question.

The tests that reliably reach the furthest back, of course, are the direct line Y-Line and mitochondrial DNA tests, but the commenter was really asking about the ethnicity predictions.  Those tests are known as BGA, or biogeographical ancestry tests, but most people just think of them or refer to them as the ethnicity tests.

Currently, Family Tree DNA, 23andMe and Ancestry.com all provide this function as a part of their autosomal product along with the Genographic 2.0 test.  In addition, third party tools available at www.gedmatch.com don’t provide testing, but allow you to expand what you can learn with their admixture tools if you upload your raw data files to their site.  I wrote about how to use these ethnicity tools in “The Autosomal Me” series.  I’ve also written about how accurate ethnicity predictions from testing companies are, or aren’t, here, here and here.

But today, I’d like to just briefly review the 3 steps in ethnicity prediction, and how those steps are accomplished.  It’s simple, really, in concept, but like everything else, the devil is in the details.devil

There are three fundamental steps.

  • Creation of the underlying population data base.
  • Individual DNA extraction.
  • Comparison to the underlying population data base.

Step 1:  Creation of the underlying population data base.

Don’t we wish this was as simple as it sounds.  It isn’t.  In fact, this step is the underpinnings of the accuracy of the ethnicity predictions.  The old GIGO (garbage in, garbage out) concept applies here.

How do researchers today obtain samples of what ancestral populations looked like, genetically?  Of course, the evident answer is through burials, but burials are not only few and far between, the DNA often does not amplify, or isn’t obtainable at all, and when it is, we really don’t have any way to know if we have a representative sample of the indigenous population (at that point in time) or a group of travelers passing through.  So, by and large, with few exceptions, ancient DNA isn’t a readily available option.

The second way to obtain this type of information is to sample current populations, preferably ones in isolated regions, not prone to in-movement, like small villages in mountain valleys, for example, that have been stable “forever.”  This is the approach the National Geographic Society takes and a good part of what the Genograpic Geno 2.0 project funding does.  Indigenous populations are in most cases our most reliable link to the past.  These resources, combined with what we know about population movement and history are very telling.  In fact, National Geographic included over 75,000 AIMs (Ancestrally Informative Markers) on the Geno 2.0 chip when it was released.

The third way to obtain this type of information is by inference.  Both Ancestry.com and 23andMe do some of this.  Ancestry released its V2 ethnicity updates this week, and as a part of that update, they included a white paper available to DNA participants.  In that paper, Ancestry discusses their process for utilizing contributed pedigree charts and states that, aside from immigrant locations, such as the United States and Canada, a common location for 4 grandparents is sufficient information to include that individuals DNA as “native” to that location.  Ancestry used 3000 samples in their new ethnicity predictions to cover 26 geographic locations.  That’s only 115 samples, on average, per location to represent all of that population.  That’s pretty slim pickins.  Their most highly represented area is Eastern Europe with 432 samples and the least represented is Mali with 16.  The regions they cover are shown below.

ancestry v2 8

Survey Monkey, a widely utilized web survey company, in their FAQ about Survey Size For Accuracy provides guidelines for obtaining a representative sample.  Take a look.  No matter which calculations you use relative to acceptable Margin of Error and Confidence Level, Ancestry’s sample size is extremely light.

23andMe states in their FAQ that their ethnicity prediction, called Ancestry Composition covers 22 reference populations and that they utilize public reference datasets in addition to their clients’ with known ancestry.

23andMe asks geographic ancestry questions of their customers in the “where are you from” survey, then incorporates the results of individuals with all 4 grandparents from a particular country.  One of the ways they utilize this data is to show you where on your chromosomes you match people whose 4 grandparents are from the same country.  In their tutorial, they do caution that just because a grandparent was born in a particular location doesn’t necessarily mean that they were originally from that location.  This is particularly true in the past few generations, since the industrial revolution.  However, it may still be a useful tool, when taken with the requisite grain of salt.

23andme 4 grandparents

The third way of creating the underlying population data base is to utilize academically published information or information otherwise available.  For example, the Human Genome Diversity Project (HGDP) information which represents 1050 individuals from 52 world populations is available for scrutiny.  Ancestry, in their paper, states that they utilized the HGDP data in addition to their own customer database as well as the Sorenson data, which they recently purchased.

Academically published articles are available as well.  Family Tree DNA utilizes 52 different populations in their reference data base.  They utilize published academic papers and the specific list is provided in their FAQ.

As you can see, there are different approaches and tools.  Depending on which of these tools are utilized, the underlying data base may look dramatically different, and the information held in the underlying data base will assuredly affect the results.

Step 2:  Your Individual DNA Extraction

This is actually the easy part – where you send your swab or spit off to the lab and have it processed.  All three of the main players utilize chip technology today.  For example, 23andMe focuses on and therefore utilizes medical SNPs, where Family Tree DNA actively avoids anything that reports medical information, and does not utilize those SNPs.

In Ancestry’s white paper, they provide an excellent graphic of how, at the molecular level, your DNA begins to provide information about the geographic location of your ancestors.  At each DNA location, or address, you have two alleles, one from each parent.  These alleles can have one of 4 values, or nucleotides, at each location, represented by the abbreviations T, A, C and G, short for Thymine, Adenine, Cytosine and Guanine.  Based on their values, and how frequently those values are found in comparison populations, we begin to fine correlations in geography, which takes us to the next step.

ancestry allele snps

Step 3:  Comparison to Underlying Population Data Base

Now that we have the two individual components in our recipe for ethnicity, a population reference set and your DNA results, we need to combine them.

After DNA extraction, your individual results are compared to the underlying data base.  Of course, the accuracy will depend on the quality, diversity, coverage and quantity of the underlying data base, and it will also depend on how many markers are being utilized or compared.

For example, Family Tree DNA utilizes about 295,000 out of 710,000 autosomal SNPs tested for ethnicity prediction.  Ancestry’s V1 product utilized about 30,000, but that has increased now to about 300,000 in the 2.0 version.

When comparing your alleles to the underlying data set one by one, patterns emerge, and it’s the patterns that are important.  To begin with, T, A, C and G are not absent entirely in any population, so looking at the results, it then becomes a statistics game.  This means that, as Ancestry’s graphic, above, shows, it becomes a matter of relativity (pardon the pun), and a matter of percentages.

For example, if the A allele above is shown is high frequencies in Eastern Europe, but in lower frequencies elsewhere, that’s good data, but may not by itself be relevant.  However if an entire segment of locations, like a street of DNA addresses, are found in high percentages in Eastern Europe, then that begins to be a pattern.  If you have several streets in the city of You that are from Eastern Europe, then that suggests strongly that some of your ancestors were from that region.

To show this in more detailed format, I’m shifting to the third party tool, GedMatch and one of their admixture tools.  I utilized this when writing the series, “The Autosomal Me” and in Part 2, “The Ancestor’s Speak,” I showed this example segment of DNA.

On the graph below, which is my chromosome painting of one a small part of one of my chromosomes on the top, and my mother’s showing the exact same segment on the bottom, the various types of ethnicity are colored, or painted.

The grid shows location, or address, 120 on the chromosome and each tick mark is another number, so 121, 122, etc.   It’s numbered so we can keep track of where we are on the chromosome.

You can readily see that both of us have a primary ethnicity of North European, shown by the teal.  This means that for this entire segment, the results are that our alleles are found in the highest frequencies in that region.

Gedmatch me mom

However, notice the South Asian, East Asian, Caucus, and North Amerindian. The important part to notice here, other than I didn’t inherit much of that segment at 123-127 from her, except for a small part of East Asian, is that these minority ethnicities tend to nest together.  Of course, this makes sense if you think about it.  Native Americans would carry Asian DNA, because that is where their ancestors lived.  By the same token, so would Germans and Polish people, given the history of invasion by the Mongols. Well, now, that’s kind of a monkey-wrench isn’t it???

This illustrates why the results may sometimes be confusing as well as how difficult it is to “identify” an ethnicity.  Furthermore, small segments such as this are often “not reported” by the testing companies because they fall under the “noise” threshold of between about 5 and 7cM, depending on the company, unless there are a lot of them and together they add up to be substantial.

In Summary

In an ideal world, we would have one resource that combines all of these tools.  Of course, these companies are “for profit,” except for National Geographic, and they are not going to be sharing their resources anytime soon.

I think it’s clear that the underlying data bases need to be expanded substantially.  The reliability of utilizing contributed pedigrees as representative of a population indigenous to an area is also questionable, especially pedigrees that only reach back two generations.

All of these tools are still in their infancy.  Both Ancestry and Family Tree DNA’s ethnicity tools are labeled as Beta.  There is useful information to be gleaned, but don’t take the results too seriously.  Look at them more as establishing a pattern.  If you want to take a deeper dive by utilizing your raw data and downloading it to GedMatch, you can certainly do so. The Autosomal Me series shows you how.

Just keep in mind that with ethnicity predictions, with all of the vendors, as is particularly evident when comparing results from multiple vendors, “your mileage may vary.”  Now you know why!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Autosomal DNA, Ancient Ancestors, Ethnicity and the Dandelion

 dandelion 1

Understanding our own ancient DNA is a little different than contemporary DNA that we use for genealogy, but it’s a continuum between the two with a very long umbilical cord between them, then, and now.  And just when you think you’re about to understand autosomal DNA transmission and how it works, the subject of ancient DNA comes up.  This is particularly perplexing when all you wanted in the first place was a simple answer to the question, “who am I and who were my ancestors?”  Well, as you’re probably figured out by now, there is no simple answer.

Inheritance

In a nutshell – we know that every generation gets divided by 50% when we’re talking about autosomal DNA transmission.

So you inherit 50% of the DNA of each of your parents.  They inherited 50% of the DNA of each of their parents, so you inherit ABOUT 25% of the DNA of each of your grandparents.

Did you see that word, about?  It’s important, because while you do inherit exactly 50% of the DNA of each parent, you don’t inherit exactly 25% of the DNA of each grandparent.  You can inherit a little less or a little more from either grandparent as your parents 50% that you’re going to receive is in the mixer.

This is also true for the 12.5% of each of your great-grandparents, and the 6.25% of each of your great-great-grandparents, and so forth, on up the line.

The chart below shows the percentages that you share from each generation.

Relationship to You Approximate % Of Their DNA You Share
Parents Exactly 50%
Grandparents 25
Great-grandparents 12.5
Great-great-grandparents 6.25
Great-great-great-grandparents 3.125
Great-great-great-great-grandparents 1.5625

Ethnicity

So, here’s the question posed by people trying to understand their ethnicity.

If I have 3% Melanesian (or Middle Eastern, Indo-Tibetan or fill-in-the-blank ethnicity), doesn’t that mean that one of my great-great-great-grandparents was Melanesian?

There are really two answers to this question.  (I can hear you groaning!!!)

If the amount is 25% (for example) and not very small amounts, then the answer would be yes, that is very likely what this is telling you.  Or maybe it’s telling you that you have two different great-grandparents who have 12.5 each – but those relatives are fairly close in time due to the amount of DNA that came from that region.  See, that was easy.

However, the answer changes when we’re down in the very small percentages, below 5%, often in the 1 and 2% range.  This answer isn’t nearly as straightforward.

The Dandelion – Your Ancestor

The answer is the dandelion.

dandelion 2

The dandelion is one of your ancestors who lived in the Middle East, let’s say, 20,000 years ago, maybe 30,000 years ago.  In case you’re counting generations, that is 800 to 1200 generations ago.  The percentage of DNA you would carry from a single ancestor who lived 20,000 years ago, assuming you only descended from that ancestor 1 time, is infinitesimally small.  There are more zeroes following that decimal point than I have patience to type.  Let’s call that ancestor Xenia and let’s say she is a female.

However, you did inherit DNA from many of your ancestors who lived 20,000 years ago, thousands of them, because all of them, through their descendants, make up the DNA you carry today.  So infinitesimally small or not, you do carry some of the DNA of some of those ancestors.  It’s just broken into extremely small pieces today and their individual contributions to you may be extremely small.  You don’t carry any DNA from some of them, actually, probably most of them, due to the recombination event, dividing their DNA in half, happening 800 times, give or take.

Now, given that your ancestors’ DNA is divided in every generation by approximately half, and we know there are about 3 billion base pairs on all of your chromosomes combined, this means that by generation 32 or 33, on average, you carry 1 segment from this ancestor.  By generation 45, you carry, on average, .00017 segments of this ancestor’s DNA.  And for those math aficionados among us, this is the mathematical notation for how much of our ancestor’s DNA we carry after 800 generations: 4.4991E-232.

But, we also know that this dividing in half, on the average, doesn’t always work exactly that way in reality, because some of those ancestors from 20,000 years ago did in fact pass their DNA to you, despite the infinitesimal odds against that happening.  Some of their DNA was passed intact generation after generation, to you, and you carry it today.  The DNA contributed by any one ancestor from 800 generations ago is probably limited to one or two locations, or bases, but still, it’s there, and it’s the combined DNA of those ancient ancestors that make us who we are today.

The autosomal DNA of any specific ancestor from long ago is probably too small and fragmented to recognize as “theirs” and attribute to them.  Of course, the beauty of Y DNA and mitochondrial is that it is passed in tact for all of those generations.  But for autosomal DNA and genealogy, we need hundreds of thousands of DNA pieces in a row from a particular ancestor to be recognizable as “theirs.”  When we measure DNA for genealogy, what we are measuring is both centiMorgans, a measure of distance between chromosome positions (length) and the number of contiguous SNP (Single Nucleotide Polymorphism) base locations that match (quantity).  The values from these calculations tells us how closely we are related to people, because remember, DNA is divided in each generation so there is a mathematically predictable amount we will share with specific relatives.

Here is an example from a Family Finder comparison table showing both centiMorgans and matching SNPs with a second cousin.

family finder table

The matching threshold for genealogical significance is either 5 or 7 cM depending on which of the major companies you are using.  At Family Tree DNA, if you match above the threshold, then you can view down to 1cM, which is the case above.  Another match criteria is the number of SNPs, or locations, matching contiguously.  Anything below about 500-800 is considered to be a population match, not a genealogical match, unless you also have a significant number of genealogical matches at higher cMs and segments with this person.

OK, where is all of this going?

Dispersion

Think of your ancestor 20,000 years ago as the dandelion.  Now, blow.

dandelion 3

Xenia lived in the Middle East.  Where might her descendants land, over time, with every new generation?  In Europe?  In Asia?  In India?  In America via the Native Americans through Asia?  In North Africa?  Where?

So let’s say that groups of descendants settle across the globe.  Let’s say that her mitochondrial haplogroup is X.  Yes, haplogroup X is found both in Europe and in Asia and in the Native Americans, so this is actually a good example.  So Xenia carried mitochondrial haplogroup X and we know for sure via mitochondrial DNA testing that indeed, Xenia’s seeds were scattered to all of the winds.  The only place we haven’t found Xenia’s children is in Subsaharan Africa and the Australian archipelago, at least not yet.

Ok, so now that we know where her children and their children went, let’s go back to ancient DNA.

Predictive DNA

The way ethnicity is determined is by studying the frequency with which a specific allele or group of alleles is found in any particular population.  Two “pure” examples come to mind.

The first example is the Duffy Null allele that is only found in the Subsaharan African populations.  Currently this marker is found in about 68% of American blacks and in 88-100% of African blacks.  If you have the Duffy Null allele, you have African heritage.  Of course, you don’t know which line or which ancestor it came from, or how far back in time, but it assures you that you do in fact have African heritage.  It could have been from an ancestor long ago.  It could have been very recent.  This is one of the factors considered when determining percentage of ethnicity.

A second example is the STR marker known as D9S919 which is present in about 30% of the Native American people.  The value of 9 at this marker is not known to be present in any other ethnic group, so this mutation occurred after the Native people migrated across Beringia into the Americas, but long enough ago to be present in many descendants.  There is also no other known marker that is only found only among Native Americans, although I expect as we move into full genome sequencing we will discover more.  You can test this marker individually at Family Tree DNA, which is the only lab that offers this test.  If you have the value of 9 at this marker, it confirms Native heritage, but if you don’t carry 9, it does NOT disprove Native heritage.  After all, many Native people don’t carry it.  Again, you don’t know how long ago this marker was introduced into your ancestry.

These two examples are very unique because the markers are found only in certain groups.  Generally, with the rest of the DNA values, they are found in different amounts, or frequencies, in different parts of the world and ethnic groups.

So, if you’re trying to determine the ethnicity of an individual, you’re going to compile a huge data base of percentages of DNA values found of Ancestrally Informative Markers (AIMs) in different parts of the world.

So, you would compare the participant’s values against your data base and you will come up with those regions or ethnicities that are present most often in your comparison.  This is exactly what the products and services that provide you with your ethnicity percentages do – and how accurate the results are depend highly on the data base itself, the amount of data, and the quality of data.  Dare I mention Ancestry’s issue that they’ve had since they first began offering their autosomal product over a year ago where everyone seems to have Scandinavian ancestry?  Ancestry doesn’t share with us their sources, so as a community we have no idea how they have come up with these numbers.

You can easily compare your autosomal results in nauseating detail at both 23andMe and Family Tree DNA by testing with both companies, or by testing with either 23andMe or Ancestry and transferring your autosomal results to Family Tree DNA.  All 3 of these companies will give you a somewhat different result, but they should be in the same ballpark.  You can also then download your raw data file from any of those vendors and upload it to www.gedmatch.com where you can then do ethnicity comparisons using a variety of tools.  These tools, an example shown below, will have much more variance and detail than the vendor’s tools or results.  And because of that, they tend to be more confusing as well.

gedmatch example

Many people with small amounts of minority admixture are disappointed with the results through the vendors, especially if their Native American admixture doesn’t show.  I wrote extensively about this in my series, The Autosomal Me, so I won’t rehash it here, but using the GedMatch tools is very enlightening, as you can see above with my results.  And do I really have Indo-Tibetan and Indo-Iranian ancestors?

Where’s Xenia?

Back to Xenia and her descendants.  Let’s say that Xenia’s descendants settled in four primary locations.  One is in the Middle East – they never left home.  One is in Asia and from there, to the Americans to become the Native Americans and lastly, to Europe.  Now let’s say there is a pocket of them in the Altai region of Asia and a pocket in France.  The Altai is the ancestral home of the Native Americans and could explain the Indo-Tibet result, above.  We’ll call that Central Asia.  And France is where my Acadian ancestors were from.  Hmmm….this is getting confusing.  To make matters even more confusing, I might well descend from both groups, who originally descended from Xenia.

Let’s say that I do in fact carry small segments of Xenia’s DNA.  Now let’s say that this same DNA is found in a group of people in Central Asia, maybe in Tibet, it’s published in an obscure journal someplace, and it finds its way into a data base.  Voila – there you go – I now have a match in Central Asia in a place called Indo-Tibet.  But do I really?

Does this mean that my ancestor was from Central Asia?  Not necessarily.  And if so, maybe not recently, but the people from that location for some reason share some of the DNA that I carry.  The question of course is why, how and when?

What this really means to you is a matter of degrees.  If you have a few matches from obscure regions, along with very small percentages, it is likely a result of the dandelion’s dispersion.  If you have a lot of matches, meaning a high percentage hit rate, from a particular region, pay attention, it probably has some genealogical significance.

It’s no wonder people are confused by this!  Now, just think how many dandelions you have.  In 15 generations, you have 32,768 ancestors.  In fact, this is how we know for sure that we all descend from the same ancestor multiple times.  Our number of ancestors quickly exceeds the world population.  In 30 (25 years) generations, in about the year 1263, we reach about 1 billion ancestors.  In 1750, there were 791 million people on Earth, in 1600, 580 million, in 1500, 458 million and in 1000, 310 million.

Ancestors - Years

We know that we very likely descend several times from a much smaller group of ancestors from isolated local populations.  However, just looking at the 32,000+ ancestors in 15 generations, it’s still an entire dandelion field!!!

???????????????????????????????????????????????????????????????????????

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Mitochondrial DNA Smartmatching – The Rest of the Story

Sometimes, a match is not a match.  I know, now I’ve gone and ruined your day…

One of the questions that everyone wants the answer to when looking at matches, regardless of what kind of DNA testing we’re talking about, is “how long ago?”  How long ago did I share a common ancestor with my match?  Seems like a pretty simple question doesn’t it?

The answer, especially with mitochondrial DNA is not terribly straightforward.  A perfect example of this fell into my lap this week, and I’m sharing it with you.

Mitochondrial DNA – A Short Primer

There are three regions that are tested in mitochondrial DNA testing for genealogy.  The HVR1 and HVR2 regions are tested at most testing companies, and at Family Tree DNA, the rest of the mitochondria, called the coding region, is tested as well with the full mitochondrial sequence test.  This is the mitochondrial equivalent of Paul Harvey’s “the rest of the story,” and of course we all know that the real story is always in “the rest of the story” or he wouldn’t be telling us about it!

Many times, the rest of the story is critically important.  In mitochondrial DNA, it’s the only way to obtain your full haplogroup designation.  If you don’t want to just be haplogroup J or A or H, you can test the coding region by taking the full sequence test and find out that you’re J1c2 or A2 or H21, and discover the story that goes with that haplogroup.  Guaranteed, it’s a lot more specific than the one that goes with simple J, A or H.  Often it’s the difference between where your ancestor was 2000 years ago and 20,000 years ago – and they probably covered a lot of territory in 18,000 years!

Let’s take a quick look at mitochondrial DNA.

To begin with, the HVR1 and HVR2 regions are called HVR for a reason – it’s short for hypervariable.  And of course, that means they vary, or mutate, a lot more rapidly, as compared to the coding region of the mitochondrial DNA.

In layman’s terms, think of a clock.  No, not a digital clock, an old-fashioned alarm clock.

alarm clock

The entire mitochondrial DNA has 16,569 locations.  The HVR1 and HVR2 regions take up the space on the clock face from 5 till until 5 after the hour.   The rest is the coding region – the mitochondrial “rest of the story.”  The coding region mutates much slower than the two HVR regions.

Just to be sure we’re on the same page, let’s talk for just a minute about how mitochondrial haplogroup assignments work.  For a detailed discussion of haplogroup assignments and how they are done, see Bill Hurst’s discussion here.

Generally a base haplogroup can be reasonably assigned by HVR1 region testing, but not always.  Sometimes they change with full sequence testing – so what you think you know may not be the end result.

My full haplogroup is J1c2f.  My base haplogroup is J.  I’m on the first branch of J, J1.  On branch J1, I’m on the third stick, c, J1c.  On the third stick J1c, I’m on the second twig, J1c2.  On the second twig, J1c2, I’m leaf f, or J1c2f.  Each of these branches of haplogroup J is determined by a specific mutation that happened long ago and was then passed to all of that person’s offspring, between them and me today.  The question is always, how long ago?

Mutation Rates – How Long Ago is Long Ago?

While we have a tip calculator at Family Tree DNA for Y-line DNA to predict how long ago 2 Y-line matches shared a most recent common ancestor, we don’t have anything similar for mitochondrial DNA, partly because of the great variation in the mutation rates for the various regions of mitochondrial DNA.  Family Tree DNA does provide guidelines for the HVR1 region, but they are so broad as to be relatively useless genealogically.  For example, at the 50th percentile, you are likely to have a common ancestor with someone whom you match exactly on the HVR1 mutations in 52 generations, or about 1300 years ago, in the year 713.  Wait, I know just who that is in my family tree!

These estimates do not take into account the HVR2 or coding regions.

I did some research jointly with another researcher not long ago attempting to determine the mutation rate for those regions, and we found estimates that ranged from 500 years to several thousand years per mutation occurrence and it wasn’t always clear in the publications whether they were referring to the entire mitochondria or just certain portions.  And then there are those pesky hot-spots that for some reason mutate a whole lot faster than other locations.  We’re not even going there.  Suffice it to say there is a wide divergence in opinion among academics, so we probably won’t be seeing any type of mito-tip calculator anytime soon.

Enter SmartMatching

Family Tree DNA does their best to make our matches useful to us and to eliminate matches that we know aren’t genealogically relevant.

For example, this week, I was working on a client’s DNA Report.  Let’s call him Joe.  Joe is haplogroup J1c2.  I am haplogroup J1c2f.  J1c2f has one additional haplogroup defining mutation, in the coding region, that J1c2 does not have.

Joe and I did not show as matches at Family Tree DNA, even though our HVR1 and HVR2 regions are exact matches.  Now, for a minute, that gave me a bit of a start.  In fact, I didn’t even realize that we were exact matches until I was working with his results at MitoSearch and recognized my own User ID.

I had to think for a minute about why we would not be considered matches at Family Tree DNA, and I was just about ready to submit a bug report, when I realized the answer was my extended haplogroup.  This, by the way, is the picture-perfect example of why you need full sequence testing.

Family Tree DNA knows that we both tested at the full sequence level.  They know that with a different haplogroup, we don’t share a common ancestor in hundreds to thousands of years, so it doesn’t matter if we match exactly on the HVR1 and HVR2 levels, we DON’T match on a haplogroup defining mutation, which, in this case, happens to be in the coding region, found only with full sequence testing.  Even if we have only one mismatch at the full sequence level, if it’s a haplogroup defining marker, we are not considered matches.  Said a different way, if our only difference was location 9055 and 9055 was NOT a haplogroup defining mutation, we would have been considered a match on all three levels – exact matches at the HVR1 and HVR2 levels and a 1 mutation difference at the full sequence level.  So how a mutation is identified, whether it’s haplogroup defining or not, is critical.

In our case, I carry a mutation at marker 9055 in the coding region that defines haplogroup J1c2f.  Joe doesn’t have this mutation, so he is not J1c2f, just J1c2.  So we don’t match.

So – How Long Ago for Me and Joe?

Dr. Behar in his “Copernican Reassessment of the Mitochondrial DNA Tree,” which has become the virtual Bible of mitochondrial DNA, estimates that the J1c2f haplogroup defining mutation at location 9055 occurred about 2000 years ago, plus or minus another 3000 years, which means my ancestor who had that mutation could have lived as long ago as 5000 years.

The mutations that define haplogroup J1c2 occurred about 9800 years ago, plus or minus another 2000.  So we know that Joe and I share a common ancestor about 7,800 – 11,800 years ago and our lines diverged sometime between then and 2,000 – 5,000 years ago.  So, in round numbers our common ancestor lived between 2,000 and 9,800 years ago.  Not much chance of identifying that person!

The ability to eliminate “near-misses” where the HVR1+HVR2 matches but the people aren’t in the same haplogroup, which is extremely common in haplogroup H, is actually a very useful feature that Family Tree DNA nicknamed SmartMatching.  With over 1000 matches at the HVR1 level, more than 200 at the HVR1+HVR2 level and another 50+ at the full sequence level, Joe certainly didn’t need to have any “misleading” matches included that could have been eliminating by a logic process.

So while Joe and I match, technically, if you only look at the HVR1 and HVR2 levels, we don’t really match, and that’s not evident at MitoSearch or at Ancestry or anyplace else that does not take into consideration both full sequence AND haplogroup defining mutations.  Family Tree DNA is the only company that does this.

It’s interesting to think about the fact that 2 people can match exactly at the HVR1+HVR2 levels, but the distance of the relationship can be vastly different.  I also match my mother on the HVR1+HVR2 levels, exactly, and our common ancestor is her.  So the distance to a common ancestor with an exact HVR1+HVR2 match can be anyplace from one generation (Mom) to thousands of years (Joe), and there is no way to tell the difference without full sequence testing and in this case, SmartMatching.

And that, my friends, is the rest of the story!

The Warrior Gene

warrior 1

In sports, business or your personal life, how you respond to stress and aggression may be in your genes, or at least partly so.  Let’s take a look at a great documentary and the science behind it.

Human behavior is complex and influenced by our genes, our environment, and our circumstances. One of the most provocative and often controversial of genetic variants has been dubbed the “Warrior Gene.”

Studies have linked the “Warrior Gene” to increased risk-taking and to retaliatory behavior. Men with the “Warrior Gene” are not necessarily more aggressive, but they are more likely to respond aggressively to perceived conflict.

On December 14, 2010, National Geographic Channel’s Explorer: “Born to Rage?” documentary investigated the discovery behind a single “warrior gene” directly associated with violent behavior.

warrior 2

With bullying and violent crime making headlines, this controversial finding stirs up the nature-versus-nurture debate. Now, former Grammy-winning rocker, author and radio/television broadcaster Henry Rollins goes in search of carriers from diverse, sometimes violent backgrounds who agree to be tested for the genetic mutation. Who has the warrior gene? And are all violent people carriers? The results turn assumptions upside down.

warrior 3

A rock band front man. A bullet-scarred Harley rider. A former gang member from East L.A. Even a Buddhist monk with a far-from-peaceful past. Which one carries the gene associated with violence? An extraordinary discovery suggests that some men are born with impulsive, aggressive behavior … but it’s not always who you think.

It’s a hotly debated topic: nature versus nurture. Many experts believe our upbringing and environment are the primary influences on our behavior, but how much are we predisposed by our DNA? The discovery of a single gene variation affecting only men, which appears to play a crucial role in managing anger, argues that nature may have a far bigger influence on behavior. It’s this low-functioning, shortened gene linked to violent behavior that has become known as the “warrior gene,” and one-third of the male population has it.

One of those men, who describes himself as “fairly furious all the time” and agrees to be tested for the gene with a simple cheek swab, is Henry Rollins — a former poster boy of youthful rebellion and the American punk scene.  Some of his tattoos are too provocative and socially offensive to show. 

warrior 4

In this special Explorer episode, he dives into his own history of rage and searches out others with aggressive behavior from a range of different backgrounds. “If you can think of a stove, and the pilot light is always on, always ready to light all four burners, that is me, all the time,” he says. “I’m always ready to go there.”

Follow Rollins as he meets with former foot soldiers in one of the most violent street gangs in East Los Angeles; fighters in the ultraviolent sport of mixed martial arts, and Harley Davidson bikers. He’ll also talk to a Navy SEAL veteran and Buddhist monks whose lives weren’t always so tranquil.

After learning more about the warrior gene, many of the men believe they have it, which could offer an explanation of their past behavior. Their sentiment mimics Rollins as he says, “If I find out that I have the warrior gene, that would be interesting. If I find out I don’t, I must say, I would feel a bit of disappointment.” As the anticipation builds, be there when they receive the surprising outcome of the test.

Explorer VII: Inside the Warrior Gene NGCUS Episode Code: 4833

Then, Explorer takes a look at the original study — on one family with generations of men displaying patterns of extreme physical aggression — that led Dutch geneticist Dr. Han Brunner to the revolutionary discovery of this rare genetic dysfunction. We’ll also take a look at new revelations that warrior gene carriers are significantly more likely to punish when provoked. In one study attempting to demonstrate this, subjects are given permission to administer punishment to their partner (who was secretly instructed to make a nuisance of himself), with unexpected results.

For any man questioning his inner warrior, a simple cheek swab test is available at Family Tree DNA.

So wanna know who, in the documentary, had the warrior gene?  Well, hint….it wasn’t the biker…although his lady assured him he would always be her warrior.  But I’m not going to tell you who does have it.  All I’ll say is that you’ll be amazed at the outcome.  The link to watch the video is below.  Enjoy!

http://topdocumentaryfilms.com/born-rage-inside-warrior-gene/

The Science

Let’s take a look at the actual science behind this most interesting and controversial mutation.

The Warrior Gene is a variant of the gene MAO-A on the X chromosome and is one of many genes that play a part in our behavioral responses. The “Warrior Gene” variant reduces function in the MAOA gene. Because men have one copy of the X-chromosome, a variant that reduces the function of this gene has more of an influence on them. Women, having two X-chromosomes, are more likely to have at least one normally functioning gene copy, and scientists have not studied variants in women as extensively.

Recent studies have linked the Warrior Gene to increased risk-taking and aggressive behavior. Whether in sports, business, or other activities, scientists found that individuals with the Warrior Gene variant were more likely to be combative than those with the normal MAO-A gene. However, human behavior is complex and influenced by many factors, including genetics and our environment. Individuals with the Warrior Gene are not necessarily more aggressive, but according to scientific studies, are more likely to be aggressive than those without the Warrior Gene variant.

This test is available for both men and women, however, there is limited research about the Warrior Gene variant amongst females. Additional details about the Warrior Gene genetic variant of MAO-A can be found in the paper titled “A functional polymorphism in the monoamine oxidase A gene promoter” by Sabol et al, 1998.

When testing for the Warrior Gene, we are looking for an absence of MAOA (monoamine oxidase A) on the X chromosomes. Based on how many times we see the repeat of a certain pattern on the X or Xs we can tell if the MAOA is present or absent (depleted). Three repeats of the pattern indicates that the X chromosome is deficient of MAOA and therefore you have the Warrior Gene. If we see 3.5, 4 or 5 repeats of the pattern, MAOA is present and this is a normal variant of the gene on your X chromosome.

warrior 6However, women have 2 X chromosomes where men have 1 X and 1 Y. As mentioned above, the gene is carried on the X chromosome, so women can either have it 1) not at all, 2) on only 1 X (therefore making them a carrier), or 3) on both Xs (exhibiting the trait).

Looking at results, with one X-chromosome, men with the “Warrior Gene” will show a value of 3. Other men will have normal variants: 3.5, 4, 4.5 or 5. With two X-chromosomes, women will have two results. For example, a woman might have 3 and 3, 3 and 5, or 4.5 and 5.

This first example is of a female with one copy of the normal variant and one copy of the Warrior Gene indicated by a value of 3.

warrior 7

In the second example, shown below, this female has the Warrior Gene trait, because she carries the Warrior Gene depletion, shown as a value of 3, on both of her chromosomes, the one contributed to her by her father and the one contributed to her by her mother.  This also tells us that her father has the Warrior Gene, since he carries only the X chromosome contributed by his mother, which he gave to his daughter.  It also tells us that her mother was either a carrier, if she had only the one copy she gave to her daughter, or had the Warrior Gene herself is she carried two copies.

warrior 8

A male’s results would have only one result listed.  If he has a value of 3, he had the Warrior Gene.  Any other value is NOT indicative of the Warrior Gene.

Happiness Gene in Women

In an unexpected turn of events, in August 2012, another study in the journal Progress in Neuro-Psychopharmacology & Biological Psychiatry indicates that while this gene may express as aggression in men, it may be the happiness gene in women.  Even women with only one copy of the gene were shown to be happier than women who carry no copies. A study of 193 women and 152 men evaluated their happiness level and women who carried this mutation on one or both X chromosomes rated themselves as significantly happier than women who did not carry this trait.  There was no difference in the male participants.

http://www.livescience.com/22789-gene-linked-to-happiness-in-women.html

Caveat

Among the many advances and discoveries of modern DNA and genetics are ‘scientific’ oddities. These genetic wonders make it into popular culture and sometimes develop a life there that far outpaces their academic worth.  But they are interesting. These factoids are best used as ‘cocktail party conversation’ starters or maybe a good way to tease Uncle Leo at the family picnic. Family Tree DNA, where you can find out if you have the Warrior Gene, portrays it to their customers as just that, a novelty.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research