Haplogroups, SNPs and Family Group Confusion

The transition at Family Tree DNA from the old haplogroup naming convention to the new SNP-only naming convention has generated a great deal of confusion.  It’s like surgery – had to be done – but it has been painful.

I’ve received several questions, many that are similar, so I’d like to attempt to resolve some of the confusing points here.

First, just a little background.

Ancient History

Remember, in 2008, when Michael Hammer et al rewrote the Y tree?  If you do, then count yourself as an old-timer.  Names such as R1b1c became R1b1a2.  E3a became E1b1a and E3b became E1b1b1.  We thought we were all going to die.  But we didn’t – and now, if I hadn’t just told you, you wouldn’t even be able to remember the previous name of R1b1a2.

Why did this happen?  Because when you have a step-wise tree where each step is given a number and letter, like this, you have no room for expansion.

R

R1

R1a

R1a1

Each of these haplogroup names is assigned a SNP, and when a new SNP is discovered between R and R1, for example, the name R1 gets assigned to the new SNP and everyone downstream gets renamed and/or a new SNP assigned.  If you think this is confusing, it is and was – terribly so.  In fact, as testimony to this, the last version of the FTDNA tree, the ISOGG tree and the tree used by 23andMe are entirely out of sync with each other.

With the shift from about 800 SNPs to 12,000 SNPs with the Geno2.0 chip, it was definitely time to redo and rethink how haplogroup names are assigned.  What seemed initially like a great idea turned out not to be when the magnitude of the number of SNPs that actually exist was realized.  In reality, they needed to be obsoleted, but the familiar cadence of the letter number path will forever be gone – with the exception of the fact that the SNP is prefaced with the haplogroup name.  We will no longer have our signposts, sadly, but our signposts were becoming overwhelmingly long.  Here’s one example I copied from the ISOGG tree.  R1b1a2a1a1c2b2a1a1b2a1a – seriously – I can’t remember that.

So, today, and forever more, R1b1a2 will be R-M269.  It will not be shifted or “become” anything else.  Moving a SNP to a new location becomes painless, because it will not affect anything upstream or downstream.

However, as you get use to this new beast, you’re going to want to refer to “what something was” before.  You’ll find that articles, papers and who knows what else will refer to the haplogroup name – and you’ll need a conversion reference.

Here’s a link to that reference.  I don’t know about you, but I copied this and created a .pdf file in case this reference disappears – not that that ever happens in the electronic world.

Why the Confusion?

Within projects, men with the same surname now have different haplogroups assigned, and the SNP names look entirely different.  Before, if most of the surname group was R1b1a2, and one person had SNP tested at a deeper level and showed R1b1a2a1a1b4, it was easy to tell by looking that R1b1a2a1a1b4 fell underneath R1b1a2, and was a subclade.  Today, with the new tree, everyone that was R1b1a2 is now shown as R-M269 and the lone R1b1a2a1a1b4 person is shown as R-L21.  You can’t tell by looking if R-L21 is a subclade of R-M269 or the other way around.  And another few SNP tests at different levels into the mix, and you have one confused administrator.

One thing hasn’t changed.  Notice the haplogroup I-M253 individual in the purple group below.  There is a note that their parentage is uncertain.  Given the completely different haplogroup – this individual does not fit into any groups of Estes males biologically.  So completely different haplogroups are still exclusive, meaning you can tell at a glance that these folks do not share a common ancestor, even though their genealogy says that they should.

estes project cropped

Ok, got that now?  Good, because it gets more confusing.

Family Tree DNA did not do a one to one conversion, meaning they did not create a conversion table where R1b1a2=R-M269.  They did an entirely new prediction routine.  This makes sense, because they don’t hard code the haplogroup – it’s fluid and based on either a hard and fast SNP test or a prediction routine. This also allows for easy future improvements, and they utilize 37 markers for haplogroup predictions now instead of just 12, in most cases.

Unfortunately, or fortunately, the prediction routine produces different results for people within the same family group, based on STR marker results and how many STRs are tested.

What this means is that different people in the same family line will have different haplogroup predictions, as you can see in the groups above of individuals all descended from one male, Abraham Estes.

This isn’t wrong, as in incorrect, but it is confusing, especially when you’re used to seeing everyone who has not been SNP tested have a matching haplogroup within families.

Enter the Terminal SNP

The terminal SNP is your SNP that is furthest down the tree based on the SNPs that you have tested.  That second part is really important – based on the SNPs that you have tested.

When you’re looking at your matches, you can see their terminal SNP in the column below to the right, but what you can’t tell is if they have tested for any downstream SNPs and were found negative.

Estes match cropped

For example, if you are tested positive for R-M269 (formerly R1b1a2) and someone else that you match is R-L21, which is downstream of R-M269 – this does not exclude them as valid matches, UNLESS the first R-M269+ gentleman has actually tested for R-L21 and is negative.  You, of course, have no way of knowing this without asking the other participant.

Also, testing “negative” is a bit subjective, because there are known no-calls in the Geno 2.0 results – so if the Geno 2.0 result did not include the terminal haplogroup you expected, and the outcome is truly important to you, meaning family defining – have that defining SNP, if it’s absent in the Geno 2.0 raw data results, tested individually through regular Sanger sequencing – meaning purchase it separately through Family Tree DNA.  A non-positive result in the Geno 2.0 results is typically interpreted to mean negative, but that is not always the case.  In most situations, if everything else matches, meaning surname, STRs and other SNPs, it’s not necessary to test the SNP separately – but it is available if you need to know, positively.

Secondly, the terminal SNP on the new Family Tree DNA haplotree and in your results, if you have taken the Big Y, the Walk Through the Y or purchased individuals SNPs, may be different.  Why, and how would you know?

The why is because Family Tree DNA has synced to the Geno 2.0 tree at this point, and there have been many new SNPs discovered since the Geno 2.0 tree was developed in 2012.  The ISOGG tree is more current, but keep in mind that it is a provisional tree.  However, you still need to have a way to determine your terminal SNP beyond the Geno 2.0 criteria if you have had advanced testing.

There were originally some tools created by individuals to help with this dilemma, but both tools appear to no longer work.  Kitty Cooper blogged about this, and was apparently recently successful, but I was not.  I downloaded the updated version of the Big Y Chromosome extension that I wrote about and was using the Morley tree but that no longer functions either.  Let’s just say that the word frustrated doesn’t even begin to apply….

My suggestion is to work closely with your haplogroup and surname project administrator(s).  Many of the administrators have put together provisional charts and the haplogroup project pages are grouped by SNP groupings with suggestions for additional relevant testing.

The U106 project is a great example of proactive administrators.  Individual participants are clearly categorized and the categories suggest an appropriate “next step.”  Looking at their home page, the administrators make themselves readily available to project members for consulting about how to proceed.

u106 project

Yes, all of this change is a bit fuzzy right now, but give it a bit of time and the fog will clear.  It did in 2008 and we all survived.

Tree Updates

Family Tree DNA has committed to at least one more tree update this year, and let’s hope that it includes all of the SNPs in the reference data base they are using for the Big Y.

I’ll be talking about Big Y comparisons in a future article.

2014 Y Tree Released by Family Tree DNA

On April 25th, DNA Day and Arbor Day, Family Tree DNA updated and released their 2014 Y haplotree created in partnership with the Genographic project.  This has been a massive project, expanding the tree from about 850 SNPs to over 6200, of which about 1200 are “terminal,” meaning the end of a branch, and the rest being proven to be duplicates.

If you’re a newbie, this would be a good place perhaps to read about what a haplogroup is and the new Y naming convention which replaces the well-known group names like R1b1a2 with the SNP shorthand version of the same haplogroup name, R-M269.  From this time forward, the haplogroups will be known by their SNP names and the longhand version is obsolete, although you will always see it in older documents, articles and papers.  In fact, this entire tree has been made possible by SNP testing by both academic organizations and consumers.  To understand the difference between regular STR marker testing and SNP testing, click here.

I’ve divided this article into two parts.  The first part is the “what did they do and why” part and the second is the “what does it mean to you” portion.

This tree update has been widely anticipated for some time now.  We knew that Family Tree DNA was calibrating the tree in partnership with the Genographic project, but we didn’t know what else would be included until the tree was released.

What Did Family Tree DNA Do, and Why?

Janine Cloud, the liaison at Family Tree DNA for Project Administrators has provided some information as to the big picture.

“First, we’re committed to the next iteration of the tree and it will be more comprehensive, but we’re going to be really careful about the data we use from other sources. It HAS to be from raw data, not interpreted data. Second, I’ve italicized what I think is really the mission statement for all the work that’s been done on this tree and that will be done in the future.”

Janine interviewed Elliott Greenspan of Family Tree DNA about the new tree, and here are some of the salient points from that discussion.

“This year we’re committing to launching another tree. This tree will be more comprehensive, utilizing data from external sources: known Sanger data, as well as data such as Big Y, and if we have direct access to the raw data to make the proof (from large companies, such as the Chromo2) or a publication, or something of that nature. That is our intention that it be added into the data.

We’re definitely committed to update at least once per year. Our intention is to use data from other sources, as well as any SNPs we can, but it must be well-vetted. NGS and SNP technology inherently has errors. You must curate for those errors otherwise you’re just putting slop out to customers. There are some SNPs that may bind to the X chromosome that you didn’t know. There are some low coverages that you didn’t know.

With technology such as this you’re able to overcome the urge to test only what you’re likely to be positive for, and instead use the shotgun method and test everything. This allows us to make the discovery that SNPs are not nearly as stable as we thought, and they have a larger potential use in that sense.

Not only does the raw data need to be vetted but it needs to make sense.  Using Geno 2.0, I only accepted samples that had the highest call rate, not just because it was the best quality but because it was the most data. I don’t want to be looking at data where I’m missing potential information A, or I may become confused by potential information B.  That is something that will bog us down. When you’re looking at large data sets, I’d much rather throw out 20% of them because they’re going to take 90% of the time than to do my best to get 1 extra SNP on the tree or 1 extra branch modified, that is not worth all of our time and effort. What is, is figuring out what the broader scope of people are, because that is how you break down origins. Figuring one single branch for one group of three people is not truly interesting until it’s 50 people, because 50 people is a population. Three people may be a family unit.  You have to have enough people to determine relevance. That’s why using large datasets and using complete datasets are very, very important.

I want it to be the most accurate tree it can be, but I also want it to be interesting. That’s the key. Historical relevance is what we’re to discover. Anthropological relevance. It’s not just who has the largest tree, it’s who can make the most sense out of what you have is important.”

Thanks to both Janine and Elliott for providing this information.

What is Provided in the Update?

The genetic genealogy community was hopeful that the new 2014 tree would be comprehensive, meaning that it would include not only the Genographic SNPs, but ones from Walk the Y, perhaps some Chromo2, Full Genomes results and the Big Y.  Perhaps we were being overly optimistic, especially given the huge influx of new SNPs, the SNP tsunami as we call it, over the past few months.  Family Tree DNA clearly had to put a stake in the sand and draw the line someplace.  So, what is actually included, how did they select the SNPs for the new tree and how does this integrate with the Genographic information?  This information was provided by Family Tree DNA.

Family Tree DNA created the 2014 Y-DNA Haplotree in partnership with the National Geographic Genographic Project using the proprietary GenoChip. Launched publicly in late 2012, the chip tests approximately 10,000 Y-DNA SNPs that had not, at the time, been phylogenetically classified.

The team used the first 50,000 male samples with the highest quality results to determine SNP positions. Using only tests with the highest possible “call rate” meant more available data, since those samples had the highest percentage of SNPs that produced results, or “calls.”

In some cases, SNPs that were on the 2010 Y-DNA Haplotree didn’t work well on the GenoChip, so the team used Sanger sequencing on anonymous samples to test those SNPs and to confirm ambiguous locations.

For example, if it wasn’t clear if a clade was a brother (parallel) clade, or a downstream clade, they tested for it.

The scope of the project did not include going farther than SNPs currently on the GenoChip in order to base the tree on the most data available at the time, with the cutoff for inclusion being about November of 2013.

Where data were clearly missing or underrepresented, the team curated additional data from the chip where it was available in later samples. For example, there were very few Haplogroup M samples in the original dataset of 50,000, so to ensure coverage, the team went through eligible Geno 2.0 samples submitted after November, 2013, to pull additional Haplogroup M data. That additional research was not necessary on, for example, the robust Haplogroup R dataset, for which they had a significant number of samples.

Family Tree DNA, again in partnership with the Genographic Project, is committed to releasing at least one update to the tree this year. The next iteration will be more comprehensive, including data from external sources such as known Sanger data, Big Y testing, and publications. If the team gets direct access to raw data from other large companies’ tests, then that information will be included as well. We are also committed to at least one update per year in the future.

Known SNPs will not intentionally be renamed. Their original names will be used since they represent the original discoverers of the SNP. If there are two names, one will be chosen to be displayed and the additional name will be available in the additional data, but the team is taking care not to make synonymous SNPs seems as if they are two separate SNPs. Some examples of that may exist initially, but as more SNPs are vetted, and as the team learns more, those examples will be removed.

In addition, positions or markers within STRs, as they are discovered, or large insertion/deletion events inside homopolymers, potentially may also be curated from additional data because the event cannot accurately be proven. A homopolymer is a sequence of identical bases, such as AAAAAAAAA or TTTTTTTTT. In such cases it’s impossible to tell which of the bases the insertion is, or if/where one was deleted. With technology such as Next Generation Sequencing, trying to get SNPs in regions such as STRs or homopolymers doesn’t make sense because we’re discovering non-ambiguous SNPs that define the same branches, so we can use the non-ambiguous SNPs instead.

Some SNPs from the 2010 tree have been intentionally removed. In some cases, those were SNPs for which the team never saw a positive result, so while it may be a legitimate SNP, even haplogroup defining, it was outside of the current scope of the tree. In other cases, the SNP was found in so many locations that it could cause the orientation of the tree to be drawn in more than one way. If the SNP could legitimately be positioned in more than one haplogroup, the team deemed that SNP to not be haplogroup defining, but rather a high polymorphic location.

To that end, SNPs no longer have .1, .2, or .3 designations. For example, J-L147.1 is simply J-L147, and I-147.2 is simply I-147.  Those SNPs are positioned in the same place, but back-end programming will assign the appropriate haplogroup using other available information such as additional SNPs tested or haplogroup origins listed. If other SNPs have been tested and can unambiguously prove the location of the multi-locus SNP for the sample, then that data is used. If not, matching haplogroup origin information is used.

We will also move to shorthand haplogroup designations exclusively. Since we’re committing to at least one iteration of the tree per year, using longhand that could change with each update would be too confusing.  For example, Haplogroup O used to have three branches: O1, O2, and O3. A SNP was discovered that combined O1 and O2, so they became O1a and O1b.

There are over 1200 branches on the 2014 Y Haplogroup tree, as compared to about 400 on the 2010 tree. Those branches contain over 6200 SNPs, so we’ve chosen to display select SNPs as “active” with an adjacent “More” button to show the synonymous SNPs if you choose.

In addition to the Family Tree DNA updates, any sample tested with the Genographic Project’s Geno 2.0 DNA Ancestry Kit, then transferred to FTDNA will automatically be re-synched on the Geno side. The Genographic Project is currently integrating the new data into their system and will announce on their website when the process is complete in the coming weeks.  At that time, all Geno 2.0 participants’ results will be updated accordingly and will be accessible via the Genographic Project website.

In summary:

  • Created in partnership with National Geographic’s Genographic Project
  • Used GenoChip containing ~10,000 previously unclassified Y-SNPs
  • Some of those SNPs came from Walk Through the Y and the 1000 Genome Project
  • Used first 50,000 high-quality male Geno 2.0 samples
  • Verified positions from 2010 YCC by Sanger sequencing additional anonymous samples
  • Filled in data on rare haplogroups using later Geno 2.0 samples

Statistics

  • Expanded from approximately 400 to over 1200 terminal branches
  • Increased from around 850 SNPs to over 6200 SNPs
  • Cut-off date for inclusion for most haplogroups was November 2013

Total number of SNPs broken down by haplogroup

A 406 DE 16 IJ 29 LT 12 P 81
B 69 E 1028 IJK 2 M 17 Q 198
BT 8 F 90 J 707 N 168 R 724
C 371 G 401 K 11 NO 16 S 5
CT 64 H 18 K(xLT) 1 O 936 T 148
D 208 I 455 L 129

myFTDNA Interface

  • Existing customers receive free update to predictions and confirmed branches based on existing SNP test results.
  • Haplogroup badge updated if new terminal branch is available
  • Updated haplotree design displays new SNPs and branches for your haplogroup
  • Branch names now listed in shorthand using terminal SNPs
  • For SNPs with more than one name, in most cases the original name for SNP was used, with synonymous SNPs listed when you click “More…”
  • No longer using SNP names with .1, .2, .3 suffixes. Back-end programming will place SNP in correct haplogroup using available data.
  • SNPs recommended for additional testing are pre-populated in the cart for your convenience. Just click to remove those you don’t want to test.
  • SNPs recommended for additional testing are based on 37-marker haplogroup origins data where possible, 25- or 12-marker data where 37 markers weren’t available.
  • Once you’ve tested additional SNPs, that information will be used to automatically recommend additional SNPs for you if they’re available.
  • If you remove those prepopulated SNPs from the cart, but want to re-add them, just refresh your page or close the page and return.
  • Only one SNP per branch can be ordered at one time – synonymous SNPs can possibly ordered from the Advanced Orders section on the Upgrade Order page.
  • Tests taken have moved to the bottom of the haplogroup page.

Coming attractions

  • Group Administrator Pages will have longhand removed.
  • At least one update to the tree to be released this year.
  • Update will include: data from Big Y, relevant publications, other companies’ tests from raw data.
  • We’ll set up a system for those who have tested with other big data companies to contribute their raw data file to future versions of the tree.
  • We’re committed to releasing at least one update per year.
  • The Genographic Project is currently integrating the new data into their system and will announce on their website when the process is complete in the coming weeks. At that time, all Geno 2.0 participants’ results will be updated accordingly and accessible via the Genographic Project website.

What Does This Mean to You?

Your Badge

On your welcome page, your badges are listed.  Your badge previously would have included the longhand form of the haplogroup, such as R1b1a2, but now it shows R-M269.

2014 y 1

Please note that badges are not yet showing on all participants pages.  If yours aren’t yet showing, clicking on the Haplotree and SNP page under the YDNA option on the blue options bar where your more detailed information is shown, below.

Your Haplogroup Name

Your haplogroup is now noted only as the SNP designation, R-M269, not the older longhand names.

2014 y 2 v2

Haplogroup R is a huge haplogroup, so you’ll need to scroll down to see your confirmed or predicted haplogroup, shown in green below.

2014 y 3

Redesigned Page

The redesigned haplotree page includes an option to order SNPs downstream of your confirmed or predicted haplogroup.  This refines your haplogroup and helps isolate your branch on the tree.  You may or may not want to do this.  In some cases, this does help your genealogy, especially in cases where you’re dealing with haplogroup R.  For the most part, haplogroups are more historical in nature.  For example, they will help you determine whether your ancestors are Native American, African, Anglo Saxon or maybe Viking.  Haplogroups help us reach back before the advent of surnames.

The new page shows which SNPs are available for you to order from the SNPs on the tree today, shown above, in blue to the right of the SNP branch.

SNPs not on the Tree

Not all known SNPs are on the tree.  Like I said, a line in the sand had to be drawn.  There are SNPs, many recently discovered, that are not on the tree.

To put this in perspective, the new tree incorporates 6200 SNPs (up from 850), but the Big Y “pool” of known SNPs against which Family Tree DNA is comparing those results was 36,562 when the first results were initially released at the end of February.

If you have taken advanced SNP testing, such as the Walk the Y, the Big Y, or tested individual SNPs, your terminal SNP may not be on the tree, which means that your terminal SNP shown on your page, such as R-M269 above, MAY NOT BE ACCURATE in light of that testing.  Why?  Because these newly discovered SNPs are not yet on the tree. This only affects people who have done advanced testing which means it does not affect most people.

Ordering SNPs

You can order relevant SNPs for your haplogroup on the tree by clicking on the “Add” button beside the SNP.

You can order SNPs not on the tree by clicking on the “Advanced Order Form” link available at the bottom of the haplotree page.

2014 y 4

If you’re not sure of what you want to do, or why, you might want to touch bases with your project administrators.  Depending on your testing goal, it might be much more advantageous, both scientifically and financially, for you to take either the Geno2 test or the Big Y.

At this point, in light of some of the issues with the new release, I would suggest maybe holding tight for a bit in terms of ordering new SNPs unless you’re positive that your haplogroup is correct and that the SNP selection you want to order would actually be beneficial to you.

Words of Caution

This are some bugs in this massive update.  You might want to check your haplogroup assignment to be sure it is reflected accurately based on any SNP testing you have had done, of course, excepting the very advanced tests mentioned above.

If you discover something that is inaccurate or questionable, please notify Family Tree DNA.  This is especially relevant for project administrators who are familiar with family groups and know that people who are in the same surname group should share a common base haplogroup, although some people who have taken further SNP testing will be shown with a downstream haplogroup, further down that particular branch of the tree.

What kind of result might you find suspicious or questionable?  For example, if in your surname project, your matching surname cousins are all listed at R-M269 and you were too previously, but now you’re suddenly in a different haplogroup, like E, there is clearly an error.

Any suspected or confirmed errors should be reported to Family Tree DNA.

They have made it very easy by providing a “Feedback” button on the top of the page and there is a “Y tree” option in the dropdown box.

2014 y 5

For administrators providing reports that involve more than one participant, please send to Groups@familytreedna.com and include the kit numbers, the participants names and the nature of the issue.

Additional Information

Family Tree DNA provides a free webinar that can be viewed about the 2014 Y Tree release.  You can see all of the webinars that are archived and available for viewing at:  https://www.familytreedna.com/learn/ftdna/webinars/

What’s Next?

The Genographic Project is in the process of updating to the same tree so their results can be synchronized with the 2014 tree.  A date for this has not yet been released.

Family Tree DNA has committed to at least one more update this year.

I know that this update was massive and required extensive reprogramming that affected almost every aspect of their webpage.  If you think about it, nearly every page had to be updated from the main page to the order page.  The tree is the backbone of everything.  I want to thank the Family Tree DNA and Genograpic combined team for their efforts and Bennett Greenspan for making sure this did happen, just as he committed to do in November at the last conference.

Like everyone else, I want everything NOW, not tomorrow.  We’re all passionate about this hobby – although I think it is more of a life mission for many – and surpassed hobby status long ago.

I know there are issues with the tree and they frustrate me, like everyone else.  Those issues will be resolved.  Family Tree DNA is actively working on reported issues and many have already been fixed.

There is some amount of disappointment in the genetic genealogy community about the SNPs not included on the tree, especially the SNPs recently discovered in advanced tests like the Big Y.  Other trees, like the ISOGG tree, do in fact reflect many of these newly discovered SNPs.

There are a couple of major differences.  First, ISOGG has an virtual army of volunteers who are focused on maintaining this tree.  We are all very lucky that they do, and that Alice Fairhurst coordinates this effort and has done so now for many years.  I would be lost without the ISOGG tree.

However, when a change is made to the ISOGG tree, and there have been thousands of changes, adds and moves over the years, nothing else is affected.  No one’s personal page, no one’s personal tree, no projects, no maps, no matches and no order pages.  ISOGG has no “responsibility” to anyone – in other words – it’s widely known and accepted that they are a volunteer organization without clients.

Family Tree DNA, on the other hand has half a million (or so) paying customers.  Tree changes have a huge domino ripple effect there – not only on their customers’ personal pages, but to their entire website, projects, support and orders.  A change at Family Tree DNA is much more significant than on the ISOGG page – not to mention – they don’t have the same army of volunteers and they have to rely on the raw science, not interpretation, as they said in the information they provided.  A tree update at Family Tree DNA is a very different animal than updating a stand-alone tree, especially considering their collaboration with various scientific organizations, including the National Geographic Society.

I commend Family Tree DNA for this update and thank them for the update and the educational materials.  I’m also glad to see that they do indeed rely only on science, not interpretation.  Frustrating to the genetic genealogist in me?  Sure.  But in the long run, it’s worth it to be sure the results are accurate.

Could this release have been smoother and more accurate?  Certainly.  Hopefully this is the big speed bump and future releases will be much more graceful.  It’s easy to see why there aren’t any other companies providing this type of comprehensive testing.  It’s gone from an easy 12 marker “do we match” scenario to the forefront of pioneering population genetics.  And all within a decade.  It’s amazing that any company can keep up.

 

2013’s Dynamic Dozen – Top Genetic Genealogy Happenings

dna 8 ball

Last year I wrote a column at the end of the year titled  “2012 Top 10 Genetic Genealogy Happenings.”  It’s amazing the changes in this industry in just one year.  It certainly makes me wonder what the landscape a year from now will look like.

I’ve done the same thing this year, except we have a dozen.  I couldn’t whittle it down to 10, partly because there has been so much more going on and so much change – or in the case of Ancestry, who is noteworthy because they had so little positive movement.

If I were to characterize this year of genetic genealogy, I would call it The Year of the SNP, because that applies to both Y DNA and autosomal.  Maybe I’d call it The Legal SNP, because it is also the year of law, court decisions, lawsuits and FDA intervention.  To say it has been interesting is like calling the Eiffel Tower an oversized coat hanger.

I’ll say one thing…it has kept those of us who work and play in this industry hopping busy!  I guarantee you, the words “I’m bored” have come out of the mouth of no one in this industry this past year.

I’ve put these events in what I consider to be relatively accurate order.  We could debate all day about whether the SNP Tsunami or the 23andMe mess is more important or relevant – and there would be lots of arguing points and counterpoints…see…I told you lawyers were involved….but in reality, we don’t know yet, and in the end….it doesn’t matter what order they are in on the list:)

Y Chromosome SNP Tsunami Begins

The SNP tsumani began as a ripple a few years ago with the introduction at Family Tree DNA of the Walk the Y program in 2007.  This was an intensively manual process of SNP discovery, but it was effective.

By the time that the Geno 2.0 chip was introduced in 2012, 12,000+ SNPs would be included on that chip, including many that were always presumed to be equivalent and not regularly tested.  However, the Nat Geo chip tested them and indeed, the Y tree became massively shuffled.  The resolution to this tree shuffling hasn’t yet come out in the wash.  Family Tree DNA can’t really update their Y tree until a publication comes out with the new tree defined.  That publication has been discussed and anticipated for some time now, but it has yet to materialize.  In the mean time, the volunteers who maintain the ISOGG tree are swamped, to say the least.

Another similar test is the Chromo2 introduced this year by Britain’s DNA which scans 15,000 SNPs, many of them S SNPs not on the tree nor academically published, adding to the difficulty of figuring out where they fit on the Y tree.  While there are some very happy campers with their Chromo2 results, there is also a great deal of sloppy science, reporting and interpretation of “facts” through this company.  Kind of like Jekyll and Hyde.  See the Sloppy Science section.

But Walk the Y, Chromo2 and Geno 2.0, are only the tip of the iceburg.  The new “full Y” sequencing tests brought into the marketspace quietly in early 2013 by Full Genomes and then with a bang by Family Tree DNA with the their Big Y in November promise to revolutionize what we know about the Y chromosome by discovering thousands of previously unknown SNPs.  This will in effect swamp the Y tree whose branches we thought were already pretty robust, with thousands and thousands of leaves.

In essence, the promise of the “fully” sequenced Y is that what we might term personal or family SNPs will make SNP testing as useful as STR testing and give us yet another genealogy tool with which to separate various lines of one genetic family and to ratchet down on the time that the most common recent ancestor lived.

http://dna-explained.com/2013/03/31/new-y-dna-haplogroup-naming-convention/

http://dna-explained.com/2013/11/10/family-tree-dna-announces-the-big-y/

http://dna-explained.com/2013/11/16/what-about-the-big-y/

http://www.yourgeneticgenealogist.com/2013/11/first-look-at-full-genomes-y-sequencing.html

http://cruwys.blogspot.com/2013/12/a-first-look-at-britainsdna-chromo-2-y.html

http://cruwys.blogspot.com/2013/11/yseqnet-new-company-offering-single-snp.html

http://cruwys.blogspot.com/2013/11/the-y-chromosome-sequence.html

http://cruwys.blogspot.com/2013/11/a-confusion-of-snps.html

http://cruwys.blogspot.com/2013/11/a-simplified-y-tree-and-common-standard.html

23andMe Comes Unraveled

The story of 23andMe began as the consummate American dotcom fairy tale, but sadly, has deteriorated into a saga with all of the components of a soap opera.  A wealthy wife starts what could be viewed as an upscale hobby business, followed by a messy divorce and a mystery run-in with the powerful overlording evil-step-mother FDA.  One of the founders of 23andMe is/was married to the founder of Google, so funding, at least initially wasn’t an issue, giving 23andMe the opportunity to make an unprecedented contribution in the genetic, health care and genetic genealogy world.

Another way of looking at this is that 23andMe is the epitome of the American Dream business, a startup, with altruism and good health, both thrown in for good measure, well intentioned, but poorly managed.  And as customers, be it for health or genealogy or both, we all bought into the altruistic “feel good” culture of helping find cures for dread diseases, like Parkinson’s, Alzheimer’s and cancer by contributing our DNA and responding to surveys.

The genetic genealogy community’s love affair with 23andMe began in 2009 when 23andMe started focusing on genealogy reporting for their tests, meaning cousin matches.  We, as a community, suddenly woke up and started ordering these tests in droves.  A few months later, Family Tree DNA also began offering this type of testing as well.  The defining difference being that 23andMe’s primary focus has always been on health and medical information with Family Tree DNA focused on genetic genealogy.  To 23andMe, the genetic genealogy community was an afterthought and genetic genealogy was just another marketing avenue to obtain more people for their health research data base.  For us, that wasn’t necessarily a bad thing.

For awhile, this love affair went along swimmingly, but then, in 2012, 23andMe obtained a patent for Parkinson’s Disease.  That act caused a lot of people to begin to question the corporate focus of 23andMe in the larger quagmire of the ethics of patenting genes as a whole.  Judy Russell, the Legal Genealogist, discussed this here.  It’s difficult to defend 23andMe’s Parkinson’s patent while flaying alive Myriad for their BRCA patent.  Was 23andMe really as altruistic as they would have us believe?

Personally, this event made me very nervous, but I withheld judgment.  But clearly, that was not the purpose for which I thought my DNA, and others, was being used.

But then came the Designer Baby patent in 2013.  This made me decidedly uncomfortable.  Yes, I know, some people said this really can’t be done, today, while others said that it’s being done anyway in some aspects…but the fact that this has been the corporate focus of 23andMe with their research, using our data, bothered me a great deal.  I have absolutely no issue with using this information to assure or select for healthy offspring – but I have a personal issue with technology to enable parents who would select a “beauty child,” one with blonde hair and blue eyes and who has the correct muscles to be a star athlete, or cheerleader, or whatever their vision of their as-yet-unconceived “perfect” child would be.  And clearly, based on 23andMe’s own patent submission, that is the focus of their patent.

Upon the issuance of the patent, 23andMe then said they have no intention of using it.  They did not say they won’t sell it.  This also makes absolutely no business sense, to focus valuable corporate resources on something you have no intention of using?  So either they weren’t being truthful, they lack effective management or they’ve changed their mind, but didn’t state such.

What came next, in late 2013 certainly points towards a lack of responsible management.

23andMe had been working with the FDA for approval the health and medical aspect of their product (which they were already providing to consumers prior to the November 22nd cease and desist order) for several years.  The FDA wants assurances that what 23andMe is telling consumers is accurate.  Based on the letter issued to 23andMe on November 22nd, and subsequent commentary, it appears that both entities were jointly working towards that common goal…until earlier this year when 23andMe mysteriously “somehow forgot” about the FDA, the information they owed them, their submissions, etc.  They also forgot their phone number and their e-mail addresses apparently as well, because the FDA said they had heard nothing from them in 6 months, which backdates to May of 2013.

It may be relevant that 23andMe added the executive position of President and filled it in June of 2013, and there was a lot of corporate housecleaning that went on at that time.  However, regardless of who got housecleaned, the responsibility for working with the FDA falls squarely on the shoulders of the founders, owners and executives of the company.  Period.  No excuses.  Something that critically important should be on the agenda of every executive management meeting.   Why?  In terms of corporate risk, this was obviously a very high risk item, perhaps the highest risk item, because the FDA can literally shut their doors and destroy them.  There is little they can do to control or affect the FDA situation, except to work with the FDA, meet deadlines and engender goodwill and a spirit of cooperation.  The risk of not doing that is exactly what happened.

It’s unknown at this time if 23andMe is really that corporately arrogant to think they could simply ignore the FDA, or blatantly corporately negligent or maybe simply corporately stupid, but they surely betrayed the trust and confidence of their customers by failing to meet their commitments with and to the FDA, or even communicate with them.  I mean, really, what were they thinking?

There has been an outpouring of sympathy for 23andme and negative backlash towards the FDA for their letter forcing 23andMe to stop selling their offending medical product, meaning the health portion of their testing.  However, in reality, the FDA was only meting out the consequences that 23andMe asked for.  My teenage kids knew this would happen.  If you do what you’re not supposed to….X, Y and Z will, or won’t, happen.  It’s called accountability.  Just ask my son about his prom….he remembers vividly.  Now why my kids, or 23andMe, would push an authority figure to that point, knowing full well the consequences, utterly mystifies me.  It did when my son was a teenager and it does with 23andMe as well.

Some people think that the FDA is trying to stand between consumers and their health information.  I don’t think so, at least not in this case.  Why I think that is because the FDA left the raw data files alone and they left the genetic genealogy aspect alone.  The FDA knows full well you can download your raw data and for $5 process it at a third party site, obtaining health related genetic information.  The difference is that Promethease is not interpreting any data for you, only providing information.

There is some good news in this and that is that from a genetic genealogy perspective, we seem to be safe, at least for now, from government interference with the testing that has been so productive for genetic genealogy.  The FDA had the perfect opportunity to squish us like a bug (thanks to the opening provided by 23andMe,) and they didn’t.

The really frustrating aspect of this is that 23andMe was a company who, with their deep pockets in Silicon Valley and other investors, could actually afford to wage a fight with the FDA, if need be.  The other companies who received the original 2010 FDA letter all went elsewhere and focused on something else.  But 23andMe didn’t, they decided to fight the fight, and we all supported their decision.  But they let us all down.  The fight they are fighting now is not the battle we anticipated, but one brought upon themselves by their own negligence.  This battle didn’t have to happen, and it may impair them financially to such a degree that if they need to fight the big fight, they won’t be able to.

Right now, 23andMe is selling their kits, but only as an ancestry product as they work through whatever process they are working through with the FDA.  Unfortunately, 23andMe is currently having some difficulties where the majority of matches are disappearing from some testers records.  In other cases, segments that previously matched are disappearing.  One would think, with their only revenue stream for now being the genetic genealogy marketspace that they would be wearing kid gloves and being extremely careful, but apparently not.  They might even consider making some of the changes and enhancements we’ve requested for so long that have fallen on deaf ears.

One thing is for sure, it will be extremely interesting to see where 23andMe is this time next year.  The soap opera continues.

I hope for the sake of all of the health consumers, both current and (potentially) future, that this dotcom fairy tale has a happy ending.

Also, see the Autosomal DNA Comes of Age section.

http://dna-explained.com/2013/10/05/23andme-patents-technology-for-designer-babies/

http://www.thegeneticgenealogist.com/2013/10/07/a-new-patent-for-23andme-creates-controversy/

http://dna-explained.com/2013/11/13/genomics-law-review-discusses-designing-children/

http://www.thegeneticgenealogist.com/2013/06/11/andy-page-fills-new-president-position-at-23andme/

http://dna-explained.com/2013/11/25/fda-orders-23andme-to-discontinue-testing/

http://dna-explained.com/2013/11/26/now-what-23andme-and-the-fda/

http://dna-explained.com/2013/12/06/23andme-suspends-health-related-genetic-tests/

http://www.legalgenealogist.com/blog/2013/11/26/fooling-with-fda/

Supreme Court Decision – Genes Can’t Be Patented – Followed by Lawsuits

In a landmark decision, the Supreme Court determined that genes cannot be patented.  Myriad Genetics held patents on two BRCA genes that predisposed people to cancer.  The cost for the tests through Myriad was about $3000.  Six hours after the Supreme Court decision, Gene By Gene announced that same test for $995.  Other firms followed suit, and all were subsequently sued by Myriad for patent infringement.  I was shocked by this, but as one of my lawyer friends clearly pointed out, you can sue anyone for anything.  Making it stick is yet another matter.  Many firms settle to avoid long and very expensive legal battles.  Clearly, this issue is not yet resolved, although one would think a Supreme Court decision would be pretty definitive.  It potentially won’t be settled for a long time.

http://dna-explained.com/2013/06/13/supreme-court-decision-genes-cant-be-patented/

http://www.legalgenealogist.com/blog/2013/06/14/our-dna-cant-be-patented/

http://dna-explained.com/2013/09/07/message-from-bennett-greenspan-free-my-genes/

http://www.thegeneticgenealogist.com/2013/06/13/new-press-release-from-dnatraits-regarding-the-supreme-courts-holding-in-myriad/

http://www.legalgenealogist.com/blog/2013/08/18/testing-firms-land-counterpunch/

http://www.legalgenealogist.com/blog/2013/07/11/myriad-sues-genetic-testing-firms/

Gene By Gene Steps Up, Ramps Up and Produces

As 23andMe comes unraveled and Ancestry languishes in its mediocrity, Gene by Gene, the parent company of Family Tree DNA has stepped up to the plate, committed to do “whatever it takes,” ramped up the staff both through hiring and acquisitions, and is producing results.  This is, indeed, a breath of fresh air for genetic genealogists, as well as a welcome relief.

http://dna-explained.com/2013/08/07/gene-by-gene-acquires-arpeggi/

http://dna-explained.com/2013/12/05/family-tree-dna-listens-and-acts/

http://dna-explained.com/2013/12/10/family-tree-dnas-family-finder-match-matrix-released/

http://www.haplogroup.org/ftdna-family-finder-matches-get-new-look/

http://www.haplogroup.org/ftdna-family-finder-new-look-2/

http://www.haplogroup.org/ftdna-family-finder-matches-new-look-3/

Autosomal DNA Comes of Age

Autosomal DNA testing and analysis has simply exploded this past year.  More and more people are testing, in part, because Ancestry.com has a captive audience in their subscription data base and more than a quarter million of those subscribers have purchased autosomal DNA tests.  That’s a good thing, in general, but there are some negative aspects relative to Ancestry, which are in the Ancestry section.

Another boon to autosomal testing was the 23andMe push to obtain a million records.  Of course, the operative word here is “was” but that may revive when the FDA issue is resolved.  One of the down sides to the 23andMe data base, aside from the fact that it’s not genealogist friendly, is that so many people, about 90%, don’t communicate.  They aren’t interested in genealogy.

A third factor is that Family Tree DNA has provided transfer ability for files from both 23andMe and Ancestry into their data base.

Fourth is the site, GedMatch, at www.gedmatch.com which provides additional matching and admixture tools and the ability to match below thresholds set by the testing companies.  This is sometimes critically important, especially when comparing to known cousins who just don’t happen to match at the higher thresholds, for example.  Unfortunately, not enough people know about GedMatch, or are willing to download their files.  Also unfortunate is that GedMatch has struggled for the past few months to keep up with the demand placed on their site and resources.

A great deal of time this year has been spent by those of us in the education aspect of genetic genealogy, in whatever our capacity, teaching about how to utilize autosomal results. It’s not necessarily straightforward.  For example, I wrote a 9 part series titled “The Autosomal Me” which detailed how to utilize chromosome mapping for finding minority ethnic admixture, which was, in my case, both Native and African American.

As the year ends, we have Family Tree DNA, 23andMe and Ancestry who offer the autosomal test which includes the relative-matching aspect.  Fortunately, we also have third party tools like www.GedMatch.com and www.DNAGedcom.com, without which we would be significantly hamstrung.  In the case of DNAGedcom, we would be unable to perform chromosome segment matching and triangulation with 23andMe data without Rob Warthen’s invaluable tool.

http://dna-explained.com/2013/06/21/triangulation-for-autosomal-dna/

http://dna-explained.com/2013/07/13/combining-tools-autosomal-plus-y-dna-mtdna-and-the-x-chromosome/

http://dna-explained.com/2013/07/26/family-tree-dna-levels-the-playing-field-sort-of/

http://dna-explained.com/2013/08/03/kitty-coopers-chromsome-mapping-tool-released/

http://dna-explained.com/2013/09/29/why-dont-i-match-my-cousin/

http://dna-explained.com/2013/10/03/family-tree-dna-updates-family-finder-and-adds-triangulation/

http://dna-explained.com/2013/10/21/why-are-my-predicted-cousin-relationships-wrong/

http://dna-explained.com/2013/12/05/family-tree-dna-listens-and-acts/

http://dna-explained.com/2013/12/09/chromosome-mapping-aka-ancestor-mapping/

http://dna-explained.com/2013/12/10/family-tree-dnas-family-finder-match-matrix-released/

http://dna-explained.com/2013/12/15/one-chromosome-two-sides-no-zipper-icw-and-the-matrix/

http://dna-explained.com/2013/06/02/the-autosomal-me-summary-and-pdf-file/

DNAGedcom – Indispensable Third Party Tool

While this tool, www.dnagedcom.com, falls into the Autosomal grouping, I have separated it out for individual mention because without this tool, the progress made this year in autosomal DNA ancestor and chromosomal mapping would have been impossible.  Family Tree DNA has always provided segment matching boundaries through their chromosome browser tool, but until recently, you could only download 5 matches at a time.  This is no longer the case, but for most of the year, Rob’s tool saved us massive amounts of time.

23andMe does not provide those chromosome boundaries, but utilizing Rob’s tool, you can obtain each of your matches in one download, and then you can obtain the list of who your matches match that is also on your match list by requesting each of those files separately.  Multiple steps?  Yes, but it’s the only way to obtain this information, and chromosome mapping without the segment data is impossible

A special hats off to Rob.  Please remember that Rob’s site is free, meaning it’s donation based.  So, please donate if you use the tool.

http://www.yourgeneticgenealogist.com/2013/01/brought-to-you-by-adoptiondna.html

I covered www.Gedmatch.com in the “Best of 2012” list, but they have struggled this year, beginning when Ancestry announced that raw data file downloads were available.  GedMatch consists of two individuals, volunteers, who are still struggling to keep up with the required processing and the tools.  They too are donation based, so don’t forget about them if you utilize their tools.

Ancestry – How Great Thou Aren’t

Ancestry is only on this list because of what they haven’t done.  When they initially introduced their autosomal product, they didn’t have any search capability, they didn’t have a chromosome browser and they didn’t have raw data file download capability, all of which their competitors had upon first release.  All they did have was a list of your matches, with their trees listed, with shakey leaves if you shared a common ancestor on your tree.  The implication, was, and is, of course, that if you have a DNA match and a shakey leaf, that IS your link, your genetic link, to each other.  Unfortunately, that is NOT the case, as CeCe Moore documented in her blog from Rootstech (starting just below the pictures) as an illustration of WHY we so desperately need a chromosome browser tool.

In a nutshell, Ancestry showed the wrong shakey leaf as the DNA connection – as proven by the fact that both of CeCe’s parents have tested at Ancestry and the shakey leaf person doesn’t match the requisite parent.  And there wasn’t just one, not two, but three instances of this.  What this means is, of course, that the DNA match and the shakey leaf match are entirely independent of each other.  In fact, you could have several common ancestors, but the DNA at any particular location comes only from one on either Mom or Dad’s side – any maybe not even the shakey leaf person.

So what Ancestry customers are receiving is a list of people they match and possible links, but most of them have no idea that this is the case, and blissfully believe they have found their genetic connection.  They have found a genealogical cousin, and it MIGHT be the genetic connection.  But then again, they could have found that cousin simply by searching for the same ancestor in Ancestry’s data base.  No DNA needed.

Ancestry has added a search feature, allowed raw data file downloads (thank you) and they have updated their ethnicity predictions.  The ethnicity predictions are certainly different, dramatically different, but equally as unrealistic.  See the Ethnicity Makeovers section for more on this.  The search function helps, but what we really need is the chromosome browser, which they have steadfastly avoided promising.  Instead, they have said that they will give us “something better,” but nothing has materialized.

I want to take this opportunity, to say, as loudly as possible, that TRUST ME IS NOT ACCEPTABLE in any way, shape or form when it comes to genetic matching.  I’m not sure what Ancestry has in mind by the way of “better,” but it if it’s anything like the mediocrity with which their existing DNA products have been rolled out, neither I nor any other serious genetic genealogist will be interested, satisfied or placated.

Regardless, it’s been nearly 2 years now.  Ancestry has the funds to do development.  They are not a small company.  This is obviously not a priority because they don’t need to develop this feature.  Why is this?  Because they can continue to sell tests and to give shakey leaves to customers, most of whom don’t understand the subtle “untruth” inherent in that leaf match – so are quite blissfully happy.

In years past, I worked in the computer industry when IBM was the Big Dog against whom everyone else competed.  I’m reminded of an old joke.  The IBM sales rep got married, and on his wedding night, he sat on the edge of the bed all night long regaling his bride in glorious detail with stories about just how good it was going to be….

You can sign a petition asking Ancestry to provide a chromosome browser here, and you can submit your request directly to Ancestry as well, although to date, this has not been effective.

The most frustrating aspect of this situation is that Ancestry, with their plethora of trees, savvy marketing and captive audience testers really was positioned to “do it right,” and hasn’t, at least not yet.  They seem to be more interested in selling kits and providing shakey leaves that are misleading in terms of what they mean than providing true tools.  One wonders if they are afraid that their customers will be “less happy” when they discover the truth and not developing a chromosome browser is a way to keep their customers blissfully in the dark.

http://dna-explained.com/2013/03/21/downloading-ancestrys-autosomal-dna-raw-data-file/

http://dna-explained.com/2013/03/24/ancestry-needs-another-push-chromosome-browser/

http://dna-explained.com/2013/10/17/ancestrys-updated-v2-ethnicity-summary/

http://www.thegeneticgenealogist.com/2013/06/21/new-search-features-at-ancestrydna-and-a-sneak-peek-at-new-ethnicity-estimates/

http://www.yourgeneticgenealogist.com/2013/03/ancestrydna-raw-data-and-rootstech.html

http://www.legalgenealogist.com/blog/2013/09/15/dna-disappointment/

http://www.legalgenealogist.com/blog/2013/09/13/ancestrydna-begins-rollout-of-update/

Ancient DNA

This has been a huge year for advances in sequencing ancient DNA, something once thought unachievable.  We have learned a great deal, and there are many more skeletal remains just begging to be sequenced.  One absolutely fascinating find is that all people not African (and some who are African through backmigration) carry Neanderthal and Denisovan DNA.  Just this week, evidence of yet another archaic hominid line has been found in Neanderthal DNA and on Christmas Day, yet another article stating that type 2 Diabetes found in Native Americans has roots in their Neanderthal ancestors. Wow!

Closer to home, by several thousand years is the suggestion that haplogroup R did not exist in Europe after the ice age, and only later, replaced most of the population which, for males, appears to have been primarily haplogroup G.  It will be very interesting as the data bases of fully sequenced skeletons are built and compared.  The history of our ancestors is held in those precious bones.

http://dna-explained.com/2013/01/10/decoding-and-rethinking-neanderthals/

http://dna-explained.com/2013/07/04/ancient-dna-analysis-from-canada/

http://dna-explained.com/2013/07/10/5500-year-old-grandmother-found-using-dna/

http://dna-explained.com/2013/10/25/ancestor-of-native-americans-in-asia-was-30-western-eurasian/

http://dna-explained.com/2013/11/12/2013-family-tree-dna-conference-day-2/

http://dna-explained.com/2013/11/22/native-american-gene-flow-europe-asia-and-the-americas/

http://dna-explained.com/2013/12/05/400000-year-old-dna-from-spain-sequenced/

http://www.thegeneticgenealogist.com/2013/10/16/identifying-otzi-the-icemans-relatives/

http://cruwys.blogspot.com/2013/12/recordings-of-royal-societys-ancient.html

http://cruwys.blogspot.com/2013/02/richard-iii-king-is-found.html

http://dna-explained.com/2013/12/22/sequencing-of-neanderthal-toe-bone-reveals-unknown-hominin-line/

http://dna-explained.com/2013/12/26/native-americans-neanderthal-and-denisova-admixture/

http://dienekes.blogspot.com/2013/12/ancient-dna-what-2013-has-brought.html

Sloppy Science and Sensationalist Reporting

Unfortunately, as DNA becomes more mainstream, it becomes a target for both sloppy science or intentional misinterpretation, and possibly both.  Unfortunately, without academic publication, we can’t see results or have the sense of security that comes from the peer review process, so we don’t know if the science and conclusions stand up to muster.

The race to the buck in some instances is the catalyst for this. In other cases, and not in the links below, some people intentionally skew interpretations and results in order to either fulfill their own belief agenda or to sell “products and services” that invariably report specific findings.

It’s equally as unfortunate that much of these misconstrued and sensationalized results are coming from a testing company that goes by the names of BritainsDNA, ScotlandsDNA, IrelandsDNA and YorkshiresDNA. It certainly does nothing for their credibility in the eyes of people who are familiar with the topics at hand, but it does garner a lot of press and probably sells a lot of kits to the unwary.

I hope they publish their findings so we can remove the “sloppy science” aspect of this.  Sensationalist reporting, while irritating, can be dealt with if the science is sound.  However, until the results are published in a peer-reviewed academic journal, we have no way of knowing.

Thankfully, Debbie Kennett has been keeping her thumb on this situation, occurring primarily in the British Isles.

http://dna-explained.com/2013/08/24/you-might-be-a-pict-if/

http://cruwys.blogspot.com/2013/12/the-british-genetic-muddle-by-alistair.html

http://cruwys.blogspot.com/2013/12/setting-record-straight-about-sara.html

http://cruwys.blogspot.com/2013/09/private-eye-on-britainsdna.html

http://cruwys.blogspot.com/2013/07/private-eye-on-prince-williams-indian.html

http://cruwys.blogspot.com/2013/06/britainsdna-times-and-prince-william.html

http://cruwys.blogspot.com/2013/03/sense-about-genealogical-dna-testing.html

http://cruwys.blogspot.com/2013/03/sense-about-genetic-ancestry-testing.html

Citizen Science is Coming of Age

Citizen science has been slowing coming of age over the past few years.  By this, I mean when citizen scientists work as part of a team on a significant discovery or paper.  Bill Hurst comes to mind with his work with Dr. Doron Behar on his paper, A Copernican Reassessment of the Human Mitochondrial DNA from its Root or what know as the RSRS model.  As the years have progressed, more and more discoveries have been made or assisted by citizen scientists, sometimes through our projects and other times through individual research.  JOGG, the Journal of Genetic Genealogy, which is currently on hiatus waiting for Dr. Turi King, the new editor, to become available, was a great avenue for peer reviewed publication.  Recently, research projects have been set up by citizen scientists, sometimes crowd-funded, for specific areas of research.  This is a very new aspect to scientific research, and one not before utilized.

The first paper below includes the Family Tree DNA Lab, Thomas and Astrid Krahn, then with Family Tree DNA and Bonnie Schrack, genetic genealogist and citizen scientist, along with Dr. Michael Hammer from the University of Arizona and others.

http://dna-explained.com/2013/03/26/family-tree-dna-research-center-facilitates-discovery-of-ancient-root-to-y-tree/

http://dna-explained.com/2013/04/10/diy-dna-analysis-genomeweb-and-citizen-scientist-2-0/

http://dna-explained.com/2013/06/27/big-news-probable-native-american-haplogroup-breakthrough/

http://dna-explained.com/2013/07/22/citizen-science-strikes-again-this-time-in-cameroon/

http://dna-explained.com/2013/11/30/native-american-haplogroups-q-c-and-the-big-y-test/

http://www.yourgeneticgenealogist.com/2013/03/citizen-science-helps-to-rewrite-y.html

Ethnicity Makeovers – Still Not Soup

Unfortunately, ethnicity percentages, as provided by the major testing companies still disappoint more than thrill, at least for those who have either tested at more than one lab or who pretty well know their ethnicity via an extensive pedigree chart.

Ancestry.com is by far the worse example, swinging like a pendulum from one extreme to the other.  But I have to hand it to them, their marketing is amazing.  When I signed in, about to discover that my results had literally almost reversed, I was greeted with the banner “a new you.”  Yea, a new me, based on Ancestry’s erroneous interpretation.  And by reversed, I’m serious.  I went from 80% British Isles to 6% and then from 0% Western Europe to 79%. So now, I have an old wrong one and a new wrong one – and indeed they are very different.  Of course, neither one is correct…..but those are just pesky details…

23andMe updated their ethnicity product this year as well, and fine tuned it yet another time.  My results at 23andMe are relatively accurate.  I saw very little change, but others saw more.  Some were pleased, some not.

The bottom line is that ethnicity tools are not well understood by consumers in terms of the timeframe that is being revealed, and it’s not consistent between vendors, nor are the results.  In some cases, they are flat out wrong, as with Ancestry, and can be proven.  This does not engender a great deal of confidence.  I only view these results as “interesting” or utilize them in very specific situations and then only using the individual admixture tools at www.Gedmatch.com on individual chromosome segments.

As Judy Russell says, “it’s not soup yet.”  That doesn’t mean it’s not interesting though, so long as you understand the difference between interesting and gospel.

http://dna-explained.com/2013/08/05/autosomal-dna-ancient-ancestors-ethnicity-and-the-dandelion/

http://dna-explained.com/2013/10/04/ethnicity-results-true-or-not/

http://www.legalgenealogist.com/blog/2013/09/15/dna-disappointment/

http://cruwys.blogspot.com/2013/09/my-updated-ethnicity-results-from.html?utm_source=feedburner&utm_medium=email&utm_campaign=Feed%3A+Cruwysnews+%28Cruwys+news%29

http://dna-explained.com/2013/10/17/ancestrys-updated-v2-ethnicity-summary/

http://dna-explained.com/2013/10/19/determining-ethnicity-percentages/

http://www.thegeneticgenealogist.com/2013/09/12/ancestrydna-launches-new-ethnicity-estimate/

http://cruwys.blogspot.com/2013/12/a-first-look-at-chromo-2-all-my.html

Genetic Genealogy Education Goes Mainstream

With the explosion of genetic genealogy testing, as one might expect, the demand for education, and in particular, basic education has exploded as well.

I’ve written a 101 series, Kelly Wheaton wrote a series of lessons and CeCe Moore did as well.  Recently Family Tree DNA has also sponsored a series of free Webinars.  I know that at least one book is in process and very near publication, hopefully right after the first of the year.  We saw several conferences this year that provided a focus on Genetic Genealogy and I know several are planned for 2014.  Genetic genealogy is going mainstream!!!  Let’s hope that 2014 is equally as successful and that all these folks asking for training and education become avid genetic genealogists.

http://dna-explained.com/2013/08/10/ngs-series-on-dna-basics-all-4-parts/

https://sites.google.com/site/wheatonsurname/home

http://www.yourgeneticgenealogist.com/2012/08/getting-started-in-dna-testing-for.html

http://dna-explained.com/2013/12/17/free-webinars-from-family-tree-dna/

http://www.thegeneticgenealogist.com/2013/06/09/the-first-dna-day-at-the-southern-california-genealogy-society-jamboree/

http://www.yourgeneticgenealogist.com/2013/06/the-first-ever-independent-genetic.html

http://cruwys.blogspot.com/2013/10/genetic-genealogy-comes-to-ireland.html

http://cruwys.blogspot.com/2013/03/wdytya-live-day-3-part-2-new-ancient.html

http://cruwys.blogspot.com/2013/03/who-do-you-think-you-are-live-day-3.html

http://cruwys.blogspot.com/2013/03/who-do-you-think-you-are-live-2013-days.html

http://genealem-geneticgenealogy.blogspot.com/2013/03/the-surnames-handbook-guide-to-family.html

http://www.isogg.org/wiki/Beginners%27_guides_to_genetic_genealogy

A Thank You in Closing

I want to close by taking a minute to thank the thousands of volunteers who make such a difference.  All of the project administrators at Family Tree DNA are volunteers, and according to their website, there are 7829 projects, all of which have at least one administrator, and many have multiple administrators.  In addition, everyone who answers questions on a list or board or on Facebook is a volunteer.  Many donate their time to coordinate events, groups, or moderate online facilities.  Many speak at events or for groups.  Many more write articles for publications from blogs to family newsletters.  Additionally, there are countless websites today that include DNA results…all created and run by volunteers, not the least of which is the ISOGG site with the invaluable ISOGG wiki.  Without our volunteer army, there would be no genetic genealogy community.  Thank you, one and all.

2013 has been a banner year, and 2014 holds a great deal of promise, even without any surprises.  And if there is one thing this industry is well known for….it’s surprises.  I can’t wait to see what 2014 has in store for us!!!  All I can say is hold on tight….

Native American Haplogroups Q, C and the Big Y Test

Sicangu man c 1900I’m writing this to provide an update about Native American paternal research, and to ask for your help and support, but first, let me tell you why.  It’s a very exciting time.

If you don’t want the details, but you know you want to help now….and we have to pay for these tests by the end of the day December 1 to take advantage of the sale price…you can click below to help fund the Big Y testing for Native American haplogroups Q and C.  Both projects need approximately $990.  Everything contributed goes directly to testing.

To donate to the American Indian project, in memory of someone, a family member perhaps, or maybe in honor of an ancestor, or anonymously, click this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=AIP

In order to donate to haplogroup C-P39 project, please click this link:

http://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

Now for the story…

As many of you know, haplogroup Q and C are the two Native American male haplogroups.  To date, every individual with direct paternal Native American ancestors descends from a subgroup of either haplogroup Q or C, Q being by far the most prevalent.  Both of these haplogroups are also found to some extent in Asia and Europe, but there are distinct and specific lineages found in the Americas that represent only Native Americans.  These subgroups are not found in either Europe or Asia.

In December, 2010, we found the first SNP (single nucleotide polymorphism) marker that separated the European and the Native American subclades of haplogroup Q.  Since that time, additional markers have been found through the Walk the Y program and other research.

How did this happen?  A collaborative research approach between individual testers and project administrators.  In this case, Lenny Trujillo was a member of the haplogroup Q project and he agreed to take the WTY (Walk the Y) test, which indeed, discovered a very unique SNP marker that defines Native American haplogroup Q, as opposed to European haplogroup Q.

Much has changed in three years.  The WTY test which was focused solely on research is entirely obsolete, being replaced by a new much more powerful test called the Big Y, and at a reduced cost.  The Big Y sequences a much larger portion of the Y chromosome, which will allow us to discover even more markers.

Why is this important?  Because today, in haplogroups Q and C, we are learning through standard STR (short tandem repeat) surname marker tests who is related to whom, and how distantly, but it’s not enough.  For example, we have a group of haplogroup Q men in Canada who match each other, but then another group with a different SNP marker that is located in the Southwest, Mexico, and then in the North Carolina/Virginia border area.  Oh yes, and one more from Charleston, SC.  Most Native American men who carry haplogroup C are found in Northeastern Canada….but then there is one in the Southwest. What do these people have in common?  Is their relationship “old” or relative new?  Do they perhaps share a common historical language group?  We don’t know, and we’d like to.  In order to do that, we need to further refine their genetic relationship.  Hence, the new tool, the Big Y.

The Big Y sequences almost all of the Y chromosome – over 10 million base pairs and nearly 25,000 known SNPs.  But the good news is that the Big Y, like its predecessor, the WTY, has the ability to find new SNPs.  And they are being found by the buckets – so fast that the haplogroup trees can’t even keep up.  For example, the haplogroup project page still lists most Native people as Q1a3a, but in reality many new SNPs have been discovered.  The official haplogroup tree is still under construction, but you can see an updated version on the front page of the haplogroup Q project.

That’s the good news – that the Big Y represents a huge research opportunity for us to make major discoveries that may well divide the Native groups in the Haplogroup C and Q projects into either language groups, or maybe, if we are lucky, into tribal “confederacies,” for lack of a better word.  I hate to use the word tribes, because the definition of a tribe has changed so much.  What we would like to be able to do it to tell someone from their test results that they are Iroquoian, for example, or Athabascan, or Siouian.  This has been our overarching goal for years, and now we’re actually getting close.  That potential rests with the Big Y.

The bad news is that the test costs $495, and that’s the sale price good only through Dec. 1., and we need funding.  In the haplogroup Q project, we do have a few people who are testing.  Everyone who did the WTY has been sent a $50 coupon to apply towards the Big Y test.  I hope everyone who did do the WTY will indeed order the Big Y as well.  If not, then the coupon can be donated to us, as project administrators, to apply towards the Big Y test of someone else in the group who is testing.  If you’re not going to test, please donate your coupon.

In haplogroup Q, we have two additional men who we desperately want to take the Big Y test, and 2 in haplogroup C as well.  We’re asking for two things.  First, for unused $50 coupons and second, for contributions against the $495 price.  We’d certainly welcome large contributions, or a sponsor for an entire test, but we’d also welcome $5, $10, $25 or whatever you’d like to contribute.  Every little bit helps.

To donate to the American Indian project and to help fund this critical research, click this link:

https://www.familytreedna.com/group-general-fund-contribution.aspx?g=AIP

In order to donate to haplogroup C-P39 project for this research, please click this link:

http://www.familytreedna.com/group-general-fund-contribution.aspx?g=Y-DNAC-P39

Thank you everyone, in advance, for your help.  We can’t do this without you.  This is what collaborative citizen science is all about.  Of course, we’ll report findings as we receive them and can process the information.

2013 Family Tree DNA Conference Day 2

ISOGG Meeting

The International Society of Genetic Genealogy always meets at 8 AM on Sunday morning.  I personally think that 8AM meeting should be illegal, but then I generally work till 2 or 3 AM (it’s 1:51 AM now), so 8 is the middle of my night.

Katherine Borges, the Director speaks about current and future activities, and Alice Fairhurst spoke about the many updates to the Y tree that have happened and those coming as well.  It has been a huge challenge to her group to keep things even remotely current and they deserve a huge round of virtual applause from all of us for the Y tree and their efforts.

Bennett opened the second day after the ISOGG meeting.

“The fact that you are here is a testament to citizen science” and that we are pushing or sometimes pulling academia along to where we are.

Bennett told the story of the beginning of Family Tree DNA.  “Fourteen years ago when the hair that I have wasn’t grey,” he began, “I was unemployed and tried to reorganize my wife’s kitchen and she sent me away to do genealogy.”  Smart woman, and thankfully for us, he went.  But he had a roadblock.  He felt there was a possibility that he could use the Y chromosome to solve the roadblock.  Bennett called the author of one of the two papers published at that time, Michael Hammer.  He called Michael Hammer on Sunday morning at his home, but Michael was running out the door to the airport.  He declined Bennett’s request, told him that’s not what universities do, and that he didn’t know of anyplace a Y test could be commercially be done.  Bennett, having run out of persuasive arguments, started mumbling about “us little people providing money for universities.”  Michael said to him, “Someone should start a company to do that because I get phone calls from crazy genealogists like you all the time.”  Let’s just say Bennett was no longer unemployed and the rest, as they say, is history.  With that, Bennett introduced one of our favorite speakers, Dr. Michael Hammer from the Hammer Lab at the University of Arizona.

Bennett day 2 intro

Session 1 – Michael Hammer – Origins of R-M269 Diversity in Europe

Michael has been at all of the conferences.  He says he doesn’t think we’re crazy.  I personally think we’ve confirmed it for him, several times over, so he KNOWS we’re crazy.  But it obviously has rubbed off on him, because today, he had a real shocker for us.

I want to preface this by saying that I was frantically taking notes and photos, and I may have missed something.  He will have his slides posted and they will be available through a link on the GAP page at FTDNA by the end of the week, according to Elliott.

Michael started by saying that he is really exciting opportunity to begin breaking family groups up with SNPs which are coming faster than we can type them.

Michael rolled out the Y tree for R and the new tree looks like a vellum scroll.

Hammer scroll

Today, he is going to focus on the basic branches of the Y tree because the history of R is held there.

The first anatomically modern humans migrated from Africa about 45,000 years ago.

After last glacial maximum 17,000 years ago, there was a significant expansion into Europe.

Neolithic farmers arrived from the near east beginning 10,000 years ago.

Farmers had an advantage over hunter gatherers in terms of population density.  People moved into Northwestern Europe about 5,000 years ago.

What did the various expansions contribute to the population today?

Previous studies indicate that haplogroup R has a Paleolithic origin, but 2 recent studies agree that this haplogroup has a more recent origin in Europe – the Neolithic but disagree about the timing of the expansion.

The first study, Joblin’s study in 2010, argued that geographic diversity is explained by single Near East source via Anaotolia.

It conclude that the Y of Mesololithic hunger-gatherers were nearly replaced by those of incoming farmers.

In the most recent study by Busby in 2012 is the largest study and concludes that there is no diversity in the mapping of R SNP markers so they could not date lineage and expansion.  They did find that most basic structure of R tree did come from the near east.  They looked at P311 as marker for expansion into Europe, wherever it was.  Here is a summary page of Neolithic Europe that includes these studies.

Hammer says that in his opinion, he thought that if P311 is so frequent and widespread in Europe it must have been there a long time.  However, it appears that he and most everyone else, was wrong.

The hypothesis to be tested is if P311 originated prior to the Neolithic wave, it would predict higher diversity it the near east, closer to the origins of agriculture.  If P311 originated after the expansion, would be able to see it migrate across Europe and it would have had to replace an existing population.

Because we now have sequences the DNA of about 40 ancient DNA specimens, Michael turned to the ancient DNA literature.  There were 4 primary locations with skeletal remains.  There were caves in France, Spain, Germany and then there’s Otzi, found in the Alps.

hammer ancient y

All of these remains are between 6000-7000 years old, so prior to the agricultural expansion into Europe.

In France, the study of 22 remains produced, 20 that were G2a and 2 that were I2a.

In Spain, 5 G2a and 1 E1b.

In Germany, 1I G2a and 2 F*.

Otzi is haplogroup G2a2b.

There was absolutely 0, no, haplogroup R of any flavor.

In modern samples, of 172 samples, 94 are R1b.

To evaluate this, he is dropping back to the backbone of haplogroup R.

hammer backbone

This evidence supports a recent spread of haplogroup R lineages in western Europe about 5K years ago.  This also supports evidence that P311 moved into Europe after the Neolithic agricultural transition and nearly displaced the previously existing western European Neolithic Y, which appears to be G2a.

This same pattern does not extrapolate to mitochondrial DNA where there is continuity.

What conferred advantage to these post Neolithic men?  What was that advantage?

Dr. Hammer then grouped the major subgroups of haplogroup R-P3111 and found the following clusters.

  • U106 is clustered in Germany
  • L21 clustered in the British Isles
  • U152 has an Alps epicenter

hammer post neolithic epicenters

This suggests multiple centers of re-expansion for subgroups of haplogroup R, a stepwise process leading to different pockets of subhaplogroup density.

Archaeological studies produce patterns similar to the hap epicenters.

What kind of model is going on for this expansion?

Ancestral origin of haplogroup R is in the near east, with U106, P312 and L21 which are then found in 3 European locations.

This research also suggests thatG2a is the Neolithic version of R1b – it was the most commonly found haplogroup before the R invasion.

To make things even more interesting, the base tree that includes R has also been shifted, dramatically.

Haplogroup K has been significantly revised and is the parent of haplogroups P, R and Q.

It has been broken into 4 major branches from several individual lineages – widely shifted clades.

hammer hap k

Haps R and Q are the only groups that are not restricted to Oceana and Southeast Asia.

Rapid splitting of lineages in Southeast Asia to P, R and Q, the last two of which then appear in western Europe.

hammer r and q in europe

R then, populated Europe in the last 4000 years.

How did these Asians get to Europe and why?

Asian R1b overtook Neolithic G2a about 4000 years ago in Europe which means that R1b, after migrating from Africa, went to Asia as haplogroup K and then divided into P, Q and R before R and Q returned westward and entered Europe.  If you are shaking your head right about now and saying “huh?”…so were we.

Hammer hap r dist

Here is Dr. Hammer’s revised map of haplogroup dispersion.

hammer haplogroup dispersion map

Moving away from the base tree and looking at more recent SNPs, Dr. Hammer started talking about some of the findings from the advanced SNP testing done through the Nat Geo project and some of what it looks like and what it is telling us.

For example, the R1bs of the British Isles.

There are many clades under L 21.  For example, there is something going on in Scotland with one particular SNP (CTS11722?) as it comprises one third of the population in Scotland, but very rare in Ireland, England and Wales.

New Geno 2.0 SNP data is being utilized to learn more about these downstream SNPs and what they had to say about the populations in certain geographies.

For example, there are 32 new SNPs under M222 which will help at a genealogical level.

These SNPs must have arisen in the past couple thousand years.

Michael wants to work with people who have significant numbers of individuals who can’t be broken out with STRs any further and would like to test the group to break down further with SNPs.  The Big Y is one option but so is Nat Geo and traditional SNP testing, depending on the circumstance.

G2a is currently 4-5% of the population in Europe today and R is more than 40%.

Therefore, P312 split in western Eurasia and very rapidly came to dominate Europe

Session 2 – Dr. Marja Pirttivaara – Bridging Social Media and DNA

Dr. Pirttivaara has her PhD in Physics and is passionate about genetic genealogy, history and maps.  She is an administrator for DNA projects related to Finland and haplogroup N1c1, found in Finland, of course.

marja

Finland has the population of Minnesota and is the size of New Mexico.

There are 3750 Finland project members and of them 614 are haplogroup N1c1.

Combining the N1c1 and the Uralic map, we find a correlation between the distribution of the two.

Turku, the old capital, was full or foreigners, in Medieval times which is today reflected in the far reaching DNA matches to Finnish people.

Some of the interest in Finland’s DNA comes from migration which occurred to the United States.

Facebook and other social media has changed the rules of communication and allows the people from wide geographies to collaborate.  The administrator’s role has also changed on social media as opposed to just a FTDNA project admin.  Now, the administrator becomes a negotiator and a moderator as well as the DNA “expert.”

Marja has done an excellent job of motivating her project members.  They are very active within the project but also on Facebook, comparing notes, posting historical information and more.

Session 3 – Jason Wang – Engineering Roadmap and IT Update

Jason is the Chief Technology Officer at Family Tree DNA and recently joined with the Arpeggi merger and has a MS in Computer Engineering.

Regarding the Gene by Gene/FTDNA partnership, “The sum of the parts is greater than the whole.”  He notes that they have added people since last year in addition to the Arpeggi acquisition.

Jason introduced Elliott Greenspan, who, to most of us, needed no introduction at all.

Elliott began manually scoring mitochondrial DNA tests at age 15.  He joined FTDNA in 2006 officially.

Year in review and What’s Coming

4 times the data processed in the past year.

Uploads run 10 times faster.  With 23andMe and Ancestry autosomal uploads, processing will start in about 5 minutes, and matches will start then.

FTDNA reinvented Family Finder with the goal of making the user experience easier and more modern.   They added photos, profiles and the new comparison bars along with an advanced section and added push to chromosome browser.

Focus on users uploading the family tree.  Tools don’t matter if the data isn’t there.  In order to utilize the genealogy aspect, the genealogy info needs to be there.   Will be enhancing the GEDCOM viewer.  New GEDCOMs replace old GEDCOMs so as you update yours, upload it again.

They are now adding a SNP request form so that you can request a SNP not currently available.  This is not to be confused with ordering an existing SNP.

They currently utilize build 14 for mitochondrial DNA.  They are skipping build 15 entirely and moving forward with 16.

They added steps to the full sequence matches so that you can see your step-wise mutations and decide whether and if you are related in a genealogical timeframe.

New Y tree will be released shortly as a result of the Geno 2.0 testing.  Some of the SNPs have mutated as much as 7 times, and what does that mean in terms of the tree and in terms of genealogical usefulness.  This tree has taken much longer to produce than they expected due to these types of issues which had to be revised individually.

New 2014 tree has 6200 SNPS and 1000 branches.

  • Commitment to take genetic genealogy to the next level
  • Y draft tree
  • Constant updates to official tree
  • Commitment to accurate science

If a single sample comes back as positive for a SNP, they will put it on the tree and will constantly update this.

If 3 or 4 people have the same SNP that are not related it will go directly to the tree.  This is the reason for the new SNP request form.

Part of the reason that the tree has taken so long is that not every SNP is public and it has been a huge problem.

When they find a new SNP, where does it go on the tree?  When one SNP is found or a SNP fails, they have run over 6000 individual SNPs on Nat Geo samples to vet to verify the accuracy of the placement.  For example, if a new SNP is found in a particular location, or one is found not to be equivalent that was believe to be so previously, they will then test other samples to see where the SNP actually belongs.

X Matching

Matching differential is huge in early testing.  One child may inherit as little as 20% of the X and another 90%.  Some first cousins carry none.

X matching will be an advanced feature and will have their own chromosome browser.

End of the year – January 1.  Happy New Year!!!

Population Finder

It’s definitely in need of an upgrade and have assigned one person full time to this product.

There are a few contention points that can be explained through standard history.

It’s going to get a new look as well and will be easily upgradeable in the future.

They cannot utilize the National Geographic data because it’s private to Nat Geo.

Bennett – “Committed to an engineering team of any size it takes to get it done.  New things will be rolling out in first and second quarter of next year.”  Then Bennett kind of sighed and said “I can’t believe I just said that.”

Session 4 – Dr. Connie Bormans – Laboratory Update

The Gene by Gene lab, which of course processes all of the FTDNA samples is now a regulated lab which allows them to offer certain regulated medical tests.

  • CLIA
  • CAP
  • AABB
  • NYSDOH

Between these various accreditations, they are inspected and accredited once yearly.

Working to decrease turn-around time.

SNP request pipeline is an online form and is in place to request a new SNP be added to their testing menu.

Raised the bar for all of their tests even though genetic genealogy isn’t medical testing because it’s good for customers and increases quality and throughput.

New customer support software and new procedures to triage customer requests.

Implement new scoring software that can score twice as many tests in half the time.  This decreases turn-around time to the customer as well.

New projects include improved method of mtDNA analysis, new lab techniques and equipment and there are also new products in development.

Ancient DNA (meaning DNA from deceased people) is being considered as an offering if there is enough demand.

Session 5 – Maurice Gleeson – Back to Our Past, Ireland

Maurice Gleeson coordinated a world class genealogy event in Dublin, Ireland Oct. 18-20, 2013.  Family Tree DNA and ISOGG volunteers attended to educate attendees about genetic genealogy and DNA. It was a great success and the DNA kits from the conference were checked in last week and are in process now.  Hopefully this will help people with Irish ancestry.

12% of the Americans have Irish ancestry, but a show of hands here was nearly 100% – so maybe Irish descendants carry the crazy genealogist gene!

They developed a website titled Genetic Genealogy Ireland 2013.  Their target audience was twofold, genetic genealogy in general and also the Irish people.  They posted things periodically to keep people interested.  They also created a Facebook page.  They announced free (sponsored) DNA tests and the traffic increased a great deal.  Today ISOGG has a free DNA wiki page too.  They also had a prize draw sponsored by the Ireland DNA and mtdna projects. Maurice said that the sessions and the booth proximity were quite symbiotic because when y ou came out of the DNA session, the booth was right there.

2000-5000 people passed by the booth

500 people in the booth

Sold 99 kits – 119 tests

45 took Y 37 marker tests

56 FF, 20 male, 36 female

18 mito tests

They passed out a lot of educational material the first two days.  It appeared that the attendees were thinking about things and they came back the last day which is when half of the kits were sold, literally up until they threatened to turn the lights out on them.

They have uploaded all of the lectures to a YouTube channel and they have had over 2000 views.  Of all of the presentation, which looked to be a list of maybe 10-15, the autosomal DNA lecture has received 25% of the total hits for all of the videos.

This is a wonderful resource, so be sure to watch these videos and publicize them in your projects.

Session 6 – Brad Larkin – Introducing Surname DNA Journal

Brad Larkin is the FTDNA video link to the “how to appropriately” scrape for a DNA test.  That’s his minute or two of fame!  I knew he looked familiar.

Brad began a peer reviewed genetic genealogy journal in order to help people get their project stories published.  It’s free, open access, web based and the author retains the copyright..  www.surnamedna.com

Conceived in 2012, the first article was published in January 2013.  Three papers published to date.

Encourage administrators to write and publish their research.  This helps the publication withstand the test of time.

Most other journals are not free, except for JOGG which is now inactive.  Author fees typically are $1320 (PLOS) to $5000 (Nature) and some also have subscription or reader fees.

Peer review is important.  It is a critical review, a keen eye and an encouraging tone.  This insures that the information is evidence based, correct and replicable.

Session 7 – mtdna Roundtable – Roberta Estes and Marie Rundquist

This roundtable was a much smaller group than yesterday’s Y DNA and SNP session, but much more productive for the attendees since we could give individual attention to each person.  We discussed how to effectively use mtdna results and what they really mean.  And you just never know what you’re going to discover.  Marie was using one of her ancestors whose mtDNA was not the haplogroup expected and when she mentioned the name, I realized that Marie and I share yet another ancestral line.  WooHoo!!

Q&A

FTDNA kits can now be tested for the Nat Geo test without having to submit a new sample.

After the new Y tree is defined, FTDNA will offer another version of the Deep Clade test.

Illumina chip, most of the time, does not cover STRs because it measures DNA in very small fragments.  As they work with the Big Y chip, if the STRs are there, then they will be reported.

80% of FTDNA orders are from the US.

Microalleles from the Houston lab are being added to results as produced, but they do not have the data from the older tests at the University of Arizona.

Holiday sale starts now, runs through December 31 and includes a restaurant.com $100 gift card for anyone who purchases any test or combination of tests that includes Family Finder.

That’s it folks.  We took a few more photos with our friends and left looking forward to next year’s conference.  Below, left to right in rear, Marja Pirttivaara, Marie Rundquist and David Pike.  Front row, left to right, me and Bennett Greenspan.

Goodbyes

See y’all next year!!!

Family Tree DNA Research Center Facilitates Discovery of Ancient Root to Y Tree

The genetic genealogy community has been abuzz for months now with the discovery of the new Root of the Y tree.  First announced last fall at the conference for DNA administrators hosted by Family Tree DNA, this discovery has literally changed the landscape of early genetic genealogy and our understanding of the timeframe of the origins of mankind.  While it doesn’t make much difference in genetic genealogy in the past few generations, since the adoption of surnames, it certainly makes a difference to all of us in terms of our ancestors and where we came from – our origins.  After all, the only difference between current genetic genealogy and the journey of mankind is a matter of generations – and all of our ancestors were there, and survived to reproduce, or we wouldn’t be here.

One of the important aspects of this discovery is the collaboration of citizen scientists with academic institutions and corporations.  In this case, the citizen scientist was Bonnie Schrack, a volunteer haplogroup project administrator, Dr. Michael Hammer of the University of Arizona, National Geographic’s Genographic Project, and Drs. Thomas Krahn and Astrid Krahn, both with the Gene by Gene Genomics Research Center.  Without any one of these players, and Family Tree DNA’s support of projects, this discovery would not have been made.  This discovery of the “new root” legitimizes citizen science in the field of genetic genealogy and ushers in a new day in scientific research in which crowd sourced samples, in this case, through Family Tree DNA projects, provide clues and resources for important scientific discoveries.

Today Gene by Gene released a press release about the discovery of the new root.  In conjunction, Family Tree DNA has lowered their Y DNA test price to $39 for the introductory 12 marker panel for the month of March, hoping to attract new participants and to eliminate price as a factor.  On April 1, the price will increase to $49, still a 50% discount from the previous $99.  Who knows where that next discovery lies.  Could it be in your DNA?

Family Tree DNA’s Genomics Research Center Facilitates Discovery of Extremely Ancient Root to the Human Y Chromosome Phylogenetic Tree

HOUSTON, March 26, 2013 /PRNewswire/   — Gene By Gene, Ltd., the Houston-based   genomics and genetics testing company, announced that a unique DNA sample submitted via National Geographic’s Genographic Project to its genetic genealogy subsidiary, Family Tree DNA, led to the discovery that the most recent common ancestor for the Y chromosome lineage tree is potentially as old as 338,000 years.  This new information indicates that the last common ancestor of all modern Y chromosomes is 70 percent older than previously thought.

The surprising findings were published in the report “An African American Paternal Lineage Adds an Extremely Ancient Root to the Human Y Chromosome Phylogenetic Tree” in The   American Journal of Human Genetics earlier this month.  The study was conducted by a team of top research scientists, including lead scientist Dr. Michael F. Hammer of   the University of Arizona, who currently serves on Gene By Gene’s advisory board, and two of the company’s staff scientists, Drs.Thomas and Astrid-Maria Krahn.

The DNA sample had originally been submitted to National Geographic’s Genographic Project, the world’s largest “citizen science” genetic research effort with more than 500,000 public participants to date, and was later transferred to Family Tree DNA’s database for genealogical research.  Once in Family Tree DNA’s database, long-time project administrator Bonnie Schrack noticed that the sample was very unique and advocated for further testing to be done.

“This whole discovery began, really, with a citizen scientist – someone very similar to our many customers who are interested in learning more about their family roots using one of our genealogy products,” said Gene By Gene President Bennett Greenspan.  “While reviewing samples in our database, she recognized that this specific sample was unique and  brought it to the attention of our scientists to do further testing.  The results were astounding and show the value of individuals undergoing DNA testing so that we can continue to grow our databases and discover additional critical information about human origins and evolution.”

The discovery took place at Family Tree DNA’s Genomic Research Center, a CLIA registered lab in Houston which has processed more than 5 million discrete DNA tests from more than 700,000 individuals and organizations, including participants in the Genographic Project.  Drs. Thomas and Astrid-Maria Krahn of Family Tree DNA conducted the company’s Walk-Through-Y test on the sample and during the scoring process, quickly realized the unique nature of the sample, given the vast number of mutations.  Following their initial findings, Dr. Hammer and others joined to conduct a formal study, sequencing ~240 kb of the chromosome sample to identify private, derived mutations on this lineage, which has been named A00.

“Our findings indicate that the last common Y chromosome ancestor may have lived long before the first anatomically modern humans appeared in Africa about 195,000 years ago,” said Dr. Michael Hammer.  “Furthermore, the sample, which came from an African American man living in South Carolina, matched Y chromosome DNA of males from a very small area in western Cameroon, indicating that the lineage is extremely rare in Africa today, and its presence in the US is likely due to the Atlantic slave trade.  This is a huge discovery for our field and shows the critical role direct-to-consumer DNA testing companies can play in science; this might not have been known otherwise.”

Family Tree DNA recently dramatically reduced the price of its basic Y-DNA test by approximately 50%.  By offering the lowest-cost DNA test available on the market today, Gene By Gene and Family Tree DNA are working to eliminate cost as a barrier to individuals introducing themselves to personal genetic and genomic research.  They hope that expanding the pool of DNA samples in their database will lead to future important scientific discoveries.

About Gene By Gene, Ltd. 
Founded in 2000, Gene By Gene, Ltd. provides reliable DNA testing to a wide range of consumer and institutional customers through its four divisions focusing on ancestry, health, research and paternity.  Gene By Gene provides DNA tests through its Family Tree DNA division, which pioneered the concept of direct-to-consumer testing in the field of genetic genealogy more than a decade ago.  Gene by Gene is CLIA registered and through its clinical-health division DNA Traits offers regulated diagnostic  tests.  DNA DTC is the Research Use Only (RUO) division serving both direct-to-consumer and institutional clients   worldwide.  Gene By Gene offers AABB certified relationship tests through its paternity testing division, DNA Findings. The privately held company is headquartered in Houston, which is also home to its state-of-the-art Genomics Research Center.

SOURCE Gene By Gene, Ltd.

The New Root – Haplogroup A00

Now that things have calmed down a bit from the whirlwind of the Family Tree DNA Conference, I’d like to write in a little more comprehensive and sane manner about the revelation that we have a new root on the human tree.

I’m referring to the session given by Bonnie Schrack, Thomas Krahn and Michael Hammer titled “In Search of the Root: Discovery of a Highly Divergent Y Chromosome Lineage.”

Bonnie has posted her slides from the presentation as well as her speaking notes on her new haplogroup A webpage.  She contacted me with some corrections to my original Blog posting about that session at the conference as well as provided additional information.  Thank you Bonnie, not just for this info, but for your work with haplogroup A that has been such a key part of this momentous discovery.  This isn’t just a once-in-a-lifetime event, it’s a once-in-the-history-of-mankind event.  Watch the haplogroup A website for more information from Bonnie about this exciting discovery and project.

Understandably, Bonnie, Thomas and Michael are somewhat restricted in what they can say until such time as the resulting academic paper in the works is published.

We all know that male humans arise from a person we call Y-line Adam, just like we call the first woman Mitochondrial Eve.  Before a 2011 paper, it was believed that shortly after Adam, haplogroup A and B were formed about the same time and were brother haplogroups.  Fulvio Cruciani’s 2011 paper, “A Revised Root for the Human Y Chromosomal Phylogenetic Tree: The Origin of Patrilineal Diversity in Africa” reorganized that tree and showed that indeed, haplogroup A formed from the root of all humanity with B forming from haplogroup A.

Cruciani showed his newly organized tree with haplogroup A1b, A1a and then A2, A3 and BT as brother haplogroups.  Cruciani did not use STR data, only SNP data in his study.

A second recent study, also in 2011, “Signatures of the pre-agricultural peopling processes in sub-Saharan Africa as revealed by the phylogeography of early Y chromosome lineages” by Chiara Batini et al, did include some STR marker that matched some of the haplogroup A samples.  Batini did not use SNP testing, so did not realize the potential of these STR samples.  These did not match the new A00 root, but other rare haplogroup A samples in subgroups.  Other STR matching samples can be found in the Sorenson data base at www.smgf.org.

The 7 marker STR samples that did match the new A00 sample were from a private database at the Center for Genetic Anthropology who very graciously worked with Michael Hammer and provided small amounts of those samples for further analysis.

In my conference blog posting, I asked how this discovery was previously missed, and Bonnie Schrack responded as follows:

“The reasons we had never heard about A00 before would be:

  • Very scanty research and sample collection in Africa, in proportion to the size and diversity of the population, compared to Europe and other more developed countries
  • Only recently has large-scale Y sequencing become practical and affordable; Cruciani’s 2011 paper was a breakthrough precisely because for the first time they were able to sequence a few samples on the scale of a WTY, resulting in a lot of new SNPs, and we’ve been able to make even more progress because we had a larger pool of (customer) samples from which I could cherry-pick the most divergent samples, and then our genetic genealogy/anthropology community made it possible to raise enough funds for us to sequence the most important three of them (after that point, Hammer and FTDNA found the other samples and funds).”

Before the WTY program, this type of analysis simply wasn’t being done.  This monumental discovery was a combination of citizen science, the haplogroup A project, an innovative scientific program, the WTY at Family Tree DNA, academic partnership, Michael Hammer’s lab at the University of Arizona and other institutions, along with that crucial public participation.  Without the public participation aspect, the rest would be a moot point.

Haplogroup A research at Family Tree DNA discovered not only one, but two new branches of haplogroup A, one of which was actually a new base root that needed to be inserted before, upstream of, the current root.  The locations where these new branches/roots needed to be inserted required the renaming of the current branches, hence, the newly discovered branch A00 and Cruciani’s branch, formerly A1b, is now A0.

Thomas Krahn’s A00 discovery presentation slides are also available online.  You can tell he’s a scientist from the nature of his presentation.  You can see the actual process of discovery, in essence, what he saw as this new root was unearthed.  It’s fun to walk along with him, even if you don’t understand everything you see.

As part of this process, Thomas also sequenced the DNA of a chimp and a gorilla.  You can see the results at www.ysearch.org for the chimp at 6RCUU, the gorilla at 9ED3A and the new root, A00, at 6M5JA.  You can breathe easy, humans are far distant from chimps and gorillas, but maybe closer to Neanderthals or other archaic humans than we thought.

At the end of Thomas’s presentation, he included the image of a tree with a new root and lots of interesting branches.

Zooming in on the branches, you can see all of the DNA sequencing paraphernalia, microplates, readouts and results.  Maybe there is a little artist buried someplace in Thomas amid those scientific genes!

This work was no small feat, and the significance is mind-boggling.  This new discovery pushed the date of Y-Adam back a whopping 67% in one fell swoop.  Cruciani’s birth age for haplogroup A1b was 140,000 years ago and A00, compared to Cruciani’s sample, falls at 237,000 years ago.

Dr. Michael Hammer at the University of Arizona reanalyzed the haplogroup A tree and root with the new information available, and his new ages are even more amazing.  Cruciani’s A1b/A0 sample is now at 200,000 years old and A00 is at 338,000, with a 98% confidence level.

These dates pre-date all human fossils, although there are some archaic fossils that have been found and dated after this time in neighboring Nigeria.  This new information provides us with glimpses through the keyhole of time into ancient human origins, and begs even more questions that will be answered in time, with more genetic and anthropology research.  We all descend from this common root, and we may all be more closely related to archaic man that we knew.

The A00 participant descends from a former slave family in South Carolina.  The closest matches are found in western Cameroon near the Gulf of Guinea, a prime location in the slave trade.

There appears to be about 500 years between the participant and the samples from Cameroon, an age that speaks to the beginning of the slave trade.

Having worked closely with Lenny Trujillo, the man whose WTY sample provided us with haplogroup-changing and defining information for haplogroup Q, and understanding what a moving experience this journey has been for Lenny, I wondered about how the family involved with this revolutionary discovery must feel.

As luck would have it, I have worked with this family in one of my projects as well, and they contacted me after seeing my blog about the conference.

I asked how they felt, how they were reacting to this history-changing event in which their family was the keystone.  I have extracted pieces from e-mails back and forth, and with the families permission, am sharing what they had to say.  Clearly, without them and their active and supportive participation, this discovery would not have been made.  We all owe them a debt of gratitude.

“I have a B.S. in Mathematics. I love science and learning. I recently retired, but I spent a lot of that time working with research scientists on cutting edge technology and methods so it is very exciting to me to be a part of such a scientific discovery. My family would say I was the right one chosen.  This is the family line I know the most about so I am glad it was this part of my family.

I don’t yet have the formal results from Family Tree DNA concerning the Y-DNA sample they tested in the Walk Through the Y, I did know that the discovery was monumental from some preliminary results from Thomas.

I wanted to see the tie back to Africa, looks like GOD did exceedingly, abundantly more than I could ever ask or think. Just think of how long HE has preserved this Y-lineage just for such a time as this.”