The Dreaded “Middle East” Autosomal Result

One of our blog followers, Ron, asked this question:

“My late father and his brother were born and raised on Hatteras Island which was a very isolated community until relatively recent times. Curious about their genetic ancestry, I had my uncle do the Family Tree DNA Family Finder test. His results for the Family (Population) Finder were:

Europe (Western European) – Orcadian 91.37% ±2.82%

Middle East – Palestinian, Bedouin, Bedouin South, Druze, Jewish, Mozabite 8.63% ±2.82%

The 8.63% Middle East was surprising since most if not all of his ancestors, going back 4 or more generations, were born on the OBX (Outer Banks). Most of the original families on Hatteras Island trace their roots back to the British Isles and western Europe.

Since my mother’s parents were immigrants from eastern Europe, I thought it would be interesting to know what contributions my maternal grandparents added to my genetic ancestry, so I submitted my DNA samples for the same test.  The Population Finder test showed that I was Europe Orcadian 100.00% ±0.00%. I was shocked that some other population did not show in the results.

Can you help me understand how the representative populations are determined and why Middle East didn’t show in my sample?”

Yes, indeed, the dreaded “Middle Eastern” result.  I’ve seen this over and over again.  Let’s talk about what this is and why it might happen.  As it happens, the fact that Ray is from Hatteras Island provides us with a wonderful research opportunity, because it’s a population I’m quite familiar with.

Given that Dawn Taylor and I administer the Hatteras Families DNA Projects (Y-line, mtDNA and autosomal), I have a good handle on the genealogy of the Hatteras Island Families.  They are of particular interest because Hatteras Island is where Sir Walter Raleigh’s Lost Colonists are rumored to have gone and amalgamated with the Hatteras Indians.  The Hatteras Indians in turn appear to have partly died off, and partly married into the European Island population.  Both the Lost Colony Project and the Hatteras DNA Projects at  http://www.familytreedna.com/public/HatterasFathers and http://www.rootsweb.ancestry.com/~molcgdrg/hatteras/hifr-index.htm are ongoing and all Hatteras families are included.

As part of the Hatteras families endeavor, Dawn and I have assembled a data base of the Hatteras families with over 5000 early settlers and their descendants to about the year 1900 included.  What Ron says is accurate.  Most of the Hatteras Island families settled on the island quite early, beginning about 1710.  Nearly all of them came from Virginia, some directly and others after having settled on the NC mainland first for a generation or so in surrounding counties.  By 1750, almost all of the families found there in 1900 were present.  So indeed, this isolated island was settled by a group of people from the British Isles and a few of them intermarried with the local population of Hatteras Indians.

Once on the island, it was unusual to marry outside of the island population, so we have the situation known as endogamy, which is where an isolated population marries repeatedly within itself.  Other examples of this are the Amish and Jewish populations.  When this happens, the founding group of people’s DNA gets passed around in circles, so to speak, and no new DNA is introduced.

Typically what happens is that in each generation, 50% “new” DNA is introduced by the other parent.  When the new DNA is from someone nonrelated, it’s relatively easy to sort out using today’s DNA phasing tools.  But when the “new” DNA isn’t new at all, but comes from the same ancestral stock as the other parent, it has the effect of making relationships look “closer” in time.

Let’s look at an example.

You carry the following average percentages of DNA from these relatives:

  • Parents 50% from each parent
  • Grandparents 25%
  • Great-grandparents 12.5%
  • Great-great-grandparents 6.5%

As you can see, the percentage is divided in each generation.  However, if two of your great-grandparents are the same person, then you actually carry 25% of the DNA from that person, not 12.5.  When you’re looking at matches to other people in an endogamous community, nearly everyone looks more closely related than they are on paper due to the cumulative effect of shared ancestors.  In essence, genetically, they are much closer than they look to be on a genealogy pedigree chart.

Ok, back to the question at hand.  Where did the Middle Eastern come from?

Looking at the percentages above, you can see that if Ray’s Uncle was in fact 8% (plus or minus about 2%, so we’ll just call it 8%) Middle Eastern, his Middle Eastern relative would be either a great-grandparent or a great-great-grandparent.  Given that generational length is typically 25 to 30 years, assuming Ray’s birth in 1960 and his uncles in 1940, this means that this Middle Eastern person would have been living on Hatteras Island between 1835 and 1860 using 25 year generations and between 1810 and 1840 using 30 year generations.  Having worked with the original records extensively, I can assure you that there were no Middle Eastern people on Hatteras Island at that time.  Furthermore, there were no Middle Eastern people on Hatteras earlier in the 1800s or in the 1700s that are reflected in the records.  This includes all existent records, deed, marriages, court, tax, census, etc.

What we do find, however, are both Native Americans, slaves and free people of color who may be an admixture of either or both with Europeans.  In fact, we find an entire community adjacent to the Indian village that is admixed.

We published an article in the Lost Colony Research Group Newsletter that discusses this mixed community when we identified the families involved.  It’s titled, “Will the Real Scarborough, Basnett and Whidbee Please Stand Up” and details our findings.

These families were present on the island and were recorded as being “of color” before 1790, so the intermarriage occurred early in the history of the island.

Furthermore, these families continued to intermarry and they continued to live in the same community as before.  In fact, in May and June of 2012, we visited with a woman who still owns the Indian land sold by the Indians to her family members in 1788!  And yes, Ray’s surname is one of the surnames who intermarried with these families.  In fact, it was someone with his family surname who bought the land that included the Indian village in 1788 from a Hatteras Indian woman.

So what does this tell us?

Having worked with the autosomal results of people who are looking for small amounts of Native American ancestry, I often see this “Middle Eastern” admixture.  I’ve actually come to expect it.  I don’t believe it’s accurate.  I believe, for some reason, tri-racial admixture is being measured as “Middle Eastern.”  If you look at the non-Jewish Middle East, this actually makes some sense.  There is no other place in the world as highly admixed with a combination of African, European (Caucasian) and Asian.  I’m not surprised that early admixture in the US that includes white, African and Native American looks somewhat the same as Middle Eastern in terms of the population as a whole.  Regardless of why, this is what we are seeing on a regular basis.

New technology is on the horizon which will, hopefully, resolve some of this ambiguous minority admixture identification.  As new discoveries are made, as we discussed when we talked about “Ethnicity Finders” in the blog a few days ago, we learn more and will be able to more acutely refine these minority amounts of trace admixture.

If Ray’s ancestor in 1750 was a Hatteras Indian, and if there was no Lost Colonist European admixture already in the genetic mix, then using a 25 year generation, we would see the following percentages of ethnicity in subsequent generations, assuming marriage to a 100% Caucasian in each generation, as follows:

  • 1750 – 100% Indian
  • 1775 – next generation, married white settler – 50% Indian
  • 1800 – 25% Indian
  • 1825 – 13.5% Indian
  • 1850 — 6.25% Indian
  • 1875 — 3.12% Indian
  • 1900 – 1.56% Indian
  • 1925 – 0.78% Indian
  • 1950 – 0.39% Indian

Remember, however, about endogamy.  This group of people were neighbors and lived in a relatively isolated community.  They married each other.  Every time they married someone else who descended from someone who was a Hatteras Indian in 1750, their percentage of Native Heritage in the subsequent generation doubled as compared to what it would have been without double inheritance.  So if Ray’s Uncle is descended several times from Hatteras Indians due to intermarriage within that community, it’s certainly possible that he would carry 6-10% Native admixture.  There are also records that suggest possible African admixture early in the Native community.

So now to answer Ray’s last question about inheritance.

Ray wanted to know why he didn’t show any “Middle Eastern” admixture when his uncle did.

Remember that Ray’s Uncle has two “genetic transmission events” that differ from Ray’s line.  Ray’s Uncle, even though he had the same parents as Ray’s father, inherited differently from his parents.  Children inherit half of their DNA from each parents, but not necessarily the same half.  Maybe Ray’s father inherited little or none of the Native admixture.  In the next generation, Ray inherited half of his father’s DNA and half of his mother’s.  We have no way of knowing in which of these two transmission events Ray lost the Native admixture, or whether it’s there, but in such small pieces that the technology today can’t detect it.

Hopefully the new technology on the horizon will improve all aspects of autosomal admixture analysis and ethnicity detection.  But for today, if you see the dreaded “Middle East” result appear as one of your autosomal geographic locations and your family isn’t Jewish and has been in the states since colonial times, think to yourself ‘racial admixture’ and revisit this topic as the technology improves.  In other words, as far as I’m concerned, the jury is still out!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Racial Admixture in Elizabethan London

We typically don’t think of Africans in London in the 1500s, but they were there, as proven in parish and other records.  Thankfully, they were rare enough that when there was a record pertaining to them, their ethnicity is recorded.  But by 1600, after the Queen’s legendary decades-long conflict with Spain where galley slaves from Spanish ships were “rescued” when the ships were captured, the number of Africans and other “Moorish” people were becoming problematic, at least to the Queen, and she sought to repatriate at least some of them to “Barbary.”

Recently, the BBC ran a wonderful story about this which you can find at this link:  http://www.bbc.co.uk/news/magazine-18903391

In the haplogroup E1b1a project, it’s not uncommon for a person who knows their family to be “white” to discover their haplogroup is of African origin.  Many times, one can account for this by more fully researching the early colonial records of America, but not always.  Perhaps we need to extend the research net a bid wider to include both London and Bristol records.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Ethnicity Finders

It’s no secret in the genetic genealogy community that one of my special areas of interest is Native and mixed race heritage.  Both are obscured in the history of this country and this continent, and hampered by the lack of records.

Descendants are left to attempt to piece the history of their family together, many times with nothing more concrete than oral family history, faintly remembered.  For these people, and there are many, genetic genealogy is the best and final hope they have of discovering IF the family rumor is true.  If it is true, then perhaps by the judicious use of these new DNA tools, we can begin to get some idea of where to look on the family tree, as well as in historical records.

Someone asked a question on the blog the other day about how to interpret these results, and I do want to answer that question specifically in a future blog, but first, we need to talk about the tools themselves.

There are three kinds of tests or tools out there in the marketplace today.

Y-line and Mitochondrial DNA Tests

Why, you ask, are we talking about these tests when we’re supposed to be talking about ethnicity finders?  Well, simply put, because these are the old, proven gold standards, and people tend to forget about using them.  These tests DO prove ethnicity, but only for that one specific line.  But that’s also the beauty of this test, we know exactly which line the ethnicity pertains to.  Y-line of course is the paternal line and mitochondrial DNA is the direct maternal line only.  What does that tell us about their spouses?  Not a darned thing.

To discover ethnicity information about the spouse, you need to find someone directly descended from the spouse in the proper manner and have them test.  What you need to do is to build yourself a DNA pedigree chart so that you can determine, to the best of your ability, the ethnicity of your family, member by member.

If you can obtain the Y-line and mtDNA of your great-grandparents (through descendants of course), you’ll know about 8 of your ancestors.  If you can obtain the DNA of your great-great-grandparents, you know the ethnicity of 16 of them.  That’s a lot of good information.

However, sometimes obtaining this information just isn’t possible.  Some people are adopted, some don’t know the identity of a parent for other reasons, sometimes couples don’t have children of the right genders for their descendants to take these kinds of DNA tests, and sometimes, you simply have relatives who aren’t interested or refuse to test.  Enter, autosomal testing.

CODIS Type Tests

The first entries into this field of autosomal testing were tests that used few markers.  I am grouping them here together, even though there were some differences and at the time, there was significant debate about which ones were better, more accurate and such.  But today, with the advent of what I’m calling the Wide Spectrum Chip Tests, they are all obsolete.

CODIS stood for the Combined DNA Index System and was developed by police to differentiate between people, not to find their ethnic similarities.  Most of them used either 15 or 21 markers that were standardized for police work.  One test specifically for genealogy used about 150 markers.

These tests were also used for early paternity testing and were fairly reliable for one generation, but beyond that, it was difficult to draw any conclusions.  My alleged half-brother and I took three of these tests to determine if we were in fact half-siblings.  One test came back inconclusive.  One test said “probably not” and one said “probably cousins, not half-siblings.”  Later, we both took two of the Wide Spectrum Chip Tests, and we are neither half-siblings nor cousins.  The results of both of the wide spectrum tests, taken at different companies, matched each other, so all doubt was removed.

I took several of these tests as they were released, and you can read about the differences in results in my paper on by website titled Revealing American Indian and Minority Heritage Using Y-line, Mitochondrial, Autosomal and X Chromosomal Testing Data Combined with Pedigree Analysis.  This paper was published in JoGG, the Journal of Genetic Genealogy, in the Fall of 2010.

Wide Spectrum Chip Tests

In one large step, we went from 21 markers to half a million, give or take 100,000 or so.  It was kind of like moving from trying to find scant evidence under a microscope to a panoramic view of the galaxy.

All together, there have been 4 players in this field.  One of the first was DeCodeMe.  They have pretty well eliminated themselves.  With an impending bankruptcy a few years ago, they raised their prices into the $2000 range.  That combined with no comparative data base, like 23andMe had at the time, in essence killed them as a player.  Unfortunately, their ethnicity test was the only one that was able to classify my African heritage with a group of tribes.  I hated to see them leave the scene.

23andMe was the next player.  They introduced the concept of matching your cousins.  Genetic genealogy went crazy and we couldn’t order those tests quickly enough.  Unfortunately, their ethnicity comparison is disappointingly vague and is limited to 3 categories, European, Asian and African.  No updates or improvements have been offered in several years.  Genealogy is not their priority or focus.  People looking for Native American heritage must extrapolate that Asian is Native.

The other unfortunate part for genetic genealogists is that most of their customer base takes this test for health information.  While that means we’re fishing in a different pool than the normal genealogy group of people who test, it also means that many or most of them don’t reply to inquiries about their family history, and those that do often have no information.

Family Tree DNA was the next player to enter this space.  In addition to the cousin matches provided, their ethnic breakdown is far more detailed than any of the others, actually breaking down continents into several population categories.  While this detail is most welcome, it can also be confusing in some cases, especially if you receive an unexpected grouping  They are the first company to bring us this level of detail, and we’ll talk in a minute about how this is done.  As with any new technology, there are pitfalls and this entire field is and has been a learning experience.

Ancestry.com recently entered this market as well.  They initially gave away thousands of kits, about 10,000 I believe, so that they would have something in their data base to compare results to when they began to sell the kits.  They did begin to sell the kits in the spring of 2012 by invitation only to customers, and now the early results are coming in.  They seem to have had some early issues with unwarranted Scandinavian results being reported, but as they fully develop the product, I would expect they would get this corrected.

So, as of today, we have three players using this Wide Spectrum Chip Technology.

There are two things you need to understand about this technology and how it is used to generate the results you’re seeing relative to ethnicity.

Chip Technology Itself

Technology has been a good friend to genetic genealogy, but most of us don’t know it.  New diagnostic technology has been developed in the medical field that we’ve been able to leverage.  Instead of manually looking for the results of 21 markers in the lab, new chips have been developed that are scanned for between 500,000 and 700,000 locations, and for about the same price.  This allows detailed analysis on the level that was previously not only impossible, but undreamed of.

Do you remember the videotape format war in the 1980s – VHS vs Beta?  If so, you’re probably groaning now.  Well, there was a similar DNA chip war too and you didn’t even know it happened.  As a result, today we use the Illumina chip.

Anyone who was a Family Tree DNA customer and bought the early Family Finder test, you received a free upgrade when Family Tree DNA replaced their previous sequencer with the new Illumina model.  I’m sure that set them back a pretty penny, both the replacement sequencer and all of those free upgrades.  In any event, now that both 23andMe and Family Tree DNA use the same technology, their results can be compared.  You can upload 23andMe results to Family Tree DNA and you can upload both results to GedMatch for private comparisons.

We don’t know for sure what technology Ancestry is using, but it’s believed to be the Illumina platform.  However, it’s a moot point at this juncture, because they do not provide customers with their data files to download.  Genetic genealogists are hoping to change their minds in the future.  Without this capability, all of the advanced analysis is impossible.

(Update – Sept. 2013 – Ancestry does use the Illumina platform, does now provide raw data files, but still does not provide any comparison tools like a chromosome browser so that you can see if and where you actually do match the person you’re paired with through their system.)

Ok, all of this said, how is this technology used to determine ethnicity?

Determining Ethnicity

Whew, I bet you thought we’d never get to this part.  Ethnicity is really not determined by smoke and mirrors with the assistance of a fortuneteller and a crystal ball.  And no, you do not just pick up the Magic 8 Ball and look for the answer on the bottom.  If you remember the VHS wars, you’re probably laughing now.  If you aren’t, well, then, never mind.

Different marker values in our DNA are found in different proportions in differing populations.  We are all familiar with this relative to haplogroups – where they are found, originated and spread.  We know that African haplogroups are much more likely to be found in Africa than in Siberia, for instance.

Ancestry Informative Markers, called AIMS, aren’t any different.  What is different is that there is no centralized data base to compile them for research purposes.

Back to the CODIS markers, information about these markers was mined, for the most part, from forensic law enforcement publications.  The problem there was that there was no standardization or quality control.  For example, if you were being booked into the jail and someone asked you your ethnicity, how reliable was the answer?  Or did the jailer just look at you and write down what they thought?  Furthermore, results were very spotty and tended to be from high crime areas, not really representative of a world-wide population.    But it was all we had at the time and it was a baby step along the way.  This problem as a whole is known as data base normalization.

Relative to the CODIS type tests, they were pretty good at determining your primary ethnicity, something very important to law enforcement looking for an unknown suspect, but not useful to genealogists.  They were much less reliable looking for minority admixture and very unreliable looking for trace amounts of admixture.  These data bases were also easy to skew based on what data the researcher in question entered for comparison.  In other words, if you were interested in Native American ancestry, your data base would likely contain disproportionately more Native data than would proportionately be warranted.

As newer technology has become available and research has advanced, new information has become available.  For example, there are two DNA marker values that are known only to exist in the African and the Native American populations, respectively.  So, if you have one of these two values, then you unquestionably DO carry that heritage.  Of course, figuring out which ancestor or even which line it came from is another matter entirely.

No longer in the law enforcement and forensics arena, most AIMS now are discovered in academic settings.  In my paper, I do discuss the reference populations used for each of the testing companies.  The biggest challenge to all of them is finding and compiling the data.  It is buried in many academic papers and is not compiled centrally anyplace.  After the papers are read, the values are amassed, then the computer crunching needs to be done to determine which of these markers are really “ancestrally informative” and if so, how.  In general, unlike the one African and one Native marker, markers are generally found in a range of populations in varying frequencies.  This means that you’re now dealing with statistical probabilities.  Did your eyes just glaze over?

In a nutshell, what has to be done is to look at all of the AIM values that you carry, look at where they are most likely to be found, and put all of that together to come up with a composite picture of you.  Let’s say for example, you have that African marker, but very few others found in high frequencies in African, that Native marker plus several more found in Asia and a whole bunch found in Europe but seldom in Asia or Africa.  This person would obviously have European, Native and African heritage, but it’s up to the statistics to determine what percentage of which type and from where.

This is obviously a new field, actually, a new field within a new field.  Genetic genealogy itself is only 12 years old.  As more papers are published and more information is found, this affects the statistics and will affect the ethnicity percentages shown.  Keep in mind also that the African value, for example, could have been passed from many generations ago, from a long forgotten and otherwise genetically “absent” ancestor.

Blaine Bettinger had a great blog about this very topic.  You can see it at http://www.thegeneticgenealogist.com/2012/06/19/problems-with-ancestrydnas-genetic-ethnicity-prediction/.  While he is actually talking about the problem with Ancestry.com’s ethnicity predictions, he discusses a very important concept, and that is that you actually have two family trees.  The genealogy one we all know and love, and a genetic family tree that we are just now getting to know.

Of course, the gift box with the big beautiful bow holds for us, one by one, the branches of our genetic tree….and that gift may look nothing at all like the package wrapping suggests.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

What Project do I Join?

You wouldn’t believe how often I receive this question.  It seems evident to those of us who work with this information, but it’s obviously not to others.  So this blog is for those who ask, and also for project administrators who want to make sure their projects are useful and friendly and reaching the people they want to reach.

This is referring to the projects at Family Tree DNA.  Ancestry also has surname projects, but they tend to be more like study groups because you don’t have to DNA test to join them.

At Family Tree DNA, there are three kinds of projects; surname projects, haplogroup projects and geographic projects.  Let’s look at all 3.

Surname Projects

Most males will want to join a surname project.  Since the Y chromosome follows the surname, unless we’re looking at cases of adoption (documented or otherwise), you’ll want to join the surname project most similar to your surname.

To find the surname project best suited for you, simply go to Family Tree DNA and enter your surname into the surname search box.

Project administrators – be sure that all variants of the surname are listed in your project profile.

Ladies, surname projects are much less useful to you directly, since surnames changed with every generation.  In my own personal case, I “keep” people who have tested for particular surnames in that project, but that’s so I can find them easily.  For example, I have two women who tested to prove who their ancestor was, that she was the wife of one William Crumley, and so they are in the Crumley project.  However, that is as much for my convenience as anything.  There are 5 surnames between their generation and the Crumley connection, so any of those surnames would be as appropriate as any other.  Generally, women should focus more on the other project types.  Some Y-DNA project administrators don’t accept mitochondrial results into the project.

Be sure to look through the mtDNA Lineage projects on the project search page.  They are similar to surname projects for males.  Don’t know how to find the lineage projects, keep reading to discover how to find different kinds of projects.

Haplogroup Projects

I encourage everyone to join appropriate haplogroup projects.  There may be more than one for you.

Often there is a primary haplogroup project, for example, haplogroup H, then subprojects.  You can find these projects by going to the Projects tab at the top of your personal page and click on the “join” option.

You will see the following selections.

If you’re looking for mitochondrial haplogroup H, scroll down to the mitochondrial haplogroup section and click on H.  You will then see the following options.

In this case, I would suggest joining both the haplogroup H main project and the subproject appropriate for you. If you are haplogroup H1, then join that project as well.  So in this case, you would join two haplogroup projects.

What is the benefit of joining a haplogroup project?

First, you can help science along its way.  This is one way you can be a citizen scientist, contributing to the greater good.  Haplogroup projects group people so we can discover new haplogroup subgroups and learn about migration patterns, which brings me to the second reason, which isn’t so altruistic.

You can learn about where your ancestors lived and settled before the advent of surnames.  Do you want to know where they lived 1000, 2000 or 5000 years ago?  Well, by looking at the haplogroup maps, you can see where they and their descendants settled.

Many haplogroup administrators group participants within the haplogroup project by either haplogroup subgroups, common mutation patters (which lead to new haplogroup subgroups) or other criteria.  Here’s an example of a subgroup from the haplogroup H1 project.  If you don’t know where your ancestors were from in Europe, wouldn’t a map like this showing where others with similar DNA patterns lived be useful?

If you’re not sure about which projects apply to you then click on the project link and read what the administrator has to say about the project.  Still not sure?  Most of the time the administrator’s name and e-mail is shown.

Project administrators, be sure that your project description in the project profile and on your project public website background page is current and useful.  If you’re receiving the same question over and over, put the answer where people can see it.  Be sure your name and e-mail are listed so that people can contact you with questions.  Please, enable mapping.  It’s free and it a wonderful resource for your participants.  If any of these things are causing you problems, the helpdesk at Family Tree DNA is a god-send for project administrators and you can reach them at helpdesk@ftdna.com.

Geographic Projects

Named “geographic projects”, these really fall into the “all other” category, meaning those that aren’t surname projects and aren’t haplogroup projects.

I think these are the most interesting and most fun.  They group people by specific interests.  Sometimes that means geography, like the Cumberland Gap Project, sometimes ethnicity, like the Native American projects, and sometimes something else that someone wants to study.  They are also the most difficult to name appropriately so that people can find them, especially if they don’t know to look for them.

There are two ways to find these kinds of projects.  Go to the project tab on your personal page and click on join.

You will then see a listing of projects, a search box, and the index to projects.

Many people think that the projects shown are recommendations by Family Tree DNA and they join all of the projects.  This is NOT what this is.  This is a list of projects where the administrator has entered your surname, the one on your account, as a surname of interest to that project.  To see why, click on the project links. These projects may or may not be appropriate for your situation.

However, there may be other projects that are of interest to you.  You can begin by putting key words into the search box.  For example, putting the word “Indian” in the search box returned the following list of projects.

There are several projects shown, but I happen to know there are several more that aren’t.  Let’s say you’re interested in the Shawnee.  Try that word in the search box.  Still didn’t find anything, then resort to browsing?

Look through the various Y-line, mtdna and dual (Y+mtDNA) projects to see what is listed.  You may be surprised at what you find that is interesting to you.  While looking for your Shawnee, you may also discover the Cumberland Gap project, the North Carolina Native project and others that might be relevant.  So take some time and look at what is available to you.

Hey look, I found the Shawnee project under PiquaShawnee in the P section of the Dual (Y+mtDNA) geographic projects.  I surely am glad I was browsing, because I would never have thought to look for that project name or to look under P!!

Project administrators, you want your project to be able to be found by those who need to find it.  If you have a Native American project, for example, you might add the names of tribes, the word “Indian” and the words “native” and “American” and “Native American” in the surname list on the project profile page.  Why?  Because those are things people might enter in that search box to find a relevant project.  In the above example, list the word “Shawnee” as well.

Once a project is named, the name can’t be changed, so think about how the project can most easily be found by a novice and name it appropriately.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

The Trouble with Ancestry.com Matches

Update: Ancestry no longer provides Y and mitochondrial DNA testing, but I’m leaving this article for historical context.

While working on a client’s mitochondrial DNA report, I came across the worst case I’ve seen in a long time of mismatches being shown as matches at Ancestry.com.  This has been a pervasive problem for a long time.

10 Point Question – If you match another person exactly on every location, HVR1&HVR2, must you have the exact same haplogroup?

Answer:  Most of the time.

You didn’t think this was going to be easy did you?

Because Family Tree DNA is the only company to test to the full sequence level, their clients are going to have far more advanced, detailed and accurate haplogroup assignments than people who test at companies who only offer the HVR1+HVR2 regions.

Therefore, like in this case, we see a client whose haplogroup is H1.  The “1” part of H1 is determined by location 3010A, a position found in the coding region that can only be read by full sequence testing.  So, at Ancestry, and in other data bases outside of Family Tree DNA, we would expect to see matches to both haplogroup H and H1 (assuming the data base allows outside results to be input), and possibly some other H haplogroups as well, if the HVR1+HVR2 region mutations match those of our H1 person.

OK – next 10 point question.  Will someone who is haplogroup H match someone who is haplogroup M or N or some other haplogroup?

Answer: No, not an exact match, but they may share some common mutations.

Then why does Ancestry show them as matches when a simple comparison would eliminate them?

The answer is two-fold.  Part of the issue could be how Ancestry assigns haplogroups.  We really don’t know how they do it, and they aren’t as forthcoming about these things as Family Tree DNA is.  Secondly, and probably the biggest issue is that Ancestry allows people to enter their own data from other labs into their data base, including their haplogroup, apparently without any verification process.  So, in essence, Ancestry has muddied their own waters.

My client’s 251 matches at ancestry were all shown with “0” differences which means they are exact matches.  That’s exciting to see, except it isn’t real.

I clicked on the “download matches” button, which dumps everything into a spreadsheet, a wonderfully handy feature.  As we talk about this, keep in mind that my client had a total of 5 mutations in the HVR1+HVR2 regions, so based on “0” differences, everyone on that list should share all of those mutations with no additional mutations.

Here’s what I found after sorting the spreadsheet.

Exact matches = 32, hardly the 251 displayed on the match page.

Of the 251 “exact” matches shown, the haplogroup breakdown is shown below:

A – 10 (Native American)

B- 7 (Native American)

C – 3 (Native American)

D – 2 (Native American)

H – 154, over half with no matching markers at all to client

HV – 10

I – 5

J – 5

K – 4

L – 12 (African)

M – 4

N – 5

R – 6

T – 7

U – 11

V – 3

W – 1

X – 1

Z – 1

But even this isn’t the worst part.  Of the 251 matches shown with “0” differences, 32 are actually exact matches.  Of those exact matches, we find 4 different haplogroups, including 3 in haplogroup M, a generally Asian haplogroup which is rare as hen’s teeth here in the US.  Hmmm….anyone spot a problem?

Of the remaining 219, 162 have no mutations whatsoever that match the clients, so they not only shouldn’t be shown with “0” differences, they shouldn’t be shown at all.  So this means that the balance of the matches that do share at least one marker but aren’t exact matches, 57 in number, are shown incorrectly, with “0” differences.

So let’s give Ancestry a report card on this.  32 out of 251 correct equals 13% correct.

Last 10 point question – What letter grade do you get for 13% right, which is 87% wrong?

In my book, and in any school I ever attended, that was a big fat F!

And no, this is not just a recently introduced software bug.  It’s been like this forever.

So now that we know how well Ancestry does on basic things like mitochondrial DNA matches, which are exceedingly easy, anyone feel good about how they’ll do with autosomal DNA?  Comparatively speaking, that’s the tough stuff.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Citizen Science

My husband, Jim, who is kind of a geeky guy in the best of ways and really is interested in genetic genealogy from a technologist’s perspective, asked me a question about the new mitochondrial comparative sequence, the RSRS (Reconstructed Sapiens Reference Sequence).  We’ve been talking about it on the blog and on the various DNA lists for days now.  So it stands to reason we’re talking about it at the dinner table too.

He asked, “Why now?  Why not before when the transition would have been easier?”  That’s a great question!  The answer isn’t nearly as short as the question.  I hate it when he does this to me!

The answer is Citizen Science – that means you and me – lots of us actually.  How is that possible?  Let’s take a look at some history.  It’s actually quite interesting!

In 1981 when the Cambridge Reference Sequence was published as a comparative model, the science of genetics was functionally brand new.  This anonymous person at Cambridge University was the first person to have all 16569 bases of their mitochondria sequenced, something anyone can have today for a couple of hundred dollars.  But back then in the not so distant past, it was groundbreaking.  The Y DNA hadn’t even been mapped yet, so this was the very beginning.  At that point in time, there was no concept of mitochondrial Eve or Y-line Adam.  So the CRS became the norm because we had no other basis for comparison.

In 1999, the CRS was resequenced, and surprisingly, 11 errors were found in the original sequence.  Today that is called the Revised Cambridge Reference Sequence, or rCRS, technically, and that is the sequence that is used for both academia and genetic genealogy.  Most people just refer to it as the Cambridge Reference Sequence because no one would use the older sequence today.

1999 was also the first year that any commercially available genetic genealogy tests were available to the public.  They were available from Oxford Ancestors and were prohibitively expensive, but that didn’t stop many of us from ordering one.  If you bought the book, “Seven Daughters of Eve” you could send in the form in the back of the book, with a hefty check, and you too could discover which of the 7 daughters you descended from.

What you received was one piece of paper in the mail, months later, with a gold attendance star (like from Sunday School when you were a kid) placed on your haplogroup name.  So for several hundred dollars, significantly more than a full sequence test today, I got a gold star on a J.  I still have that certificate and I was unbelievably excited to know I was a member of Jasmine’s clan.  Of course, in order to justify my DNA test, I had to test my husband’s too, so it cost me twice as much!

In the year 2000, Family Tree DNA opened their doors and began selling genetic genealogy testing kits. They also began surname projects.  I don’t know if that was a stroke of genius or a stroke of luck.  Soon thereafter, they added both haplogroup projects and geographic projects.  These various project types allowed people with specific interests to focus on those areas of genetic genealogy.  Little did we know that projects would eventually provide a huge pool of people who have been DNA tested for research areas, such as determining new haplogroups.  In the past all sequencing had been done at academic institutions and often did not use full sequences initially due to the prohibitive cost.  Many of the early academic papers were written with far fewer samples than today’s projects have members.  Full sequence commercial testing has fostered exponential change in this industry.

By 2006, Family Tree DNA was offering the full mitochondrial sequence for genealogists, something still not offered today by any of the other major commercial testing companies.  This not only enabled genealogists to determine who was actually a close match, but it also enabled the haplogroup projects to collect many samples of full sequence data.  The coding region (meaning not the HVR1, HVR2 and HVR3 regions) is not shown in the public projects because of the possibility that they may carry medical information, but they are available for project administrators to see, if the individual participant authorizes administrator view access.

Haplogroups aren’t just determined by the hypervariable (HVR) regions, but by mutations found in the entire mitochondrial sequence, including the coding region.  Never before had groupings of participants this size been available outside of academia, and often, not even within academia.

Many of the project administrators began discovering new haplogroups in a flurry of activity.  Two that come immediately to mind are both Jim Logan and Bill Hurst.  Bill began publishing about haplogroup K in the Fall 2007 JoGG issue, as did Ian Logan with a discussion of what the mitochondrial DNA of “mitochondrial Eve” might look like.  In Spring of 2008, Jim Logan published a groundbreaking paper for haplogroup J, still in use today.  Indeed, citizen science came into its own in the spring of 2005 when the Journal of Genetic Genealogy (JoGG) was launched to facilitate exactly this type of academic publishing effort.  The more traditional publications weren’t quite ready to deal with citizen scientists making discoveries.  Clearly, citizen scientists didn’t fit well into the academic publishing “box.”

Bill Hurst has been collaborating with Dr. Doron Behar for several years now and is recognized in his most recent paper.  They presented a joint session at the 5th International Conference on Genetic Genealogy for DNA Administrators in Houston, Texas in March of 2009.

During this time, Family Tree DNA implemented an authorization system for people to make their full sequence DNA results, if they wanted, available to Dr. Behar for research.

Dr. Behar’s paper (along with several other authors), “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root” was published earlier this year, defining the RSRS (Reconstructed Sapiens Reference Sequence) revealing the genetic fingerprint of Mitochondrial Eve, the original mother of us all.  He was able to do this, in part, as a result of the many full sequence test results made available by Family Tree DNA customers, you and me, and by the hard work of haplogroup administrators like Bill Hurst and Jim Logan.  Of course, there are many other hard-working administrators too, and I don’t mean to slight anyone.

So, this is a long-winded way to answer Jim’s question, which, in case you’ve forgotten, was “why now for the RSRS and why not before?”  The answer is quite simply, Citizen Scientists were needed.  People like you and me.  Until the stars aligned where haplogroup projects existed, full sequence mitochondrial data became affordable and widely available, and there was a way for genealogists to contribute their results for scientific research, it couldn’t have been done – at least not yet.  It’s been a long way from the gold star on haplogroup J to the beautifully elegant RSRS, the mitochondrial map of Eve, the common ancestor of everyone living today – the entire trip made in just a dozen years.  Congratulations and thank you to everyone involved.  Indeed, it’s really quite a remarkable story!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

The mtDNA Community

When you look at your mitochondrial DNA results on your personal page at Family Tree DNA, the third tab, after rCRS and RSRS is the mtDNA Community. You will only see this tab if you have taken the full sequence test.

The mtDNA Community software was developed to facilitate easy donation of your full sequence mitochondrial DNA sequences for scientific research purposes.  You can read about it here:  http://www.mtdnacommunity.org/default.aspx and here: http://www.mtdnacommunity.org/about.aspx

You too can be a part of science research by uploading your mitochondrial DNA sequence so that it can be included in the sequences studied by scientists.

Many of the leaps and bounds in genetic genealogy, the discovery of new haplogroups and learning how the people who carried them lived and where they settled has been through the volunteer efforts of genetic genealogists, just like yourself.

Let’s talk a minute though about what this means.  First, we don’t yet have a FAQ about the mtDNA Community from Family Tree DNA.  Much of what is known now is through working with the products personally, Rebekah Canada and Bill Hurst, both of whom have been rather intimately involved in the research and rollout process and Max Blankfeld, the President of Family Tree DNA – all of whom made themselves available over the weekend to sort through this.

There are really two levels of research here, but one leads to the other.  If you authorize your full sequence results to be uploaded to mtDNA Community you are authorizing your results to be included in scientific research.  In the mtDNA Community, you are not anonymous.  This means that your sequence can be tracked back to you.  This is neither a bad thing or a good thing, it’s just the way it works. Of course, there are benefits to you, other than being altruistic, for uploading your information.  We’ll discuss those in a minute.

The second part of the research quotient is that when papers are written using mitochondrial DNA sequences, most of the time those sequences are uploaded, anonymously, to GenBank.  At GenBank, the contact information is the submitting researcher and paper that the sequence is associated with.  This is done, at least in part, so that this research can be corroborated by others.  So if you upload your results to mtDNA Community, you are in essence granting permission for your results to be uploaded anonymously at some point in the future to GenBank.

What is GenBank?

The GenBank sequence database is an open access collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information (NCBI) as part of the International Nucleotide Sequence Database Collaboration (INSDC). The National Center for Biotechnology Information is a part of the National Institutes of Health in the United States. GenBank and its collaborators receive sequences produced in laboratories throughout the world from more than 100,000 distinct organisms. In more than 20 years since its establishment, GenBank has become the most important and most influential database for research in almost all biological fields, whose data were accessed and cited by millions of researchers around the world.

You can read more about GenBank here:  http://en.wikipedia.org/wiki/GenBank and here:  http://www.ncbi.nlm.nih.gov/genbank/

Why is this important?

The science of genetic genealogy has grown by leaps and bounds in the past few years as a result, at least in part, of vast amounts of data becoming readily available through the genetic genealogy community and citizen scientists.  Without these sequences to study, scientific advances like the RSRS model wouldn’t have happened, at least not yet.

Is providing your mitochondrial DNA sequence for research the right choice for you?

For me, it was.  I provided my sequence to GenBank some time ago.  For everyone, it might not be.  It’s a personal decision.  But once it’s uploaded and in the scientific “stream” so to speak, there is no recalling it.  Even if the mtDNA community administrators would remove your sequence, and the same for GenBank, that doesn’t mean that someone hasn’t already downloaded it for study.

Is there a downside?

I can’t tell you that there is not.  What I can say is that I don’t know of any.  The mtDNA Community is new software released in conjunction with Dr. Behar’s paper in order to facilitate the study of mitochondrial DNA.  Before, submitting your results to GenBank was not straightforward and took quite a bit of effort on your part.  Now it’s as easy as clicking….and there are some benefits to you too.  So whether you do or not, follow along as I upload my results to the mtDNA Community.

Uploading your Results

Uploading your results is easy.  Just click on the mtDNA Community tab, shown above, on your personal page.  Fill in the blanks and click on the orange Upload button which you will see to the right of the blanks (not shown here).

You will then see the screen, above.  Click on the words “mtDNA Community” which will take you to the website to create your account.

Once you’re on the mtDNA Community website, you’ll need to do some setup.  It’s minimal, but do complete the profile questions, because that process leads you to the good stuff.  And yes, there is a bug in the year selection for your oldest ancestor, but I’m sure that will be fixed shortly.

The important part of this is the information in the Results box, shown at the bottom, above, and shown enlarged below.  You will notice that these are not all of your mutations.  The mutations you carry that are part of the haplogroup designation are not shown here. 

This information displays your new, extended haplogroup under the RSRS model, but even more important, it shows you any “private mutations.” These are important, because they are your family mutations, meaning those not found in the haplogroup as a whole.  These have developed in your family line, and everyone you are related to in a genealogically relevant timeframe will carry these as well.  These are your personal filters that differentiate your family from everyone else in the larger haplogroup, or your extended clan.

There are also other matching features, but it’s unclear how this would differ, at least today, from your matches at Family Tree DNA.  Maybe eventually this data base will hold more sequences other than those donated from Family Tree DNA.  If so, this would provide us with new avenues to find matches.

We will know more when the FAQ is released and as we use this tool a bit more.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

 

The CRS and the RSRS

Before we talk about the new Reconstructed Sapiens Reference Sequence, RSRS, let’s talk for a minute about the current comparison model, the Cambridge Reference Sequence, also known as the CRS or rCRS.

When analyzing mitochondrial DNA, your results are compared to the results of an anonymous individual whose DNA was sequenced in 1981 at Cambridge University.  This set of results which has become the standard is called the Cambridge Reference Sequence, or CRS.  Everyone else’s DNA is compared against theirs, and the differences (mutations) duly noted.

What this means is that for comparison purposes, the current state of their mitochondrial DNA in 1980 is considered “normal” and any differences are then considered “mutations.”

All DNA testing companies as well as academia use this model, but this is changing.  Enter, the RSRS.  What is the RSRS?

The Reconstructed Sapiens Reference Sequence or RSRS

In April, 2012, a groundbreaking, watershed, paper was published by Dr. Doron Behar and 8 other authors titled “A “Copernican” Reassessment of the Human Mitochondrial DNA Tree from its Root.”

You can read the paper and download the supplementary data at this link:  http://www.cell.com/AJHG/abstract/S0002-9297(12)00146-2

Previous to this new paper, mitochondrial DNA results have always been reported by comparing your mtDNA to the Cambridge Reference Sequence.  This has been problematic for a number of reasons, but let’s just look at one example.

Mutation 16519C is present in just about everyone.  In fact, in more than half of the people.  So what this really means is that it’s not really a mutation in the people who carry 16519C, it really was a mutation in the anonymous person who is the Cambridge Reference Sequence.  But since they did not carry 16519C, it’s reported as a mutation in the rest of us.  However, it’s really the “normal” state of the DNA, or what we call the ancestral state.  And it’s relatively useless when comparing your results to others because nearly everyone has it.

What Dr. Behar has spent years doing is going back in time, genetically, and reconstructing what we believe the original “mitochondrial Eve” looked like, at least in terms of her mitochondrial DNA.  He could do that because he took the time to sort through each haplogroup, taking him back in time to the ancestral state of all of the mutations, in other words, before they happened.

The result is something called the Reconstructed Sapiens Reference Sequence, or RSRS.

Why does this matter to you?

Today, when people at other companies are still using the older CRS, it doesn’t matter much, but it will in time as other companies adopt the new model too. It means that your reported mutations change. The RSRS is much more accurate and allows for a uniform naming of the various haplogroups from an ancestral base.  Your haplogroup name may and probably will change between the two.

There may be a time during the transition where you’ll need to know if you’re using CRS or RSRS numbers.  Fortunately, for Family Tree DNA clients, you’re being provided with both, so you’ll be able to use either one or switch back and forth, as needed.

Haplogroup Name Changes

Whether or not the new reference sequence becomes widely accepted or not, this project by Dr. Behar has very successfully found many new subgroups of haplogroups.  In some cases, the haplogroups got shuffled a bit as to where their branch lives on the tree, based on new discoveries.  In my case, I have a new letter appended to my former haplogroup name, J1c2 to J1c2f, but for others, the change is significant.  On the Family Tree DNA pages, RSRS haplogroup names are displayed.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

What Happened to My Mitochondrial DNA???

Did you notice?  If you tested at Family Tree DNA, something happened to your mitochondrial DNA while you were sleeping.  Go ahead….quick….go and take a look.  There is something new….very new.

Family Tree DNA rolled out the very new RSRS sequencing and has positioned it right beside your Cambridge Reference Sequence (CRS) values.  These are the mutations you’ve always seen, are familiar with, know and love.  Your CRS results are on the first tab shown, so still there and very much visible.  See below.

So you have the old and you have the new, side by side, but what is the RSRS and what does it mean to you?  Why are they so different?  Tune in tomorrow where we’ll talk about the brand new, watershed RSRS, the Reconstructed Sapiens Reference Sequence, what it is and why it’s important to you.  This is cutting edge science, and you get to be a part of it!!!

And congratulations to Family Tree DNA for being the first company to bring us this new science, up close and personal.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

I’ve Never Met a DNA Test I Wouldn’t Take….

Ok, so maybe that is a bit of an exaggeration, but not much.

Someone commented recently that they were surprised that I had taken tests with other companies since I am affiliated with Family Tree DNA.  I’d like to talk about that.

First, I am affiliated with Family Tree DNA.  I provide the Personalized DNA Reports that they sell.  We teamed up several years ago to offer these.  I am not an employee, but a contractor to them.  Having said that, I was a customer long before that.  I’ve established several projects there, and for many reasons, I believe they are the best in this industry.

However, that has never kept me from testing at other companies, for several reasons.

First, I feel an obligation to my clients to be well versed in what the industry has to offer, and how can you be well versed if you don’t take the tests?  At least, that’s what I tell my husband when he asks why all those DNA testing bills:) So that’s my story and I’m sticking to it!

Secondly, I believe in fishing in different ponds.  Your DNA is fishing for you 24x7x365.

Third, there are lots of people in lots of places conducting research today.  I’m involved with a number of those projects as well, as a volunteer.

Fourth, if one company has a better tool for DNA analysis, I’m all for it, and for them.  For example, 23andMe was the first to offer the full spectrum autosomal tests, and I tested there and so did many of my family members.  I have also benefitted from the health information.

Fifth, I like to compare similar information between companies.  You can see an example of this and how I used it in my genealogy in the paper I wrote (published in JoGG), Revealing American Indian and Minority Heritage Using Y-line, Mitochondrial, Autosomal and X Chromosomal Testing Data Combined with Pedigree Analysis.

So, I’m by no means a DNA snob or in an exclusive relationship with Family Tree DNA relative to testing.  In fact, I recently ordered Ancesty.com’s autosomal DNA test.  I want to see what they say, find cousins in that data base who may help to break down genealogical brick walls, and how my percentages of ethnicity are calculated there.  I’ll let you know as the results come in and how they stack up with similar tests at Family Tree DNA and 23andMe.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research