Using Spousal Surnames and DNA to Unravel Male Lines

When Y DNA matching at Family Tree DNA, it’s not uncommon for men to match other males of the same surname who share the same ancestor. In fact, that’s what we hope for, fervently!

However, if you’re stuck downstream, you may need to figure out which of several male children you descend from.

If you’re staring at a brick wall working yourselves back in time, you may need to try working forward, utilizing various types of information, including wives’ surnames.

For all intents and purposes, this is my Vannoy line, in Wilkes County, NC, so let’s use it as an example, because it embodies both the promise and the peril of this approach.

So, there you sit, disconnected from the Vannoy line. That little yellow box is just so depressing. So close, but yet so far. And yes, we’ve already exhausted the available paper trail records, years ago.

We know the lineage back through Elijah Vannoy, who was born between 1784-1786 in Wilkes County, or vicinity. We know my Vannoy cousin Y DNA matches with other men from the Vannoy line upstream of John Francis Vannoy, the known father of four sons in Wilkes County, NC and the first (and only) Vannoy to move from New Jersey to that part of North Carolina.

Therefore, we know who the candidates are to be Elijah’s father, but the connection in the yellow box is missing. Many Wilkes County records have gone missing over the years and births were not recorded in that timeframe.  The records from neighboring Ashe County where Daniel Vannoy lived burned during the Civil War, although some records did survive. In other words, the records are rather like Swiss cheese. Welcome to genealogy in the south.

Which of John Francis Vannoy’s four sons does Elijah descend from?

Let’s see what we can discover.

Contact Matches and Ask for Help

The first thing I would do is to ask for assistance from your surname matches.

Let’s say that you match a known descendant of each of these four men, meaning each of John Francis Vannoy’s sons. Ask each person if they know where the male Vannoy descendants of each son went along with any documentation they might have. If your ancestor, Elijah in this case, is not found in the same location as the sons, geography may be your friend.

In our case, we know that Francis Vannoy migrated to Knox County, Kentucky, but that was after he signed for his daughter’s marriage in Wilkes Co., NC in 1812. It was also about this time that Elijah Vannoy migrated to Claiborne County, TN, in the same direction, but not the same location. The two locations are an hour away by car today, separated by mountains and the Cumberland Gap, a nontrivial barrier.

We also know that Nathaniel Vannoy left a Bible that did not list Elijah as one of his children, but with a gap large enough to possibly encompass another child.  If you’re thinking to yourself, “Who would leave a child’s birth out of the Bible?,” I though the same thing until I encountered it myself personally in another line.  However, the Bible record does make Nathaniel a less likely father candidate, despite a persistent rumor that Nathaniel was Elijah’s father.

Our only other clues are some tax records recording the number of children in the household of various ages, but none are conclusive. None of these men had wills.

Y DNA Genetic Distance

Your Y DNA matches will show how many mutations you are from them at a particular marker level.

Please note that you can click to enlarge any graphic.

The number of mutations between two men is called the genetic distance.

The rule of thumb is that the more mutations, the further back in time the common ancestor. The problem is, the rule of thumb doesn’t always work. DNA mutates when it darned well pleases, not on any clock that we can measure with that degree of accuracy – at least not accurately enough to tell which of 4 sons a man descends from – unless that line has incurred a defining mutation between the ancestor and the current generation. We call those line marker mutations. To determine the mutation history, you need multiple men from each line to have tested.

You can read more about Y DNA matching in the article, Concepts – Y DNA Matching and Connecting with your Paternal Ancestor.

Check Autosomal DNA Tests

Next, check to see if your Y DNA matches from all Vannoy lines have also taken the autosomal Family Finder test, noted as FF, which shows matches from all ancestral lines, not just the paternal line.

You can see in the match list above that not many have taken the Family Finder test. Ask if they would be willing to upgrade. Be prepared to pay if need be – because you are, after all, the one with the “problem” to solve.

Generally, I simply offer to pay. It’s well worth it to me, and given that paper records don’t exist to answer the question – a DNA test under $100 is cheap. Right now, Family Finder tests are on sale for $69 until the end of the month.

Check for Intermarriage

While you’re waiting for autosomal DNA results, check the pedigrees for all for lines involved to see if you are otherwise related to these men or their wives.

For example, in Andrew Vannoy’s wife’s line and Elijah Vannoy’s wife’s line, we have a common ancestor. George Shepherd and Elizabeth Mary Angelique Daye are common to both lines, and John Shepherd’s wife is unknown, so we have one known problem and one unknown surname.

You can tell already that this could be messy, because we can’t really use Andrew Vannoy’s wife’s line to search for matches because Elijah’s line is likely to match through Andrew’s wife since Susannah Shepherd and Lois McNiel share a common lineage. Rats!

We’ll mark these in red to remind ourselves.

Check Advanced Matching

Family Tree DNA provides a wonderful tool that allows you to compare matches of different kinds of DNA. The Advanced Matching tab is found under “Tools and Apps” under the myFTDNA tab at the upper left.

In this case, I’m going to use the Advanced Match feature to see which of my Vannoy cousin’s Y matches at 37 markers, within the Vannoy DNA project, also match him autosomally.

This report is particularly nice, because it shows number of Y mutations, often indicating distance to a common ancestor, as well as the estimated autosomal relationship range.

You can see in this case that the first Vannoy male, “A,” is a close match both on Y DNA and autosomally, with 1 mutation difference and falling in the 2nd to 4th cousin range, as compared to the second Vannoy male, “D,” who is 3 mutations different and falls into the 4th to remote cousin range.

Not every Vannoy male may have joined the Vannoy project, so you’ll want to run this report a second time, replacing the Vannoy project search criteria with “The Entire Database.”

Unfortunately, not everyone that I need has taken the Family Finder test, so I’ll be contacting a few men, asking if I can sponsor their upgrades.

Let’s move on to our next tactic, using the wives’ surnames.

Search Utilizing the Wife’s Surname

We already know that we can’t rely on the Shepherd surname, so we’ll have to utilize the surnames of the other three wives:

  • Millicent Henderson – parents Thomas Henderson born circa 1730 Virginia, died 1806 Laurens, SC, wife Frances, surname unknown
  • Elizabeth Ray (Raye) – parents William Ray born circa 1725/1730 Herdford, England, died 1783 Wilkes Co., NC (the portion now Ashe Co.,) wife Elizabeth Gordon born circa 1783 Amherst Co., VA and died 1804 Surry Co., NC
  • Sarah Hickerson – parents Charles Hickerson born circa 1725 Stafford Co., VA, died before 1793 Wilkes Co., NC, wife Mary Lytle

Utilizing the Family Finder match search function, I’m going to search for matches that include the wives surnames, but are NOT descended from the Vannoy line.

Hickerson produced no non-Vannoy matches utilizing the matches of my first Vannoy cousin, but Henderson is another matter entirely.

Since the Henderson line would be on my cousin’s father’s side, the matches that are most relevant are the ones phased to his paternal line, those showing the blue person icon.

The surname that you have entered as the search criteria will show as blue in the Ancestral Surname list, at far right, and other matching surnames will show as black. Please note that this includes surnames from ANY person in the match’s tree if they have uploaded a Gedcom file, not just surnames of direct ancestral lines. Therefore, if the match has a tree, it’s important to click on the pedigree icon and search for the surname in question. Don’t assume.

Altogether, there are 76 Henderson matches, of which 17 are phased to his paternal line. You’ll need to review each one of at least the 17. Personally, I would painstakingly review each one of the 76. You never know where a shred of information will be found.

Please note, finding a match with a common surname DOES NOT MEAN THAT YOU MATCH THIS PERSON THROUGH THAT SURNAME. Even finding a person with a common ancestor doesn’t mean that you both descend from that ancestor. You may have a second common ancestor. It means that you have more work to do, as proof, but it’s the beginning you need.

Of course, the first thing we need to do is eliminate any matches who also descend from a Vannoy, because there is no way to know if the matching DNA is through the Vannoy or Henderson lines. However, first, take note of how that person descends from the Vannoy line.

You can see your matches entire surname list by clicking on their profile picture.

The surname, Ray, is more difficult, because the search for Ray also returns names like Bray and Wray, as well as Ray.

But Wait – There’s a Happy Ending!

If you’re thinking, “this is a lot of work,” yes, it is.

Yes, you are absolutely going to do the genealogy of the wives’ lines so you can recognize if and how your matches might connect.

I enter the wives’ lines into my genealogy software and then I search for the ancestors found in my matches trees to see if they descend from that line.

One tip to make this easier is to test multiple people in the same line – regardless of whether they are males or carry the desired surname. They simply need to be descendants – that’s the beauty of autosomal DNA and why I carry kits with me wherever I go.  And yes, I’m really serious about that!

When you have multiple testers from the same line, you can utilize each test independently, searching for each surname in the Family Finder results.  Then, from the surname match list, select a sibling or other close relative with that same surname in their list, then choose the ICW feature. This allows you to see who both of those people match who also carries the Henderson surname in their surname list.

Not successful with that initial cousin’s match results – like I wasn’t with Hickerson?

Rinse and repeat, with every single person who you can find who has descended from the line in question. I started the process over again with a second cousin and a Hickerson search.

About the time you’re getting really, really tired of looking at all of those trees, extending the branches of other people’s lines, and are about to give up and go to bed because it’s 3 AM and you’re discouraged, you see something like this:

Yep, it’s good old Charles Hickerson and Mary Lytle.  I could hardly believe my eyes!!! This Hickerson match to a cousin in my Vannoy line descends from Charles Hickerson’s son, Joshua.

All of a sudden…it’s all worthwhile! Your fatigue is gone, replaced by adrenalin and you couldn’t sleep now if your life depended on it!

Using the ICW (in common with feature) to find additional known cousins who match the person with Charles Hickerson and Mary Lytle in their tree, I found a total of three Vannoy cousins with significant matches.

Using the chromosome browser to compare, I’ve confirmed that one segment is a triangulated match of 12.69 cM (blue) on chromosome 2.

You can read more about triangulation in the article, Concepts – Why Genetic Genealogy and Triangulation? as well as the article, Concepts – Match Groups and Triangulation.

Do I wish I had more than three people in my triangulation group? Yes, of course, but with a match of this size triangulated between cousins and a Hickerson descendant who is a 30 year genealogist, sporting a relatively complete tree and no other common lines, it’s a great place to begin digging deeper! This isn’t the end, but a new beginning!

After obsessively digging through the matches of every Elijah Vannoy descended cousin I can find (sleep is overrated anyway) and whose account I have access to, I have now discovered matches with four additional people who have no other common lines with the Vannoy cousins and who descend from Charles Hickerson and Mary Lytle through sons David and Joseph Hickerson. I can’t tell if they triangulate without access to accounts that I don’t have access to, so I’ve sent e-mails requesting additional information.

WooHoo Happy Day!!! There’s a really big crack in the brick wall and I’ve just witnessed the sunrise of a beautiful, amazing day.

I think Elijah’s parents are…drum roll…Daniel Vannoy and Sarah Hickerson!

Which walls do you need to fall and how can you use this technique?

______________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Concepts – Mirror Trees

What are mirror trees, and why would I ever want to use one?

Great question.

You’ll hear genealogists, especially adoptees or persons trying to find a missing parent mention using mirror trees.

Mirror trees are a technique that genealogists use to help identify a missing common ancestor by recreating the tree of a match and strategically attaching your DNA to their tree to see who you match that descends from which line in their tree.

I have used mirror trees to attempt to determine the common line of a close cousin whose common ancestor (with me) I simply CANNOT discover. Notice the words “attempt to.”  Mirror trees are not a sure-fire answer, and they can sometimes lead you astray.

Foundation Concept

The foundation concept of a mirror tree is very straightforward.

Let’s say you match Susie as a second cousin. This means that you should share a great-grandparent with Susie. A relationship this close OUGHT to be relatively simple to figure out – except sometimes it isn’t.

Note that vendor relationship estimates are just that, estimates of relatedness based on total and longest cM, and they can be off in either direction.

In the case of third cousins or closer, vendor estimates are generally pretty accurate.

You can view the ranges of cMs and relationships in this chart.

Of course, when you match someone, you don’t know who the common ancestor is, nor do you necessarily have access to their pedigree chart or tree. If you do, and you can easily see the identity of the common ancestral couple, that’s great – but life isn’t always that simple.

In Practice

In my case, I match Susie, and no place in our trees, at ALL, is a common ancestor, let alone three generations back in time. Furthermore, her entire line and my father’s line were all from Appalachia, so common geography doesn’t help.

We matched at Ancestry, so we both uploaded to GedMatch, where we match almost exactly the same, and the relationship prediction is the same as well. Someplace, in one of our trees, is an NPE, a misattributed parentage – because both of our trees are complete back beyond those generations.

Uh oh.

So, I created a tree in my Ancestry account, duplicating Susie’s tree, and making it private – at least one generation beyond great-grandparents – just in case the estimate is wrong. Then, I connected my DNA to her tree, as her.

In my case, I have two DNA tests at Ancestry, my V1 results and my V2 results. I never really thought about this as a way to keep one set of results working for me, connected to my own tree, and to have a second set of results to connect to mirror trees – but that’s exactly what I’ve done. I utilize the second set of results as my “working on a problem” results while the first set of results just stays connected to my own tree.

After connecting my DNA results to the mirror tree and giving Ancestry a couple of days to cycle through, creating connections and green leaf “shared ancestor” hints, I checked to see who my DNA attached to her tree says I match, and which line in her tree “lights up” with match hints. If I can’t tell by connecting my DNA as her, I can also connect my DNA to her parents and grandparents, one at a time – again – looking for green leaf shared ancestor hints in those lines. No hints = wrong line.

This process shows me in which of her lines our common lineage is found – even if I can’t exactly pinpoint the common ancestors just yet.

Instructions

I had planned to provide step by step directions for how to create a mirror tree and then how to utilize the results, but then I discovered that someone else has done an absolutely wonderful job of writing mirror tree instructions. There is absolutely no reason to recreate the wheel, so I’m linking to two articles from the blog, Resurrecting Roots, as follows:

After building a mirror tree, their next article explains what to do next.

Now, if I could just figure out that common ancestor with my second cousin match. You may encounter the same type of challenge.

If the right people haven’t tested yet, you may not be able to achieve your goal on the first try. Or, in my case, it appears that we may have more than one common ancestor – complicating matters a bit. If this happens to you, wait a few weeks/months and connect the tree again, or build it out another generation to increase your changes of a green leaf hint.

The great thing about genetic genealogy is that more people are testing every single day. Give mirror trees a try if you’re an adoptee, trying to find an unidentified family member in a relatively close generation, or are being driven absolutely batty with a relatively close match that you can’t solve!

If you need help solving these types of problems, I suggest contacting dnaadoption and taking one of their classes.  They aren’t just for adoptees.

__________________________________________________________________

Standard Disclosure

This standard disclosure will now appear at the bottom of every article in compliance with the FTC Guidelines.

Hot links are provided to Family Tree DNA, where appropriate. If you wish to purchase one of their products, and you click through one of the links in an article to Family Tree DNA, or on the sidebar of this blog, I receive a small contribution if you make a purchase. Clicking through the link does not affect the price you pay. This affiliate relationship helps to keep this publication, with more than 850 articles about all aspects of genetic genealogy, free for everyone.

I do not accept sponsorship for this blog, nor do I write paid articles, nor do I accept contributions of any type from any vendor in order to review any product, etc. In fact, I pay a premium price to prevent ads from appearing on this blog.

When reviewing products, in most cases, I pay the same price and order in the same way as any other consumer. If not, I state very clearly in the article any special consideration received. In other words, you are reading my opinions as a long-time consumer and consultant in the genetic genealogy field.

I will never link to a product about which I have reservations or qualms, either about the product or about the company offering the product. I only recommend products that I use myself and bring value to the genetic genealogy community. If you wonder why there aren’t more links, that’s why and that’s my commitment to you.

Thank you for your readership, your ongoing support and for purchasing through the affiliate link if you are interested in making a purchase at Family Tree DNA.

Concepts – Segment Survival – 3 and 4 Generation Phasing

Have you ever had something you need to refer back to and can’t find it? I do this more often than I care to admit.

About a year ago, I did a study when I was writing the “Concepts – Parental Phasing” article where I tracked segment matches from generation to generation through three generations.

I wanted to see how small versus large segments faired during the phasing process with a known relative. In other words, if a known relative matches a child and a parent on the same segment, does that known relative also match the relevant grandparent on that same segment, or is that match ”lost” in the older generation.

This first example shows the tester matching all 4 generations of the Curtis lineage.

The second example, below, shows the Tester matching only the two youngest generations, but not the Grandparent or Great-grandparent.

Obviously, the tester cannot match the child and parent without also matching the grandparent and great-grandparents, who have also tested, for the segment to be genealogically relevant, meaning passed from the common ancestor to both the tester and the descendants in the Curtis line.  For the match between the tester and the parent/child to be valid, meaning the DNA descended from the common ancestor, the DNA segment MUST also be carried by the Grandparent and Great-grandmother.

If the segment matches all four people, then it phases through all generations and is a solid phased match.

If the segment matches only two contiguous generations, and not the older generation, as shown above, the segment is identical by chance in the younger generations, and is not genealogically relevant.

A third situation is clearly possible, where the tester matches the older generation or generations, but not the younger. In this case, the DNA simply did not get passed on down to the younger generations. In the example shown below, the segment still phases between the Grandparent and the Great-grandmother.

I’ve extracted the results from the original article and am showing them here, along with a 4 generation study utilizing 5 different examples.

The results are important because they were unexpected, as far as I was concerned.

Let’s take a look at the original results first.

Original Study – 3 Generations – 2 Meiosis

In the first study comparing three generations, I compared four different groups of people to a known relative in their family line. None of the family groups included any of the same people.

If the known relative matches the youngest generations, meaning the child and the parent, both, the location was colored green. This means the match phased through one generation. If the known relative also matched the third generation, the grandparent, on that same location, the location remained green. If the known relative did not match the oldest generation in addition to the child and the parent, then the location was changed to red, because the phasing was lost.

Green means that the matches did phase in all three generations and red means they either did not phase or the phasing was “lost” in the older generation.  Lost, in this instance, means the DNA match never happened and it was “lost” during the analysis process.

I followed this same process for 4 separate groups of three individuals, resulting in the following distribution of matching segments through all three generations (green), versus segments that matched the younger two generations but not the older generation (red) or don’t phase at all, meaning they match only one of the two younger relatives.

I marked what appears to be a threshold with a black line.

As you can see, the phasing threshold cutoff appears to be someplace between 2.46 and 3.16 cM. These matches are through Family Tree DNA, so all SNPs will be 500 or over. In other words, almost all segments below that line phased to all three generations. Many or most segments above that line were lost in upstream generations. This means they were false matches, or identical by chance (IBC).

More segments phased to earlier generations than I expected.  I was especially surprised at the number of small segments and the low threshold, so I was anxious to see if the pattern held when utilizing 4 generations which involves 3 meiosis..

New Study – 4 Generations – 3 Meiosis

In any one generation, a match can occur by chance, but once the match has phased through the parent’s generation, meaning the cousin matches the child AND the parent on the same segment, it’s easy to assume that they would, logically, match through the next two generations upwards as well. But do they? Let’s take a look.

Instead of just the summary information provided in the 3 generation study, I’m going to be showing you the three steps in the evaluation process for each example we discuss. I think it will help to answer questions, as well as to enable you to follow these same steps for your own family.

In total, I did 5 separate 4 generation comparisons, labeled as Examples 1-5, below.

Example 1 – 4 Generation – 3 Meiosis (DL)

A known cousin was compared up the tree on the relevant line through 4 generations. The relationship of the testers is shown in the chart above, with the blue arrows.

On the Curtis line, 4 individuals in descending generations were tested:

  • Child
  • Parent
  • Grandparent
  • Great-grandparent

In the Solomon line, one descendant was tested.

The results show the DNA segments that phased for 2, 3 and 4 generations, which is a total of 3 meiosis, meaning three times that the DNA was passed from generation to generation between the Great-grandparent and the Child.

The individual whose matches are tracked below is a third cousin to the Great-grandparent of the group. The relationship of the cousin to the descendants of the great-grandparent is shown below.

In reality, the distance of the cousin relationship isn’t really relevant. The relevant aspect is that the cousin DOES match all 4 relatives that tested, and we can track the segments that the cousin matches to the child, parent or grandparent back through the great-grandparent to see if they phase, meaning to see if the match is legitimate or not. In other words, was the segment passed from the Great-grandparent to the Grandparent to the Parent to the Child?

This first chart shows the cousin’s matches to all 4 of the family members. I’ve colored them green if they have phased matches, meaning adjacent generations on the same segment. In the comment column, I’ve explained what you are seeing.

This chart is a little more complex than previously, because we are dealing with 4 generations instead of 3. Therefore, I’m showing the cousin’s matches to all 4 individuals.

  • For a location to have no color and be labeled “No Phased Match” means that there was a match to one family member, but not to the adjacent generation upstream, so it’s not a genealogically relevant match. In other words, it’s a false match.
  • For a location to have no color and be labeled “Oldest Gen Only” means that the cousin matches the great-grandmother only. Those matches may be genealogically relevant, but because we don’t have a generation upstream of her, we can’t phase them and can’t tell if they are relevant or not based only on the information we have here. Obviously you’ll want to evaluate each match individually to see if it is a legitimate or false match using additional criteria.
  • For a location to be colored green, it must phase entirely for all the generations from where it begins upwards in the tree. For some matches, that means all 4 generations. Some matches that do phase only phase for 2 or 3 generations, meaning that the segment did not get passed on to younger generations. The two shades of green are only to differentiate the match groups when they are adjacent on the spreadsheet.
  • If the cell is green and says “4 Gen Match,” it means that the match appeared in all 4 generations and matched (or at least overlapped.)
  • If the cell is green and says “3 Gen Match,” it means that the match appeared in the oldest 3 generations and matched. The match did NOT appear in the child’s generation, so what we know about this segment is that it did not get passed to the child, but in the three generations in which it does appear, it phased.
  • If the cell is green and says “2 Gen Match,” it means that it appeared in the oldest two generations and phased, but did NOT get passed to the parent, so it could not have been passed to the child.
  • Matches to any single generation (but not the immediate upstream generation) are labeled “No Phased Match.”
  • If the cell is red and says “Lost Phasing” it means that the segment phased in at least two generations but did NOT match the adjacent generation upstream. Therefore, this is an example of a segment that did phase in one generation, but that was actually identical by chance (IBC) further upstream. In the case of the red segments above, they phased in all three of the younger generations, only to become irrelevant in the oldest generation when the tester did not match the Great-grandmother.

Now, looking at the same segment chart sorted by centiMorgan size.

Sorted by centiMorgan size gives you the opportunity to note that the larger segments are much more likely to phase, when given the opportunity. Translated, this means they are much more likely to be legitimate segments.

Formatted in the same way as the 3 generation groups, we see the following chart of only the segments, with the matches that were to the oldest generation only removed because they did not have the opportunity to phase. What we have below are the results for the matches that did have the opportunity to phase:

  • Green means the segment did phase
  • Red Means the segment did not phase and/or lost phasing.
  • White rows that did NOT phase are red above, along with rows that lost phasing.
  • White rows that are labeled “Oldest Gen Only” were removed because they are the oldest generation and did not have the opportunity to phase with an older generation.
  • For details, refer to the original charts, above.

Example 2 – 4 Generation – 3 Meiosis (CF-SV)

A second 4 generation comparison with a first cousin to the Great-grandmother results in more matches due to the closeness of the relationship, yielding additional information.

The 4 individuals in this and the following 3 examples are related in the following fashion:

Child 1 and Child 2 are siblings and Cousin 1 and Cousin 2 are siblings.

The two cousins are first cousins to the great-grandmother, so related to the matching individuals in the following fashion:

Because first cousins are significantly closer than third cousins, we have a lot more matching segments to work with.

It’s worth noting in the above chart that the two groups colored with gold in the right column both look like they phase, but when you look at the relationships of the people involved, you quickly realize that an intermediate generation is missing.

In the first example, the Grandparent and Great-grandmother do phase, but the child does not, because the cousin doesn’t also match the parent on that segment, so the parent could NOT have passed that segment to the child.  Therefore, the child does not phase.

In the second example, the cousin matches the Parent and Great-Grandmother, but the parent is missing in the match sequence, so these people don’t phase at all.

Sorted by centiMorgan size, we see the following.

Formatted by phased segment size, where red means did not phase or lost phasing and green means phased, we see the following pattern emerge.

Example 3 – 4 Generation – 3 Meiosis (CF-PV)

The next comparison is the still Cousin 1 but compared to Child 2.

In this case, three segments lost phasing when compared to older generations. They look like they phased when comparing the cousin to the Parent and Child, but we know they don’t because they don’t match the Grandparent, the next adjacent generation upstream.

Sorted by centiMorgan size, we see the following:

It’s interesting that all of the segments that lost phasing were quite small.

Formatted by segment size where red equals segments that did not phase or lost phasing and green equals segments that did phase.

Example 4 – 4 Generations – 3 Meiosis (DF-SV)

The fourth example utilizes Cousin 2 and Child 1.

In this comparison, no segments lost phasing, so there are no red segments.

Sorted by centiMorgan size, above and phased versus unphased segments, below.

Example 5 – 4 Generations – 3 Meiosis (DF-PV)

This last example utilizes the results of Cousin 2 matching to Child 2.

Again we have a group identified by gold in the last column that looks like a phased group if you’re just looking at the chromosome start and end locations, until you notice that the Grandparent is missing. The Parent and Child do share an overlapping segment mathematically, and it appears that this is part of the Great-grandmother’s segment, but it isn’t because the segment did not pass through the Grandparent. Of course, there is always a small possibility that there is a read issue with the grandparent’s file in this location, but as it stands, the parent and child’s matching segment loses phasing because it does not phase to the grandparent.

Again, three segments lost phasing.

Above, the spreadsheet sorted by centiMorgan value and below, by phased and unphased segments.

Side By Side Comparison

This side by side comparison shows the 5 different comparisons of 4 generations and 3 meiosis.

The pattern looks very similar and is almost identical in terms of the threshold to the original 3 generation study.  The 3 gen study thresholds varied from 2.46 to 3,16.  The largest 3 generation unphased segments were 3.36, 4.16, 4.75 and 6.05.

This suggests that your results with a 3 generation study are probably nearly just as reliable as a 4 generation study, although we did see one instance where phasing was lost after three matching generations. However, evaluating that match itself reveals that it was certainly highly questionable with the Parent carrying more of the “matching” segment to the Child than the Grandparent carried. While it was technically a 3 generation match before losing phasing, it wasn’t a solid match by any means.

With more test data, this could also mean that off-shifted matches or questionable matches are more likely to not phase or fail in higher generations.  I wrote here about methodologies for determining legitimate and false matches.

Discussion

I assembled a summary of the pertinent information from the five different 4 generation charts.

  • As expected, very small segments often did not phase. However, around the 3.5 cM region, they began to phase and reliably so. However, some larger segments, one as large as 7.13, did not phase.
  • It appears from the small number of segments that lost phasing that most of the time, if a segment does phase with the next generation upstream, it’s a valid segment and will continue to phase upwards.
  • Occasionally, phased segments are not valid and fail a “test” further up the tree. These are the segments that “lost phasing.”
  • The segments that did lose phasing were smaller segments with the largest at 3.68 cM.
  • Phasing, even in small segments, seems to be a relatively good predictor of a segment that is identical by descent, as determined by continuing to match ancestral segments on up the tree.

Of course, additional matches with cousins on the same segments would strengthen the argument as well, with or without phasing. Genetic genealogists are always looking for more information and ways to strengthen our evidence of connections with our cousins and family members. After all, that’s how we positively identify segments attributable to specific ancestors.

Testing Your Own Family

If you have either 3 or 4 individuals in descending generations, you can reproduce these same kinds of results for yourself. It’s actually easy and you can use the charts, methodology and color coding above as a guide.

You will need a relative that matches on the side of the oldest generation. In this case, the relatives were cousins of the great-grandmother. The relative will need to match the other two or three downstream people as well, meaning the direct descendants of the oldest relative. By copying the cousin’s entire match list from the Family Finder chromosome browser, you will be able to delete all matches other than to the people in your family group and compare the results using the same methodology I have shown.

If you don’t have access to the cousin’s match list, you can copy the matches to the cousin from the family member’s match lists and combine them into one spreadsheet.  The outcome is the same, but it’s easier if you have access to the cousin’s matches because you only have to download one file instead of 4.

What Can I Do With This Information?

Based on identifying segments as legitimate or false matches, you can label your DNA Master Spreadsheet with the information you’ve gleaned from the process. I’ve done that with just phasing to my mother. Studies such as this give me confidence that the larger phased segments with my mother are legitimate; even some segments below 5 cM and as low as 3.5 cM that DO phase.

These results and this article is NOT a suggestion that people should assume that ALL smaller segment matches are legitimate, because they aren’t. These studies are attempts to figure out HOW to discern which segments are valid and how to go about that process, including small segments. We now have three tools that can be utilized either together or individually:

  • Parental phasing
  • Multi-generation phasing, utilizing the parental phasing tools
  • Cousin Matching to phased segments, which is what we did in this article
  • Family Tree DNA’s Family Phasing which in essence does this sort of matching for you, labeling your matches as to the side they descend from.

From the phasing information we’ve discovered, it appears that most segments below 3.5 cM aren’t going to phase and the majority are NOT legitimate matches.

This is a limited study.  Additional information could change and would certainly add to this information.

More is Better

As always, more data is always better.  Additional examples of results using this same phasing/cousin matching technique would allow quantification of the reliability of phased results as compared to unphased results.  In other words we know already that phased results are much better and more reliable than unphased results, but how much more and what are the functional limits of phased results?

There really is no question about the reliability of phased results in regard to larger segments, but additional information would help immensely in understanding how to successfully utilize smaller phased segments, in the range of 3.5 to 8 cM.

I would also suspect that in endogamous families, the thresholds observed here will move, probably with the phasing threshold moving even lower. People from fully endogamous cultures have many legitimate common small segments from sharing ancient ancestors. It would be interesting to observe the effects of endogamy on the observations made here.

I’m not Jewish and don’t have access to Jewish family information, but if several Jewish readers have tested multi-generational family and have a cousin from that side to test against, I would be glad to publish a followup article similar to this one with endogamous information.

It’s so exciting to be on the forefront of this wonderful genetic genealogy frontier together and to be able to experiment and learn.

I hope you use this methodology to explore, have fun and discover new information about your family.

Revisiting AncestryDNA Matches – Methods and Hints

I think all too often we make the presumption about businesses like Ancestry that “our” information that is on their site, in our account, will always be there. That’s not necessarily true – for Ancestry or any other business. Additionally, at Ancestry, being a subscription site, the information may be there, but inaccessible if your subscription lapses.

For a long time, I didn’t keep a spreadsheet of my matches at Ancestry, and when I began, not all of the information available today was available then – so my records are incomplete. Conversely, some of the matches that were there then are gone now. A spreadsheet or other type of record that you keep separately from Ancestry preserves all of your match information.

I was recently working on a particular line, and I couldn’t find some of the DNA Shared Ancestor Hints (aka green leaves) that were previously shown as matches. That’s because they aren’t there anymore. They’ve disappeared.

Granted, Ancestry has been through a few generations of their software and has made changes more than once, but these matches remained through those. However, they are unquestionably gone now. I would never have noticed if I hadn’t been keeping a spreadsheet.

Now, I have a confession to make. At Ancestry, the ONLY matches that I really work with are the DNA matches where I ALSO have a leaf hint – the Shared Ancestor Hint Matches.

ancestry-ancestor-hint

That’s not to say that this approach is right or wrong, but it’s what works best for me.  The only real exception is close matches, 3rd cousins or closer.  Those I “should” be able to unravel.

I’m not interested in trying to unravel the rest. About 50% of my matches have trees, and those trees do the work for me, telling me the common ancestor we match if one can be identified. For me, those 367 green Ancestor Hints DNA+tree-matches are the most productive.

So I’m not interested in utilizing the third party tools that download all of my Ancestry matches. I also don’t really want all of that information either – just certain fields.

Adding the match to my spreadsheet gives me the opportunity to review the match information and assures that I don’t get in a hurry and skim over or skip something.

So, when some of my matches came up missing, I knew it because I HAVE the spreadsheet, and I still have their information because I entered it on the spreadsheet.

Here’s an example. In a chart where I worked with the descendants of George Dodson, I realized that three of my sixteen matches (19%) to descendants of George Dodson are gone. That’s really not trivial.

ancestry-match-information

If you’re wondering how I could not notice that my matches dropped, I asked the same question. After all, Ancestry clearly shows how many Shared Ancestor hints I have.

Ancestry matches periodically have a habit of coming and going, so I’ve never been too concerned about a drop of 1 in the total matches – especially given adoptee shadow trees and such. Generally, my match numbers increase, slowly. What I think has actually been happening is that while I have 3 new matches, what really happened is that I lost two and gained 5 – so the net looks like 3 and I never realized what was happening.

ancestry-dna-main-page

Because I’m only interested in the Shared Ancestor Hint matches, that’s also the only number I monitor – and it’s easy because it’s dead center in the middle of my page.

When I realized I have missing matches, I also realized that I had better go back and enter the information that is missing in my spreadsheet for my early matches– such as the total segment match size, the number of matching segments and the confidence level. That’s the best we can do without a chromosome browser. It would be so nice if Ancestry provided a match download, like the other vendors do, so we don’t have to create this spreadsheet manually.

The silk purse in this sow’s ear is that in the process of reviewing my Ancestry matches, I learned some things I didn’t know.

Why Revisit Your Matches?

So, let’s take a look a why it’s a good idea to go back and revisit your Ancestry Shared Ancestor Hints from time to time.

  • People change their user name.
  • People change their ancestors.
  • You may now share more than one ancestral line, where you didn’t originally. I’ve had this happen several times.
  • People change their tree from public to private.
  • People change their tree from private to public.
  • Your matches may not be there later.
  • Circles come, and Circles go, and come, and go, and come and go…
  • If you contacted someone in the past about a private tree, requesting access, they may have never replied to you (or you didn’t receive their correspondence,) but they may have granted you access to their tree. Who knew!!!
  • Check, and recheck Shared Surnames, because trees change. You can see the Shared Surnames in the box directly below the pedigree lineage to the common ancestor for you and your match.

ancestry-shared-surnames

  • Ancestry sometimes changes relationship ranges. For example, all of the range formerly titled “Distant Cousin” appears to be 5th – 8th cousins now.
  • When people have private trees, you’re not entirely out of luck. You can utilize the Shared Matches function to see which matches you and they both match that have leaf hints. Originally, there were seldom enough people in the data base to make this worthwhile, but now I can tell which family line they match for about half of my Shared Ancestor Hint matches (leaf matches) that are private.

This is also my first step if I do happen to be working with someone who doesn’t have a tree posted or linked to their DNA.

Click on the “View Match” link on your main match page for the match you want to see, then on the “Shared Matches” in the middle of the gray bar.

ancestry-shared-matches

The hint that you are looking for in the shared matches are those leaf hints, because you can look at that person’s tree and see your common ancestor with them, which should (might, may) provide a hint as to why the person you match is also matching them. It’s not foolproof, but it’s a hint.

ancestry-shared-matches-leaf

Of course, if you find 3 or 4 of those leaf hints, all pointing to the same ancestral couple, that’s a mega-hint.

Unfortunately, that’s the best sleuthing we can we can do for private matches with no tree to view and no chromosome browser.

  • You may have forgotten to record a match, or made an error.
  • Take the opportunity to make a note on your Ancestry match. The “Add Note” button is just above the “Pedigree and Surnames” button and just below the DNA Circle Connection.

ancestry-note

On your main match page, you can then click on the little note icon and see what you’ve recorded – which is an easy way to view your common ancestor with a match without having to click through to their match page. When the person has a private tree, I enter the day that I sent a message, along with any common tree leaf hint shared matches that might indicate a common ancestor.

ancestry-note-n-match-page

Tracked Information

Part of the information I track in my spreadsheet is provided directly by Ancestry, and some is not. However, the matching lines back to a common ancestor makes other information easy to retrieve.  The spreadsheet headings are shown below.  Click to enlarge.

ancestry-spreadsheet-headings

I utilize the following columns, thus:

  • Name – Ancestry’s user name for the match. If their account is handled by someone else, I enter the information as “C. T. by johndoe.”
  • Est Relationship – ancestry’s estimated relationship range of the match.
  • Generation – how many generations from me through the common ancestor with my match. Hint – it’s always two more than the relationship under the common ancestor. So if the identification of the common ancestor says 5th great-grandfather, then the person (or couple) is 7 generations back from me.
  • Ancestor – the common ancestor or couple with the match.
  • Child – the child of that couple that the match descends from.
  • Relationship – my relationship to the match. This information is available in the box showing the match in the shared ancestor hint. In this case, EHVannoy (below) and I are third cousins.
  • Common Lines – meaning whether we have additional lines that are NOT shown in Ancestor Hints. You’ll need to look through the Shared Surnames below the Shared Ancestor Hint box. I often say things in this field like, “probably Campbell” or “possibly Anderson” when it seems likely because either I’ve hit a dead end, or the family is found in the same geographic location.

ancestry-common-lines

  • Shared cMs – available in the little “i” to the right of the Confidence bar, shown below.

ancestry-shared-cms

Click on the “i” to show the amount of shared DNA, and the number of shared segments.

  • Confidence – the confidence level shown, above.
  • MtDNA – whether or not this person is a direct mitochondrial line descendant from the female of the ancestral couple. If so, or if their father is if they aren’t, I note it as such.
  • Y DNA – if this person, or if a female, their father or grandfather is a direct Y line descendant of this couple.

I’m sure you’ve figured out by now that if they are mtDNA or Y descendants, and I don’t already have that haplogroup information, I’m going to be contacting them and asking if they have taken that test at Family Tree DNA. If they have not, I’m going to ask if they would be willing. And yes, I’ll probably be offering to pay for it too. It’s worth it to me to obtain that information which can’t be otherwise obtained.

  • Comments – where I record anything else I might have to say – like their tree isn’t displaying correctly, or there is an error in their tree, or they contacted me via e-mail, etc. I may make these same types of notes in the notes field on the match at Ancestry.

Musings

It’s interesting that at least one of my matches that was removed when Ancestry introduced their Timber phasing is back now.

However, and this is the bad news, 82 previous leaf hint matches are now gone. Some disappeared in the adjustment done back in May 2016, but not all disappearances can be attributed to that house-cleaning. I noted the matches that disappeared at that time.

If you look at my current 367 matches and add 82, that means I’ve had a total of 449 Ancestor Hint matches since the Timber introduction – not counting the matches removed because of Timber. That means I’ve lost 18% of my matches since Timber, or said another way, if those 82 remained, I’d have 22% more Ancestor Hint matches than I have today.

Suffice it to say I wish I had more information about the matches that are gone now. I’d also like to know why I lost them. It’s not that they have private trees, they are simply gone.

As you may recall, I took the Ancestry V2 test when it became available to compare against the V1 version of the Ancestry test that I had taken originally.

ancestry-v2-match

It’s interesting that my own V2 second test doesn’t show as a shared match in several instances, example above and below.

ancestry-no-v2-match

It should show, since I’m my own “identical twin,” and the fact that it does not show on several individual’s shared matched with my V1 kit indicates that my match to that individual (E.B. in this case) was on the 300,000 or so SNPs that Ancestry replaced on their V2 chip with other locations that are more medically friendly. All or part of that V1 match was on the now obsolete portion of the V1 chip that my V2 test, on the newer chip, isn’t shown as a match. That’s 44% of the DNA that was available for matching on the V1 chip that isn’t now on the V2 chip.

My smallest match was 6cM. Based on the original white paper, Ancestry was utilizing 5cM for matches. Apparently that changed at some point. Frankly, without a chromosome browser, I’m fine with 6cM. There’s nothing I can do with that information, beyond tree matching without a chromosome browser anyway – and Ancestry already does tree matching for us.

Frustrations and Hints

Aside from the lack of a chromosome browser, which is a perpetual thorn in my side, I have two really big frustrations with Ancestry’s DNA implementation.

My first frustration is the search function, or lack thereof. If I turn up bald one day, this is why.

Here’s the search function for DNA matches.

ancestry-search

I can’t search for a user ID that I’ve recorded in my notes that I know matches me.

I can’t narrow searches beyond just a surname. For example, I’d like to search for that surname ONLY in trees with Shared Ancestor Hints, or maybe only in trees without hints, or only people in my matches with that surname, or only people who have this surname in their direct line, not just someplace in their tree. Just try searching for the surname Smith and you’ll get an idea of the magnitude of the problem. Not to mention that Ancestry searches do not reliably return the correct or even the same information. Ancestry lives and dies on searching, so I know darned good and well they can do better. I don’t know of any way around this search issue, so if you do, PLEASE DO TELL!!!

My second frustration is the messaging system, but I do have a couple hints for you to circumvent this issue.

I have discovered that there are two ways to contact your matches, and those two methodologies are by far NOT equal.

On your DNA match page, there is a green “Send Message” button in the upper right. Don’t use this button.

ancestry-messaging-green-button

The problem with using this button is that Ancestry does NOT send the recipient an e-mail telling them they received a message. Users have to both know and remember to look for the little grey envelope at the top of their task bar by their user name. Most don’t. It’s tiny and many people have no idea it’s there, especially if they are receiving e-mails when other people contact them through Ancestry. They assume that they’ll receive an e-mail anytime anyone wants to contact them. Reasonable, but not true.

I’m embarrassed to tell you that by the time I realized that envelope was there, I had over 100 messages waiting for me, all from people who thought I was willfully disregarding them, and I wasn’t.

So, if you use the green button, you’ve sent the message, but they have no idea they received a message. And you’re waiting, with your hopes dropping every day, or every hour if it’s an important match.

If you click on your little gray envelope, you’ll see any messages you’ve sent or received through the green contact button on the DNA page.

You can remedy this notification problem by utilizing the regular Ancestry contact button. Click on the user name beside their member profile on this same DNA page. In this case, EHVannoy.

You’ll then see their profile page, with a tan “Contact EHVannoy” button, EHVannoy being the user name.

ancestry-messaging-brown-button

Use this tan contact button to contact your matches, because it generates an e-mail. However, the tan button does NOT add the message to your gray envelope, and I don’t know of any way to track messages sent through the tan button. I note in my spreadsheet the date I send messages and a summary of the content. I also put this information in the Ancestry note field.

What’s Next?

Now, I know what you’re going to be doing next. You’re going to be going to look at your grey envelope and resend all of those messages using the tan button. There is an easy way to do this.

First, click on the grey envelope, then on the “Sent” box on the left hand side. You will then see all the messages you’ve sent.

ancestry-sent

Then, just click on the user name of any of your matches and that will take you to their profile page with the tan button!!! You can even copy/paste your original message to them. Do be sure to check your inbox to be sure they didn’t answer before you send them a new message.

ancestry-sent-to-profile

Hopefully some of the people who didn’t answer when you sent green button messages will answer with tan button messages. Fingers crossed!!!

23andMe’s New Ancestry Composition (Ethnicity) Chromosome Segments

I was excited to see 23andMe’s latest feature that provides customers with Ancestry Composition (ethnicity) chromosome segment information by location.  This means I can compare my triangulation groups to these segments and potentially identify which ancestor’s DNA that I inherited carry which ethnicity – right?? Another potential way to help discern whether I should ask Santa for lederhosen or a kilt?

Not so fast…

Theoretically yes, but as it turns out, after working with the results, this tool doesn’t fulfill it’s potential and has some very significant issues, or maybe this new tool just unveiled underlying issues.

Rats, I guess Santa is off the hook.

Let’s take a look and step through the process.

Ancestry Composition Chromosome Painting

To see your Ancestry Composition ethnicity chromosome painting, sign into 23andMe, then go to the Reports tab at the top of your page and click on Ancestry. Please note that you can click on any of the graphics in this article to enlarge.

23andme-eth-seg-1

Then click on Ancestry Composition, which shows you the following:

23andme-eth-seg-2

Scrolling downs shows you your chromosomes, painted with your ethnicity. This isn’t new and it’s a great visual.

You may note that 23andMe paints both “sides” of each chromosome separately, the side you received from your mother and the side you received from your father. However, there is no way to determine which is which, and they are not necessarily the same side on each chromosome.

If one or both of your parents tested at 23andMe, you can connect your parents to your results and you can then see which ethnicity you received from which parent.

Let’s work through an example.

23andme-eth-seg-3

This person, we’ll call her Jasmine, received two segments of Native ancestry, one on chromsome 1 and one on chromosome 2, both on the first (top) strands or copies. She also received one segment of African on DNA strand (copy) 1 of chromsome 7.

Caveat

Words of warning.

JUST BECAUSE THESE ETNICITIES APPEAR ON THE SAME STRANDS OF DIFFERENT CHROMOSOMES, STRAND ONE IN THIS CASE, DOES NOT MEAN THEY ARE INHERITED FROM THE SAME PARENT.  

Each chromosome recombines separately and without a parent to compare to, there is no way to know which strand is mother’s or father’s on any chromsome. And figuring out which strand is which for one chromsome does NOT mean it’s the same for other chromsomes.

In fact, Jasmine’s mother has tested, and she has NO African on chromosome 7. However, Jasmine and her mother both have Native American on chromosomes 1 and 2 in the same location, so we know absolutely that Jasmine’s strand 1 on chromosome 7 is not from the same parent as strand 1 on chromosome 1 and 2, because Jasmine’s mother doesn’t have any African DNA in that location.

If you’re a seasoned 23andMe user, and you’re saying to yourself, “That’s not right, the chromosome sides should be aligned if a parent tests.”  You’re right, at least that’s what we’ve all thought.  Keep reading.

Let’s dig a bit further.

Connecting Up

23and Me encourages everyone to connect their parents, if your parents have tested.

Jasmine’s mother has tested and is connected to Jasmine at 23andMe.

23andme-eth-seg-4

Even though the button says “Connect Mother,” which makes it appear that Jasmine’s mother isn’t connected, she is. Clicking on Jasmine’s “Connect Mother” button shows the following:

23andme-eth-seg-5

Furthermore, if the parent isn’t connected, you don’t see any parental side ethnicity breakdown – and we clearly see those results for Jasmine.  Below is an example of the same page of someone whose parents aren’t connected – and you can see the verbiage at the bottom saying that a parent must be connected to see how much ancestry composition was inherited from each parent.

23andme-eth-seg-not-connect

If a child is connected to at least one parent, 23andMe, based on that parent’s test, tells the child which sides they inherited which pieces of their ethnicity from, shown for Jasmine, below.

23andme-eth-seg-6

In this case, the mother is connected to Jasmine and the father’s ethnicity results are imputed by subtracting the results where Jasmine matches her mother. The balance of Jasmine’s DNA ethnicity results that don’t match her mother in that location are clearly from her father.

23andMe may sort the results into the correct buckets, but they do not correctly rearrange the chromosome “copies” or “sides” on the chromosome browser display based on the parents’ DNA, as seen from the African example on chromosome 7. Either that, or the ethnicity phasing is inaccurate, or both.

You can see that 23andMe tells Jasmine that all of her Native is from her mother’s side, which is correct.

23andMe tells Jasmine that part of her North African and Sub-Saharan African are from her mother, but some North African is also from her father. You can see Jasmine’s African on her chromosome 7, below.

23andme-eth-seg-7

There is no African on Jasmine’s mother’s chromosome 7, below.

23andme-eth-seg-8

So if African exists on chromosome 7, it MUST come from Jasmine’s father’s side. Therefore, side one of chromosome 7 cannot be Jasmine’s mother’s side, because that’s where Jasmine’s African resides.

This indictes that either the results are incorrect, or the “sides” showing have not been corrected or realigned by 23andMe after parental ethnicity phasing, or both.

Here’s another example. Jasmine shows Middle East and North Africa on chromosomes 12 and 13 on sides one and two, respectively.

23andme-eth-seg-9

Jasmine’s mother shows Middle East and North Africa on chromosome 14, only, with none showing on chromosome 12 or 13.

23andme-eth-seg-10

Yet, 23andMe shows Jasmine receiving Middle East and North African DNA from her mother.

23andme-eth-seg-11

Jasmine is also shown as receiving Sub-Saharan African and West African from her mother, but Jasmine’s mother has no Sub-Saharan or West African, at all.

Interestingly, when you highlight both West African and Sub-Saharan African, shown below, it highlights the same segment of Jasmine’s DNA, so apparently these are not different categories, but subsets of each other, at least in this case, and reflect the same segment.

23andme-eth-seg-12

23andme-eth-seg-13

Jasmine’s mother shows this region of chromosome 7 to be “European” with no further breakdown.

Clearly Jasmine’s sides 1 and 2 have not been consistently assigned to her mother, because Jasmine’ African shows on both sides 1 and 2 of chromosomes 12 and 13 and Jasmine’s mother has no African on either on those chromosomes – so those segments should be assigned consistently to Jasmine’s father’s side, which, based on Jasmine’s match to her mother on chromosome 1, side 1 – Jasmine’s father’s “copy” should be Jasmine’s side 2.  This tool is not functioning correctly.

Jasmine’s father is deceased, so there is no way to test him.

The information provided by 23and Me contradicts itself.

Either the ethnicity assignment itself or the parental ethnicity phasing is inaccurate, or both. Additionally, we now know that the chromosome “sides,” meaning “copies” are inaccurately displayed, even when one parent’s DNA is available and connected, and the sides could and should be portrayed accurately.

This discrepancy has to be evident to 23andMe, if they are checking for consistency in assigning child to parent segments.  You can’t assign a child’s segment to a parent who doesn’t carry any of that ethnicity in a common location.  That situation should result in a big red neon sign flashing “STOP” in quality assurance.  Inaccurate results should never be delivered to testers, especially when there are easy ways to determine that something isn’t right.

The New Feature – Ethnicity Segments

Like I said, I was initially quite excited about this new feature, at least until I did the analysis. Now, I’m not excited at all, because if the results are flawed, so is the underlying segment data.

My original intention was to download the ethnicity segment information into my master spreadsheet so that I could potentially match the ethnicity segments against ancestors when I’ve identified an ancestral segment as belonging to a particular ancestral line.

This would have been an absolutely wonderful benefit.

Let’s walk though these steps so you can find your results and do your own analysis.

When you are on the Ancestry Composition page, you will be, by default, on the Summary page.

23andme-eth-seg-14

Click on the Scientific Details tab, at the top, and scroll down to the bottom of the page where you will see the following:

23andme-eth-seg-15

You will be able to select a confidence level, ranging from 50% to 90%, where 50% is speculative and 90% is the highest confidence. Hint – at the highest confidence level, many of the areas broken out in the speculative level are rolled up into general regions, like “European.”  Default is 50%.

23andme-eth-seg-16

Click on download raw data and you can then open or save a .csv file. I suggest then saving that file as an Excel file so you can do some comparisons without losing features like color.

In my case, I saved a 50% confidence file and a 90% confidence file to compare to each other.

I began my analysis with both strands of chromosome 1:

Strand 1 was easy.  (Click on graphic to enlarge.)

23andme-eth-seg-17

At the 50% confidence level, on the left, three segments are identified, but when you really look at the start and end positions, rows one and two overlap entirely. Looking back at the chromosome browser painting, this looks to be because that segment will show up in both of those categories, so this isn’t an either-or situation. Row 3 shows Scandinavian beginning at 79,380,466 and continuing through 230,560,900, which is a partial embedded segment of row 2.

At the 90% confidence level, on the right, above, this entire segment, meaning all of chromosome 1 on side 1, is simply called European.

You can see how this might get complex very quickly when trying to utilize this information in a Master DNA Spreadsheet with your matches, especially since individual segments can have 2 or 3 different labels.  However, I’d love to know where my mystery Scandinavian is coming from – assuming it’s real.

Now, let’s look at strand 2 of chromosome one. It’s a little more complex.

23andme-eth-seg-18

I’ve tried to color code identical, or partially-overlapping segments.

The red, green and apricot segments overlap or partially overlap at the 50% level, on the left, indicating that they show up in different categories.

The red segments are partially the same, with some overlapping, but are grouped differently within Europe.

The green Native/East Asian segments at the 90% level are interrupted by the blue unassigned segments in the middle of the green segments, while at the 50% confidence level, they remain contiguous.

All of the start and end segments change, even if the categories stay the same or generally the same. The grey example at the bottom is the easiest to see – the category changes to the more general “European” at the 90% level and the start segment is slightly different.

Jasmine and Her Mother

As one last example, let’s look at the segments at the 50% confidence level, which should be the least restrictive, that we were comparing when discussing Jasmine and her mother.

You can see, below, that Jasmine’s Native portion of chromosome 1 and 2 are either equal to or a subset of her mother’s Native portion, so these match accurately and are shown in green.

This tells us that Jasmine’s mother’s side of chromosomes 1 and 2 is Jasmine’s “copy 1” and given that we can identify Jasmine’s mother’s DNA, all of Jasmine’s “copy 1” should now be displayed as her mother’s DNA, but it isn’t.

23andme-eth-seg-19

On chromosomes 7 and 12, where Jasmine’s copy 1 shows African DNA, her mother has none. All African DNA segments are shown in red, above.

Furthermore, 23andMe attributes at least some portion of Jasmine’s African to Jasmine’s mother, but Jasmine’s mother’s only African DNA appears on chromosome 14, a location where Jasmine has none. There is no common African segment or segments between Jasmine and her mother, in spite of the fact that 23andMe indicates that Jasmine inherited part of her African DNA from her mother.  It’s true that Jasmine and her mother both carry African DNA, but not on any of the same segments, so Jasmine did not inherit her mother’s African DNA.  Jasmine’s African DNA had to have come from her father – and that’s evident if you compare Jasmine and her mother’s segment data.

Where Jasmine has African DNA segments, above, I’ve shown her mother’s corresponding DNA segments on both strands for comparison. I have not colored these segments. Conversely, where Jasmine’s mother has African, on chromosome 14, I have shown Jasmine’s corresponding DNA segments covering that segment.  There are no matches.

Clearly Jasmine did not inherit her African segments from her mother, or the segments have been incorrectly assigned as African or European, or multiple problems exist.

Summary

I initially thought the Ancestry Composition segments were a great addition to the genealogists toolset, but unfortunately, it has proven to be otherwise, highlighting deficiencies in more than one of the following area:

  • Potentially, the ancestry composition ethnicity breakdown itself.  Is the underlying ethnicity assignment incorrect?  In either case, that would not explain the balance of the issues we encountered.
  • The chromosome “sides” or “copy” shown after the parental phasing – in other words, the child’s chromosome copies can be assigned to a particular parent with either or both parents’ DNA. Therefore, after parental phasing, all of the same parent’s DNA should consistently be assigned to either copy 1 or copy 2 for the child on all of their chromosomes.  It isn’t.
  • The child’s ethnicity source (parent) assignment based on the parent’s or parents’ ethnicity assignment(s).  Hence, the African segment assignment issues above.
  • The ethnicity phasing itself.  The assigning of the source of Jasmine’s African DNA to her mother when they share no common African segments.  Clearly this is incorrect, calling into question the validity of the rest of the parental ethnicity phasing.

Unfortunately, we really don’t have adequate tools to determine exactly where the problem or problems lie, but problems clearly do exist. This is very disappointing.

As a result, I won’t be adding this information to my Master DNA spreadsheet, and I’m surely glad I took the time to do the analysis BEFORE I copied the segment data into my spreadsheet.  In my excitement, I almost skipped the analysis step, trusting that 23andMe had this right.

All ethnicity results need to be taken with a large grain of salt, especially at the intra-continent level, because the reference populations and technology just haven’t been perfected.  It’s very difficult to discern between countries and regions of Europe, for example.  I discussed this in the article, “Ethnicity Testing – A Conundrum.”

However, it appears that adding parental phasing on top means that instead of a grain of salt, we’re looking at the entire shaker, at least at 23andMe – even at the continent level – in this case, Africa, which should be easily discernable from European. Parental phasing by its very nature should be able to help refine our results, not make them less reliable.

Is this new segment information just showing us the problems with the original ethnicity information?  I hate to even think about this or ask these difficult questions, but we must, because testers often rely on minority (to them) ethnicity admixture information to help confirm the ethnicity of distant ancestors. Are the display tools or 23andMe’s programs not working correctly, or is there a deeper problem, or both?

I think I just received a big lump of coal, or maybe a chunk of salt, in my stocking for Christmas.

Bah, humbug.

New Family Tree DNA Holiday Coupons – And Why the Big Y

holiday-lights

Each week during the holiday season, Family Tree DNA issues new coupons on Monday. These coupons are redeemable on top of the holiday sale prices, already in effect.

As I’ll be doing each week, I’ve listed my coupons available to redeem from kits that I manage.

But first, want to talk briefly about one particular type of DNA that is tested, and why one might want to order that particular test.

I’ve seen questions this past week about the Big Y test, so let’s talk about this test today.

The Big Y Test

The questions I’ve seen recently about the Big Y mostly revolve around why the test isn’t listed among the sale prices shown on the Family Tree DNA main page.

The Big Y test is not an entry level test. The tests shown on the Family Tree DNA main page are entry level and can be ordered by anyone, at least so long as the Y DNA tests are ordered for males. (Females don’t have a Y chromosome, so Y tests won’t work for them.)

The Big Y test is an upgrade for a male who has already taken the regular 37, 67 or 111 STR (short tandem repeat) marker test. For those who are unfamiliar, STR markers are used in a genealogically relevant timeframe to match other men to search for a common recent ancestor and are the type of markers used for 37, 67 and 111 marker tests.

SNPs (single nucleotide polymorphisms) are used to determine haplogroups, which reflect deep ancestry and reach significantly further back in time.

Haplogroups are predicted for each participant based on the STR test results, and Family Tree DNA’s prediction routines are very accurate, but the haplgroup can only be confirmed by SNP testing. These two tests are testing different types of DNA mutations. I wrote about the difference here.

Different SNPs are tested to confirm different haplogroups, so you must have your STR results back with the prediction before you can order SNP tests.

The Big Y is the granddaddy of SNP testing, because it doesn’t directly test each SNP location, and there are thousands, but scans virtually the entire Y chromosome to cover in essence all known SNPs. Better yet, the Big Y looks for previously unknown or unnamed SNPs. In other words, this test is a test of discovery, not just a test of confirmation.

Many SNPS are either unknown or as yet unnamed and unplaced on the haplotree, meaning the Y DNA tree of mankind for the Y chromosome. The only way we discover new SNPs is to run a test of discovery. Hence, the Big Y.

It’s fun to be on the frontier of this wonderfully personal science.

Applying the Big Y to Genealogy

In addition to defining and confirming the haplogroup, the Big Y test can be immensely informative in terms of ancestral roots. For example, we know that our Lentz line, found in Germany in the 1600s, matches the contemporary results of Burzyan Bashkir men, descendants of the Yamnaya. I wrote about this here, near the end of the article.

Even more amazing, we then discovered that our Lentz line actually shares mutations with ancient DNA recovered from Yamnaya culture burials from 3500 years ago from along the Volga River. You can read about that here, near the end of the article. This discovery, of course, could never have been made if the Big Y test had not been taken, and it was made by working with the haplogroup project administrators. I am eternally grateful to Dr. Sergey Malyshev for this discovery and the following tree documenting our genetic lineage.

JakobLenz Malyshev chart

Our family heritage now extends back into Russia, 3500 years ago, instead of stopping in Germany, 400 or 500 years ago. This huge historical leap could NEVER have been made without the Big Y test in conjunction with the projects and administrators at Family Tree DNA.

And I must say, I’m incredibly glad we didn’t wait to order this test, because Mr. Lentz, my cousin who tested, died unexpectedly, just a couple months later. His daughter, when informing me of his death, expressed her gratitude for the test, the articles and shared with me that he had taken both articles to Staples, had them printed and bound as gifts for family members this Christmas.

These gifts will be quite bittersweet for those family members, but his DNA legacy lives on, just as the DNA of our ancestors does inside each and every one of us.  He gave all Lentz descendants an incredible gift.

Purchasing the Big Y

If you or a kit you manage has already tested to 37 markers, you can order the Big Y test as an upgrade.  If they haven’t yet tested to 37 markers, you’ll need to order that test or upgrade first.

Every kit has an upgrade link that you can see in two places on your personal page.

upgrade-link

Click either of these links and you’ll be able to see which tests are available for you to purchase including upgrades.

upgrades-available

The sale prices are reflected on this page. Just click on the Big Y or whatever tests you wish to purchase.

If you have a coupon code, type it into this field where I’ve typed “Coupon Code” and then click on Apply.

upgrade-big-y-checkout

It’s worth noting that there are a couple $100 off coupons for the Big Y and some $75s and $50s too.

Coupons

Now, for this week’s list of coupons. As always, first come, first serve. These coupons expire on 12-4-2016 unless otherwise noted. Dates before 12-4 are a result of bonus coupons issued during the past week as coupons were used.

Please list any coupons you wish to share in the comments to this article.

Please note that these coupons, with the exception of the Big Y test, are for new kit orders only, not upgrades.

Remember to be cognizant of the number 1 versus the capital letter l, and the number zero versus the capital letter O.

Click here to redeem coupon codes below or to see what coupon codes await you on your account!!! Enjoy!

Coupon # Good for What
R186H23O1CJY $10 Off MTDNA
R18UFAYP9YP1 $10 Off MTDNA
R18CM684KFTG $10 Off MTDNA
R18QQOEDDC2W $10 Off MTDNA
R18B6EQTQNZO $10 Off MTDNA
R18N16ONSWUM $10 Off MTDNA
R18T3EGHSFSJ $10 Off MTDNA
R18DK57J883L $10 Off MTDNA
R18ZAODYZ5OS $10 Off MTDNA
R18G3OZQCHBR $10 Off MTDNA
R1859WUSWKWO $10 Off Y37, Y67 or Y111
R18P6S4FJWOM $10 Off Y37, Y67 or Y111
R18KOGLXRX7O $10 Off Y37, Y67 or Y111
R185G17XWT3R $10 Off Y37, Y67 or Y111
R18RJ37YR49M $10 Off Y37, Y67 or Y111
R18KDQDDADVB $10 Off Y37, Y67 or Y111
R186LQRI8DS2 $10 Off Y37, Y67 or Y111
R18QSZB7A86T $10 Off Y37, Y67 or Y111
R18IU4DK5NGW $10 Off Y37, Y67 or Y111
R18IK8GMDD8C $10 Off Y37, Y67 or Y111
R18U9XCYU1HO $10 Off Y37, Y67 or Y111
R18OM4SXOL16 $10 Off Y37, Y67 or Y111
R18AWCHIW45H $10 Off Y37, Y67 or Y111
R188VCTO38WC $10 Off Y37, Y67 or Y111
R18AJXZEZEXC $10 Off Y37, Y67 or Y111
R155WBEMG99 $100 Off Big Y
R18HMGLKL4KG $100 Off Big Y
R1834VTG4CIF $20 Off MTDNA
R18TRKWO2MY9 $20 Off MTDNA
R18OUBCTA2KI $20 Off Y37, Y67 or Y111
R18ZXDH7TAX7 $20 Off Y37, Y67 or Y111
R18OX18NFXJE $20 Off Y37, Y67 or Y111
R18AB7JDZ73O $20 Off Y37, Y67 or Y111
R18XEKCN8GPH $20 Off Y37, Y67 or Y111
R18UUAEIVMG9 $20 Off Y37, Y67 or Y111
R1813Q24LQA7 $30 Off Y-DNA 67
R1853SS3IIQP $30 Off Y-DNA 67
R18BQFEFNWSL $40 Off MTFULL
R18M96WZ4X5F $40 Off MTFULL
R18O73U6Y51O $40 Off MTFULL
R18S53W9HXBC $40 Off MTFULL
R157Y5N3USEH $40 Off MTFULL (until 12-3 only)
R189ZHFFPSU3 $40 Off Y-DNA 111
R18XO6Q76XP{N $40 Off Y-DNA 67
R187Y9BO9ODH $40 Off Y-DNA 67
R18OFGORCM7E $40 Off Y-DNA 67
R189HMHY3N9D $40 Off Y-DNA 67
R18DMEO59OVO $40 Off Y-DNA 67
R15QHJMX45W7 $50 off Big Y
R18MKLR7L32P $50 off Big Y
R15GVYGX51MI $50 Off Big Y (Until 12-1 only)
R18H467ILEKD $60 Off Y-DNA 111
R18AOZQU4XZG $60 Off Y-DNA 111
R18QO8WNQNOZ $60 Off Y-DNA 111
R186Z9BJDZEC $60 Off Y-DNA 111
R18HOPBNDKIL $60 Off Y-DNA 111
R188ODYMOO5P $75 Off Big Y
R15VBANUACFW 20% Off Y37, Y67 or Y111
R154JXYQPK6F 20% Off Y37, Y67 or Y111

Building Your Personal Mitochondrial Tree

People who test at Family Tree DNA and receive mitochondrial DNA full sequence results often have questions about how they can use their results to further their understanding of their ancestors.

One of the things you can do is to build a mitochondrial DNA haplotree of your own, showing how various people that you match are or are not descended from common ancestors. To do this, you’ll need to contact your matches and share your mutations.

Your results at Family Tree DNA tell you how many mutations you have, shown below, in the genetic distance column.  For more information on genetic distance, how it is calculated and what it means, click here.

GD my results

Your results at MitoSearch, if you upload, or within projects at Family Tree DNA, show you the HVR1+HVR2 region mutations, but the only way to compare the coding region, or full sequence matches is for the people involved to share them directly with each other.

How can mutations help identify your common ancestors with your matches, or if not the ancestor themselves, at least where they were from?

Let’s look at reconstructing a DNA tree based on both your common mutations and mutations you don’t share with your matches.

When building a DNA tree, remember that once a mutation enters the mitochondrial DNA, unless there is a back-mutation, which is exceedingly rare, that mutation will be found in all descendants.

This discussion excludes heteroplasmic mutations, which can be easily identified as any mutation that ends with any letter other than T, A, C or G – for example 16519Y would be heteroplasmic, indicated by the Y. The simple explanation for heteroplamic mutations is that they are a mutation in progress, and therefore relatively recent. They don’t pertain to deeper ancestry, so we are ignoring them for this discussion. Most people don’t have heteroplasmic mutations.

Building Your Tree

Let’s look at an example of how to build a mitochondrial mutation tree.

A common ancestor, at the top of the tree, has 2 mutations that they pass to all of their descendants.

Ancestor B and C have those 2 mutations, so they match ancestor A and each other.

Both ancestor B and C have both developed mutations that don’t match each other. In real life, it would be very rare for mitochondrial DNA to develop mutations in every generation, so just view this as a rather time-compressed example.

In ancestor B’s line, there are two contemporary individuals, D and E, who have all 3 of the mutations that Ancestor B carried.

So, you have a tree that looks like this.  You can click to enlarge.

mito-tree

Ancestor C also has two descendants, F and G, who both carry all of Ancestor C’s mutations, plus both F and G each have a mutation that doesn’t match each other.

So, now let’s say Person I comes along as a match. You can tell which line they belong to, and which lines they don’t, by which mutation(s) person I carries, as compared to your tree. For example, if person I carries mutations 1, 2 and 4, then you know that they are a descendant of Ancestor C, not B.  If they carry 1, 2, 4 and 5, then they descend from Person G’s line.

I suggest that you work with your full sequence matches to build this type of mitochondrial descendancy tree. You must work with your matches, because you cannot see your matches’ coding region results, not even in projects, so you’ll have to ask each one to share with you. Be prepared, some people won’t answer, but often, based on who the people match that do respond to you, and are willing to share, you can figure out the missing blanks.

For example, Let’s say John matches you with one mutation, and so does Joe, but Joe doesn’t answer your e-mail. However, John wants to work with you and John matches Joe exactly. Now you know which mutation Joe has as well – the same one as John.

You know that each of your full sequence matches is within a maximum of 3 mutations difference from you, because that’s the maximum that Family Tree DNA allows to be considered a match at the full sequence level.

Of course, not all of your matches will have the same 3 mutations, which is why you’ll need to work with them to see how your tree fleshes out. Who knows what surprises you may find.

The first question I ask each of my matches, after explaining what I’m trying to do, is whether they share any of my extra or missing mutations, with the exception of the insertions at 309, 315 or 522 and/or any mutation at 16519. These mutations are extremely common. Sometimes people are more comfortable sharing specific mutations than sending you their results. Other people will be glad to send results. In rare instances, the coding region may hold mutations that have medical significance, which is why Family Tree DNA doesn’t show specific mutations, only whether you match or not.

mito-extra-and-missing

In the example above, you can see that C16189T is normally present in this mitochondrial sequence, but it missing from this person’s results.

Your mitochondrial tree that you build may well shed light on your common ancestor and based on the location of the oldest ancestor of the person at the top of your tree, may also shed light on the location where your common ancestor may have lived and the migration path she took to where your most distant ancestor in this line was found.

My own mitochondrial DNA tree begins in Scandinavia and only my line winds up in Germany before 1700.  Another branch is found in Poland.

mitomatches

Ironically, my exact matches are in Norway (red), not to the line in Poland (orange). The rest of the lines whom I match and that also descend from my Scandinavian ancestor are still found in Scandinavia with one exception found in southern Russia which could be a result of migration to this region from the Germanic region of Europe in the 1700s and 1800s. This tells me that I’m closer, genetically, to the Scandinavian branches than the Polish branch, which is not at all what I would have expected. The Polish branch apparently migrated separately from mine.

My mitochondrial tree also tells me that the common ancestor of all of the matches likely originated in Scandinavia, possibly Norway, also not something I would have expected, given that my most distant ancestor is very clearly German, based on church records.

Give building your mitochondrial tree a try and see what kinds of surprises it may hold!  If you haven’t yet tested your full sequence mitochondrial DNA, order that test today.  You have ancestors waiting for you!