Crossovers: Frequency and Inheritance Statistics – Male Versus Female Matters

Recently, a reader asked if I had any crossover statistics.

They were asking about the number of crossovers, meaning divisions on each chromosome, of the parent’s DNA when a child is created. In other words, how many segments of your maternal and paternal grandparent’s DNA do you inherit from your mother and father – and are those numbers somehow different?

Why would someone ask that question, and how is it relevant for genealogists?

What is a Crossover and Why is it Important?

We know that every child receives half of their autosomal DNA from their father, and half from their mother. Conversely that means that each parent can only give their child half of their own DNA that they received from their parents. Therefore, each parent has to combine some of the DNA from their father’s chromosome and their mother’s chromosome into a new chromosome that they contribute to their child.

Crossovers are breakpoints that are created when the DNA of the person’s parents is divided into pieces before being recombined into a new chromosome and passed on to the person’s child.

I’m going to use the following real-life scenario to illustrate.

Crossover pedigree.png

The colors of the people above are reflected on the chromosome below where the DNA of the blue daughter, and her red and green parents are compared to the DNA of the tester. The tester is shown as the gray background chromosomes in the chromosome browser. The backgroud person is whose results we are looking at.

My granddaughter has tested her DNA, as have her parents and 3 of her 4 grandparents along with 2 great-grandparents, shown as red and green in the diagram above.

Here’s an example utilizing the FamilyTreeDNA chromosome browser.

Crossover example chr 1.png

On my granddaughter’s chromosome 1, on the chromosome brower above, we see two perfect examples of crossovers.

There’s no need to compare her DNA against that of her parent, the son in the chart above, because we already know she matches the full length of every chromosome with both of her parents.

However, when comparing my granddaughter’s DNA against the grandmother (blue) and her grandmother’s parents, the great-grandmother shown in red and great-grandfather shown in green, we can see that the granddaughter received her blue segments from the grandmother.

The grandmother had to receive that entire blue segment from either her mother, in red, or her father, in green. So, every blue segment must have an exactly matching red segment, green segment or combination of both.

The first red box at left shows that the blue segment was inherited partially from the grandmother’s red mother and green father. We know that because the tester matches the red great-grandmother on part of that blue segment and the green great-grandfather on a different part of the entire blue segment that the tester inherited from her blue grandmother.

The middle colored region, not boxed, shows the entire blue segment was inherited from the red great-grandmother and the blue grandmother passed that intact through her son to her granddaughter.

The third larger red boxed area encompassing the entire tested region to the right of the centromere was inherited by the granddaughter from her grandmother (blue segment) but it was originally from the blue grandmother’s red mother and green father.

The Crossover

The areas on this chromosome where the blue is divided between the red and green, meaning where the red and green butt up against each other is called a crossover. It’s literally where the DNA of the blue daughter crosses over between DNA contributed by her red mother and green father.

Crossover segments.png

In other words, the crossover where the DNA divided between the blue grandmother’s parents when the grandmother’s son was created is shown by the dark arrows above. The son gave his daughter that exact same segment from his mother and it’s only by comparing the tester’s DNA against her great-grandparents that we can see the crossover.

Crossover 4 generations.png

What we’re really seeing is that the segments inherited by the grandmother from her parents two different chromosomes were combined into one segment that the grandmother gave to her son. The son inherited the green piece and the red piece on his maternal chromosome, which he gave intact to his daughter, which is why the daughter matches her grandmother on that entire blue segment and matches her great-grandparents on the red and green pieces of their individual DNA.

Inferred Matching Segments

Crossover untested grandfather.png

The entirely uncolored regions are where the tester does not match her blue grandmother and where she would match her grandfather, who has not tested, instead of her blue grandmother.

The testers father only received his DNA from his mother and father, and if his daughter does not match his mother, then she must match his untested father on that segment.

Looking at the Big Inheritance Picture

The tester’s full autosomal match between the blue grandmother, red great-grandmother and green great-grandfather is shown below.

Crossover autosomes.png

In light of the discussion that follows, it’s worth noting that chromosomes 4 and 20 (orange arrows) were passed intact from the blue grandmother to the tester through two meiosis (inheritance) events. We know this because the tester matches the green great-grandfather’s DNA entirely on these two chromosomes that he passed to his blue daughter, her son and then the tester.

Let’s track this for chromosomes 4 and 20:

  • Meiosis 1 –The tester matches her blue grandmother, so we know that there was no crossover on that segment between the father and the tester.
  • Meiosis 2 – The tester matches her green great-grandfather along the entire chromosome, proving that it was passed intact from the grandmother to the tester’s father, her son.
  • What we don’t know is whether there were any crossovers between the green great-grandfather when he passed his parent or parents DNA to the blue grandmother, his daughter. In order to determine that, we would need at least one of the green great-grandfather’s parents, which we don’t have. We don’t know if the green great-grandfather passed on his maternal or paternal copy of his chromosome, or parts of each to the blue great-grandmother, his daughter.

Meiosis Events and the Tree

So let’s look at these meiosis or inheritance events in a different way, beginning at the bottom with the pink tester and counting backwards, or up the tree.

Crossover meiosis events.png

By inference, we know that chromosomes 11, 16 and 22 (purple arrows) were also passed intact, but not from the blue grandmother. The tester’s father passed his father’s chromosome intact to his daughter. That’s the untested grandfather again. We know this because the tester does not match her blue grandmother at all on either of these three chromosomes, so the tester must match her untested grandfather instead, because those are the only two sources of DNA for the tester’s father.

A Blip, or Not?

If you’ve noticed that chromosome 14 looks unusual, in that the tester matches her grandmother’s blue segment, but not either of her great-grandparents, which is impossible, give yourself extra points for your good eye.

In this case, the green great-grandfather’s kit was a transfer kit in which that portion of chromosome 14 was not included or did not read accurately. Given that the red great-grandmother’s kit DID read in that region and does not match the tester, we know that chromosome 14 would actually have a matching green segment exactly the size of the blue segment.

However, in another situation where we didn’t know of an issue with the transfer kit, it is also possible that the granddaughter matched a small segment of the blue grandmother’s DNA where they were identical by chance. In that case, chromosome 14 would actually have been passed to the tester intact from her father’s father, who is untested.

Every Segment has a Story

Looking at this matching pattern and our ability to determine the source of the DNA back several generations, originating from great-grandparents, I hope you’re beginning to get a sense of why understanding crossovers better is important to genealogists.

Every single segment has a story and that story is comprised of crossovers where the DNA of our ancestors is combined in their offspring. Today, we see the evidence of these historical genetic meiosis or division/recombination events in the start and end points of matches to our genetic cousins. Every start and end point represents a crossover sometime in the past.

What else can we tell about these events and how often they occur?

Of the 22 autosomes, not counting the X chromosome which has a unique inheritance pattern, 17 chromosomes experienced at least one crossover.

What does this mean to me as a genealogist and how can I interpret this type of information?

Philip Gammon

You may remember our statistician friend Philip Gammon. Philip and I have collaborated before authoring the following articles where Philip did the heavy lifting.

I discussed crossovers in the article Concepts – DNA Recombination and Crossovers, also in collaboration with Philip, and showed several examples in a Four Generation Inheritance Study.

If you haven’t read those articles, now might be a good time to do so, as they set the stage for understanding the rest of this article.

The frequency of chromosome segment divisions and their resulting crossovers are key to understanding how recombination occurs, which is key to understanding how far back in time a common ancestor between you and a match can expect to be found.

In other words, everything we think we know about relationships, especially more distant relationships, is predicated on the rate that crossovers occur.

The Concepts article references the Chowdhury paper and revealed that females average about 42 crossovers per child and males average about 27 but these quantities refer to the total number of crossovers on all 22 autosomes and reveal nothing about the distribution of the number of crossovers at the individual chromosome level.

Philip Gammon has been taking a closer look at this particular issue and has done some very interesting crossover simulations by chromosome, which are different sizes, as he reports beginning here.

Crossover Statistics by Philip Gammon

For chromosomes there is surprisingly little information available regarding the variation in the number of crossovers experienced during meiosis, the process of cell division that results in the production of ova and sperm cells. In the scientific literature I have been able to find only one reference that provides a table showing a frequency distribution for the number of crossovers by chromosome.

The paper Broad-Scale Recombination Patterns Underlying Proper Disjunction in Humans by Fledel-Alon et al in 2009 contains this information tucked away at the back of the “Supplementary methods, figures, and tables” section. It was likely not produced with genetic genealogists in mind but could be of great interest to some. The columns X0 to X8 refer to the number of crossovers on each chromosome that were measured in parental transmissions. Separate tables are shown for male and female transmissions because the rates between the two sexes differ significantly. Note that it’s the gender of the parent that matters, not the child. The sample size is quite small, containing only 288 occurrences for each gender.

A few years ago I stumbled across a paper titled Escape from crossover interference increases with maternal age by Campbell et al 2015. This study investigated the properties of crossover placement utilising family groups contained within the database of the direct-to-consumer genetic testing company 23andMe. In total more than 645,000 well-supported crossover events were able to be identified. Although this study didn’t directly report the observed frequency distribution of crossovers per chromosome, it did produce a table of parameters that accurately described the distribution of inter-crossover distances for each chromosome.

By introducing these parameters into a model that I had developed to implement the equations described by Housworth and Stahl in their 2003 paper Crossover Interference in Humans I was able to derive tables depicting the frequency of crossovers. The following results were produced for each chromosome by running 100,000 simulations in my crossover model:

Crossover transmissions from female to child.png

Transmissions from female parent to child, above.

Crossover transmissions male to child.png

Transmissions from male parent to child.

To be sure that we understand what these tables are revealing let’s look at the first row of the female table. The most frequent outcome for chromosome #1 is that there will be three crossovers and this occurs 27% of the time. There were instances when up to 10 crossovers were observed in a single meiosis but these were extremely rare. Cells that are blank recorded no observations in the 100,000 simulations. On average there are 3.36 crossovers observed on chromosome #1 in female to child transmissions i.e. the female chromosome #1 is 3.36 Morgans (336 centimorgans) in genetic length.

Blaine Bettinger has since examined crossover statistics using crowdsourced data in The Recombination Project: Analyzing Recombination Frequencies Using Crowdsourced Data, but only for females. His sample size was 250 maternal transmissions and Table 2 in the report presents the results in the same format as the tables above. There is a remarkable degree of conformity between Blaine’s measurements and the output from my simulation model and also to the earlier Fledel-Alon et al study.

The diagrams below are a typical representation of the chromosomes inherited by a child.

Crossovers inherited from mother.jpg

The red and orange (above) are the set of chromosomes inherited from the mother and the aqua and green (below) from the father. The locations where the colours change identify the crossover points.

It’s worth noting that all chromosomes have a chance of being passed from parent to child without recombination. These probabilities are found in the column for zero crossovers.

In the picture above the mother has passed on two red chromosomes (#14 and #20) without recombination from one of the maternal grandparents. No yellow chromosomes were passed intact.

Similarly, below, the father has passed on a total of five chromosomes that have no crossover points. Blue chromosomes #15, #18 and #21 were passed on intact from one paternal grandparent and green chromosomes #4 and #20 from the other.

Crossovers inherited from father.jpg

It’s quite a rare event for one of the larger chromosomes to be passed on without recombination (only a 1.4% probability for chromosome #1 in female transmissions) but occurs far more frequently in the smaller chromosomes. In fact, the male chromosome #21 is passed on intact more often (50.6% of the time) than containing DNA from both of the father’s parents.

However, there is nothing especially significant about chromosome #21.

The same could be said for any region of similar genetic length on any of the autosomes i.e. the first 52 cM of chromosome #1 or the middle 52 cM of chromosome #10 etc. From my simulations I have observed that on average 2.8 autosomes are passed down from a mother to child without a crossover and an average of 5.1 autosomes from a father to child.

In total (from both parents), 94% of offspring will inherit between 4 and 12 chromosomes containing DNA exclusively from a single grandparent. In the 100,000 simulations the child always inherited at least one chromosome without recombination.

Back to Roberta

If you have 3 generations who have tested, you can view the crossovers in the grandchild as compared to either one or two grandparents.

If the child doesn’t match one grandparent, even if their other grandparent through that parent hasn’t tested, you can certainly infer that any DNA where the grandchild doesn’t match the available grandparent comes from the non-tested “other” grandparent on that side.

Let’s Look at Real-Life Examples

Using the example of my 2 granddaughters, both of their parents and 3 of their 4 grandparents have tested, so I was able to measure the crossovers that my granddaughters experienced from all 4 of their grandparents.

Maternal Crossovers Granddaughter 1 Granddaughter 2 Average
Chromosome 1 6 2 3.36
Chromosome 2 4 2 3.17
Chromosome 3 3 2 2.71
Chromosome 4 2 2 2.59
Chromosome 5 2 1 2.49
Chromosome 6 4 2 2.36
Chromosome 7 3 1 2.23
Chromosome 8 2 2 2.11
Chromosome 9 3 1 1.95
Chromosome 10 4 2 2.08
Chromosome 11 3 0 1.93
Chromosome 12 3 3 2.00
Chromosome 13 1 1 1.52
Chromosome 14 3 1 1.38
Chromosome 15 4 1 1.44
Chromosome 16 2 2 1.58
Chromosome 17 2 2 1.53
Chromosome 18 2 0 1.40
Chromosome 19 2 1 1.18
Chromosome 20 0 1 1.19
Chromosome 21 0 1 0.74
Chromosome 22 1 0 0.78
Total 56 30 41.71

Looking at these results, it’s easy to see just how different inheritance between two full siblings can be. Granddaughter 1 has 56 crossovers through her mother, significantly more than the average of 41.71. Granddaughter 2 has 30, significantly less than average.

The average of the 2 girls is 43, very close to the total average of 41.71.

Note that one child received 2 chromosomes intact from her mother, and the other received 3.

Paternal Crossovers Granddaughter 1 Granddaughter 2 Average
Chromosome 1 2 2 1.98
Chromosome 2 3 2 1.85
Chromosome 3 2 2 1.64
Chromosome 4 0 1 1.46
Chromosome 5 1 2 1.46
Chromosome 6 2 1 1.41
Chromosome 7 1 2 1.36
Chromosome 8 1 1 1.23
Chromosome 9 1 3 1.26
Chromosome 10 3 2 1.30
Chromosome 11 0 1 1.20
Chromosome 12 1 1 1.32
Chromosome 13 2 1 1.02
Chromosome 14 1 0 0.97
Chromosome 15 1 2 1.01
Chromosome 16 0 1 1.02
Chromosome 17 0 0 1.06
Chromosome 18 1 1 0.98
Chromosome 19 1 1 1.00
Chromosome 20 0 0 0.99
Chromosome 21 0 0 0.52
Chromosome 22 0 0 0.63
Total 23 26 26.65

Granddaughter 2 had slightly more paternal crossovers than did granddaughter 1.

One child received 7 chromosomes intact from her father, and the other received 5.

Chromosome Granddaughter 1 Maternal Granddaughter 1 Paternal
Chromosome 1 6 2
Chromosome 2 4 3
Chromosome 3 3 2
Chromosome 4 2 0
Chromosome 5 2 1
Chromosome 6 4 2
Chromosome 7 3 1
Chromosome 8 2 1
Chromosome 9 3 1
Chromosome 10 4 3
Chromosome 11 3 0
Chromosome 12 3 1
Chromosome 13 1 2
Chromosome 14 3 1
Chromosome 15 4 1
Chromosome 16 2 0
Chromosome 17 2 0
Chromosome 18 2 1
Chromosome 19 2 1
Chromosome 20 0 0
Chromosome 21 0 0
Chromosome 22 1 0
Total 56 23

Comparing each child’s maternal and paternal crossovers side by side, we can see that Granddaughter 1 has more than double the number of maternal as compared to paternal crossovers, while Granddaughter 2 only had slightly more.

Chromosome Granddaughter 2 Maternal Granddaughter 2 Paternal
Chromosome 1 2 2
Chromosome 2 2 2
Chromosome 3 2 2
Chromosome 4 2 1
Chromosome 5 1 2
Chromosome 6 2 1
Chromosome 7 1 2
Chromosome 8 2 1
Chromosome 9 1 3
Chromosome 10 2 2
Chromosome 11 0 1
Chromosome 12 3 1
Chromosome 13 1 1
Chromosome 14 1 0
Chromosome 15 1 2
Chromosome 16 2 1
Chromosome 17 2 0
Chromosome 18 0 1
Chromosome 19 1 1
Chromosome 20 1 0
Chromosome 21 1 0
Chromosome 22 0 0
Total 30 26

Granddaughter 2 has closer to the same number of maternal and paternal of crossovers, but about 8% more maternal.

Comparing Maternal and Paternal Crossover Rates

Given that males clearly have a much, much lower crossover rate, according to the Philip’s chart as well as the evidence in just these two individual cases, over time, we would expect to see the DNA segments significantly LESS broken up in male to male transmissions, especially an entire line of male to male transmissions, as compared to female to female linear transmissions. This means we can expect to see larger intact shared segments in a male to male transmission line as compared to a female to female transmission line.

  G1 Mat G2 Mat Mat Avg G1 Pat G2 Pat Pat Avg
Gen 1 56 30 41.71 23 26 26.65
Gen 2 112 60 83.42 46 52 53.30
Gen 3 168 90 125.13 69 78 79.95
Gen 4 224 120 166.84 92 104 106.60

Using the Transmission rates for Granddaughter 1, Granddaughter 2, and the average calculated by Philip, it’s easy to see the cumulative expected average number of crossovers vary dramatically in every generation.

By the 4th generation, the maternal crossovers seen in someone entirely maternally descended at the rate of Grandchild 1 would equal 224 crossovers meaning that the descendant’s DNA would be divided that many times, while the same number of paternal linear divisions at 4 generations would only equal 92.

Yet today, we would never look at 2 people’s DNA, one with 224 crossovers compared to one with 92 crossovers and even consider the possibility that they are both only three generations descended from an ancestor, counting the parents as generation 1.

What Does This Mean?

The number of males and females in a specific line clearly has a direct influence on the number of crossovers experienced, and what we can expect to see as a result in terms of average segment size of inherited segments in a specific number of generations.

Using Granddaughter 1’s maternal crossover rate as an example, in 4 generations, chromosome 1 would have incurred a total of 24 crossovers, so the DNA would be divided into in 25 pieces. At the paternal rate, only 8 crossovers so the DNA would be in 9 pieces.

Chromosome 1 is a total of 267 centimorgans in length, so dividing 267 cM by 25 would mean the average segment would only be 10.68 cM for the maternal transmission, while the average segment divided by 9 would be 29.67 cM in length for the paternal transmission.

Given that the longest matching segment is a portion of the estimated relationship calculation, the difference between a 10.68 cM maternal linear segment match and a 29.67 paternal linear cM segment match is significant.

While I used the highest and lowest maternal and paternal rates of the granddaughters, the average would be 19 and 29, respectively – still a significant difference.

Maternal and Paternal Crossover Average Segment Size

Each person has an autosomal total of 3374 cM on chromosomes 1-22, excluding the X chromosome, that is being compared to other testers. Applying these calculations to all 22 autosomes using the maternal and paternal averages for 4 generations, dividing into the 3374 total we find the following average segment centiMorgan matches:

Crossovers average segment size.png

Keep in mind, of course, that the chart above represents 3 generations in a row of either maternal or paternal crossovers, but even one generation is significant.

The average size segment of a grandparent’s DNA that a child receives from their mother is 80.89 cM where the average segment of a grandparent’s DNA inherited from their father is 1.57 times larger at 126.6 cM.

Keep the maternal versus paternal inheritance path in mind as you evaluate matches to cousins with identified common ancestors, especially if the path is entirely or mostly maternal or paternal which would skew the cumulative average. You can easily tell, for example, that matches who descend paternally from a common ancestor and carry the surname are likely to carry more DNA from that common male ancestor than someone who descends from a mixed or directly maternal line.

For unknown matches, just keep in mind that the average that vendors calculate and use to predict relationships, because they can’t and don’t have “inside knowledge” about the inheritance path, may or may not be either accurate or average. They do the best they can do with the information they have at hand.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

First Steps When Your DNA Results are Ready – Sticking Your Toe in the Genealogy Water

First steps helix

Recently someone asked me what the first steps would be for a person who wasn’t terribly familiar with genealogy and had just received their DNA test results.

I wrote an article called DNA Results – First Glances at Ethnicity and Matching which was meant to show new folks what the various vendor interfaces look like. I was hoping this might whet their appetites for more, meaning that the tester might, just might, stick their toe into the genealogy waters😊

I’m hoping this article will help them get hooked! Maybe that’s you!

A Guide

This article can be read in one of two ways – as an overview, or, if you click the links, as a pretty thorough lesson. If you’re new, I strongly suggest reading it as an overview first, then a second time as a deeper dive. Use it as a guide to navigate your results as you get your feet wet.

I’ll be hotlinking to various articles I’ve written on lots of topics, so please take a look at details (eventually) by clicking on those links!

This article is meant as a guideline for what to do, and how to get started with your DNA matching results!

If you’re looking for ethnicity information, check out the First Glances article, plus here and here and here.

Concepts – Calculating Ethnicity Percentages provides you with guidelines for how to estimate your own ethnicity percentages based on your known genealogy and Ethnicity Testing – A Conundrum explains how ethnicity testing is done.

OK, let’s get started. Fun awaits!

The Goal

The goal for using DNA matching in genealogy depends on your interests.

  1. To discover cousins and family members that you don’t know. Some people are interested in finding and meeting relatives who might have known their grandparents or great-grandparents in the hope of discovering new family information or photos they didn’t know existed previously. I’ve been gifted with my great-grandparent’s pictures, so this strategy definitely works!
  2. To confirm ancestors. This approach presumes that you’ve done at least a little genealogy, enough to construct at least a rudimentary tree. Ancestors are “confirmed” when you DNA match multiple other people who descend from the same ancestor through multiple children. I wrote an article, Ancestors: What Constitutes Proof?, discussing how much evidence is enough to actually confirm an ancestor. Confirmation is based on a combination of both genealogical records and DNA matching and it varies depending on the circumstances.
  3. Adoptees and people with unknown parents seeking to discover the identities of those people aren’t initially looking at their own family tree – because they don’t have one yet. The genealogy of others can help them figure out the identity of those mystery people. I wrote about that technique in the article, Identifying Unknown Parents and Individuals Using DNA Matching.

DNAAdoption for Everyone

Educational resources for adoptees and non-adoptees alike can be found at www.dnaadoption.org. DNAAdoption is not just for adoptees and provides first rate education for everyone. They also provide trained and mentored search angels for adoptees who understand the search process along with the intricacies of navigating the emotional minefield of adoption and unknown parent searches.

First Look” classes for each vendor are free for everyone at DNAAdoption and are self-paced, downloadable onto your computer as a pdf file. Intro to DNA, Applied Autosomal DNA and Y DNA Basics classes are nominally priced at between $29 and $49 and I strongly recommend these. DNAAdoption is entirely non-profit, so your class fee or contribution supports their work. Additional resources can be found here and their 12 adoptee search steps here.

Ok, now let’s look at your results.

Matches are the Key

Regardless of your goal, your DNA matches are the key to finding answers, whether you want to make contact with close relatives, prove your more distant ancestors or you’re involved in an adoptee or unknown parent search.

Your DNA matches that of other people because each of you inherited a piece of DNA, called a segment, where many locations are identical. The length of that DNA segment is measured in centiMorgans and those locations are called SNPs, or single nucleotide polymorphisms. You can read about the definition of a centimorgan and how they are used in the article Concepts – CentiMorgans, SNPs and Pickin’Crab.

While the scientific details are great, they aren’t important initially. What is important is to understand that the more closely you match someone, the more closely you are related to them. You share more DNA with close relatives than more distant relatives.

For example, I share exactly half of my mother’s DNA, but only about 25% of each of my grandparents’ DNA. As the relationships move further back in time, I share less and less DNA with other people who descend from those same ancestors.

Informational Tools

Every vendor’s match page looks different, as was illustrated in the First Glances article, but regardless, you are looking for four basic pieces of information:

  • Who you match
  • How much DNA you share with your match
  • Who else you and your match share that DNA with, which suggests that you all share a common ancestor
  • Family trees to reveal the common ancestor between people who match each other

Every vendor has different ways of displaying this information, and not all vendors provide everything. For example, 23andMe does not support trees, although they allow you to link to one elsewhere. Ancestry does not provide a tool called a chromosome browser which allows you to see if you and others match on the same segment of DNA. Ancestry only tells you THAT you match, not HOW you match.

Each vendor has their strengths and shortcomings. As genealogists, we simply need to understand how to utilize the information available.

I’ll be using examples from all 4 major vendors:

Your matches are the most important information and everything else is based on those matches.

Family Tree DNA

I have tested many family members from both sides of my family at Family Tree DNA using the Family Finder autosomal test which makes my matches there incredibly useful because I can see which family members, in addition to me, my matches match.

Family Tree DNA assigns matches to maternal and paternal sides in a unique way, even if your parents haven’t tested, so long as some close relatives have tested. Let’s take a look.

First Steps Family Tree DNA matches.png

Sign on to your account and click to see your matches.

At the top of your Family Finder matches page, you’ll see three groups of things, shown below.

First Steps Family Tree DNA bucketing

Click to enlarge

A row of tools at the top titled Chromosome Browser, In Common With and Not in Common With.

A second row of tabs that include All, Paternal, Maternal and Both. These are the maternal and paternal tabs I mentioned, meaning that I have a total of 4645 matches, 988 of which are from my paternal side and 847 of which are from my maternal side.

Family Tree DNA assigns people to these “buckets” based on matches with third cousins or closer if you have them attached in your tree. This is why it’s critical to have a tree and test close relatives, especially people from earlier generations like aunts, uncles, great-aunts/uncles and their children if they are no longer living.

If you have one or both parents that can test, that’s a wonderful boon because anyone who matches you and one of your parents is automatically bucketed, or phased (scientific term) to that parent’s side of the tree. However, at Family Tree DNA, it’s not required to have a parent test to have some matches assigned to maternal or paternal sides. You just need to test third cousins or closer and attach them to the proper place in your tree.

How does bucketing work?

Maternal or Paternal “Side” Assignment, aka Bucketing

If I match a maternal first cousin, Cheryl, for example, and we both match John Doe on the same segment, John Doe is automatically assigned to my maternal bucket with a little maternal icon placed beside the match.

First Steps Family Tree DNA match info

Click to enlarge

Every vendor provides an estimated or predicted relationship based on a combination of total centiMorgans and the longest contiguous matching segment. The actual “linked relationship” is calculated based on where this person resides in your tree.

The common surnames at far right are a very nice features, but not every tester provides that information. When the testers do include surnames at Family Tree DNA, common surnames are bolded. Other vendors have similar features.

People with trees are shown near their profile picture with a blue pedigree icon. Clicking on the pedigree icon will show you their ancestors. Your matches estimated relationship to you indicates how far back you should expect to share an ancestor.

For example, first cousins share grandparents. Second cousins share great-grandparents. In general, the further back in time your common ancestor, the less DNA you can be expected to share.

You can view relationship information in chart form in my article here or utilize DNAPainter tools, here, to see the various possibilities for the different match levels.

Clicking on the pedigree chart of your match will show you their tree. In my tree, I’ve connected my parents in their proper places, along with Cheryl and Don, mother’s first cousins. (Yes, they’ve given permission for me to utilize their results, so they aren’t always blurred in images.)

Cheryl and Don are my first cousins once removed, meaning my mother is their first cousin and I’m one generation further down the tree. I’m showing the amount of DNA that I share with each of them in red in the format of total DNA shared and longest unbroken segment, taken from the match list. So 382-53 means I share a total of 382 cM and 53 cM is the longest matching block.

First Steps Family Tree DNA tree.png

The Chromosome Browser

Utilizing the chromosome browser, I can see exactly where I match both Don and Cheryl. It’s obvious that I match them on at least some different pieces of my DNA, because the total and longest segment amounts are different.

The reason it’s important to test lots of close relatives is because even siblings inherit different pieces of DNA from their parents, and they don’t pass the same DNA to their offspring either – so in each generation the amount of shared DNA is probably reduced. I say probably because sometimes segments are passed entirely and sometimes not at all, which is how we “lose” our ancestors’ DNA over the generations.

Here’s a matching example utilizing a chromosome browser.

First Steps Family Tree DNA chromosome browser.png

I clicked the checkboxes to the left of both Cheryl and Don on the match page, then the Chromosome Browser button, and now you can see, above, on chromosomes 1-16 where I match Cheryl (blue) and Don (red.)

In this view, both Don and Cheryl are being compared to me, since I’m the one signed in to my account and viewing my DNA matches. Therefore, one of the bars at each chromosome represents Don’s DNA match to me and one represents Cheryl’s. Cheryl is the first person and Don is the second. Person match colors (red and blue) are assigned arbitrarily by the system.

My grandfather and Cheryl/Don’s father, Roscoe, were siblings.

You can see that on some segments, my grandfather and Roscoe inherited the same segment of DNA from their parents, because today, my mother gave me that exact same segment that I share with both Don and Cheryl. Those segments are exactly identical and shown in the black boxes.

The only way for us to share this DNA today is for us to have shared a common ancestor who gave it to two of their children who passed it on to their descendants who DNA tested today.

On other segments, in red boxes, I share part of the same segments of DNA with Cheryl and Don, but someone along the line didn’t inherit all of that segment. For example on chromosome 3, in the red box, you can see that I share more with Cheryl (blue) than Don (red.)

In other cases, I share with either Don or Cheryl, but Don and Cheryl didn’t inherit that same segment of DNA from their father, so I don’t share with both of them. Those are the areas where you see only blue or only red.

On chromosome 12, you can see where it looks like Don’s and Cheryl’s segments butt up against each other. The DNA was clearly divided there. Don received one piece and Cheryl got the other. That’s known as a crossover and you can read about crossovers here, if you’d like.

It’s important to be able to view segment information to be able to see how others match in order to identify which common ancestor that DNA came from.

In Common With

You can use the “In Common With” tool to see who you match in common with any match. My first 6 matches in common with Cheryl are shown below. Note that they are already all bucketed to my maternal side.

First Steps Family Tree DNA in common with

click to enlarge

You can click on up to 7 individuals in the check box at left to show them on the chromosome browser at once to see if they match you on common segments.

Each matching segment has its own history and may descend from a different ancestor in your common tree.

First Steps 7 match chromosome browser

click to enlarge

If combinations of people do match me on a common segment, because these matches are all on my maternal side, they are triangulated and we know they have to descend from a common ancestor, assuming the segment is large enough. You can read about the concept of triangulation here. Triangulation occurs when 3 or more people (who aren’t extremely closely related like parents or siblings) all match each other on the same reasonably sized segment of DNA.

If you want to download your matches and work through this process in a spreadsheet, that’s an option too.

Size Matters

Small segments can be identical by chance instead of identical by descent.

  • “Identical by chance” means that you accidentally match someone because your DNA on that segment has been combined from both parents and causes it to match another person, making the segment “looks like” it comes from a common ancestor, when it really doesn’t. When DNA is sequenced, both your mother and father’s strands are sequenced, meaning that there’s no way to determine which came from whom. Think of a street with Mom’s side and Dad’s side with identical addresses on the houses on both sides. I wrote about that here.
  • “Identical by descent” means that the DNA is identical because it actually descends from a common ancestor. I discussed that concept in the article, We Match, But Are We Related.

Generally, we only utilize 7cM (centiMorgan) segments and above because at that level, about half of the segments are identical by descent and about half are identical by chance, known as false positives. By the time we move above 15 cM, most, but not all, matches are legitimate. You can read about segment size and accuracy here.

Using “In Common With” and the Matrix

“In Common With” is about who shares DNA. You can select someone you match to see who else you BOTH match. Just because you match two other people doesn’t necessarily mean that it’s on the same segment of DNA. In fact, you could match one person from your mother’s side and the other person from your father’s side.

First Steps match matrix.png

In this example, you match Person B due to ancestor John Doe and Person C due to ancestor Susie Smith. However, Person B also matches person C, but due to ancestor William West that they share and you don’t.

This example shows you THAT they match, but not HOW they match.

The only way to assure that the matches between the three people above are due to the same ancestor is to look at the segments with a chromosome browser and compare all 3 people to each other. Finding 3 people who match on the same segment, from the same side of your tree means that (assuming a reasonably large segment) you share a common ancestor.

Family Tree DNA has a nice matrix function that allows you to see which of your matches also match each other.

First steps matrix link

click to enlarge

The important distinction between the matrix and the chromosome browser is that the chromosome browser shows you where your matches match you, but those matches could be from both sides of your tree, unless they are bucketed. The matrix shows you if your matches also match each other, which is a huge clue that they are probably from the same side of your tree.

First Steps Family Tree DNA matrix.png

A matrix match is a significant clue in terms of who descends from which ancestors. For example, I know, based on who Amy matches, and who she doesn’t match, that she descends from the Ferverda side and that Charles, Rex and Maxine descend from ancestors on the Miller side.

Looking in the chromosome browser, I can tell that Cheryl, Don, Amy and I match on some common segments.

Matching multiple people on the same segment that descends from a common ancestor is called triangulation.

Let’s take a look at the MyHeritage triangulation tool.

MyHeritage

Moving now to MyHeritage who provides us with an easy to use triangulation tool, we see the following when clicking on DNA matches on the DNA tab on the toolbar.

First Steps MyHeritage matches

click to enlarge

Cousin Cheryl is at MyHeritage too. By clicking on Review DNA Match, the purple button on the right, I can see who else I match in common with Cheryl, plus triangulation.

The list of people Cheryl and I both match is shown below, along with our relationships to each person.

First Steps MyHeritage triangulation

click to enlarge

I’ve selected 2 matches to illustrate.

The first match has a little purple icon to the right which means that Amy triangulates with me and Cheryl.

The second match, Rex, means that while we both match Rex, it’s not on the same segment. I know that without looking further because there is no triangulation button. We both match Rex, but Cheryl matches Rex on a different segment than I do.

Without additional genealogy work, using DNA alone, I can’t say whether or not Cheryl, Rex and I all share a common ancestor. As it turns out, we do. Rex is a known cousin who I tested. However, in an unknown situation, I would have to view the trees of those matches to make that determination.

Triangulation

Clicking on the purple triangulation icon for Amy shows me the segments that all 3 of us, me, Amy and Cheryl share in common as compared to me.

First Steps MyHeritage triangulation chromosome browser.png

Cheryl is red and Amy is yellow. The one segment bracketed with the rounded rectangle is the segment shared by all 3 of us.

Do we have a common ancestor? I know Cheryl and I do, but maybe I don’t know who Amy is. Let’s look at Amy’s tree which is also shown if I scroll down.

First Steps MyHeritage common ancestor.png

Amy didn’t have her tree built out far enough to show our common ancestor, but I immediately recognized the surname Ferveda found in her tree a couple of generations back. Darlene was the daughter of Donald Ferverda who was the son of Hiram Ferverda, my great-grandfather.

Hiram was the father of Cheryl’s father, Roscoe and my grandfather, John Ferverda.

First Steps Hiram Ferverda pedigree.png

Amy is my first cousin twice removed and that segment of DNA that I share with her is from either Hiram Ferverda or his wife Eva Miller.

Now, based on who else Amy matches, I can probably tell whether that segment descends from Hiram or Eva.

Viva triangulation!

Theory of Family Relativity

MyHeritage’s Theory of Family Relativity provides theories to people whose DNA matches regarding their common ancestor if MyHeritage can calculate how the 2 people are potentially related.

MyHeritage uses a combination of tools to make that connection, including:

  • DNA matches
  • Your tree
  • Your match’s tree
  • Other people’s trees at MyHeritage, FamilySearch and Geni if the common ancestor cannot be found in your tree compared against your DNA match’s MyHeritage
  • Documents in the MyHeritage data collection, such as census records, for example.

MyHeritage theory update

To view the Theories, click on the purple “View Theories” banner or “View theory” under the DNA match.

First Steps MyHeritage theory of relativity

click to enleage

The theory is displayed in summary format first.

MyHeritage view full theory

click to enlarge

You can click on the “View Full Theory” to see the detail and sources about how MyHeritage calculated various paths. I have up to 5 different theories that utilize separate resources.

MyHeritage review match

click to enlarge

A wonderful aspect of this feature is that MyHeritage shows you exactly the information they utilized and calculates a confidence factor as well.

All theories should be viewed as exactly that and should be evaluated critically for accuracy, taking into consideration sources and documentation.

I wrote about using Theories of Relativity, with instructions, here and here.

I love this tool and find the Theories mostly accurate.

AncestryDNA

Ancestry doesn’t offer a chromosome browser or triangulation but does offer a tree view for people that you match, so long as you have a subscription. In the past, a special “Light” subscription for DNA only was available for approximately $49 per year that provided access to the trees of your DNA matches and other DNA-related features. You could not order online and had to call support, sometimes asking for a supervisor in order to purchase that reduced-cost subscription. The “Light” subscription did not provide access to anything outside of DNA results, meaning documents, etc. I don’t know if this is still available.

After signing on, click on DNA matches on the DNA tab on the toolbar.

You’ll see the following match list.

First Steps Ancestry matches

click to enlarge

I’ve tested twice at Ancestry, the second time when they moved to their new chip, so I’m my own highest match. Click on any match name to view more.

First Steps Ancestry shared matches

click to enlarge

You’ll see information about common ancestors if you have some in your trees, plus the amount of shared DNA along with a link to Shared Matches.

I found one of the same cousins at Ancestry whose match we were viewing at MyHeritage, so let’s see what her match to me at Ancestry looks like.

Below are my shared matches with that cousin. The notes to the right are mine, not provided by Ancestry. I make extensive use of the notes fields provided by the vendors.

First Steps Ancestry shared matches with cousin

click to enlarge

On your match list, you can click on any match, then on Shared Matches to see who you both match in common. While Ancestry provides no chromosome browser, you can see the amount of DNA that you share and trees, if any exist.

Let’s look at a tree comparison when a common ancestor can be detected in a tree within the past 7 generations.

First Steps Ancestry view ThruLines.png

What’s missing of course is that I can’t see how we match because there’s no chromosome browser, nor can I see if my matches match each other.

Stitched Trees

What I can see, if I click on “View ThruLines” above or ThruLines on the DNA Summary page on the main DNA tab is all of the people I match who Ancestry THINKS we descend from a common ancestor. This ancestor information isn’t always taken from either person’s tree.

For example, if my match hadn’t included Hiram Ferverda in her tree, Ancestry would use other people’s trees to “stitch them together” such that the tester is shown to be descended from a common ancestor with me. Sometimes these stitched trees are accurate and sometimes they are not, although they have improved since they were first released. I wrote about ThruLines here.

First Steps Ancestry ThruLines tree

click to enlarge

In closer generations, especially if you are looking to connect with cousins, tree matching is a very valuable tool. In the graphic above, you can see all of the cousins who descend from Hiram Ferverda who have tested and DNA match to me. These DNA matches to me either descend from Hiram according to their trees, or Ancestry believes they descend from Hiram based on other people’s trees.

With more distant ancestors, other people’s trees are increasingly likely to be copied with no sources, so take them with a very large grain of salt (perchance the entire salt lick.) I use ThruLines as hints, not gospel, especially the further back in time the common ancestor. I wish they reached back another couple of generations. They are great hints and they end with the 7th generation where my brick walls tend to begin!

23andMe

I haven’t mentioned 23andMe yet in this article. Genealogists do test there, especially adoptees who need to fish in every pond.

23andMe is often the 4th choice of the major 4 vendors for genealogy due to the following challenges:

  • No tree support, other than allowing you to link to a tree at FamilySearch or elsewhere. This means no tree matching.
  • Less than 2000 matches, meaning that every person is limited to a maximum of 2000 matches, minus however many of those 2000 don’t opt-in for genealogical matching. Given that 23andMe’s focus is increasingly health, my number of matches continues to decrease and is currently just over 1500. The good news is that those 1500 are my highest, meaning closest matches. The bad news is the genealogy is not 23andMe’s focus.

If you are an adoptee, a die-hard genealogist or specifically interested in ethnicity, then test at 23andMe. Otherwise all three of the other vendors would be better choices.

However, like the other vendors, 23andMe does have some features that are unique.

Their ethnicity predictions are acknowledged to be excellent. Ethnicity at 23andMe is called Ancestry Composition, and you’ll see that immediately when you sign in to your account.

First Steps 23andMe DNA Relatives.png

Your matches at 23andMe are found under DNA Relatives.

First Steps 23andMe tools

click to enlarge

At left, you’ll find filters and the search box.

Mom’s and Dad’s side filter matches if you’ve tested your parents, but it’s not like the Family Tree DNA bucketing that provides maternal and paternal side bucketing by utilizing through third cousins if your parents aren’t available for testing.

Family names aren’t your family names, but the top family names that match to you. Guess what my highest name is? Smith.

However, Ancestor Birthplaces are quite useful because you can sort by country. For example, my mother’s grandfather Ferverda was born in the Netherlands.

First Steps 23andMe country.png

If I click on Netherlands, I can see my 5 matches with ancestors born in the Netherlands. Of course, this doesn’t mean that I match because of my match’s Dutch ancestors, but it does provide me with a place to look for a common ancestor and I can proceed by seeing who I match in common with those matches. Unfortunately, without trees we’re left to rely on ancestor birthplaces and family surnames, if my matches have entered that information.

One of my Dutch matches also matches my Ferverda cousin. Given that connection, and that the Ferverda family immigrated from Holland in 1868, that’s a starting point.

MyHeritage has a similar features and they are much more prevalent in Europe.

By clicking on my Ferverda cousin, I can view the DNA we share, who we match in common, our common ethnicity and more. I have the option of comparing multiple people in the chromosome browser by clicking on “View DNA Comparison” and then selecting who I wish to compare.

First Steps 23andMe view DNA Comparison.png

By scrolling down instead of clicking on View DNA Comparison, I can view where my Ferverda cousin matches me on my chromosomes, shown below.

First STeps 23andMe chromosome browser.png

23andMe identifies completely identical segments which would be painted in dark purple, the legend at bottom left.

Adoptees love this feature because it would immediately differentiate between half and full siblings. Full siblings share approximately 25% of the exact DNA on both their maternal and paternal strands of DNA, while half siblings only share the DNA from one parent – assuming their parents aren’t closely related. I share no completely identical DNA with my Ferverda cousin, so no segments are painted dark purple.

23andMe and Ancestry Maps Show Where Your Matches Live

Another reason that adoptees and people searching for birth parents or unknown relatives like 23andMe is because of the map function.

After clicking on DNA Relatives, click on the Map function at the top of the page which displays the following map.

First Steps 23andMe map

click to enlarge

This isn’t a map of where your matches ancestors lived, but is where your matches THEMSELVES live. Furthermore, you can zoom in, click on the button and it displays the name of the individual and the city where they live or whatever they entered in the location field.

First Steps 23andMe your location on map.png

I entered a location in my profile and confirmed that the location indeed displays on my match’s maps by signing on to another family member’s account. What I saw is the display above. I’d wager that most testers don’t realize that their home location and photo, if entered, is being displayed to their matches.

I think sharing my ancestors’ locations is a wonderful, helpful, idea, but there is absolutely no reason whatsoever for anyone to know where I live and I feel it’s stalker-creepy and a safety risk.

First Steps 23andMe questions.png

If you enter a location in this field in your profile, it displays on the map.

If you test with 23andMe and you don’t want your location to display on this map to your matches, don’t answer any question that asks you where you call home or anything similar. I never answer any questions at 23andMe. They are known for asking you the same question repeatedly, in multiple locations and ways, until you relent and answer.

Ancestry has a similar map feature and they’ve also begun to ask you questions that are unrelated to genealogy.

Ancestry Map Shows Where Your Matches Live

At Ancestry, when you click to see your DNA matches, look to the right at the map link.

First Steps Ancestry map link.png

By clicking on this link, you can see the locations that people have entered into their profile.

First Steps Ancestry match map.png

As you can see, above, I don’t have a location entered and I am prompted for one. Note that Ancestry does specifically say that this location will be shown to your matches.

You can click on the Ancestry Profile link here, or go to your Personal Profile by click the dropdown under your user name in the upper right hand corner of any page.

This is important because if you DON’T want your location to show, you need to be sure there is nothing entered in the location field.

First Steps Ancestry profile.png

Under your profile, click “Edit.”

First Steps Ancestry edit profile.png

After clicking edit, complete the information you wish to have public or remove the information you do not.

First Steps Ancestry location in profile.png

Sometimes Your Answer is a Little More Complicated

This is a First Steps article. Sometimes the answer you seek might be a little more complicated. That’s why there are specialists who deal with this all day, everyday.

What issues might be more complex?

If you’re just starting out, don’t worry about these things for now. Just know when you run into something more complex or that doesn’t make sense, I’m here and so are others. Here’s a link to my Help page.

Getting Started

What do you need to get started?

  • You need to take a DNA test, or more specifically, multiple DNA tests. You can test at Ancestry or 23andMe and transfer your results to both Family Tree DNA and MyHeritage, or you can test directly at all vendors.

Neither Ancestry nor 23andMe accept uploads, meaning other vendors tests, but both MyHeritage and Family Tree DNA accept most file versions. Instructions for how to download and upload your DNA results are found below, by vendor:

Both MyHeritage and Family Tree DNA charge a minimal fee to unlock their advanced features such as chromosome browsers and ethnicity if you upload transfer files, but it’s less costly in both cases than testing directly. However, if you want the MyHeritage DNA plus Health or the Family Tree DNA Y DNA or Mitochondrial DNA tests, you must test directly at those companies for those tests.

  • It’s not required, but it would be in your best interest to build as much of a tree at all three vendors as you can. Every little bit helps.

Your first tree-building step should be to record what your family knows about your grandparents and great-grandparents, aunts and uncles. Here’s what my first step attempt looked like. It’s cringe-worthy now, but everyone has to start someplace. Just do it!

You can build a tree at either Ancestry or MyHeritage and download your tree for uploading at the other vendors. Or, you can build the tree using genealogy software on your computer and upload to all 3 places. I maintain my primary tree on my computer using RootsMagic. There are many options. MyHeritage even provides free tree builder software.

Both Ancestry and MyHeritage offer research/data subscriptions that provide you with hints to historical documents that increase what you know about your ancestors. The MyHeritage subscription can be tried for free. I have full subscriptions to both Ancestry and MyHeritage because they both include documents in their collections that the other does not.

Please be aware that document suggestions are hints and each one needs to be evaluated in the context of what you know and what’s reasonable. For example, if your ancestor was born in 1750, they are not included in the 1900 census, nor do women have children at age 70. People do have exactly the same names. FindAGrave information is entered by humans and is not always accurate. Just sayin’…

Evaluate critically and skeptically.

Ok, Let’s Go!

When your DNA results are ready, sign on to each vendor, look at your matches and use this article to begin to feel your way around. It’s exciting and the promise is immense. Feel free to share the link to this article on social media or with anyone else who might need help.

You are the cumulative product of your ancestors. What better way to get to know them than through their DNA that’s shared between you and your cousins!

What can you discover today?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

MyHeritage LIVE Conference Day 2 – The Science Behind DNA Matching    

The MyHeritage LIVE Oslo conference is but a fond memory now, and I would count it as a resounding success.

Perhaps one of the reasons I enjoyed it so much is the scientific aspect and because the content is very focused on a topic I enjoy without being the size and complexity of Rootstech. The smaller, more intimate venue also provides access to the “right” people as well as the ability to meet other attendees and not be overwhelmed by the sheer size.

Here are some stats:

  • 401 registered guests
  • 28 countries represented including distant places like Australia and South America
  • More than 20 speakers plus the hands-on workshops where specialist teams worked with students
  • 38 sessions and workshops, plus the party
  • 60,000 livestream participants, in spite of the time differences around the world

I was blown away by the number of livestream attendees.

I don’t know what criteria Gilad Japhet will be using to determine “success” but I can’t imagine this conference being judged as anything but.

Let’s take a look at the second day. I spent part of the time talking to people and drifting in and out of the rear of several sessions for a few minutes. I meant to visit some of the workshops, but there was just too much good, distracting content elsewhere.

I began Sunday in Mike Mansfield’s presentation about SuperSearch. Yes, I really did attend a few sessions not about DNA, but my favorite was the session on Improved DNA Matching.

Improved DNA Matching

I’m sure it won’t surprise any of my readers that my favorite presentations were about the actual science of genetic genealogy.

Consumers don’t really need to understand the science behind autosomal results to reap the benefits, but the underlying science is part of what I love – and it’s important for me to understand the underpinnings to be able to unravel the fine points of what the resulting matches are and are not revealing. Misinterpretation of DNA results leading to faulty conclusions is a real issue in genetic genealogy today. Consequently, I feel that anyone working with other people’s results and providing advice really needs to understand how the science and technology together works.

Dr. Daphna Weissglas-Volkov, a population geneticist by training, although she clearly functions far beyond that scope today, gave a very interesting presentation about how MyHeritage handles (their greatly improved) DNA Matching. I’m hitting the high points here, but I would strongly encourage you to watch the video of this session when they are made available online.

In addition to Dr. Weissglas-Volkov’s slides, I’ve added some additional explanations and examples in various places. You can easily tell that the slides are hers and the graphics that aren’t MyHeritage slides are mine.

Dr. Weissglas-Volkov began the session by introducing the MyHeritage science team and then explaining terminology to set the stage.

A match is when two people match each other on a fairly long piece of DNA. Of course, “fairly long” is defined differently by each vendor.

Your genetic map (of your chromosomes) is comprised of the DNA you inherit from different ancestors by the process of recombination when DNA is transferred from the parents to the child. A centiMorgan is the relatively likelihood that a recombination will occur in a single generation. On average, 36 recombinations occur in each generation, meaning that the DNA is divided on any chromosome. However, women, for reasons unknown have about 1.5 times as many recombinations as men.

You can’t see that when looking at an example of a person compared to their parents, of course, because each individual is a full match to each parent, but you can see this visually when comparing a grandchild to their maternal grandmother and their paternal grandmother on a chromosome browser.

The above illustration is the same female grandchild compared to her maternal grandmother, at left, and her paternal grandmother at right. Therefore the number of crossovers at left is through a female child (her mother), and the number at right is through a male child (her father.)

# of Crossovers
Through female child – left 57
Through male child – right 22

There are more segments at left, through the mother, and the segments are generally shorter, because they have been divided into more pieces.

At right, fewer and larger segments through the father.

Keep in mind that because you have a strand of DNA from each parent, with exactly the same “street addresses,” that what is produced by DNA sequencing are two columns of data – but your Mom’s and Dad’s DNA is intermixed.

The information in the two columns can’t be identified as Mom’s or Dad’s DNA or strand at this point.

That interspersed raw data is called a genotype. A haplotype is when Mom’s and Dad’s DNA can be reassembled into “sides” so you can attribute the two letters at each address to either Mom or Dad.

Here’s a quick example.

The goal, of course, is to figure out how to reassemble your DNA into Mom’s side and Dad’s side so that we know that someone matching you is actually matching on all As (Mom) or all Gs (Dad,) in this example, and not a false match that zigzags back and forth between Mom and Dad.

The best way to accomplish that goal of course is trio phasing, when the child and both parents are available, so by comparing the child’s DNA with the parents you can assign the two strands of the child’s DNA.

Unfortunately, few people have both or even one parent available in order to actual divide their DNA into “sides,” so the next best avenue is statistical phasing. I’ve called this academic phasing in the past, as compared to parental phasing which MyHeritage refers to as trio phasing.

There’s a huge amount of confusion about phasing, with few people understanding there are two distinct types.

Statistical phasing is a type of machine learning where a large number of reference populations are studied. Since we know that DNA travels together in blocks when inherited, statistical phasing learns which DNA travels with which buddy DNA – and creates probabilities. Your DNA is then compared to these models and your DNA is reshuffled in order to assemble your DNA into two groups – one representing your Mom’s DNA and one representing your Dad’s DNA, according to statistical probability.

Looking at your genotype, if we know that As group together at those 6 addresses in my example 95% of the time, then we know that the most likely scenario to create a haplotype is that all of the As came from one parent and all of the Gs from the other parent – although without additional information, there is no way to yet assign the maternal and paternal identifier. At this point, we only know parent 1 and parent 2.

In order to train the computers (machine learning) to properly statistically phase testers’ results, MyHeritage uses known relationships of people to teach the machines. In other words, their reference panels of proven haplotypes grows all of the time as parent/child trios test.

Dr. Weissglas-Volkev then moved on to imputation.

When sequencing DNA, not every location reads accurately, so the missing values can be imputed, or “put back” using imputation.

Initially imputation was a hot mess. Not just for MyHeritage, but for all vendors, imputation having been forced upon them (and therefore us) by Illumina’s change to the GSA chip.

However, machine learning means that imputation models improve constantly, and matching using imputation is greatly improved at MyHeritage today.

Imputation can do more than just fill in blanks left by sequencing read errors.

The benefit of imputation to the genetic genealogy community is that vendors using disparate chips has forced vendors that want to allow uploads to utilize imputation to create a global template that incorporates all of the locations from each vendor, then impute the values they don’t actually test for themselves to complete the full template for each person.

In the example below, you can see that no vendor tests all available locations, but when imputation extends the sequences of all testers to the full 1-500 locations, the results can easily be compared to every other tester because every tester now has values in locations 1-500, regardless of which vendor/chip was utilized in their actual testing.

Therefore, using imputation, MyHeritage is able to match between quite disparate chips, such as the traditional Illumina chips (OmniExpress), the custom Ancestry chip and the new GSA chip utilized by 23andMe and LivingDNA.

So, how are matches determined?

Matching

First your DNA and that of another person are scanned for nearly identical seed sequences.

A minimum segment length of 6cM must be identified for further match processing to occur. Anything below 6cM is discarded at this point.

The match is then further evaluated to see if the seed match is of a high enough quality that it should be perfected and should count as a match. Other segments continue to be evaluated as well. If the total matching segment(s) is 8 total cM or greater, it’s considered a valid match. MyHeritage has taken the position that they would rather give you a few accidental false matches than to miss good matches. I appreciate that position.

Window cleaning is how they refer to the process of removing pileup regions known to occur in the human genome. This is NOT the same as Ancestry’s routine that removes areas they determine to be “too matchy” for you individually.

The difference is that in humans, for example, there is a segment of chromosome 6 where, for some reason, almost all humans match. Matching across that segment is not informative for genetic genealogy, so that region along with several others similar in nature are removed. At Ancestry, those genome-wide pileup segments are removed, along with other regions where Ancestry decides that you personally have too many matches. The problem is that for me, these “too matchy” segments are many of my Acadian matches. Acadians are endogamous, so lots of them match each other because as a small intermarried population, they share a great deal of the same DNA. However, to me, because I have one great-grandfather that’s Acadian, that “too matchy” information IS valuable although I understand that it wouldn’t be for someone that is 100% Acadian or Jewish.

In situations such as Ashkenazi Jewish matching, which is highly endogamous, MyHeritage uses a higher matching threshold. Otherwise every Ashkenazi person would match every other Ashkenazi person because they all descend from a small founder population, and for genealogy, that’s not useful.

The last step in processing matches is to establish the confidence level that the match is accurately predicted at the correct level – meaning the relationship range based on the amount of matching DNA and other criteria.

For example, does this match cluster with other proven matches of the same known relationship level?

From several confidence ascertainment steps, a confidence score is assigned to the predicted relationship.

Of course, you as a customer see none of this background processing, just the fact that you do match, the size of the match and the confidence score. That’s what genealogists need!

Matching Versus Triangulation Thresholds

Confusion exists about matching thresholds versus triangulation thresholds.

While any single segment must be over 6 cM in length for the matching process to begin, the actual match threshold at MyHeritage is a total of 8 cM.

I took a look at my lowest match at MyHeritage.

I have two segments, one 6.1 cM segment, and one 6 cM segment that match. It would appear that if I only had one 6 cM segment, it would not show as a match because I didn’t have the minimum 8 cM total.

Triangulation Threshold

However, after you pass that matching criteria and move on to triangulation with a matching individual, you have the option of selecting the triangulation threshold, which is not the same thing as the match threshold. The match threshold does not change, but you can change the triangulation threshold from 2 cM to 8 cM and selections in-between.

In the example below, I’m comparing myself against two known relatives.

You won’t be shown any matches below the 6 cM individual segment threshold, BUT you can view triangulated segments of different sizes. This is because matching segments often don’t line up exactly and the triangulated overlap between several individuals may be very small, but may still be useful information.

Flying your mouse over the location in the bubble, which is the triangulated segment, tells you the size of the triangulated portion. If you selected the 2 cM triangulation, you would see smaller triangulated portions of matches.

Closing Session

The conference was closed by Aaron Godfrey, a super-nice MyHeritage employee from the UK. The closing session is worth watching on the recorded livestream when it becomes available, in part because there are feel good moments.

However, the piece of information I was looking for was whether there will be a MyHeritage LIVE conference in 2019, and if so, where.

I asked Gilad afterwards and he said that they will be evaluating the feedback from attendees and others when making that decision.

So, if you attended or joined the livestream sessions and found value, please let MyHeritage know so that they can factor your feedback onto their decision. If there are topics you’d like to see as sessions, I’m sure they’d love to hear about that too. Me, I’m always voting for more DNA😊

I hope to hear about MyHeritage LIVE 2019, and I’m voting for any of the following locations:

  • Australia
  • New Zealand
  • Israel
  • Germany
  • Switzerland

What do you think?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

DNA Painter – Touring the Chromosome Garden

This is the third article in a series about DNA Painter. To know DNA Painter is to love DNA Painter! Trust me!

The first two articles are:

The Chromosome Sudoku article introduces you to DNA Painter, it’s purpose and how to use the tool. The Mining Vendor Data article illustrates exactly how to find the segments you can paint from each of the main autosomal testing vendors and GedMatch.

This article is a leisurely tour through my colorful chromosome garden so that, together, we can see examples of how to utilize the information that chromosome painting unveils.

Chromosome painting can do amazing things: walk you back generations, show visual phasing…and reveal that there’s a mistake someplace, too.

If you’re not willing to be wrong and reconsider, this might not be the field for you😊

Automatic Triangulation

Chromosome painting automatically mathematically triangulates your DNA and in a much easier way than the old spreadsheet method. In fact, triangulation just happens, effortlessly IF you can determine which side is maternal and which side is paternal. Of course, you’ll always want to check to be sure that your matches also match each other. if not, then that’s an indication that maybe one or both are identical by chance.

The definition of triangulation in this context means:

  • To find a common segment
  • Of reasonable size (generally 7cM or over)
  • That is confirmed to a common ancestor with at least two other individuals
  • Who are not close family

Close family generally means parents, siblings, sometimes grandparents, although parents and grandparents can certainly be used to verify that the match is valid. The best triangulation situation is when you match those two other people through a second child, meaning siblings of your ancestor.

Different matches, depending on the circumstances, have a different level of value to you as a genealogist. In other words, some are more solid than others.

The X chromosome has special matching and triangulation rules, so we’ll talk about that when we get to that section.

Don’t think of chromosome painting as “doing” triangulation, because triangulation is a bonus of chromosome painting, and it just happens, automatically, so long as you can confirm that the segment is from either your maternal or paternal line.

What does triangulation look like in DNA Painter?

Here’s what my painted chromosome 15 looks like.

Here, I’ve drawn boxes around the areas that are triangulated. Actually, I made a small mistake and omitted one grey bar that’s also part of a second triangulation group. Can you spot it? Hint – look at the grey bars at far right in the overlapping triangulation group boxes where the red arrow is pointing. The box below should extend upwards to incorporate part of that top grey bar too.

Triangulation are those several segments piled up on top of each other. It means they match you at the same address on either the maternal or paternal chromosome. That’s good, but it’s not the same as an official “pileup area.”

Ok, so what’s a pileup area?

Pileup Areas

Certain locations in the human genome have been designated as pileup regions based on the fact that many people will match on these segments, not necessarily because they share a common relatively recent ancestor, but instead because a particular segment has a very high frequency in the general human population, or in the population of a specific region. Translated, this means that the segment might not be relevant to genealogy.

But before going too far with this discussion, it doesn’t mean that matches in pileup regions aren’t relevant to genealogy – just consider it a caution sign.

Aside from chromosome 6, which includes the HLA region, I’ve always been rather suspicious of pileup regions, because they don’t seem to hold true for me. You can view a chart that I assembled of the known pileup regions here.

DNA Painter generously includes pileup region warnings, in essence, along a chromosome bar at the top indicating “shared” or “both.”

Please note that you can click to enlarge any image.

Pileups regions are indicated by the grey hashed region at right. In my case, on chromosome 1, the pileup region isn’t piled up at all, on either the paternal (blue) chromosome or the maternal (pink) chromosome.

As you can see, I have exactly one match on the maternal side (green) and one (gold) on the paternal side (with a smidgen of a second grey match) as well, with both extending significantly beyond the pileup region. There is no reason to suspect that these gold and green matches aren’t valid.

If I saw many more matches in a pileup region than elsewhere, or many small matches, or DNA that was supposed to be from multiple ancestors not in the same line, then I’d have to question whether a pileup region was responsible.

Stacked Segments

DNA Painter provides you with the opportunity to see which of your ancestors’ segments stack. Stacking is a very important concept of DNA painting.

Before we talk about stacking, notice that the legend for which segments are color coded to specific ancestors is located at right. You can also click on the little grey box beside “Shared or Both,” at left, to show the match names beside the segments.  This is very useful when trying to analyze the accuracy of the match.

I wish DNA Painter offered an option to paint the ancestor’s names beside the segments. Maybe in V2. It’s really difficult to complain about anything because this tool is both free and awesome.

I’m using Powerpoint to label this group of stacked matches for this example.

This is a situation where I know my pedigree chart really well, so I know immediately upon looking at this stacked segment group who this piece of DNA descends from.

Here’s my pedigree chart that corresponds to the stacked segment.

We attribute each DNA segment to a couple initially based on who we match. In this case, that’s William George Estes and Ollie Bolton, my grandparents. The DNA remains attributed to them until we have evidence of which individual person in the couple received that DNA from their ancestors and passed it on to their descendant.

Therefore, the pink people are the half of the couple who we now know (thanks to DNA Painter) did NOT contribute that DNA segment, because we can track the DNA directly through the yellow line until we’re once again to another genetic brick wall couple.

My father is listed at left, and the DNA path runs back to William Crumley the second and his unknown wife who is haplogroup H2a1, the yellow couple at far right. How cool is this? One of those ancestors (or a combined segment from both) has been passed intact to me today. This is not a trivial segment either at 23.3 cM. I would not expect a segment passed to 5th cousins to be that large, but it is!

Also, note that the grey segment of DNA from Lazarus Estes (1848-1918) and Elizabeth Vannoy (1847-1918) is sitting slightly to the left of the dark blue segment from William Crumley III, so part or all of the grey or blue segment may originate with a different ancestor. Perhaps we’ll know more when additional people test and match on this same segment.

Double Related

I have one person who is related to me through two different lines. I need a way to determine which line (or both) our common DNA segment descends from.

I painted the segment for both of our common ancestor couples. The pink is George Dodson (1702-1770) & Margaret Dagord. The bright blue segment is William Crumley III (1788-1859) & Lydia Brown.

Those two lines don’t converge, at least not that we know of.

Now, as I map additional people, I’ll watch this segment for a tie breaker match between the two ancestors. The gold is not a tie breaker because that’s my grandparents who are downstream of both the pink and blue ancestors.

Painted Ethnicity

23andMe does us the favor of painting our ethnicity segments and allowing us to download a file with those segments. Conversely, DNA Painter does us the favor of allowing us to paint that entire file at once.

I already know my two Native segments on chromosome 1 and 2 descend through my mother, because her DNA is Native in exactly the same location. In other words, in this case, my ethnicity segment does in fact phase to my mother, although that’s not always the case with ethnicity.

Multiple Acadian ancestors are also proven to be Native by both genealogical records and maternal and/or paternal haplogroups.

Therefore, I’ve painted my Native segments on my mother’s side in order to determine exactly from which ancestor(s) those Native segment descend.

Confirming Questionable Ancestors

One very long-standing mystery that seemed almost unsolvable was the identity of the parents of Elijah Vannoy (1784->1850). We know he was the son of one of 4 Vannoy brothers living in Wilkes County, NC. Two were eliminated by existing Bibles and other records, but the other two remained candidates in spite of sifting through every available record and resource. We were out of luck unless DNA came to the rescue. Y DNA confirmed that Elijah was descended from one of the Vannoy males, but didn’t shed light on which one.

I decided that the wives would be the key, since we knew the identity of all four wives, thankfully. Of course, that means we’d be using autosomal DNA to attempt to gather more information.

I entered one candidate couple at Ancestry as Elijah’s parents – the one I felt most likely based on tax records and other criteria – Daniel Vannoy and Sarah Hickerson.  I also entered Sarah’s parents, Charles Hickerson (c 1725-<1793) and Mary Lytle.

I began getting matches to people who descend from Charles Hickerson and Mary Lytle through children other than Sarah.

The grey segment is from a descendant of Lazarus Estes & Elizabeth Vannoy. The salmon segments are from descendants of Charles Hickerson and Mary Lytle.

These segments aren’t small, 12.8 and 16.1 cM, so I’m fairly confident that these multiple segments in combination with the Elizabeth Vannoy segment do indeed descend from Charles Hickerson and Mary Lytle.

At Ancestry, I have 5 matches to Charles Hickerson and Mary Lytle through three of their children. However, only two of the individuals has transferred their results to either Family Tree DNA, MyHeritage or GedMatch where segment information is available to customers.

Finally, the thirty year old mystery is solved!

Shifting, Sliding, Offset or Staggered Segment Groups

Occasionally, you can prove an entire large segment by groups of shifting or sliding segments, sometimes referred as offset or staggered segments.

The entire bright pink region is inherited from Jacob Lentz (1783-1870) and Fredericka Reuhl (1788-1863.) However, it’s not proven by one individual but by a combination of 6 people whose segments don’t all overlap with each other.  The top two do match very closely with me and each other, then the third spans the two groups. The bottom 3 and part of the middle segment match very closely as well.

I can conclude that the entire dark pink region from left to right descends from Jacob and Fredericka.

Two Matches – 7 Generations

Two matches is all it took to identify this segment back to George Dodson and Margaret Dagord.

The mustard match is to my grandparents (22cM), and the pink match is to George Dodson (1702-1770) and his wife (22cM) – 7 generations. These people also match each other.

Additional matches would make this evidence stronger, although a 22cM triangulated match is very significant alone. Future might also suggest ancestors further back in time.

First Chromosome Fully Mapped

I actually have chromosome 5 entirely mapped to confirmed ancestors. I’m so excited.

Uh Oh – Something’s Wrong

I found a stack that clearly indicates something is wrong.  The question is, what?

The mustard represents my paternal grandparents, so these segments could have come through either of them, although on the pedigree chart below, we can see that this came through my grandfathers line..

There is only a small overlap with the magenta (Nicholas Speak 1782-1852 and Sarah Faires 1786-1865) and green (James Crumley 1711-1764 and Catherine c1712-c1790,) which could be by chance given that the Nicholas segment is 7.5 cM, so I’m leaving the magenta out of the analysis.

However, the rest of these segments overlap each other significantly, even though they are stepped or staggered.

As you can see from the colors on the pedigree chat, it’s impossible for the green segment to descend from the same ancestor as the purple segment. The purple and orange confirm that branch of the tree, but the red cannot be from the same ancestor or the same line as the green ancestor.

I suspect that the purple and orange line is correct, because there are 4 segments from different people with the same ancestral line.

This means that we have one of the following situations with the red and green segments:

  • The smaller segments are incorrect, false positives, meaning matching by chance. The green segment is 14 cM, so quite large to match by chance. The red segment is 10 cM. Possible, but not probable.
  • The segments are population-based matches, so appear in all 3 lines. Possible, technically, but also not probable due to the segment size.
  • The segments are genuine matches, and one of the lines is also found in one of the other lines, upstream. This is possible, but this would have to be the case with both the red and green lines. To continue to weigh this possibility, I’ll be watching for similar situations with these same ancestors.
  • Some combination of the above.

I need more matches on this segment for further clarity.

Visual Phasing – Crossovers

A crossover point is where the DNA on one side of a demarcation line is descended from one ancestor and the DNA on the other side is descended from another ancestor, represented by the pink and blue halves of the segment, below.

Crossovers occur when the DNA is combined from two different ancestors when it is passed to the child. In other words, a chunk of mom’s ancestors’ DNA is contributed by mom and a chunk of dad’s ancestors’ DNA is contributed as well. The seam between different ancestor’s DNA pieces is called a crossover.

In this example, the brown lines confirmed by several testers to be from Henry Bolton (c1759-1846) and Nancy Mann (c1780-1841) is shown with a very specific left starting point, all in a vertical line. It looks for all the world like this is a crossover point. The DNA to the left would have been contributed by another, as yet unidentified, ancestor.

The gold lines above are matches from more recent generations.

Naming Those Unnamed Acadians

My Acadian ancestry is hopelessly intertwined, but chromosome painting may in fact provide me with some prayer of unraveling this ball of twine. Eventually.

When I know that someone is Acadian, but I can’t tell which of many lines I connect through, I add them as “Acadian Undetermined.”

There’s a lot of Acadian DNA, because it’s an endogamous population and they just keep passing the same segments around and around in a very limited population.

On my maternal chromosome, all of the olive green is “Acadian Undetermined.”  However, that blue segment in the stack is Rene de Forest (1670-1751) and Francoise Dugas (1678->1751).

In essence, this one match identified all of the DNA of the other people who are now simply a row in the Acadian Undetermined stack. Now I need to go back and peruse the trees of these individuals to determine if they descend form this line, or a common ancestor of this line, or if (some of) these matches are a matter of endogamy.

Endogamous matches can be population based, meaning that you do match each other, but it’s because you share so much of the same DNA because you have small pieces of many common ancestors – not because a particular segment comes from one specific ancestor. You can also share part of your DNA from Mom’s side and part from Dad’s side, because both of your parents descend from a common population and not because the entire segment comes from any particular ancestor.

On some long cold winter weekend, I’ll go through and map all of the trees of my Acadian matches to see what I can unravel. I just love matches with trees. You just can’t do something like this otherwise.

Of course, those Acadians (and other endogamous populations) can be tricky, no matter what, one click up from a needle in a haystack.

Acadian Endogamy Haystack on Steroids

At first, our haystack looks like we’ve solved the mystery of the identity of the stack.  However, we soon discover that maybe things aren’t as neat and tidy as we think.

Of course, the olive green is Acadian Undetermined, but the three other colored segments are:

  • Pink – Guillaume Blanchard (1650-1715/17) & Huguette Goujon (c1647-1717)
  • Brown/Pink – Francois Broussard (c1653-1716) & Catherine Richard (c1663-1748)
  • Coffee – Daniel Garceau (1707-1772) & Anne Doucet (1713-1791)

Looking at the pedigree chart, we find two of these couples in the same lineage, so all is good, until we find the third, pink, couple, at the bottom.

Clearly, this segment can’t be in two different lines at once, so we have a problem.  Or do we?

Working the pink troublesome lines on back, we make a discovery.

We find a Blanchard line consisting of Guilluame Blanchard born circa 1590 and Huguette Poirier also born circa 1690.

Interesting. Let’s compare the Guillaume Blanchard and Huguette Goujon line. Is this the same couple, but with a different surname for her?

No, as it turns out, Guillaume Blanchard that married Huguette Goujon was the grandson of Guilluame Blanchard and Huguette Poirier. That haystack segment of DNA was passed down through two different lines, it appears, to converge in three descendants – me, the descendant of the pink segment couple and the descendant of the brown/burgundy segment couple. This segment reaches back in time to the birth of either Guilluame Blanchard or Huguette Poirier in 1590, someplace in France, rode over on the ship to Port Royal in the very early 1600s, probably before Jamestown was settled, and has been kicking around in my ancestors and their descendants ever since.

This 18 or so cM ancestral segment is buried someplace at Port Royal, Nova Scotia, but lives on in me and several other people through at least two divergent lines.

The X Chromsome

Several vendors don’t report the X chromosome segments. I do use X segments from those who do, but I utilize a different threshold because the SNP density is about half of that on the other chromosomes. In essence, you need a match twice as large to be equivalent to a match on another chromosome..

Generally, I don’t rely on segments below 10 for anyone, and I generally only use segments over 14cM and no less than 500 SNPs.

Having just said that, I have painted a few smaller segments, because I know that if they are inaccurate, they are very easy to delete. They can remain in speculative mode. The default for DNAPainter and that’s what I use.

The great thing about the X chromosome is that because of it’s special inheritance path, you can sometimes push these segments another 2 generations back in time.

Let’s use an X chromosome match in conjunction with my X fan chart printed through Charting Companion.

On the paternal X, I inherited the gold segment from the couple, William George Estes (1873-1971) & Ollie Bolton (1874-1955.) However, since my father didn’t inherit an X from William George Estes (because my father inherited the Y from his father,) that X segment has to be from Ollie Bolton, and therefore from her parents Joseph Bolton (1853-1920) and Margaret Claxton (1851-1920.)

The segment from Lazarus Estes (1848-1918) and Elizabeth Vannoy (1847-1918) that’s 14 cM is false. It can’t descend from that couple. Same for the 7.5 cM from Jotham Brown (c1740-c1799) & Phoebe unk (c1747-c1803.) That segment’s false too. The green 48 cM segment from Samuel Claxton (1827-1876) and Elizabeth Speak (1832-1907)?  That segment’s good to go!

On my mother’s side, there’s a 7.8 cM Acadian Undetermined, which must be false, because Curtis Benjamin Lore (1856-1909) did not inherit an X chromosome from his Acadian father, Antoine Lore (1805-1862/67.)  Therefore, my X chromosome has no Acadian at all. I never realized that before, and it makes my X chromosome MUCH easier.

How about that light green 33cM segment from Antoine Lore (1805-1862/67) & Rachel Hill (1814/15-1870/80)? That segment must come from Rachel Hill, so it’s pushed back another generation to Joseph Hill (1790-1871) and Nabby Hall (1792-1874.)

I love the X chromosome because when you find a male in the line, you automatically get bumped two more generations back to his mother’s parents. It’s like the X prize for genetic genealogy, pardon the pun!

Adoptees

Some adoptees are lucky and receive close matches immediately. Others, not so much and the search is a long process.

If you’re an adoptee trying to figure out how your matches connect together, use in-common-match groupings to cluster matches together, then paint them in groups.  Utilize the overlapping segments in order to view their trees, looking for common surnames. Always start with the groups with the longest segments and the most matches. The larger the match, the more likely you are to be able to find a connection in a more recent generation. The more matches, the more likely you are to be able to spot a common surname (or two.)

Painting can speed this process significantly.

Much More Than Painting

I hope this tour through my colorful chromosomes has illustrated how much fun analysis can be. You’ll have so much fun that you won’t even realize you’re triangulating, phasing and all of those other difficult words.

If you have something you absolutely have to do, set an alarm – or you’ll forget all about it. Voice of experience here!

So, go and find some segments to paint so all of these exciting things can happen to you too!

How far back will you be able to identity a segment to a specific ancestor?  How about a triangulated segment? An X segment?

Have fun!!! Don’t forget to eat!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – DNA Recombination and Crossovers

What is a crossover anyway, and why do I, as a genetic genealogist, care?

A crossover on a chromosome is where the chromosome is cut and the DNA from two different ancestors is spliced together during meiosis as the DNA of the offspring is created when half of the DNA of the two parents combines.

Identifying crossover locations, and who the DNA that we received came from is the first step in identifying the ancestor further back in our tree that contributed that segment of DNA to us.

Crossovers are easier to see than conceptualize.

Viewing Crossovers

The crossover is the location on each chromosome where the orange and black DNA butt up against each other – like a splice or seam.

In this example, utilizing the Family Tree DNA chromosome browser, the DNA of a grandchild is compared to the DNA of a grandparent. The grandchild received exactly 50 percent of her father’s DNA, but only the average of 25% of the DNA of each of her 4 grandparents. Comparing this child’s DNA to one grandmother shows that she inherited about half of this grandmother’s DNA – the other half belonging to the spousal grandfather.

  • The orange segments above show the locations where the grandchild matches the grandmother.
  • The black sections (with the exception of the very tips of the chromosomes) show locations where the grandchild does not match the grandmother, so by definition, the grandchild must match the grandfather in those black locations (except chromosome tips).
  • The crossover location is the dividing line between the orange and black. Please note that the ends of chromosomes are notoriously difficult and inconsistent, so I tend to ignore what appear to be crossovers at the tips of chromosomes unless I can prove one way or the other. Of the 22 chromosomes, 16 have at least one black tip. In some cases, like chromosome 16, you can’t tell since the entire chromosome is black.
  • Ignore the grey areas – those regions are untested because they are SNP poor.

We know that the grandchild has her grandmother’s entire X chromosome, because the parent is a male who only inherited an X chromosome from his mother, so that’s all he had to give his daughter. The tips of the X chromosome are black, showing that the area is not matching the mother, so that region is unstable and not reported.

It’s also interesting to note that in 6 cases, other than the X chromosome, the entire chromosome is passed intact from grandparent to grandchild; chromosomes 4, 11, 16, 20, 21 and 22.

Twenty-six crossovers occurred between mother and son, at 5cM.  This was determined by comparing the DNA of mother to son in order to ascertain the actual beginning and end of the chromosome matching region, which tells me whether the black tips are or are not crossovers by comparing the grandchild’s DNA to the grandmother.

For more about this, you might want to read Concepts – Segment Survival – Three and Four Generation Phasing.

Before going on, let’s look at what a match between a parent and child looks like, and why.

Parent/Child Match

If you’re wondering why I showed a match between a grandchild and a grandparent, above, instead of showing a match between a child and a parent, the chromosome browser below provides the answer.

It’s a solid orange mass for each chromosome indicating that the child matches the parent at every location.

How can this be if the child only inherits half of the parent’s DNA?

Remember – the parent has two chromosomes that mix to give the child one chromosome.  When comparing the child to the parent, the child’s single chromosome inherited from the parent matches one of the parent’s two chromosomes at every address location – so it shows as a complete match to the parent even though the child is only matching one of the parent’s two of chromosome locations.  This isn’t a bug and it’s just how chromosome browsers work. In other words, the “other ” chromosome that your parents carry is the one you don’t match.

The diagram below shows the mother’s two copies of chromosome 1 she inherited from her father and mother and which section she gave to her child.

You can see that the mother’s father’s chromosome is blue in this illustration, and the mother’s mother’s chromosome is pink.  The crossover points in the child are between part B and C, and between part C and D.  You can clearly see that the child, when compared to the mother, does in fact match the mother in all locations, or parts, 3 blue and 1 pink, even though the source of the matching DNA is from two different parents.

This example shows the child compared to both parents, so you can see that the child does in fact match both parents on every single location.

This is exactly why two different matches may match us on the same location, but may not match each other because they are from different sides of our family – one from Mom’s side and one from Dad’s.

You can read more about this in the article, One Chromosome, Two Sides, No Zipper – ICW and the Matrix.

The only way to tell which “sides” or pieces of the parent’s DNA that the child inherited is to compare to other people who descend from the same line as one of the parents.  In essence, you can compare the child to the grandparents to identify the locations that the child received from each of the 4 grandparents – and by genetic subtraction, which segments were NOT inherited from each grandparent as well, if one grandparent happens to be missing.

In our Parental Chromosome pink and blue diagram illustration above, the child did NOT inherit the pink parts A, B and D, and did not inherit the blue part C – but did inherit something from the parent at every single location. They also didn’t inherit an equal amount of their grandparents pink and blue DNA. If they inherited the pink part, then they didn’t inherit the blue part, and vice versa for that particular location.

The parent to child chromosome browser view also shows us that the very tip ends of the chromosomes are not included in the matching reports – because we know that the child MUST match the parent on one of their two chromosomes, end to end. The download or chart view provides us with the exact locations.

This brings us to the question of whether crossovers occur equally between males and female children.  We already know that the X chromosome has a distinctive inheritance pattern – meaning that males only inherit an X from their mothers.  A father and son will NEVER match on the X chromosome.  You can read more about X chromosome inheritance patterns in the article, X Marks the Spot.

Crossovers Differ Between Males and Females

In the paper Genetic Analysis of Variation in Human Meiotic Recombination by Chowdhury, et al, we learn that males and females experience a different average number of crossovers.

The authors say the following:

The number of recombination events per meiosis varies extensively among individuals. This recombination phenotype differs between female and male, and also among individuals of each gender.

Notably, we found different sequence variants associated with female and male recombination phenotypes, suggesting that they are regulated by different genes.

Meiotic recombination is essential for the formation of human gametes and is a key process that generates genetic diversity. Given its importance, we would expect the number and location of exchanges to be tightly regulated. However, studies show significant gender and inter-individual variation in genome-wide recombination rates. The genetic basis for this variation is poorly understood.

The Chowdhury paper provides the following graphs. These graphs show the average number of recombinations, or crossovers, per meiosis for each of two different studies, the AGRE and the FHS study, discussed in the paper.

The bottom line of this paper, for genetic genealogists, is that males average about 27 crossovers per child and females average about 42, with the AGRE study families reporting 41.1 and the FHS study families reporting 42.8.

I have been collaborating with statistician, Philip Gammon, and he points out the following:

Male, 22 chromosomes plus the average of 27 crossovers = an average of 49 segments of his parent’s DNA that he will pass on to his children. Roughly half will be from each of his parents. Not exactly half. If there are an odd number of crossovers on a chromosome it will contain an even number of segments and half will be from each parent. But if there are an even number of crossovers (0, 2, 4, 6 etc.) there will be an odd number of segments on the chromosome, one more from one parent than the other.

The average size of segments will be approximately:

  • Males, 22 + 27 = 49 segments at an average size of 3400 / 49 = 69 cM
  • Females, 22 + 42 = 64 segments at an average size of 3400 / 64 = 53 cM

This means that cumulatively, over time, in a line of entirely females, versus a line of entirely males, you’re going to see bigger chunks of DNA preserved (and lost) in males versus females, because the DNA divides fewer times. Bigger chunks of DNA mean better matching more generations back in time. When males do have a match, it would be likely to be on a larger segment.

The article, First Cousin Match Simulations speaks to this as well.

Practically Speaking

What does this mean, practically speaking, to genetic genealogists?

Few lines actually descend from all males or all females. Most of our connections to distant ancestors are through mixtures of male and female ancestors, so this variation in crossover rates really doesn’t affect us much – at least not on the average.

It’s difficult to discern why we match some cousins and we don’t match others. In some cases, rather than random recombination being a factor, the actual crossover rate may be at play. However, since we only know who we do match, and not who tested and we don’t match, it’s difficult to even speculate as to how recombination affected or affects our matches. And truthfully, for the application of genetic genealogy, we really don’t care – we (generally) only care who we do match – unless we don’t match anyone (or a second cousin or closer) in a particular line, especially a relatively close line – and that’s a horse of an entirely different color.

To me, the burning question to be answered, which still has not been unraveled, is why a difference in recombination rates exists between males and females. What processes are in play here that we don’t understand? What else might this not-yet-understood phenomenon affect?

Until we figure those things out, I note whether or not my match occurred through primarily men or women, and simply add that information into the other data that I use to determine match quality and possible distance.  In other words, information that informs me as to how close and reasonable a match is likely to be includes the following information:

  • Total amount of shared DNA
  • Largest segment size
  • Number of matching segments
  • Number of SNPs in matching segment
  • Shared matches
  • X chromosome
  • mtDNA or Y DNA match
  • Trees – presence, absence, accuracy, depth and completeness
  • Primarily male or female individuals in path to common ancestor
  • Who else they match, particularly known close relatives
  • Does triangulation occur

It would be very interesting to see how the instances of matches to a certain specific cousin level – say 3rd cousins (for example), fare differently in terms of the average amount of shared DNA, the largest segment size and the number of segments in people descended from entirely female and entirely male lines. Blaine Bettinger, are you listening? This would be a wonderful study for the Shared cM Project which measures actual data.

Isn’t the science of genetics absolutely fascinating???!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research