There has been recent discussion and confusion about the difference between pedigree collapse and endogamy.
Let’s take a look at the similarities and differences and what it means to genealogists.
Pedigree collapse occurs when the same person/people appear in your tree multiple times as ancestors.
In this example, you can see that John Smith and Mary Johnson appear twice, which of course means the ancestors further back in time in those lines all appear at least twice too.
Genetically speaking, our tester, Tester Smith, could be expected to inherit more of the DNA of John Smith and Mary Johnson because they are receiving an infusion of their DNA from both sides of their tree.
Each parent provides 50% of their respective DNA to each child, but contribute different pieces of their DNA to different children.
Each grandparent normally contributes approximately 25% to each grandchild, although it may be slightly more or less. Each great-grandparent contributes about 12.5% to each great-grandchild.
However, since John Smith appears twice in Tester Smith’s tree as a great-grandparent, John Smith would be expected to contribute approximately 12.5% of his DNA times 2 to Tester Smith. This means that approximately 25% of Tester Smith’s DNA descends from John Smith. The same is true for John Smith’s wife, Mary Johnson.
Let’s look at how this affects our chromosomes and matching.
Chromosome Perspective with No Pedigree Collapse
First, let’s look at a situation where there’s no pedigree collapse. Chromosome 22 has about 72 cM of DNA that is being compared for genealogy, so let’s use that chromosome for our example, with chromosome 22 being representative of all other chromosomes (except the X chromosome.)
If each grandparent contributes one fourth of each person’s DNA, then our tester’s mother would have received approximately 25% of her DNA from her 4 grandparents, respectively, or 18 cM from each grandparent.
In this example with no pedigree collapse, you can see that our tester received 9 cM or 12.5% of each of their great-grandparents’ DNA. The great-grandparents’ DNA combined in the grandparents and then Tester’s parents such that Tester received 18 cM or 25% of their DNA from each grandparent and 9 cM or 12.5% from each great-grandparent.
Note that Tester Smith received one 9 cM piece of each color of his 8 grandparents’ colored DNA. It’s easy to visualize inheritance this way.
Chromosome Perspective with Pedigree Collapse
Our second example shows pedigree collapse with John Smith and Mary Johnson being present as great-grandparents twice.
Note that Tester Smith inherited a segment of John Smith’s red DNA from their mother and one from their father. Tester also inherited one segment of Mary Johnson’s yellow DNA from each parent.
In this situation, the red DNA segment inherited by Tester Smith’s father from John Smith and the red DNA segment inherited from Tester’s mother from John Smith could potentially be:
- The same DNA segment contributed by John Smith to both of his children, George Smith and Fred Smith, meaning those segments will match entirely.
- Partially the same DNA segments meaning that some of John Smith’s DNA that Tester Smith inherited from his parents will match each other and some won’t.
- Entirely different DNA segments meaning that although the DNA was inherited from John Smith in both cases, his children, George Smith and Fred Smith inherited different pieces of John Smith’s DNA. That DNA was passed through George Smith and Fred Smith’s children to Tester Smith. Even though both segments inherited by Tester descended from John Smith originally, they don’t match because they were different segments to begin with.
Tester Smith will inherit approximately half of the DNA from John Smith that his parents received, so their red segments of DNA could be exactly the same, partially the same, or completely different.
Since I am showing the red segments in different positions on the chromosomes, we’ll presume that the positions shown indicate chromosome location (addresses.) Since the red DNA is not in the same location on the mother’s and father’s chromosomes, the DNA from John Smith inherited by Tester from his parents are different segments.
Tester will have inherited 18 cM total from John Smith and 18 cM total from Mary Johnson (using averages). In this illustration, the red and yellow segments, respectively are two separate 9 cM segments. If by chance those two red (or yellow) segments had been inherited in adjacent locations, they would match as one 18 cM segment – even though they were really two separate segments inherited through two different parents. The phenomenon where segments from common ancestors joining each other again in descendants causes relationship predictions to be closer than the actual relationships.
Said another way, even though Tester Smith inherited 25% of John Smith’s DNA, John Smith is still a great-grandparent, albeit twice, not a grandparent even though vendors would predict someone with 25% shared DNA as a grandparent.
Of course, each generation further back in the tree means that the amount of DNA inherited from each ancestor is cut in half, so the effects of pedigree collapse become less pronounced the further back in time the collapse occurs.
Looking at our example, if John Smith and Mary Johnson were duplicated in Tester’s tree another generation further removed, Tester would inherit 6.25% times two from John Smith, or a total of 12.5% of his DNA, and the same from Mary Johnson. Another generation back in time, 6.25% total. Eventually, many of those segments will disappear entirely due to loss during recombination, so distant pedigree collapse is not necessarily discernable in this way.
To summarize, pedigree collapse occurs in a genealogical timeframe, meaning that you could at least potentially identify the ancestors who are duplicated in your tree. If you know where in your tree the duplication occurs, you can calculate the expected amount of DNA that you will inherit (assuming an exact 50% inheritance/recombination rate in each generation) from each of those ancestors.
Endogamy is different. Instead of one person or a pair of ancestors who are duplicated, testers will have no immediate ancestors who are the same in their tree, but they will have many historical ancestors who are identical.
Endogamy most often occurs in closed communities where out-marriage is either highly discouraged or impossible. Common examples include Jewish populations, especially in Europe with the Ashkenazi, Native Americans, Finnish people, Acadians, Amish, Mennonite and Brethren communities. Of course, there are many more.
These communities often married only within their own community for many generations. Each community member shares the DNA of many common ancestors from long ago.
In this example, the DNA from distant common ancestors is handed down to the parents from the grandparents, but the ancestral segments are shown in small pieces. I used 4.5 cM as the segment size, but endogamous samples have many small, fragmented segments below that threshold.
“Small” segments for purposes of this discussion are those below the 6 cM minimum vendor matching/viewing threshold of FamilyTreeDNA, MyHeritage and 23andMe. Ancestry’s minimum match threshold is 8 cM. The take-away here is that none of those individual 4.5 cM segments would match between testers at any of those vendors because they are below all vendor’s thresholds.
The red arrows point to small segments where the mother and father both inherited small pieces of the same identical DNA from the same distant ancestors. Our tester will receive the pink and red DNA segment from both parents, because there is nothing else in that location for them to inherit.
The green arrows show examples of identical by chance matches where the yellow and red DNA, respectively, is not handed down from one parent. Instead, the two yellow and two red segments abut and are joined to form one 9 cM segment where two individual 4.5 cM segments converge – one inherited from the mother and one from the father.
This, of course, is the definition of identical by chance (IBC) where the DNA from two parents just happens to align in such a way that the tester matches another person. However, in ADDITION to being IBC, those two smaller segments just happened to be from common earlier ancestors in an endogamous population. Because endogamous populations have a limited amount of available DNA, it’s much more common to have small segments that match in descendants – and sometimes recombine to match in larger segments too.
In this case, the DNA of unknown distant ancestors just happened to be handed down and aligned adjacent to each other.
Our tester will match to anyone else who just happens to have inherited those two small ancestral DNA segments in the same location from that same population. When the original number of ancestors is limited, so are the number of DNA segments available for inheritance, and it’s very common for random ancestral segments to align in this way. Think of each ancestor’s tiny DNA segments salting a bowl of soup. You’re going to get some in every spoonful.
If those two adjacent 4.5 yellow segments are passed down together to the next generation, they add up to 9 cM, so will be considered a match to another person who inherited those same two adjacent 4.5 cM segments from that yellow ancestor – even though that unknown yellow ancestor could have lived ages ago – long before the possibility of genealogical matches. When no new DNA is introduced into populations, the only DNA available to be passed to the next generation is the ancestral DNA that has been salting the same pot of soup for generations.
This is exactly why we see the following situations in highly endogamous populations:
- Many matches at lower cM levels due to identical by change recombination
- Many small segments in common below vendor match thresholds
- Significantly more smaller segment matches than non-endogamous individuals due to the historical ancestral DNA being passed and recombined from descendant to descendant.
A fully endogamous individual from the Ashkenazi population often has 4 or 5 times as many matches, or more, than non-endogamous individuals.
Conversely, some fully endogamous individuals from populations that have not tested many people will have very few matches, but may not be able to identify their genealogical relationship with any of their matches.
In the article Concepts – Endogamy and DNA Segments, I provided several real-life examples of how endogamy affects DNA matches.
FamilyTreeDNA’s most recent matching update, among other things, has:
- Removed the segments below 6 cM from the DNA match totals
- Developed a new technique to determine and remove many identical by chance (IBC) matches
- Fully imputed all transfer kits from other vendors (yay!,) meaning that early transfers who did not previously have distant matches now do
- Recalculated everyone’s matches based on all of the above
- Developed an improved relationship prediction algorithm
- Re-predicted everyone’s relationships
While these changes benefit everyone, they provide huge benefits to people with high numbers of matches due to endogamy.
In this chart from the earlier article, you can see individuals predicted to the same relationship level, with segments as small as 1 cM showing, although matching never occurred at this level:
- Non-endogamous matches at left
- Jewish matches in the center
- Native American matches at right.
The chromosomes of the Jewish and Native people look polka dotted by comparison to the non-endogamous people. All of their matching segments are shorter then the non-endogamous group at the same predicted level, because all of the small segments were included in the relationship prediction calculations.
The removal of segments below 6 cM at FamilyTreeDNA improves accuracy and relationship predictions for everyone. A white paper will be available soon describing their new techniques.
While endogamous matches are frustrating for genealogists due to both the high number of matches and the difficulty identifying common pedigree ancestors, endogamous matches are very useful in another way.
Looking at our endogamous example again, let’s say our tester is entirely Jewish, with no admixture.
Our tester has a child with a partner who is entirely Asian, with no admixture. The DNA of these two populations does not fit the same genetic pattern.
In this example, the Asian person’s DNA is chartreuse green (for simplicity.) The Jewish DNA in the child has been divided in half, losing all of the army green, bright blue, and light blue segments, along with part of the tan, grey and yellow segments. Notice that the child still has two yellow segments and two red ones.
Population geneticists look for distinct patterns among populations of people who have lived exclusively together, in close proximity, or mixed often for tell-tale genetic signals where high frequencies of certain DNA patterns, or colors here, appear. Think of an island like Australia or New Zealand where there were no new populations available.
Those telltale small DNA segments, below matching thresholds, signal membership in or a genetic affiliation with that population. Of course, not all populations are quite as distinctive as the Jewish or Aussie/NZ populations. Some populations have not been isolated as long or more admixture has taken place. Think about Europe and those fluid borders.
Still, the signal of the founding populations is present for several generations, and sometimes longer if the testers ancestors were from the same population or region of the world and those identifying segments have been preserved during genetic recombination.
This individual’s ethnicity or populations would likely be predicted at or near 50% Jewish and 50% Asian. Those populations have been separated for tens of thousands of years and are relatively easy, genetically, to tell apart.
However, ascertaining between France and Germany is another matter altogether.
Real Life Examples
In the “picture is worth a thousand words” category, let’s take a look at some visuals.
Genetic Affairs has developed autocluster technology which I’ve written about several times. In the introductory article AutoClustering by Genetic Affairs, I provided examples of “normal” non-endogamous autoclusters.
A non-endogamous individual where other people from their family lines have tested would show several separate clusters. Individuals included in the same colored cluster match each other. Those clusters represent different ancestors or ancestral couples. For the most part, the people in individual clusters don’t match other clusters, although some will as the smaller clusters tend to represent generations further back in time. The people who match two clusters are shown by grey cells.
On the other hand, people who descend from an entirely endogamous population pretty much have one large interrelated square, not neatly arranged descending colored blocks.
My mother’s great-grandfather is Acadian, a highly endogamous population. She has no known pedigree collapse outside of the Acadian population. However, the Acadian population has substantial pedigree collapse meaning that most of her matches would have substantial pedigree collapse. All Acadians share the same founding ancestors from the early 1600s.
As researchers, we are fortunate to have meticulous Catholic church and tax records maintained by the Acadians. Other genealogists aren’t nearly as fortunate and therefore can’t necessarily differentiate between endogamy and pedigree collapse or a combination of both.
Mother would have inherited about 12.5% of her DNA from Antoine (Anthony Lord) Lore.
Mom’s orange Acadian cluster at upper left is oversized, much larger than her next cluster, and you can see that many orange-cluster people are related to each other. Mom has more Acadian matches than would normally account for 12.5% of her matches at the threshold used to generate the autocluster. These proportionally oversized autoclusters are the hallmark of endogamy.
One generation further downstream, my Acadian cluster, which accounts for 6.25% of my DNA is still my largest cluster, shown below, NOT clusters from my four grandparents as might be expected.
However, my Acadian cluster isn’t nearly as large as my mother’s, illustrating just how much was lost through recombination in one generation.
My friend and professional Dutch genealogist, Yvette Hoitink was gracious enough to provide an example of endogamy from an individual whose ancestors were from Winterswijk, a small village in the Netherlands.
Yvette tells us that in Winterswijk, people were serfs, some until 1795, and were required to pay a fine if they married a serf belonging to a different landlord.
Now, in addition to being a small village, we understand why so many people were related to each other, and why the other clusters are so tiny. Do note that many of the people in the red cluster also match people in the other colored clusters too, as identified by the grey cells. Truly, everyone does seem to be related to (at least) some of this person’s other ancestors.
Just so you don’t think all Dutch people are endogamous, Yvette also provided this autocluster of an individual from Friesland where people weren’t serfs during that timeframe.
Regional differences and population history, both on a large or small scale, really do make a HUGE difference.
Pedigree Collapse, Endogamy and Their Cousin, Population Genetics
I hope you have a better idea how pedigree collapse is different than endogamy and why endogamy is useful in population genetics.
- Pedigree collapse means you have the same ancestor(s) present in your tree, but other than those lines, it does NOT mean that everyone in your tree is related to each other.
- When everyone within a group is related somehow to everyone else, that’s endogamy.
- Of course, like many things in life, these “states of being” are not exclusive and entirely separate. You can have pedigree collapse without endogamy, but long-term community pedigree collapse within a group of people, such as the Acadians, defines endogamy.
- When endogamy is present, literally everyone is somehow related to everyone else, one way or another – especially distantly.
- You can have endogamy without any known recent ancestors.
- You can also have both pedigree collapse and endogamy, together, like my Acadian family line. If you do, I’m sorry😊!
With pedigree collapse, you have duplicate ancestors but you know who they are.
With endogamy, you’ll have a huge distant family, but it’s difficult or sometimes impossible to determine which ancestors, even if you DO know who they are, contributed specific DNA segments. Lots of matches with smaller matching DNA segments are prevalent and likely result from distant population-based matches.
I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.
Thank you so much.
DNA Purchases and Free Transfers
- FamilyTreeDNA – Y, mitochondrial and autosomal DNA testing
- MyHeritage DNA – Autosomal DNA test
- MyHeritage FREE DNA file upload – Transfer your results from other vendors free
- AncestryDNA – Autosomal DNA test
- 23andMe Ancestry – Autosomal DNA only, no Health
- 23andMe Ancestry Plus Health
Genealogy Products and Services
- MyHeritage FREE Tree Builder – Genealogy software for your computer
- MyHeritage Subscription with Free Trial
- Legacy Family Tree Webinars – Genealogy and DNA classes, subscription-based, some free
- Legacy Family Tree Software – Genealogy software for your computer
- Charting Companion – Charts and Reports to use with your genealogy software or FamilySearch
- RootsMagic Software – Genealogy software for your computer
- Newspapers.com – Search newspapers for your ancestors
- com – Lots of wonderful genealogy research books
- Legacy Tree Genealogists – Professional genealogy research