Concepts – CentiMorgans, SNPs and Pickin’ Crab

Posted on March 30, 2016 by Roberta Estes

In autosomal DNA testing, you’ll see the terms centiMorgans, represented as cMs and SNPs, which stands for single nucleotide polymorphism, combined.

These are two terms that are used to discuss thresholds and measurements of matching amounts of autosomal DNA segments.

These two terms, relative to autosomal DNA, are two parts of a whole, kind of like the left and right hand.

CentiMorgans are units of recombination used to measure genetic distance. You can read a scientific definition here.

For our conceptual purposes, think of centiMorgans as lines on a football field. They represent distance.

SNPs are locations that are compared to each other to see if mutations have occurred. Think of them as addresses on a street where an expected value occurs. If values at that address are different, then they don’t match. If they are the same, then they do match. For autosomal DNA matching, we look for long runs of SNPs to match between two people to confirm a common ancestor.

Think of SNPs as blades of grass growing between the lines on the football field. In some areas, especially in my yard, there will be many fewer blades of grass between those lines than there would be on either a well-maintained football field, or maybe a manicured golf course. You can think of the lighter green bands as sparse growth and darker green bands as dense growth.

If the distance between 2 marks on the football field is 5cM and there are 550 blades of grass growing there, you’ll be a match to another person if all of your blades of grass between those 2 lines match if the match threshold was 5cM and 500 SNPs.

So, for purposes of autosomal DNA, the combination of distance, centiMorgans, and the number of SNPs within that distance measurement determines if someone is considered a match to you. In other words, if the match is over the threshold as compared to your DNA, meaning the match is deemed to be relevant by the party setting the threshold. Think of track and field hurdles. To get to the end (match), you have to get over all of the hurdles!

By Ragnar Singsaas – Exxon Mobil ÅF Golden League Bislett Games 2008, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=5288962

For example, a threshold of 7 cM and 700 SNPs means that anyone who matches you OVER BOTH of these thresholds will be displayed as a match. So centiMorgans and SNPs work together to assure valid matches.

Thresholds

These two numbers, cMs and SNPs, are used in conjunction with each other. Why? Because the distribution of SNPs within cM boundaries is not uniform. Some areas of the human genome have concentrations of SNPs, and some areas are known as “SNP deserts.” So distance alone is not the only relevant factor. How many blades of grass growing between the lines matters.

Each of the vendors selects a default threshold that they feel will give you the best mix of not too many false positives, meaning matches that are identical by chance, and not too many false negatives, meaning people who do actually match you genealogically that are eliminated by small amounts of matching DNA. Unfortunately, there is no line in the sand, so no matter where the vendor sets that threshold, you’re probably going to miss something in either or both directions. It’s the nature of the beast.

Company	Min cMs	Min SNPs	Comment
Family Tree DNA	7cM for any one segment + 20cM total	500	After the initial match, you can view down to 6 cM and 500 SNPs to people you match
23andMe	7cM	700
Ancestry	8cM after Timber and associated phasing routines	Unknown	Timber population based phasing removes matches they determine to be “too matchy” or population based
GedMatch	User selectable – default is 7	User selectable – default is 700

2022 Update: MyHeritage began offering DNA testing and matching after this original article was published. Matches must have at least one 8 cM matching segment, but they show additional segments to 6 cM. There is no specified number of SNPs. Note that their imputation calculations sometimes cause the reported number of cM to be larger than for the same two people at other vendors.

As you might guess, there many opinions about the optimum threshold combinations to use – just about as many opinions as people!

These are important values, because the combined size of those matches to an individual allows you to roughly estimate the relationship range to the person you match.

As a general rule, the vendors do a relatively good job, with some exceptions that I’ve covered elsewhere and amount to beating a dead horse (Ancestry’s Timber, no chromosome browser). Of course, one of the big draws of GedMatch is that you can set your own cM and SNP matching thresholds.

Having said that, if you come from an endogamous population, you may want to raise your threshold to 10cM or even higher, depending on what you’re trying to accomplish

Effectively Using cMs and SNPs

Your personal goals have a lot to do with the thresholds you’ll want to select.

If you are new at genetic genealogy, you will first want to pursue your best matches, meaning the highest number of matching centiMorgans/SNPs, because they will be the low-hanging fruit and the easiest matches to connect genealogically. Said another way, you’ll match your closer relatives on bigger chunks of DNA, so concentrate on those first. Successes are encouraging and rewarding!

Your match to a second cousin, for example, will have a significant amount of shared DNA, and second cousins share common great-grandparents – 2 of 8 people in that generation on your tree – so relatively easy to identify – as these things go.

The chart below shows the expected percentage of shared DNA in a given match pair, in this case, first and second cousins with a first-cousin-once-removed thrown in for good measure. Also shown is the expected amount of shared centiMorgans for the given relationship, the average amount of shared DNA from a crowd-sourced project titled The Shared cM Project by Blaine Bettinger, and the range of shared DNA found in that same project.

A pedigree chart of my family members fitting those categories is shown below, plus the actual amount of shared cMs of DNA to the right.

The chart below shows my DNA matches to my first-cousin-once-removed (1C1R), Cheryl.

Since we do match at Family Tree DNA above the match threshold, I can view all of my matching segments to Cheryl down to 1cM and 500 SNPs.

Just as a matter of interest, I’ve color coded the cM segments:

>10 cM = green
7-10 cM = yellow
<7 = red

This means that if these were the largest matching segments, you would or would not be able to see them at the various thresholds of 7 and 10 cM.

If the matching threshold is at the default of 7cM, the green and yellow segments would be displayed.

If the matching threshold was set at 10, only the green cM segments are going to be shown.

At Family Tree DNA, you can select various threshold display options when using the chromosome browser tool, but not for initial matching. In other words, you have to match at their default threshold before you can see your smaller segments or alter your threshold display.

Some people want to see all of their DNA that matches, and some only want to see the large and compelling pieces, those green segments. Neither choice is wrong, simply a matter of personal preference and individual goals.

The “large and compelling” part of that statement brings me back to why you’re participating in genetic genealogy in the first place, those individual goals. The larger segments are going to lead to common ancestors who are generally easier to find and identify, unless you have an unidentified parent or a misattributed parental event.

You would never start with smaller segments in terms of matching, but that does not mean those smaller segments are never useful. In fact, after you’ve managed to analyze all of your low hanging fruit, and you’re ready to research or concentrate on those ugly brick walls, groupings of those smaller segments in descendants may just be your lifesaver.

Surviving Phasing

However, now I’m curious. How many of those smaller segments do stand up to the test of parental phasing, meaning they match both me and my parent? If my match (Cheryl) matches both me and my parent, then Cheryl does not match me by chance on that segment, so the match is genealogical in nature, the matching DNA proven to have descended to me from my mother.

Let’s see.

In order to phase my results with Cheryl against my mother, I copied Mother’s results into the same spreadsheet, above, color coding our rows so you can see them easier. “Cheryl matching Mom” rows are apricot and “Cheryl matching me” rows are yellow.

You can see that in some cases, like the first two rows, the two rows are identical which means I inherited all of Mom’s DNA in that segment and Cheryl inherited the same segment from her father, matching both Mom and me.

In other cases, I inherited part of Mom’s DNA on a particular segment. I could also have inherited none of a particular segment.

In fact, of the 27 segments where I match Mom on any part of the segment, I match her on the entire segment 18 times, or 66.6% and on part of the segment 9 times, or 33.3%.

I left the color coding in the cM column the same as it was before, in my rows, to indicate small, medium and large segments. The small segments are red, which would be the most likely NOT to phase with my mother, in other words, the most likely to be Identical by Chance, not descent. If Cheryl and I are Identical by Chance on these segments, it means that the reason I’m matching Cheryl is NOT because I inherited that chunk of DNA from mother. If Mom and I both match Cheryl, then Cheryl and I are Identical by Descent, meaning I inherited that piece of DNA from my mother, so the match is not because Cheryl’s DNA is randomly matching that of both of my parents.

In the spreadsheet below, I removed mother’s rows to eliminate clutter, but I color-coded mine. The rows that show red in the CHR and SNP columns BOTH are rows that did NOT phase with my mother, meaning these matches were indeed identical to Cheryl by chance. The rows that are red ONLY in the cM column (and not in the CHR column) are small segments that DID phase with my mother, so those are identical by descent (IBD).

Here’s the interesting part.

All of the large segments, 10cM and over passed phasing. They are legitimate IBD matches.
One of 2 of the medium cM matches passed phasing.
Of the 15 smaller segments, ranging in size from 1.38 cM to 6.14 cM, more than half, 8, passed phasing. Seven did not. The smallest segment to pass phasing was 1.38 cM. I suspect that part of the reason that the smaller cM segments are passing phasing is that the SNP threshold is held steady at 500 SNPs. In another (unpublished) study, dropping the SNP threshold below 500 results in a dramatic increase in matches (roughly fourfold) and a very small percentage of those matches phase with parents.

Small Segments Guidelines

There has been a lot of spirited debate about the usage, or not, of small segments, so I’m going to provide some guidelines. Let me preface this by saying that none of this is worth getting your knickers in a knot, so please don’t. If you don’t want to include or utilize small segments, then just don’t.

What is and is not a small segment can vary depending on who you are talking to and the context of the conversation.
Small segments CAN and do survive parental phasing, as shown above.
Small segments CAN be triangulated to a particular ancestor. Triangulated in this sense means that this segment is found in the descendants of a group of people (3 or more) proven to descend from the same ancestor AND who all match each other on the same segment.
Not all small segments can be triangulated to a common ancestor. But then again, the same can be said for larger segments too. It’s more difficult and unlikely to be successful with smaller segments unless you are starting with a group of people who descend from a common ancestor and are looking for “ancestral DNA.”
Small segments, even after triangulation, can be found matching a different lineage. This is an indicator that while the descendants of the first group share this DNA segment from a specific ancestor, it may also be prevalent in a population in general, which would cause the same segment to show up matching in a second lineage from the same region as well. I have an example where my Acadian line also matches a different German line on a particular segment – which really isn’t surprising given the geography and history of Germany and France.
Small segments without the benefit of other tools such as parental phasing, triangulation and match groups are, at this time, a waste of time genealogically. This may not always be the case.
Never start with small segments.
Never draw conclusions from small segments alone, meaning without corroborating evidence.
Use small segments only in context of a combination of parental phasing, triangulation and match groups.
Just because you match a group of people, out of context, on a segment (small or otherwise) doesn’t mean that you share a common ancestor. The smaller the segment, the more likely it is to be either IBC or IBP. Situations where the DNA is exactly the same from both parents, meaning everyone has all As in that location, for example, are called runs of homozygosity and the smaller the segment, the more likely you are to encounter ROH segments which appear as phased matches. Yes, another cruel joke of nature.

As a proof point relative to how deceptive small segment matching out of context can be, I ran my kit against my friend who is unquestionably 100% Jewish. I have no Jewish ancestry. At 7cM/700 SNPs we have no matches, at 3cM/300SNPs we have 7 matching segments.

However, matching this individual to my phased parents, none of these segments match both me and either one of my phased parent. Phased parent kits, at GEDMatch are kits reflecting the half of my parents DNA I received from that parent. If you have one or both parents who have tested, you can create phased kits with instructions from this article.

Lowering the match threshold even further to 100 SNPs and 1cM, my Jewish friend and I match on a whopping 714 tiny matching segments, over 1100 cM total, but all very small pieces of DNA. Because of the absolute known 100% Jewish heritage of my friend, and my known non-Jewish heritage, these matches must be either IBC, identical by chance or perhaps some small segments of IBP, identical by population from a very long time ago when both of our ancestors lived in the Middle East, meaning thousands of years ago. Bottom line, they are not genealogically relevant to either of us. I repeated this same experiment with someone that is 100% Asian, with the same type of results. You will match everyone at this threshold, including ancient DNA matches tens of thousands of years old.

The message here is that you can work from the “top down” with small segments, meaning in a known relationship situation like with my cousin and other relatives, but you cannot work from the bottom up with small segments as you have no way to differentiate the wheat from the chaff.

In the Crumley study, there are groups of small segments (greater than 3cM/300SNPs) that persist in multiple descendants of James Crumley, born in 1712. In this case, because you can separate the wheat from the chaff with more than 50 participants, others who triangulate with those small segments and match the group of Crumley descendants may well share a common ancestor at some point in time, especially if they can phase with their parents on those segments to prove the match is not IBC.

Remember, your match on any segment to one person can be IBD, meaning you have identified the common ancestor, your match to another person on that same segment IBC, and yet to a third person, IBP where your match survives generational phasing, but you may never find the common ancestor due to the age of the segment or endogamy.
When utilizing small segments, I generally don’t drop the SNP threshold below 500, as the number of matches increases exponentially and the valid matches decrease proportionately as well. I’ll be publishing more on this shortly.
I do fully believe, within this set of cautionary criteria, that small segments can be useful. I also believe that small segments can be very easily misinterpreted. The use of matching segments has a lot to do with combining different pieces of evidence to build confidence in what the “match” is telling you. I wrote about the Autosomal DNA Matching Confidence Spectrum here.
Small segments should only be utilized after one has a good grasp of how genetic genealogy works and by utilizing the tools available to restrict those segments to genealogically descended DNA. In other words, small segments are for the advanced user. However, maintain those small segment groupings and triangulations in your spreadsheet, because when you have the level of experience needed to work with those small segments, they’ll be available for you to work with. You may discover that most of your DNA triangulates by using large segments and you don’t need to utilize those small segments at all.
If you send me a list of matches from GedMatch with the cM set to 1 and the SNPs set to 100 and ask me what I think, I would simply to refer you to this article. But if I did reply, I would tell you that unless you have corroborating evidence, I think you’re wasting your time, but it’s your time and you’re welcome to do what you want with it. Life is about learning.
If you tell me you’ve drawn any conclusions from those types of matches (1cM and 100 SNPs), I’m going to be inconvincible without other tools such as genealogical proof, parental phasing and triangulation groups that prove the segments to be valid to a specific ancestor for the people about whom you’re drawing conclusions. I might even suggest you look at the raw data in those segments to see if you’re dealing with runs of homozygosity.

Netting It Out

The net-net of this is that small segments can be useful, but it takes a lot more work because of the inherent questionable nature of small segment matches. This goes along with that old adage of “extraordinary claims require extraordinary evidence.” Just be ready to roll up your shirt sleeves, because small segments are a lot more work!

Now having said all of that, I very much encourage continuing to triangulate your small segments and pay attention to them. You may notice patterns very relevant to your own genealogy, or you may learn that those patterns were somewhat deceptive – like IBD that turned into IBP. Still useful and interesting, but perhaps not as originally intended.

Without continuing and ongoing research, we’ll never learn how to best utilize small segments nor develop the tools and techniques to sort the wheat from the chaff. Just be appropriately paranoid about conclusions based on small segments, especially small segments alone, and the smaller the segment, the more paranoid you should be!

There is a very big difference between working with small segments along with larger matching data and genealogy, which I encourage, and drawing conclusions based on small segment data alone and out of context, which I highly discourage.

Let’s hope that all of your matches come with large segments and matching ancestors in their trees!!!

Pickin’ Crab

You know, working with different cM levels and SNPs, especially as segments get smaller and more challenging, I’m reminded of “picking crab” at a good old North Carolina crab bake. You would never start out with a crab bake for breakfast. You kind of have to work your way up to pickin’ crab – the same as small segments. And you never pick crab alone. It’s a group activity, shared with friends and kin. So is genetic genealogy.

You’ll need lessons, at first, in how to “pick crab” effectively. There’s a particular technique to it. Friends teach friends. You’ll find cousins you didn’t know you had, like Dawn in the brown shirt below, giving lessons to Anne.

A little practice and you’ll get it.

Just because it’s not easy doesn’t mean it’s not productive, especially when everyone works together! And the results are “very good,” if you just have patience and work through the process. If you decide that you “can’t pick crab,” then you’re right, you can’t pick crab, and you’ll just have to go hungry and miss out on all the fun! Don’t let that happen. Hint – sometimes the fun is in the pickin’!

Here’s hoping you can solve all of your brick walls with large cMs and large SNP counts, and if not, here’s hoping you enjoy “picking crab” with a group of friends and cousins and who will contribute to the ongoing research.

Pickin’ crab, or working on identifying difficult ancestors is always better when collaborating with others! Find cousins and fellow collaborators and enjoy!!! Genetic genealogy is not something you can do alone – it’s dependent on sharing.

Sometimes it’s as much about the friends and cousins you meet on the journey and the adventures along the way as it is about the answer at the end.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Legacy Tree Genealogists for genealogy research

82 thoughts on “Concepts – CentiMorgans, SNPs and Pickin’ Crab”

Mike on March 30, 2016 at 6:40 pm said:

Thanks Roberta — a very timely article for me. I had someone send me an e-mail yesterday telling me we matched on GEDmatch, and when I checked using the default thresholds I didn’t see him in my list of matches. We exchanged a couple more e-mails and I found out that he had dropped the cM threshold down to 3 to find our match. I was literally trying to figure out how to answer him when your post arrived in my inbox — now I can just refer him to this excellent post of small segments. As always you gave a great explanation of a topic that could be difficult to understand.

Loading...

Reply ↓
- robertajestes on March 30, 2016 at 6:45 pm said:
  
  Hi Mike, You’re right, it is difficult to understand. I wish we could come up with hard and fast rules, using words like “never” and “always,” but we can’t.
  
  Loading...
  
  Reply ↓
Kalani on March 30, 2016 at 6:47 pm said:

Interesting analogy with the football field. That should be helpful.

With raising the threshold higher for endogamous matches, not sure about other endogamous groups but I’ve been saying this the past few years now, that if I were to do that, it would eliminate my true 2nd – 4th or even 5th cousin matches. Since the ability to adjust the threshold seems to rely on TOTAL shared.

Between my 1/2 1C and my brothers, we have between us all, we have 17 comparisons with 4th cousins. The TOTAL range from as low as 0cM to 117cM. We seem to fall in the range of 40cM – 60cM for many of us.

On GEDmatch, my highest (non-relative) match shares 198cM of whom we haven’t seen a connection. The next highest, 185cM and that person definitely is not related to me at least not in the last 8 centuries. Same with the others whom I share 188cM, 168cM, 171cM and many more above 100cM. My mother easily shares more than 100cM for many of these 8+ century IBP matches.

So trying to increase the threshold won’t weed out the correct matches. And this is probably where people say I’m the exception, as always. At least until they see a similar situation happenning.

Loading...

Reply ↓
fmowry on March 30, 2016 at 7:19 pm said:

Thank you! I’m thrilled to see this topic addressed. I’m also excited to realize how much I have learned since getting my DNA results a year ago. I don’t think I would have understood most of this back in early 2015.

Loading...

Reply ↓
Barbara Peterson on March 30, 2016 at 7:30 pm said:

I have a sorta-kinda related question. Right now I am in the process of matching larger segments of closest matches by chromosome on my direct paternal line. I noticed that there is a pattern of “hiccup” or discontinuous cM segment inheritance on chromosome 19 between my brother Rick and 3 of our matches. I match for the whole segment with these three while he has a “hiccup” in about the middle of it:

Me & Del 19 46749865 to 57726410 27.14 3211
Me & Oma 19 51492068 to 57726410 18.84 2034
Me & Les 19 51492068 to 57726410 18.84 2034

Here’s Rick’s:

Rick & Del 19 46749865 to 55007510 16.54 2187
19 55418271 to 57726410 9.66 924
Rick & Oma 19 51492068 to 55007510 8.24 1010
19 55418271 to 57726410 9.66 924
Rick & Les 19 51492068 to 55007510 8.24 1010
19 55972568 to 57726410 6.72 748

The discontinuity for all three happens after position 55007510 and all the segments end at 57726410.

This is the only cluster of ICWs on any chromosome that I’ve seen this pattern. Does it have a name and is there any significance to it? Is it more likely to occur on 19? Or is it just an error in the data that’s being reported?

Thanks…

Loading...

Reply ↓
- robertajestes on March 30, 2016 at 7:41 pm said:
  
  It’s possibly a read error at that location in your brother’s kit. Does he match other cousins on this same segment portion? Does he have this same hiccup when matching against you?
  
  Loading...
  
  Reply ↓
  - Barbara Peterson on March 31, 2016 at 3:14 pm said:
    
    Robert…no he shares an 83.64cM block on 19 and when viewed in the table it is reported as continuous. But I don’t understand the position numbering system. My half-brother Frank and I share almost the same amount and the bar chart data says my section is from 6752953 to 63788972 while Frank’s, which totally matches/overlaps, is reported as 4408650 to 62527724 for 85.19cM.
    
    Loading...
    
    Reply ↓
caith on March 30, 2016 at 7:48 pm said:

Sometimes when working with cMs under 10, I ask the age of my dna match if we have a tree match. I.e., I am 72, and if my match who is 30 years old and we share only 5cMs, obviously had her parent been tested, the parent and I would have shared ~ 10 cMs.

So if you analyze it that way, that 5cM has morphed into 10cMs because the age/generation difference represents an addition recombination.

Loading...

Reply ↓
Shelley Hallman on March 30, 2016 at 8:24 pm said:

Thanks Roberta, I understand the cm’s and snps much better now. I may never make it to the crab feast though. I seem to assimilate “mathlike” information rather slowly so understanding the info in the first portion of this article was a major leap for me. The football yardage locations vs the blades of grass did it!

Loading...

Reply ↓
Carolyn Lea, PhD on March 30, 2016 at 9:17 pm said:

Roberta,

I was wondering why you don’t mention DNAGedCom.com? I think I learned about it from you!

Unrelated question – looking at the time on the posts the latest post was at 8:24 pm on Mar 30. In Oklahoma it is currently 4:12 pm, NY 3:12. Why is the time so much later than here? Just curious.

I am sending this to my friend who just received her results. I thinks it will be very helpful to her in understanding what she is looking at. Thanks for the post.

Loading...

Reply ↓
- robertajestes on March 30, 2016 at 9:29 pm said:
  
  I think the time that WordPress uses is Greenwich Mean Time.
  
  I like DNAGedcom.com, but their tools are different that the GedMatch tools. For doing the kinds of comparisons I did in this article, I needed my cousin’s results at FTDNA and I needed the GedMatch comparison of my Jewish friend. DNAGedcom does not archive people’s raw data files to be compared to each other – they provide tools for individuals to use – in particular the adoption community. I used to use their 23andMe tool but I’ve pretty much stopped using 23andMe altogether. So, nothing negative about DNAGedcom.com, just different tools for different uses.
  
  Loading...
  
  Reply ↓
Bonnie on March 30, 2016 at 9:50 pm said:

Excellent explanation for this non-scientific person! Best explanation and analogy of cMs and SNPs I’ve seen, thanks!

Loading...

Reply ↓
Trish Schmig on March 30, 2016 at 9:55 pm said:

Hi Roberta: If I were looking for 7 X Great Grandparents and took some ICW matches from FTDNA that automatically come up lowered to 1 CM to see all the segments. Then went to GEDMATCH and leaving the threshold at 500 SNP and 3 CM’s added them to a spreadsheet of my FTDNA matches and got many overlapping segments on multiple chromosomes including “X”. In your opinion, would we be looking for a common ancestor at or past our 7X Greats ?

Loading...

Reply ↓
- robertajestes on March 30, 2016 at 10:02 pm said:
  
  If they are matching at FTDNA, they you have to be over the 20 total and 7 cM longest segment threshold already. I would use those longest segments at the basis for comparisons. The further out in time and the more distant, the wider the range for the relationship spans. In other words, I don’t know.
  
  Loading...
  
  Reply ↓
Eric S Johnson on March 31, 2016 at 3:00 am said:

Roberta, one of the tests you used to determine the IBDness of a small segment is whether it also matches one of your parents.

I agree that’s a good test, but I think it’s not a 100%-accurate test. At a particular chromosomal address I have a number of (~9-cM) shared HIRs which match both parent and child, and even siblings too, which are demonstrably IBC.

I’m not dissing the parent-child-IBD test, just cautioning that it’s not 100%.

Best,

Eric

Loading...

Reply ↓
- robertajestes on March 31, 2016 at 2:17 pm said:
  
  Hi Eric. I’m curious, if those segments matched parent, child and other siblings, how did you demonstrate that those segments are IBC?
  
  Loading...
  
  Reply ↓
- robertajestes on March 31, 2016 at 2:17 pm said:
  
  Hi Eric. I’m curious, if those segments matched parent, child and other siblings, how did you demonstrate that those segments are IBC?
  
  Loading...
  
  Reply ↓
- Ann Turner on April 4, 2016 at 11:02 pm said:
  
  Eric, I agree that matching a parent makes a good screening test, because the parent can also be identical by chance. Using the GEDmatch Matching Segment Utility down to the 5 cM / 500 SNP threshold, my son’s genotype data has 6123 / 9767 matching segments not found in either parent. He has 1689 that are listed for his father. But if I run my son’s phased kit, 523 of those drop out.
  
  Loading...
  
  Reply ↓
thegenealogistdotca on March 31, 2016 at 7:59 pm said:

This is a good general overview of the concept of cM lengths and what they mean matching, and I agree on most of your points, but I have a few bones (if not crabs) to pick.

For starters, the FTDNA threshold appears anecdotally to be around 7.69 cM, not 7.0.

Also, it’s not intuitive, but the probability of a possible match being IBD is notably smaller for 7.0 as opposed to 7.69 cM, as I explained here .

Another caveat, 23andme will be nearly useless at finding non-Ashkenazi relatives beyond about third cousins for anyone with three-quarters, one-half, one-quarter, or even one-eighth Ashkenazi ancestry, because the 7.0 cM cutoff yields way too many IBP segments, which crowd out non-Ashkenazi matches, because of the cap on kits that show up in Relative Finder, as I explain in the link above.

This is quite ironic, considering company founder Anne Wojcicki is herself half-Ashkenazi, and must be getting very few matches from her paternal Polish side in her own RF.

People with significant colonial New England or early Virginia ancestry, or origins from other groups such as French Canadians/Acadians, may suffer a similar effect.

One solution to dealing with vast numbers of IBP matches is a kind of “slide” the user can apply to change the cM threshold, as at Gedmatch. The FTDNA chromosome browser has this too, though only in the set increments of 1.0, 3.0, 5.0, and 10.0 cM. Still, a useful feature.

While 23andme used to allow a sliding scale for matches in increments of 1.0 cM, from 5.0 to 15.0 cM, through Countries of Ancestry – in fact this was quite interesting, since most IBP matches drop off even by 8.0 cM – they abolished this and other useful features in the past months, making 23andme ever-less useful for ancestry analysis.

Loading...

Reply ↓
- Kalani on March 31, 2016 at 8:20 pm said:
  
  It seems like many of us have seen different minimum longest block segment. Mine is way below 7.0cM.
  
  Loading...
  
  Reply ↓
- robertajestes on March 31, 2016 at 10:33 pm said:
  
  Mine is 7.0.
  
  Loading...
  
  Reply ↓
  - thegenealogistdotca on April 1, 2016 at 1:34 am said:
    
    Interesting — someone should let ISOGG know that the setting for smallest long matches is adjustable.
    
    Loading...
    
    Reply ↓
- Kalani on March 31, 2016 at 11:18 pm said:
  
  What would work best if they allowed an adjustment for the largest segment which is also critical in determining a true 2nd, 3rd or even a 4th cousin relationship or anything in between that. Obviously getting largest segment at 7cM but more than 100cM total is a distant match for me. I have a 7.7cM largest segment, total 105.5cM. Our ancestors were in the same geographic area (somewhat) but just ONE intermingling beyond a genealogical time frame wouldn’t have caused this type of match. It should be multiple if not the same ancestor multiple times. So these type of matches are useless.
  
  Loading...
  
  Reply ↓
  - thegenealogistdotca on April 1, 2016 at 1:39 am said:
    
    Yes, this is my main issue with 23andme in particular. In the end it’s a larger part of a conceptual problem for genetic genealogy, which is, When do “genealogical” markers become “population” markers?
    
    Loading...
    
    Reply ↓
thegenealogistdotca on March 31, 2016 at 9:57 pm said:

Interesting. Possibly FTDNA have different settings depending on your population(s) of origin?

Loading...

Reply ↓
- Kalani on March 31, 2016 at 10:15 pm said:
  
  I was actually told that was the case for me. Not sure about others.
  
  When I sort my matches by LONGEST BLOCK & look at the smallest size, it says 5.11cM LONGEST BLOCK while the total is 134.14cM. Next largest one above that is 5.68cM LONGEST BLOCK, total shared 141.83cM.
  
  On that last page, there are only 8 matches with the lowest at 5.11cM (longest block) as I said, and at the top, 6.86cM longest block but total 211.13cM.
  
  Loading...
  
  Reply ↓
  - thegenealogistdotca on April 1, 2016 at 1:48 am said:
    
    Hm. If I look at the kit of someone ~23.5% Ashkenazi that I manage, their lowest long block is 7.69, the same amount quoted in the ISOGG page. Perhaps they make manual adjustments depending on ancestrally informative markers.
    
    Loading...
    
    Reply ↓
    - robertajestes on April 1, 2016 at 2:02 am said:
      
      Jewish kits are handled differently but typically that’s people that are a majority Jewish.
      
      Loading...
      
      Reply ↓
  - Ann Turner on April 4, 2016 at 10:53 pm said:
    
    Kalani — I have a hunch that a very large number of small segments overpowers the typical number we see for shortest segment. When I uploaded my experimental 23andMe v4 kit to FTDNA (with lots of no-calls), it generated a lot of small pseudo-segments, and I matched another person who had a huge number of no-calls from a regular FTDNA kit. (We later found out that was the reason he was so “matchy” at GEDmatch. The v4 kit performed very nicely on long segments, though.
    
    Loading...
    
    Reply ↓
  - Kalani on April 4, 2016 at 11:00 pm said:
    
    Ann…..I never thought of dropping the threshold from the 23andme kits compared to our (my mother, mine, brothers, aunt & cousins) matches to see if there are the same tiny segments. But at FTDNA, this is what some of the other distant people (one we have no known connection within a genealogy time frame while the other we know we share common ancestors from 8 centuries ago) look like when compared to just my mother.
    https://hawaiiandna.files.wordpress.com/2015/02/tinyseg-mom.png
    
    Loading...
    
    Reply ↓
- Kalani on April 1, 2016 at 2:41 am said:
  
  I was told that they have a “script” that they run for AJ, so I can see why the settings are adjusted.
  
  Loading...
  
  Reply ↓
  - Gary Anderson on June 24, 2017 at 4:04 pm said:
    
    I am 2 percent AJ on Ancestry and have 25 full Askenazi relatives on Gedmatch. Those are on my nat mom’s side who had UK roots. My nat father was Sephardic. Question for author,can he be certain that he does not possess any DNA from Jacob?
    
    Loading...
    
    Reply ↓
    - robertajestes on June 24, 2017 at 4:12 pm said:
      
      That is not possible to answer.
      
      Loading...
      
      Reply ↓
Pingback: Concepts – Parental Phasing | DNAeXplained – Genetic Genealogy
Pingback: Concepts – Managing Autosomal DNA Matches – Step 1 – Assigning Parental Sides | DNAeXplained – Genetic Genealogy
Pingback: The Concepts Series | DNAeXplained – Genetic Genealogy
Pingback: Nine Autosomal Tools at Family Tree DNA | DNAeXplained – Genetic Genealogy
Pingback: Concepts – Match Groups and Triangulation | DNAeXplained – Genetic Genealogy
Pingback: Double Match Triangulator (DMT) | DNAeXplained – Genetic Genealogy
Karen Kent on November 12, 2016 at 9:12 am said:

Very interesting read, i love how you made it a little easier to understand. I still have a question though. I have a match that on chromosome 17 we share 70.6 cm, that seams like alot for one chromosome. We also match on chromosome 5 with 10 cm. That’s the extent of our matching. She has a surname ( honeycutt) that is on my dad’s side, but my mom’s half brother matches her in the same segment i do on chromosome 17. According to gedmatch.com my mom and dad are not related. Could it be that my mom’s brother’s dad is related to my dad? Or is this just a case of IBC ? thanks ,
Karen

Loading...

Reply ↓
- robertajestes on November 12, 2016 at 12:14 pm said:
  
  That’s awfully high for IBC.
  
  Loading...
  
  Reply ↓
  - Karen Kent on November 12, 2016 at 4:58 pm said:
    
    Thank you for your reply 🙂 would you think that she’s a close enough match to contact? I’m only interested in contacting back to 2nd cousins, or great aunts and uncles.
    
    Loading...
    
    Reply ↓
Pingback: 2016 Genetic Genealogy Retrospective | DNAeXplained – Genetic Genealogy
Pingback: Concepts – Segment Size, Legitimate and False Matches | DNAeXplained – Genetic Genealogy
Joan on February 11, 2017 at 9:37 pm said:

Hi Roberta, it’s your Acadian cousin, Joan. This article was so helpful for me as I recently had my DNA tested and now I’m trying to use all the tools on GEDmatch and make sense of things. I have a French Canadian distant cousin with whom I have one match that is 10.1cM. We also have a second match on chromosome 6 that is 3.8cM and 2,553 SNPs. Is that an unusually high SNP number for a short segment, or does that have to to with the particular location on chromosome 6? I wasn’t sure what to make of this. I have read that the DNA in chromosome 6 has coding for our immune system and is usually passed down in larger blocks. Thanks

Loading...

Reply ↓
- robertajestes on February 11, 2017 at 11:15 pm said:
  
  You need to read this article about pile-up areas regarding chromosome 6. https://dna-explained.com/2016/09/08/concepts-managing-autosomal-dna-matches-step-2-updating-match-spreadsheets-bucketed-family-finder-matches-and-pileups/
  
  Loading...
  
  Reply ↓
Kats Rita on May 8, 2017 at 6:50 am said:

I first wanted to say this is the most informative and easy to read article I have come across in my recent dive into genetics. Thank you (!)
That being said… help! I’m still confused on what to actually look for. Example: I match (GEDmatch) another human on four chr (chromosomes?) and the ratios are set to their standards. Here is a c/p of that ‘one-on-one’ match (this is my #1 match on GEDmatch)
Could this be an actual biological relation(?!) And if so, how does one decipher how close/distant a relation?
••••••••••••••
Minimum threshold size to be included in total = 500 SNPs
Mismatch-bunching Limit = 250 SNPs
Minimum segment cM to be included in total = 7.0 cM

Chr. Start. End. cM. SNPs
2 2,211,750 12,118,203 25.8 1,541
3 188,421,650 195,401,596 17.3 933
5 91,139 4,694,160 13.9 807
11 69,392,754 95,070,864 23.1 2,556
Largest segment = 25.8 cM
Total of segments > 7 cM = 80.1 cM
4 matching segments
Estimated number of generations to MRCA = 3.7
•••••••••••••
Thank you so much for your time, I appreciate any help that comes my way 🙂

-kats

Loading...

Reply ↓
- robertajestes on May 9, 2017 at 9:21 pm said:
  
  The best reference in terms of correlating segments to relationships is the chart in this article: https://dna-explained.com/2016/08/04/concepts-relationship-predictions/
  
  I use this all the time.
  
  Loading...
  
  Reply ↓
Margaret Kolesar on June 25, 2017 at 4:00 pm said:

Hi Roberta,
Thank you for this article…it has helped me immensely!
As my parents are both deceased, can I use my sibling’s results for phasing to identify IBD vs IBC?
I am trying to identify whether a match is IBD or IBC for a distant relative. I have the common surname and also common townland in Ireland.
At the default threshold in Gedmatch, there is no match, but when I lower the SNP’s to 400, we have the following match, from this person to both me and my sister…

Chr Start Location End Location Centimorgans (cM) SNPs
9 10,671,522 14,010,982 4.7 451

This is exactly the same for both of us. Would I consider this an IBD or and IBC?
Thank you for your time!
Margie

Loading...

Reply ↓
- robertajestes on June 25, 2017 at 5:55 pm said:
  
  Siblings can’t be used in this manner, because both you and your sibling inherited from your parents – and you could both have inherited the exact same segment from your parents that is reading as IBC.
  
  Loading...
  
  Reply ↓
  - Margaret Kolesar on June 25, 2017 at 7:07 pm said:
    
    Thank you!
    Because I can’t use my parents, then it would be wise to just stick to the default numbers correct?
    Would there be any other way of checking to see if a match like the above would be an IBD?
    Thanks again Roberta…..I am such a novice at this….trying to wrap my mind around it.
    
    Loading...
    
    Reply ↓
    - robertajestes on June 25, 2017 at 8:08 pm said:
      
      If you have known cousins who match on the same segment from one side or the other, then that’s a good indication that the segment is IBD.
      
      Loading...
      
      Reply ↓
Pingback: X Matching and Mitochondrial DNA is Not the Same Thing | DNAeXplained – Genetic Genealogy
Claricia on July 27, 2017 at 12:40 am said:

This is a great article. I think there is a lot of people who write off anything under 7 or even 10 or higher CM and some get quite shirty you even contacted them. You do indicate that reasons for looking are very different. Many people are looking for cousins within their own community and in a place like America, they will have more than enough to keep them occupied. I have NZ and Australian family for the last 3 to 8 generations. I am highly unlikely to ever get an 8th Generation American showing up at 10cM or more. We are relatively small population countries with very few years of testing availability so VERY few identifiable 2nd-4th cousins within the databases. My brick walls are around 1800-1860 in rural Scotland, England and Ireland. While we might never find the matching ancestor, finding a cluster of people with 4 to 8 shared segments of 3+CM does give some clues as to what area of Rosshire my Ross Greaty Grandfather actually DID come from. I have been doing a one place geneaological study on Strathdon Scotland and so far almost every person links to every other one in multiple pathways – I EXPECT to see lots of small links when I do a GEDMATCH on someone who thinks they may come from there. I actually start worrying when I find nothing at all down to 3CM as it is possible that their tree is totally off.
I just wish people would respect my purpose and not step in to cut off conversations on group forums with “Matches under 7 are false and useless” especially when the person shares an uncommon name AND a small geographical location. These forum experts may have just cemented a brick-wall in place for both of us.

Loading...

Reply ↓
ROBERT DAVIS on September 11, 2017 at 6:58 pm said:

In the last 18 months, have the numbers or caveats changed in your table of values from the companies that you give at the beginning of the Thresholds section? Also, it seems the criteria may differ between what’s used to determine a match and what’s used to calculate the total shared cM number between two people.

Loading...

Reply ↓
BETTY J HILL on September 29, 2017 at 8:18 am said:

Chr Start Location End Location Centimorgans (cM) SNPs
22 23,433,024 25,987,507 8.8 717
what does this me? can someone plz Tell Thanks I

Loading...

Reply ↓
- BETTY J HILL on September 29, 2017 at 8:20 am said:
  
  Chr Start Location End Location Centimorgans (cM) SNPs
  22 23,433,024 25,987,507 8.8 717
  Largest segment = 8.8 cM
  Total of segments > 7 cM = 8.8 cM
  1 matching segments
  Estimated number of generations to MRCA = 6.5
  
  Loading...
  
  Reply ↓
  - Roberta Estes on September 29, 2017 at 12:54 pm said:
    
    That is the location where you match someone. It’s now up to you to see if you can figure out a common ancestor.
    
    Loading...
    
    Reply ↓
Margaret on October 7, 2017 at 6:16 am said:

I have documented ancestry to my 4G grandparents, as does another contact, to the same 4G grandparents. We do not appear as a match at any of the sites. Lowering the threshold at gedmatch to 1cM, we come up as a 1.15 cM match. What is happening here?

Loading...

Reply ↓
- Roberta Estes on October 7, 2017 at 6:42 am said:
  
  It’s not uncommon for people that distantly related to not share DNA.
  
  Loading...
  
  Reply ↓
Robert Moore on February 20, 2018 at 6:58 pm said:

I’m glad I came across this post.

I’ve been running Gedmatch for a large number of my Moore kin (we’ve got a results pool of over 60 participants as of now, and all are traceable on paper as related) with some matches at low cMs, began to get skeptical. I did work in some triangulations, and found consistency with locations on certain segments. So, I figured I’d throw a monkey wrench in to see if I’d pull some false positives. I added my mother-in-law in the line, and to my surprise, she had four broken segments on chromosome 6 (where our Moore autosomal DNA is based) that match my dad (1.6 cM at 985 SNP, 1.1 at 577, 1.7 at 542, and 2.3 at 828). Now, I’ve traced her tree, and maybe I haven’t traced far enough back, but I haven’t found any Moores, or evidence to show that she had ancestors in the locale in which my Moores were located in the 17th and 18th centuries. While I don’t match her on these same fragments, I match others who I know to be relatives.

In short, this strikes me as random, but the 4-pt segment on chromosome 6 seems to make me think I might be wrong. Your thoughts?

Loading...

Reply ↓
- Roberta Estes on February 20, 2018 at 7:52 pm said:
  
  Check the location on chromosome 6 first. It’s notorious for a pileup in the HLA region. There’s a pileup chart in this article: https://dna-explained.com/2016/09/08/concepts-managing-autosomal-dna-matches-step-2-updating-match-spreadsheets-bucketed-family-finder-matches-and-pileups/
  
  Loading...
  
  Reply ↓
- Gary Anderson on July 6, 2018 at 1:03 pm said:
  
  Did you do one to one match at Gedmatch for your mom and your Moore line? I have two cousins one on each side of my family who match segment numerically on the same chromosome number, but one is on the first chromosome and the other is on the second chromosome same number.
  
  Loading...
  
  Reply ↓
- Gary Anderson on July 29, 2020 at 12:14 am said:
  
  That is a above my pay grade. Did you do one to one to make sure they are all on the same side of the chromosome? If they do not match one to one they are on opposite chromosomes but with the same numbers.
  
  Loading...
  
  Reply ↓
Hallie on May 19, 2018 at 11:54 pm said:

Hi. Thank you for this excellent introductory article. I am a beginner and I am working in the ultimate “out of context” context, which is that I’m comparing my DNA with that of my adopted daughter and that of my partner (and theirs to each other). Since none of us is know to be genetically related, I was very excited to find an aggregate total of approximately 20-30 cM between each of us on GEDMatch (with Ancestry data) —however it is based on dropping the threshold to 3 cM. I left the SNP threshold at default. So, we think we are all distant cousins, but understand from this article that we need to phase where we can and Triangulate (among ourselves and other relatives) to determine whether the matches are IBD.

Loading...

Reply ↓
- Roberta Estes on May 20, 2018 at 12:39 am said:
  
  I don’t recommend dropping the threshold below 7cM. Belong that point, about half of the matches are by chance and there’s currently no way to tell the difference. Work with your largest matches first. At the rate people are testing, you’ll never run out:)
  
  Loading...
  
  Reply ↓
  - Hallie on May 20, 2018 at 1:45 pm said:
    
    I guess I didn’t make myself clear—my adopted daughter, my partner (who is also our daughter’s adoptive mom), and I are comparing our DNA results to each other. There are no larger segments that match, but there are several segments around 3-4 Centimorgans adding up to 20-30 cM. So I hear you saying half of these matches could be IBC. Still, that indicates we are distantly related. This makes us happy because we are a family.
    
    Loading...
    
    Reply ↓
- Kalani on May 20, 2018 at 2:10 pm said:
  
  Dropping the threshold down to 3cM ANYONE can match. It’s not a real match. I’ve easily done that with my aunt compared to people whose ancestors were not from the same continent as my aunt. Impossible. Definitely IBS (IBC).
  
  Loading...
  
  Reply ↓
  - Roberta Estes on May 20, 2018 at 2:15 pm said:
    
    At 3 cM almost everyone will match everyone else.
    
    Loading...
    
    Reply ↓
Hermi14 on May 20, 2018 at 4:13 pm said:

OK. That does add to the content of the article since I was interpreting your “guidelines for small segments” to apply down to the 3-4 cM range, meaning they can be significant if they are backed up by phasing and triangulation (which are our next steps). So you are saying that throughout this article, “small segment” only applies to segments larger than 7 cM. Just trying to clarify. Thanks.

Loading...

Reply ↓
- Roberta Estes on May 20, 2018 at 10:32 pm said:
  
  Small segments are generally considered to be anything below 7, which is also where the 50% IBS/IBD falls too.
  
  Loading...
  
  Reply ↓
  - Hermi14 on May 20, 2018 at 11:07 pm said:
    
    Yes, that was my understanding. So, this article seems to contradict the idea that at “3 cM almost everyone will match everyone else.” I’m confused by that statement. I know you are busy with the European changes so I’m sorry to string this out, but it’s not clear to me.
    
    Loading...
    
    Reply ↓
    - Roberta Estes on May 21, 2018 at 3:48 am said:
      
      You can look at triangulation, and that may help. When we had fewer matches, we had less to work with. Today there is almost no reason to look at these smaller matching segments. I rarely look below 7cM today.
      
      Loading...
      
      Reply ↓
caith on May 20, 2018 at 6:47 pm said:

Caveat Emptor: When we drop that threshold, we need to be prepared. I dropped it and and found I match my husband with 4.7 cMs. It is okay, we married late and had no children together. LOL We were born in adjoining states.

Loading...

Reply ↓
- Hermi14 on May 20, 2018 at 11:11 pm said:
  
  Right —my partner and our adopted daughter (whose ancestry we know) could easily share ancestors with each other and with me based on our histories. Even if you did have children, the inferred relationship so distant as to not be a problem, I think!
  
  Loading...
  
  Reply ↓
debsong2013 on July 2, 2018 at 2:27 am said:

Roberta,

My brother has a Y-DNA match (Haplogroup T1a2n/T-L446) on FamilyTreeDNA.com (FTDNA) who is also a Family Finder Autosomal DNA Match on FTDNA. I used FTDNA’s In-Common-With Tool to create a list of their shared Family Finder Autosomal DNA matches.

Next, I began comparing their DNA in FTDNA’s Chromosome Browser tool.

One of these shared Family Finder matches with the surname Stamps has a matching DNA segment on Chromosome 6 that is 1.18cMs and 3000 SNPs. This segment has more SNPs than another larger matching segment with that same person on Chromosome 13 that is 9.19cMs and 2500 SNPs.

What accounts for such a tiny DNA segment (1.18 cMs) having such a large number of matching SNPs (3,000)?

— Deborah

Loading...

Reply ↓
- Roberta Estes on July 5, 2018 at 11:58 pm said:
  
  That’s probably the HLA region of chromosome 6 which is extreme SNP dense. Good eye though!
  
  Loading...
  
  Reply ↓
Pingback: Concepts – Endogamy and DNA Segments | DNAeXplained – Genetic Genealogy
Pingback: First Steps When Your DNA Results are Ready – Sticking Your Toe in the Genealogy Water | DNAeXplained – Genetic Genealogy
Pingback: Concepts – Managing Autosomal DNA Matches – Step 1 – Assigning Parental Sides | DNAeXplained – Genetic Genealogy
sunshine1978 on July 27, 2020 at 11:50 am said:

Thank you for the explanation and article! I agree, 1 cM/100 SNP matches would be a huge waste of time in my case. I wonder more about small matches that would show up on Gedmatch without changing any of the parameters. I have a possible genetic cousin discovery. We have several small matches. The largest, in terms of cM: 7.3 cM/234 SNPs, 7.3 cM/598 SNPs, and then 8.8 cM/247 SNPs. When I adjust the settings and take the cM down to 4 and raise the SNPs to 500, I see that those two lower SNP strands disappear, but I also find: 6.3 cM/693 SNPs and 4.1 cM/454 SNPs. Although we come from different countries, we both share a background of a specific ethnic group, and furthermore, my ancestors come from his same country. I wonder what are the chances that we are really related and not just by chance? I am thinking it may be hard to trace for certain due to lack of records and probably a good 200-300 years away from finding a common ancestor. I’m just trying to determine if this is a real match or just a match by chance or match by common population (our ethnic background is only one part of our overall backgrounds… he says he’s only partially Rusyn and I’m probably about 1/4-1/8th Rusyn).

Loading...

Reply ↓
- Roberta Estes on July 27, 2020 at 2:24 pm said:
  
  If either of your parents have tested, or another close relative see if they match too.
  
  Loading...
  
  Reply ↓