Concepts – CentiMorgans, SNPs and Pickin’ Crab

In autosomal DNA testing, you’ll see the terms centiMorgans, represented as cM and SNPs, which stands for single nucleotide polymorphism, combined.

These are two terms that are used to discuss thresholds and measurements of matching amounts of autosomal DNA segments.

These two terms, relative to autosomal DNA, are two parts of a whole, kind of like the left and right hand.

CentiMorgans are units of recombination used to measure genetic distance. You can read a scientific definition here.

For our conceptual purposes, think of centiMorgans as lines on a football field. They represent distance.

football fabric 2

SNPs are locations that are compared to each other to see if mutations have occurred.  Think of them as addresses on a street where an expected value occurs. If values at that address are different, then they don’t match.  If they are the same, then they do match.  For autosomal DNA matching, we look for long runs of SNPs to match between two people to confirm a common ancestor.

Think of SNPs as blades of grass growing between the lines on the football field.  In some areas, especially in my yard, there will be many fewer blades of grass between those lines than there would be on either a well maintained football field, or maybe a manicured golf course.  You can think of the lighter green bands as sparse growth and darker green bands as dense growth.

If the distance between 2 marks on the football field is 5cM and there are 550 blades of grass growing there, you’ll be a match to another person if all of your blades of grass between those 2 lines match if the match threshold was 5cM and 500 SNPs.

So, for purposes of autosomal DNA, the combination of distance, centiMorgans, and the number of SNPs within that distance measurement determines if someone is considered a match to you. In other words, if the match is over the threshold as compared to your DNA, meaning the match is deemed to be relevant by the party setting the threshold.  Think of track and field hurdles.  To get to the end (match), you have to get over all of the hurdles!

hurdles

By Ragnar Singsaas – Exxon Mobil ÅF Golden League Bislett Games 2008, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=5288962

For example, a threshold of 7 cM and 700 SNPs means that anyone who matches you OVER BOTH of these thresholds will be displayed as a match.  So centiMorgans and SNPs work together to assure valid matches.

Thresholds

These two numbers, cMs and SNPs, are used in conjunction with each other. Why?  Because the distribution of SNPs within cM boundaries is not uniform.  Some areas of the human genome have concentrations of SNPs and some areas are known as “SNP deserts.”  So distance alone is not the only relevant factor.  How many blades of grass growing between the lines matters.

Each of the vendors selects a default threshold that they feel will give you the best mix of not too many false positives, meaning matches that are identical by chance, and not too many false negatives, meaning people who do actually match you genealogically that are eliminated by small amounts of matching DNA. Unfortunately, there is no line in the sand, so no matter where the vendor sets that threshold, you’re probably going to miss something in either or both directions.  It’s the nature of the beast.

Company Min cMs Min SNPs Comment
Family Tree DNA 7cM for any one segment + 20cM total 500 After the initial match, you can view down to 1cM and 500 SNPs to people you match
23andMe 7cM 700
Ancestry 5cM after Timber and associated phasing routines Unknown Timber population based phasing removes matches they determine to be “too matchy” or population based
GedMatch User selectable – default is 7 User selectable – default is 700

As you might guess, there many opinions about the optimum threshold combinations to use – just about as many opinions as people!

These are important values, because the combined size of those matches to an individual allows you to roughly estimate the relationship range to the person you match.

As a general rule, the vendors do a relatively good job, with some exceptions that I’ve covered elsewhere and amount to beating a dead horse (Ancestry’s Timber, no chromosome browser). Of course, one of the big draws of GedMatch is that you can set your own cM and SNP matching thresholds.

Having said that, if you come from an endogamous population, you may want to raise your threshold to 10cM or even higher, depending on what you’re trying to accomplish

Effectively Using cMs and SNPs

Your personal goals have a lot to do with the thresholds you’ll want to select.

If you are new at genetic genealogy, you will first want to pursue your best matches, meaning the highest number of matching centiMorgans/SNPs, because they will be the low hanging fruit and the easiest matches to connect genealogically. Said another way, you’ll match your closer relatives on bigger chunks of DNA, so concentrate on those first.  Successes are encouraging and rewarding!

Your match to a second cousin, for example, will have a significant amount of shared DNA and second cousins share common great-grandparents – 2 of 8 people in that generation on your tree – so relatively easy to identity – as these things go.

The chart below shows the expected percentage of shared DNA in a given match pair, in this case, first and second cousins with a first cousin once removed thrown in for good measure. Also shown is the expected amount of shared centiMorgans for the given relationship, the average amount of shared DNA from a crowd sourced project titled The Shared cM Project by Blaine Bettinger and the range of shared DNA found in that same project.

A pedigree chart of my family members fitting those categories is shown below, plus the actual amount of shared cMs of DNA to the right.

shared cM table

The chart below shows my DNA matches to my first cousin once removed, Cheryl.

Since we do match at Family Tree DNA above the match threshold, I can view all of my matching segments to Cheryl down to 1cM and 500 SNPs.

Cheryl chart

Just as a matter of interest, I’ve color coded the cM segments:

  • >10 cM = green
  • 7-10 cM = yellow
  • <7 = red

This means that if these were the largest matching segments, you would or would not be able to see them at the various thresholds of 7 and 10 cM.

If the matching threshold is at the default of 7cM, the green and yellow segments would be displayed.

If the matching threshold was set at 10, only the green cM segments are going to be shown.

At Family Tree DNA, you can select various threshold display options when using the chromosome browser tool, but not for initial matching. In other words, you have to match at their default threshold before you can see your smaller segments or alter your threshold display.

Some people want to see all of their DNA that matches, and some only want to see the large and compelling pieces, those green segments.  Neither choice is wrong, simply a matter of personal preference and individual goals.

The “large and compelling” part of that statement brings me back to why you’re participating in genetic genealogy in the first place, those individual goals.  The larger segments are going to lead to common ancestors who are generally easier to find and identify, unless you have an unidentified parent or a misattributed parental event.

You would never start with smaller segments in terms of matching, but that does not mean those smaller segments are never useful.  In fact, after you’ve managed to analyze all of your low hanging fruit, and you’re ready to research or concentrate on those ugly brick walls, groupings of those smaller segments in descendants may just be your lifesaver.

Surviving Phasing

However, now I’m curious. How many of those smaller segments do stand up to the test of parental phasing, meaning they match both me and my parent?  If my match (Cheryl) matches both me and my parent, then Cheryl does not match me by chance on that segment so the match is genealogical in nature, the matching DNA proven to have descended to me from my mother.

Let’s see.

Cheryl Mom me chart

In order to phase my results with Cheryl against my mother, I copied Mother’s results into the same spreadsheet, above, color coding our rows so you can see them easier. “Cheryl matching Mom” rows are apricot and “Cheryl matching me” rows are yellow.

You can see that in some cases, like the first two rows, the two rows are identical which means I inherited all of Mom’s DNA in that segment and Cheryl inherited the same segment from her father, matching both Mom and me.

In other cases, I inherited part of Mom’s DNA on a particular segment.  I could also have inherited none of a particular segment.

In fact, of the 27 segments where I match Mom on any part of the segment, I match her on the entire segment 18 times, or 66.6% and on part of the segment 9 times, or 33.3%.

I left the color coding in the cM column the same as it was before, in my rows, to indicate small, medium and large segments. The small segments are red, which would be the most likely NOT to phase with my mother, in other words, the most likely to be Identical by Chance, not descent.  If Cheryl and I are Identical by Chance on these segments, it means that the reason I’m matching Cheryl is NOT because I inherited that chunk of DNA from mother. If Mom and I both match Cheryl, they Cheryl and I are Identical by Desent, meaning I inherited that piece of DNA from my mother, so the match is not because Cheryl’s DNA is randomly matching that of both of my parents.

In the spreadsheet below, I removed mother’s rows to eliminate clutter, but I color coded mine. The rows that show red in the CHR and SNP columns BOTH are rows that did NOT phase with my mother, meaning these matches were indeed identical to Cheryl by chance.  The rows that are red ONLY in the cM column (and not in the CHR column) are small segments that DID phase with my mother, so those are identical by descent (IBD).

Cheryl Me phased chart

Here’s the interesting part.

  • All of the large segments, 10cM and over passed phasing. They are legitimate IBD matches.
  • One of 2 of the medium cM matches passed phasing.
  • Of the 15 smaller segments, ranging in size from 1.38 cM to 6.14 cM, more than half, 8, passed phasing. Seven did not. The smallest segment to pass phasing was 1.38 cM. I suspect that part of the reason that the smaller cM segments are passing phasing is that the SNP threshold is held steady at 500 SNPs. In another (unpublished) study, dropping the SNP threshold below 500 results in a dramatic increase in matches (roughly fourfold) and a very small percentage of those matches phase with parents.

Small Segments Guidelines

There has been a lot of spirited debate about the usage, or not, of small segments, so I’m going to provide some guidelines.  Let me preface this by saying that none of this is worth getting your knickers in a knot, so please don’t.  If you don’t want to include or utilize small segments, then just don’t.

  • What is and is not a small segment can vary depending on who you are talking to and the context of the conversation.
  • Small segments CAN and do survive parental phasing, as shown above.
  • Small segments CAN be triangulated to a particular ancestor. Triangulated in this sense means that this segment is found in the descendants of a group of people (3 or more) proven to descend from the same ancestor AND who all match each other on the same segment.
  • Not all small segments can be triangulated to a common ancestor.  But then again, the same can be said for larger segments too.  It’s more difficult and unlikely to be successful with smaller segments unless you are starting with a group of people who descend from a common ancestor and are looking for “ancestral DNA.”
  • Small segments, even after triangulation, can be found matching a different lineage. This is an indicator that while the descendants of the first group share this DNA segment from a specific ancestor, it may also be prevalent in a population in general, which would cause the same segment to show up matching in a second lineage from the same region as well. I have an example where my Acadian line also matches a different German line on a particular segment – which really isn’t surprising given the geography and history of Germany and France..
  • Small segments without the benefit of other tools such as parental phasing, triangulation and match groups are, at this time, a waste of time genealogically. This may not always be the case.
  • Never start with small segments.
  • Never draw conclusions from small segments alone, meaning without corroborating evidence.
  • Use small segments only in context of a combination of parental phasing, triangulation and match groups.
  • Just because you match a group of people, out of context, on a segment (small or otherwise) doesn’t mean that you share a common ancestor. The smaller the segment, the more likely it is to be either IBC or IBP. Situations where the DNA is exactly the same from both parents, meaning everyone has all As in that location, for example, are called runs of homozygosity and the smaller the segment, the more likely you are to encounter ROH segments which appear as phased matches.  Yes, another cruel joke of nature.

As a proof point relative to how deceptive small segment matching out of context can be, I ran my kit against my friend who is unquestionably 100% Jewish. I have no Jewish ancestry.  At 7cM/700 SNPs we have no matches, at 3cM/300SNPs we have 7 matching segments.

Me to Jewish match

However, matching this individual to my phased parents, none of these segments match both me and either one of my phased parent. Phased parent kits, at GedMatch are kits reflecting the half of my parents DNA I received from that parent.  If you have one or both parents who have tested, you can create phased kits with instructions from this article.

Lowering the match threshold even further to 100 SNPs and 1cM, my Jewish friend and I match on a whopping 714 tiny matching segments, over 1100 cM total, but all very small pieces of DNA. Because of the absolute known 100% Jewish heritage of my friend, and my known non-Jewish heritage, these matches must be either IBC, identical by chance or perhaps some small segments of IBP, identical by population from a very long time ago when both of our ancestors lived in the Middle East, meaning thousands of years ago.  Bottom line, they are not genealogically relevant to either of us.  I repeated this same experiment with someone that is 100% Asian, with the same type of results.  You will match everyone at this threshold, including ancient DNA matches tens of thousands of years old.

The message here is that you can work from the “top down” with small segments, meaning in a known relationship situation like with my cousin and other relatives, but you cannot work from the bottom up with small segments as you have no way to differentiate the wheat from the chaff.

In the Crumley study, there are groups of small segments (greater than 3cM/300SNPs) that persist in multiple descendants of James Crumley born in 1712.  In this case, because you can separate the wheat from the chaff with more than 50 participants, others who triangulate with those small segments and match the group of Crumley descendants may well share a common ancestor at some point in time, especially if they can phase with their parents on those segments to prove the match is not IBC.

  • Remember, your match on any segment to one person can be IBD meaning you have identified the common ancestor, your match to another person on that same segment IBC, and yet to a third person, IBP where your match survives generational phasing, but you may never find the common ancestor due to the age of the segment or endogamy.
  • When utilizing small segments, I generally don’t drop the SNP threshold below 500, as the number of matches increases exponentially and the valid matches decrease proportionately as well. I’ll be publishing more on this shortly.
  • I do fully believe, within this set of cautionary criteria, that small segments can be useful. I also believe that small segments can be very easily misinterpreted. The use of matching segments has a lot to do with combining different pieces of evidence to build confidence in what the “match” is telling you. I wrote about the Autosomal DNA Matching Confidence Spectrum here.
  • Small segments should only be utilized after one has a good grasp of how genetic genealogy works and by utilizing the tools available to restrict those segments to genealogically descended DNA. In other words, small segments are for the advanced user. However, maintain those small segment groupings and triangulations in your spreadsheet, because when you have the level of experience needed to work with those small segments, they’ll be available for you to work with.  You may discover that most of your DNA triangulates by using large segments and you don’t need to utilize those small segments at all.
  • If you send me a list of matches from GedMatch with the cM set to 1 and the SNPs set to 100 and ask me what I think, I would simply to refer you to this article. But if I did reply, I would tell you that unless you have corroborating evidence, I think you’re wasting your time, but it’s your time and you’re welcome to do what you want with it. Life is about learning.
  • If you tell me you’ve drawn any conclusions from those types of matches (1cM and 100 SNPs), I’m going to be inconvincible without other tools such as genealogical proof,  parental phasing and triangulation groups that prove the segments to be valid to a specific ancestor for the people about whom you’re drawing conclusions. I might even suggest you look at the raw data in those segments to see if you’re dealing with runs of homozygosity.

Netting It Out

The net-net of this is that small segments can be useful, but it takes a lot more work because of the inherent questionable nature of small segment matches. This goes along with that old adage of “extraordinary claims require extraordinary evidence.”  Just be ready to roll up your shirt sleeves, because small segments are a lot more work!

Now having said all of that, I very much encourage continuing to triangulate your small segments and pay attention to them. You may notice patterns very relevant to your own genealogy, or you may learn that those patterns were somewhat deceptive – like IBD that turned into IBP.  Still useful and interesting, but perhaps not as originally intended.

Without continuing and ongoing research, we’ll never learn how to best utilize small segments nor develop the tools and techniques to sort the wheat from the chaff. Just be appropriately paranoid about conclusions based on small segments, especially small segments alone, and the smaller the segment, the more paranoid you should be!

There is a very big difference between working with small segments along with larger matching data and genealogy, which I encourage, and drawing conclusions based on small segment data alone and out of context, which I highly discourage.

Let’s hope that all of your matches come with large segments and matching ancestors in their trees!!!

Pickin’ Crab

You know, working with different cM levels and SNPs, especially as segments get smaller and more challenging, I’m reminded of “picking crab” at a good old North Carolina crab bake. You would never start out with a crab bake for breakfast.  You kind of have to work your way up to pickin’ crab – the same as small segments.  And you never pick crab alone. It’s a group activity, shared with friends and kin.  So is genetic genealogy.

You’ll need lessons, at first, in how to “pick crab” effectively. There’s a particular technique to it.  Friends teach friends.  You’ll find cousins you didn’t know you had, like Dawn in the brown shirt below, giving lessons to Anne.

Dawn lessons

A little practice and you’ll get it.

Just because it’s not easy doesn’t mean it’s not productive, especially when everyone works together!  And the results are “very good,” if you just have patience and work through the process.  If you decide that you “can’t pick crab,” then you’re right, you can’t pick crab, and you’ll just have to go hungry and miss out on all the fun!  Don’t let that happen.  Hint – sometimes the fun is in the pickin’!

Here’s hoping you can solve all of your brick walls with large cMs and large SNP counts, and if not, here’s hoping you enjoy “picking crab” with a group of friends and cousins and who will contribute to the ongoing research.

Pickin’ crab, or working on identifying difficult ancestors is always better when collaborating with others! Find cousins and fellow collaborators and enjoy!!! Genetic genealogy is not something you can do alone – it’s dependent on sharing.

crab pickin

Sometimes it’s as much about the friends and cousins you meet on the journey and the adventures along the way as it is about the answer at the end.

60 thoughts on “Concepts – CentiMorgans, SNPs and Pickin’ Crab

  1. Thanks Roberta — a very timely article for me. I had someone send me an e-mail yesterday telling me we matched on GEDmatch, and when I checked using the default thresholds I didn’t see him in my list of matches. We exchanged a couple more e-mails and I found out that he had dropped the cM threshold down to 3 to find our match. I was literally trying to figure out how to answer him when your post arrived in my inbox — now I can just refer him to this excellent post of small segments. As always you gave a great explanation of a topic that could be difficult to understand.

  2. Interesting analogy with the football field. That should be helpful.

    With raising the threshold higher for endogamous matches, not sure about other endogamous groups but I’ve been saying this the past few years now, that if I were to do that, it would eliminate my true 2nd – 4th or even 5th cousin matches. Since the ability to adjust the threshold seems to rely on TOTAL shared.

    Between my 1/2 1C and my brothers, we have between us all, we have 17 comparisons with 4th cousins. The TOTAL range from as low as 0cM to 117cM. We seem to fall in the range of 40cM – 60cM for many of us.

    On GEDmatch, my highest (non-relative) match shares 198cM of whom we haven’t seen a connection. The next highest, 185cM and that person definitely is not related to me at least not in the last 8 centuries. Same with the others whom I share 188cM, 168cM, 171cM and many more above 100cM. My mother easily shares more than 100cM for many of these 8+ century IBP matches.

    So trying to increase the threshold won’t weed out the correct matches. And this is probably where people say I’m the exception, as always. At least until they see a similar situation happenning.

  3. Thank you! I’m thrilled to see this topic addressed. I’m also excited to realize how much I have learned since getting my DNA results a year ago. I don’t think I would have understood most of this back in early 2015.

  4. I have a sorta-kinda related question. Right now I am in the process of matching larger segments of closest matches by chromosome on my direct paternal line. I noticed that there is a pattern of “hiccup” or discontinuous cM segment inheritance on chromosome 19 between my brother Rick and 3 of our matches. I match for the whole segment with these three while he has a “hiccup” in about the middle of it:

    Me & Del 19 46749865 to 57726410 27.14 3211
    Me & Oma 19 51492068 to 57726410 18.84 2034
    Me & Les 19 51492068 to 57726410 18.84 2034

    Here’s Rick’s:

    Rick & Del 19 46749865 to 55007510 16.54 2187
    19 55418271 to 57726410 9.66 924
    Rick & Oma 19 51492068 to 55007510 8.24 1010
    19 55418271 to 57726410 9.66 924
    Rick & Les 19 51492068 to 55007510 8.24 1010
    19 55972568 to 57726410 6.72 748

    The discontinuity for all three happens after position 55007510 and all the segments end at 57726410.

    This is the only cluster of ICWs on any chromosome that I’ve seen this pattern. Does it have a name and is there any significance to it? Is it more likely to occur on 19? Or is it just an error in the data that’s being reported?

    Thanks…

    • It’s possibly a read error at that location in your brother’s kit. Does he match other cousins on this same segment portion? Does he have this same hiccup when matching against you?

      • Robert…no he shares an 83.64cM block on 19 and when viewed in the table it is reported as continuous. But I don’t understand the position numbering system. My half-brother Frank and I share almost the same amount and the bar chart data says my section is from 6752953 to 63788972 while Frank’s, which totally matches/overlaps, is reported as 4408650 to 62527724 for 85.19cM.

  5. Sometimes when working with cMs under 10, I ask the age of my dna match if we have a tree match. I.e., I am 72, and if my match who is 30 years old and we share only 5cMs, obviously had her parent been tested, the parent and I would have shared ~ 10 cMs.

    So if you analyze it that way, that 5cM has morphed into 10cMs because the age/generation difference represents an addition recombination.

  6. Thanks Roberta, I understand the cm’s and snps much better now. I may never make it to the crab feast though. I seem to assimilate “mathlike” information rather slowly so understanding the info in the first portion of this article was a major leap for me. The football yardage locations vs the blades of grass did it!

  7. Roberta,

    I was wondering why you don’t mention DNAGedCom.com? I think I learned about it from you!

    Unrelated question – looking at the time on the posts the latest post was at 8:24 pm on Mar 30. In Oklahoma it is currently 4:12 pm, NY 3:12. Why is the time so much later than here? Just curious.

    I am sending this to my friend who just received her results. I thinks it will be very helpful to her in understanding what she is looking at. Thanks for the post.

    • I think the time that WordPress uses is Greenwich Mean Time.

      I like DNAGedcom.com, but their tools are different that the GedMatch tools. For doing the kinds of comparisons I did in this article, I needed my cousin’s results at FTDNA and I needed the GedMatch comparison of my Jewish friend. DNAGedcom does not archive people’s raw data files to be compared to each other – they provide tools for individuals to use – in particular the adoption community. I used to use their 23andMe tool but I’ve pretty much stopped using 23andMe altogether. So, nothing negative about DNAGedcom.com, just different tools for different uses.

  8. Excellent explanation for this non-scientific person! Best explanation and analogy of cMs and SNPs I’ve seen, thanks!

  9. Hi Roberta: If I were looking for 7 X Great Grandparents and took some ICW matches from FTDNA that automatically come up lowered to 1 CM to see all the segments. Then went to GEDMATCH and leaving the threshold at 500 SNP and 3 CM’s added them to a spreadsheet of my FTDNA matches and got many overlapping segments on multiple chromosomes including “X”. In your opinion, would we be looking for a common ancestor at or past our 7X Greats ?

    • If they are matching at FTDNA, they you have to be over the 20 total and 7 cM longest segment threshold already. I would use those longest segments at the basis for comparisons. The further out in time and the more distant, the wider the range for the relationship spans. In other words, I don’t know.

  10. Roberta, one of the tests you used to determine the IBDness of a small segment is whether it also matches one of your parents.

    I agree that’s a good test, but I think it’s not a 100%-accurate test. At a particular chromosomal address I have a number of (~9-cM) shared HIRs which match both parent and child, and even siblings too, which are demonstrably IBC.

    I’m not dissing the parent-child-IBD test, just cautioning that it’s not 100%.

    Best,

    Eric

    • Eric, I agree that matching a parent makes a good screening test, because the parent can also be identical by chance. Using the GEDmatch Matching Segment Utility down to the 5 cM / 500 SNP threshold, my son’s genotype data has 6123 / 9767 matching segments not found in either parent. He has 1689 that are listed for his father. But if I run my son’s phased kit, 523 of those drop out.

  11. This is a good general overview of the concept of cM lengths and what they mean matching, and I agree on most of your points, but I have a few bones (if not crabs) to pick.

    For starters, the FTDNA threshold appears anecdotally to be around 7.69 cM, not 7.0.

    Also, it’s not intuitive, but the probability of a possible match being IBD is notably smaller for 7.0 as opposed to 7.69 cM, as I explained here .

    Another caveat, 23andme will be nearly useless at finding non-Ashkenazi relatives beyond about third cousins for anyone with three-quarters, one-half, one-quarter, or even one-eighth Ashkenazi ancestry, because the 7.0 cM cutoff yields way too many IBP segments, which crowd out non-Ashkenazi matches, because of the cap on kits that show up in Relative Finder, as I explain in the link above.

    This is quite ironic, considering company founder Anne Wojcicki is herself half-Ashkenazi, and must be getting very few matches from her paternal Polish side in her own RF.

    People with significant colonial New England or early Virginia ancestry, or origins from other groups such as French Canadians/Acadians, may suffer a similar effect.

    One solution to dealing with vast numbers of IBP matches is a kind of “slide” the user can apply to change the cM threshold, as at Gedmatch. The FTDNA chromosome browser has this too, though only in the set increments of 1.0, 3.0, 5.0, and 10.0 cM. Still, a useful feature.

    While 23andme used to allow a sliding scale for matches in increments of 1.0 cM, from 5.0 to 15.0 cM, through Countries of Ancestry – in fact this was quite interesting, since most IBP matches drop off even by 8.0 cM – they abolished this and other useful features in the past months, making 23andme ever-less useful for ancestry analysis.

    • What would work best if they allowed an adjustment for the largest segment which is also critical in determining a true 2nd, 3rd or even a 4th cousin relationship or anything in between that. Obviously getting largest segment at 7cM but more than 100cM total is a distant match for me. I have a 7.7cM largest segment, total 105.5cM. Our ancestors were in the same geographic area (somewhat) but just ONE intermingling beyond a genealogical time frame wouldn’t have caused this type of match. It should be multiple if not the same ancestor multiple times. So these type of matches are useless.

      • Yes, this is my main issue with 23andme in particular. In the end it’s a larger part of a conceptual problem for genetic genealogy, which is, When do “genealogical” markers become “population” markers?

    • I was actually told that was the case for me. Not sure about others.

      When I sort my matches by LONGEST BLOCK & look at the smallest size, it says 5.11cM LONGEST BLOCK while the total is 134.14cM. Next largest one above that is 5.68cM LONGEST BLOCK, total shared 141.83cM.

      On that last page, there are only 8 matches with the lowest at 5.11cM (longest block) as I said, and at the top, 6.86cM longest block but total 211.13cM.

      • Hm. If I look at the kit of someone ~23.5% Ashkenazi that I manage, their lowest long block is 7.69, the same amount quoted in the ISOGG page. Perhaps they make manual adjustments depending on ancestrally informative markers.

      • Kalani — I have a hunch that a very large number of small segments overpowers the typical number we see for shortest segment. When I uploaded my experimental 23andMe v4 kit to FTDNA (with lots of no-calls), it generated a lot of small pseudo-segments, and I matched another person who had a huge number of no-calls from a regular FTDNA kit. (We later found out that was the reason he was so “matchy” at GEDmatch. The v4 kit performed very nicely on long segments, though.

      • Ann…..I never thought of dropping the threshold from the 23andme kits compared to our (my mother, mine, brothers, aunt & cousins) matches to see if there are the same tiny segments. But at FTDNA, this is what some of the other distant people (one we have no known connection within a genealogy time frame while the other we know we share common ancestors from 8 centuries ago) look like when compared to just my mother.

      • I am 2 percent AJ on Ancestry and have 25 full Askenazi relatives on Gedmatch. Those are on my nat mom’s side who had UK roots. My nat father was Sephardic. Question for author,can he be certain that he does not possess any DNA from Jacob?

  12. Pingback: Concepts – Parental Phasing | DNAeXplained – Genetic Genealogy

  13. Pingback: Concepts – Managing Autosomal DNA Matches – Step 1 – Assigning Parental Sides | DNAeXplained – Genetic Genealogy

  14. Pingback: The Concepts Series | DNAeXplained – Genetic Genealogy

  15. Pingback: Nine Autosomal Tools at Family Tree DNA | DNAeXplained – Genetic Genealogy

  16. Pingback: Concepts – Match Groups and Triangulation | DNAeXplained – Genetic Genealogy

  17. Pingback: Double Match Triangulator (DMT) | DNAeXplained – Genetic Genealogy

  18. Very interesting read, i love how you made it a little easier to understand. I still have a question though. I have a match that on chromosome 17 we share 70.6 cm, that seams like alot for one chromosome. We also match on chromosome 5 with 10 cm. That’s the extent of our matching. She has a surname ( honeycutt) that is on my dad’s side, but my mom’s half brother matches her in the same segment i do on chromosome 17. According to gedmatch.com my mom and dad are not related. Could it be that my mom’s brother’s dad is related to my dad? Or is this just a case of IBC ? thanks ,
    Karen

  19. Pingback: 2016 Genetic Genealogy Retrospective | DNAeXplained – Genetic Genealogy

  20. Pingback: Concepts – Segment Size, Legitimate and False Matches | DNAeXplained – Genetic Genealogy

  21. Hi Roberta, it’s your Acadian cousin, Joan. This article was so helpful for me as I recently had my DNA tested and now I’m trying to use all the tools on GEDmatch and make sense of things. I have a French Canadian distant cousin with whom I have one match that is 10.1cM. We also have a second match on chromosome 6 that is 3.8cM and 2,553 SNPs. Is that an unusually high SNP number for a short segment, or does that have to to with the particular location on chromosome 6? I wasn’t sure what to make of this. I have read that the DNA in chromosome 6 has coding for our immune system and is usually passed down in larger blocks. Thanks

  22. I first wanted to say this is the most informative and easy to read article I have come across in my recent dive into genetics. Thank you (!)
    That being said… help! I’m still confused on what to actually look for. Example: I match (GEDmatch) another human on four chr (chromosomes?) and the ratios are set to their standards. Here is a c/p of that ‘one-on-one’ match (this is my #1 match on GEDmatch)
    Could this be an actual biological relation(?!) And if so, how does one decipher how close/distant a relation?
    ••••••••••••••
    Minimum threshold size to be included in total = 500 SNPs
    Mismatch-bunching Limit = 250 SNPs
    Minimum segment cM to be included in total = 7.0 cM

    Chr. Start. End. cM. SNPs
    2 2,211,750 12,118,203 25.8 1,541
    3 188,421,650 195,401,596 17.3 933
    5 91,139 4,694,160 13.9 807
    11 69,392,754 95,070,864 23.1 2,556
    Largest segment = 25.8 cM
    Total of segments > 7 cM = 80.1 cM
    4 matching segments
    Estimated number of generations to MRCA = 3.7
    •••••••••••••
    Thank you so much for your time, I appreciate any help that comes my way 🙂

    -kats

  23. Hi Roberta,
    Thank you for this article…it has helped me immensely!
    As my parents are both deceased, can I use my sibling’s results for phasing to identify IBD vs IBC?
    I am trying to identify whether a match is IBD or IBC for a distant relative. I have the common surname and also common townland in Ireland.
    At the default threshold in Gedmatch, there is no match, but when I lower the SNP’s to 400, we have the following match, from this person to both me and my sister…

    Chr Start Location End Location Centimorgans (cM) SNPs
    9 10,671,522 14,010,982 4.7 451

    This is exactly the same for both of us. Would I consider this an IBD or and IBC?
    Thank you for your time!
    Margie

    • Siblings can’t be used in this manner, because both you and your sibling inherited from your parents – and you could both have inherited the exact same segment from your parents that is reading as IBC.

      • Thank you!
        Because I can’t use my parents, then it would be wise to just stick to the default numbers correct?
        Would there be any other way of checking to see if a match like the above would be an IBD?
        Thanks again Roberta…..I am such a novice at this….trying to wrap my mind around it.

      • If you have known cousins who match on the same segment from one side or the other, then that’s a good indication that the segment is IBD.

  24. Pingback: X Matching and Mitochondrial DNA is Not the Same Thing | DNAeXplained – Genetic Genealogy

  25. This is a great article. I think there is a lot of people who write off anything under 7 or even 10 or higher CM and some get quite shirty you even contacted them. You do indicate that reasons for looking are very different. Many people are looking for cousins within their own community and in a place like America, they will have more than enough to keep them occupied. I have NZ and Australian family for the last 3 to 8 generations. I am highly unlikely to ever get an 8th Generation American showing up at 10cM or more. We are relatively small population countries with very few years of testing availability so VERY few identifiable 2nd-4th cousins within the databases. My brick walls are around 1800-1860 in rural Scotland, England and Ireland. While we might never find the matching ancestor, finding a cluster of people with 4 to 8 shared segments of 3+CM does give some clues as to what area of Rosshire my Ross Greaty Grandfather actually DID come from. I have been doing a one place geneaological study on Strathdon Scotland and so far almost every person links to every other one in multiple pathways – I EXPECT to see lots of small links when I do a GEDMATCH on someone who thinks they may come from there. I actually start worrying when I find nothing at all down to 3CM as it is possible that their tree is totally off.
    I just wish people would respect my purpose and not step in to cut off conversations on group forums with “Matches under 7 are false and useless” especially when the person shares an uncommon name AND a small geographical location. These forum experts may have just cemented a brick-wall in place for both of us.

  26. In the last 18 months, have the numbers or caveats changed in your table of values from the companies that you give at the beginning of the Thresholds section? Also, it seems the criteria may differ between what’s used to determine a match and what’s used to calculate the total shared cM number between two people.

  27. Chr Start Location End Location Centimorgans (cM) SNPs
    22 23,433,024 25,987,507 8.8 717
    what does this me? can someone plz Tell Thanks I

    • Chr Start Location End Location Centimorgans (cM) SNPs
      22 23,433,024 25,987,507 8.8 717
      Largest segment = 8.8 cM
      Total of segments > 7 cM = 8.8 cM
      1 matching segments
      Estimated number of generations to MRCA = 6.5

  28. I have documented ancestry to my 4G grandparents, as does another contact, to the same 4G grandparents. We do not appear as a match at any of the sites. Lowering the threshold at gedmatch to 1cM, we come up as a 1.15 cM match. What is happening here?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s