Lots of people have struggled with exactly how to identify and work with autosomal DNA matches, create DNA match groups and triangulation groups, which isn’t at all the same thing. Add to that multiple testing vendors who provide you with different types information in different formats, and it’s a challenge.
Now I have a confession to make. I’ve gotten very behind on keeping up with matches and such. Family Tree DNA recently made improvements to their matching algorithm which changes the matching amounts with several of my matches, so I’m going to “start over” with my matching spreadsheet and use the steps as an example for you of how you can do this. Yes, I will preserve what info I have previously collected, of course, but if I’m adding something from previous information, I’ll tell you.
Goal: I want to see how much I can figure out from what I have available to me at the three vendors.
There has been a lot of discussion recently about the lack of communication when people attempt to communicate with their matches – so let’s see what we can do with just DNA.
I don’t know how many steps this series will be. We’ll see. I’m trying to do this in manageable “bites.” And yes, there will be some homework, but don’t think of it that way. Think of it as panning for gold – your ancestors!!!
Before we go on, let’s talk about who these techniques in today’s article work best for:
- People with one or both parents
- People with known cousins
Adoptees or people with no known cousins can still learn about sorting and matching, but will not be able to assign genealogical sides to matches without working with their matches to discover their shared ancestor.
Adoptees should be utilizing a different set of techniques taught by www.dnaadoption.com.
People who are not adoptees but who have no known cousins who have tested will, hopefully, be able to identify ancestral groups based on the genealogy of the other people in match groups. Perhaps they will discover new cousins.
So, stay with me and just skip steps that you don’t think apply to you AFTER reading them.
What We’re Doing in this Article
In this article we’re going to do the following:
- Combine you and your parent(s) match results into a single spreadsheet
- Do some preparatory maintenance
- Sort the spreadsheet so you can see common matches
- Identify matches to a maternal or paternal side of your family
- Further identify parental “sides” based on known cousin matches
Matches and Stats
So let’s start with some basic information.
At Family Tree DNA, I have 1470 matches and my mother has 803.
Why does my mother have only about half of the number of matches that I do? Three of her 4 grandparents came from the old country. All of the data bases are highly skewed towards “New World” testers. My father’s line is very colonial and has been in the US, having children, lots of children, for hundreds of years now.
My father was deceased in 1963, so clearly I don’t have his DNA in any database, except the Y by virtue of other Estes males and at GedMatch by virtue of a phased parent kit. We’ll work with GedMatch in a future article.
I provided instructions for how to download your chromosome browser matches at Family Tree DNA here. If you haven’t done that, do it now for the following people:
- Both of your parents
- One of your parents if you don’t have both
- If you don’t have both parents, download the files for FULL siblings only
If you haven’t already done so, save the files as Excel files and not CSV files, as the CSV format does not support some of the coloration and other functions we’ll be doing. (File, save as, Excel Workbook)
Why Full Siblings?
If you have the DNA results for both parents, you don’t need your sibling data, and it will just add unnecessary bulk to your file. However, if you don’t have either parent or only one parent, your full siblings’ information will be helpful.
You receive 50% of your DNA from your parents. Your siblings do too, but not the exact same 50% (unless you are identical twins.) Therefore, the matches your full siblings receive, especially in the absence of one or both parents, are as relevant to your genealogy as your own matches. Therefore, you can obtain some of the matches your parents would have had, if you had their DNA results, by virtue of including your full siblings matches.
Selecting Files and Colors
You are going to be combining spreadsheet files for you, your parents and your full siblings if you don’t have both parents.
The file you want to combine is the file that shows your chromosome matches to other participants. When you download results from Family Tree DNA, there are two files titled:
- Family Finder Matches
- Chromosome Browser Results
The chromosome browser results file is the one you want and includes the following information.
Select the Chromosome Browser file to work with that holds your results and save it with a title something like “DNA Master Spreadsheet.” That’s the file you’ll be adding to for the duration…meaning forever.
Before proceeding, I want you to think for a minute about coloration. You’re going to color different family members’ results different colors so you can recognize them at a glance and so that sorting and discerning matches is easier.
In my case, I left my rows as white. I colored my Mom’s file pink and while I don’t have a father at Family Tree DNA, he would be colored blue if I did. This makes it easy for me to see who is who and it’s intuitive for me.
If I was utilizing full siblings, I would likely color them in some way that makes sense but is easily distinguishable from the parents. Maybe sisters would be shades of pink and brothers would be shades of blue. Whatever you select, make sure it makes sense to you.
Next, you’re going to create the master spreadsheet, and you WILL write down the legend. Now you may think you’ll remember, but one time I copied additional matches into my spreadsheet and I inverted Mom’s and my colors, pink and white, and it was never right again. That’s actually part of why I’m “starting over.”
Creating a Master Spreadsheet
Open your spreadsheet (now titled DNA Master Spreadsheet) and color the relevant rows in your color, unless your row color is white, then do nothing.
Open your parent’s spreadsheet(s) and color their rows appropriately.
Here’s an example of Mom’s.
You are now going to copy and paste the entire set of information from your mother’s spreadsheet into your spreadsheet to make one combined spreadsheet. Do NOT do this until AFTER the rows are colored.
If you have both parents, repeat this same process for your father’s results after they are colored.
If you have both parents, you don’t need your siblings files because your siblings only inherited part of your parents DNA, and you already have both parents.
If you don’t have BOTH parents, then you’ll add your FULL siblings. Half siblings will be used later for another step, but NOT here because you can’t differentiate easily between what part of their DNA is from your common parent (especially if you share the “missing” parent) and perhaps from their other parent’s side.
If you are utilizing full siblings, then copy their information into the master spreadsheet as well – but not until AFTER it’s colored.
On another spreadsheet tab titled “Legend”, I recorded the following information:
Do not neglect this step or you will one day be very sorry! Voice of experience here.
A Bit of Housekeeping
Because my descendants (children, grandchildren) only received their DNA from me (and their father, ) I removed their results from this spreadsheet. Their DNA is not helpful for identifying MY ancestors. I also removed the segments where mother and I match each other because they are irrelevant. It won’t hurt anything if you skip this step. It just reduces the size of your spreadsheet a bit.
A Parentally Phased Spreadsheet
You have just created a parentally phased spreadsheet.
Isn’t this exciting?
Now, how does this work?
If you are not familiar with the terms, identical by descent (IBD), identical by population (IBP) and identical by chance (IBC), or need a refresher, this would be a good time to read the “Identical By…” article.
Time to Make A Decision – To Delete or Not
We’re going to be using the terms centiMorgans abbreviated cM and SNPs. If you’re not familiar with these terms, or would like to review information about using small segments, it would be a good time to read the concepts article about CentiMorgans and SNPs.
Some people remove segments from their spreadsheet below a specific cM size.
I don’t, but my goals may be different than yours. I want to know every single thing possible. I also participate in the research aspect of genetic genealogy, so if I delete segments of any size, I’m deleting information that may be useful in one way or another, so I don’t delete.
You may not be interested in research, so let me share with you some rules of thumb.
I did a small study on parentally phased matches. You can read about the results in “The Threshold Study” section at the end of this article.
Suffice it to say that when I studied four families of three generations each of non-endogamous families, there seemed to be a cutoff at about 3cM/500 SNPs where segments below that level did not reliably phase for three generations in the same family, and segments above that tended to phase. By phasing, I mean the segment was passed from a grandparent, to a parent, to a grandchild intact. If you need a refresher about parental phasing, you can read about that here.
On the chart below, from that article, green means the segment phased in all upstream generations and red means that it did not. The black bar is about where the “reliable phasing line” occurred.
In one case, in a fifth study, below, I had four generations to work with, and the same threshold seemed to work. 2, 3 and 4 match means that’s how many generations were upstream. If the segment didn’t match on any upstream individual, it’s counted as a nonmatch.
What is the take home message here? If segments don’t even phase reliably within families, they aren’t going to be reliable elsewhere either.
So, unless you’re interested in research, like I am, then you could safely delete any segment below 3cM.
Other genetic genealogists who have been working with triangulated segments a long time use 5cM as a cutoff in non-endogamous populations. I wouldn’t delete segments larger than 5cM, but some do. Look at it this way, larger segments put the relationship closer in time. Smaller segments are further away. If you’re an adoptee and you really only care, for now, about close relationships, then fine, delete as much as you want. But if you’re looking for colonial American ancestors, you might want to consider keeping those smaller segments, at least the ones over 3cM at 500 SNPs, which is the lowest number of SNPs reported by Family Tree DNA.
If you are going to delete, now is the time. Simply sort your spreadsheet by cM size and delete all the rows you don’t want.
Be SURE you know how to sort the entire spreadsheet and not just one column, because if you sort just one column, the rest of the data stays in place which means the rows are all messed up – as in forever. (Highlight only the column header and sort. Do not highlight the entire column.)
I’ll close my eyes while you delete!
Different Kinds of Matches Mean Different Things
You will see different types of matches as you work through your spreadsheet. Don’t do anything to your spreadsheet yet – read this next section first.
Matches if You Have Only One Parent
- Matches to you only and not your parent – this means they match to your other parent or are IBC.
- Matches to your parent only and not to you – this probably means you didn’t receive that DNA from your parent (or it’s IBC) but this match is still genealogically very valuable to you.
- Matches to both you and your parent – this is a phased match meaning you received the matching DNA from that parent because the person matches both you and your parent ON THE SAME SEGMENT. Why is “on the same segment” capitalized? Because you can match the same person on different segments through different parents. Yea, I know, cruel joke!
- Matches both you and your parent, but not on any common segments – this means your match is either to the other parent, IBC or we’re dealing with an anomaly. In some cases, a single matching segment has become split into two due to a read error.
Matches if You Have Both Parents
- Matches to one or both of your parents – You received the matching segment of DNA from the parent whom the other person matches as well. If you are from a highly endogamous population, expect that several of your matches will match you and BOTH parents, potentially on the same segment. That means your parents shared a common ancestor at some point in time.
- Matches to your parent(s) and not to you – this means that you did not inherit that DNA from your parents. These are still very valid genealogically relevant matches for you because they match your parents.
- Matches to only you and neither of your parents – this means the match is either IBC or you have barely missed the matching threshold due to an anomaly. I would label these as suspicious (IBC?) until I could look at them individually and they would be the last matches I worked with.
Sibling Matches with One Parent
If you have full siblings and one parent, you can have the following matches:
- Your matches match you, at least one sibling and one parent on the same segment. This means that the match is from that parent’s side of your tree, at least on that segment.
- Your matches match you, at least one sibling and does not match your one parent. This means that the match is from the missing parent’s side of the tree or you and your sibling are identically IBC.
Sibling Matches, No Parent
- Your matches match you and at least one sibling on the same segment. This means that you inherited this DNA from a common parent or the segment is identically IBC.
- Your matches match you and none of your siblings. If you have only one full sibling, this might happen about 25% of the time, but the more siblings you have, the lower the possibility that a match won’t match any of your siblings. This could indicate an IBC segment. If you know who your match is, for example, a first cousin on your father’s side, and they match you and your sibling(s), that segment of DNA is very likely from your father’s side.
Let’s Start Matching
You are going to sort (not filter) your spreadsheet by column four separate times, in the following column order:
- End Location
- Start Location
What this gives you is a spreadsheet sorted by match, but within match the spreadsheet is sorted by chromosome, start and end position, in that order.
Here are my first two matches. You can see that they are in chromosome order, smallest to largest, for each matching individual.
Since there are no pink interspersed rows, neither of these two people match my mother, so they are either from my father’s side or are IBC. To have an IBC match of 23.73 cM would be highly unusual. I have seen non-parentally phased segments as high as 8cM which indicates an IBC match, but that’s unusual and I’ve only seen it once.
Add four columns to the right of Matching SNPs column labeled:
- MRCA (Most Recent Common Ancestor)
Some people retain a lot more information in the spreadsheet, such as e-mail address and a communications history other than in comments. I don’t, but you may want to.
Now you’re ready for the fun stuff!!!
You’re going to work your way through the entire spreadsheet (after you’ve sorted as per the instructions above) and you’re going to identify the “side” that your matches fall on, as best you can.
Do NOT, and I really mean do NOT assume. So if you see a surname you just KNOW matches one side of your family, do NOT assign it a side unless:
- you know that person and how they match
- they match your parent or close relative
When I did this step, I had 10 sure foolers that would have been WRONG if I had made that assumption. Don’t fall into that trap.
Let me give you two quick examples.
One of my mother’s surname lines is Lore which is spelled a variety of ways, including Lohr. There was a Lohr male, but he did not match my mother, so he is clearly not from her side.
There is an other individual with the surname Dotson, which is one of my father’s lines, but she matches both me and my mother.
No assuming allowed! Thank goodness for tools.
A Phased Parent Match
Here’s what a phased parent match looks like. You can see that Alfred matches me and Mom both on at least some of the same segments. This firmly puts this individual on “Mom’s side.” In the column labeled “Side,” type Mom.
Let’s take a minute and look at this match, row by row.
The rows where Alfred matches my mother but not me are shown in yellow in the chromosome column. This means that either I didn’t inherit those segments, or they were IBC matches.
The rows colored green are the segments where Alfred matches both mother and me. That’s a respectable size segment, so very unlikely to be IBC and probably inherited from a common ancestor.
The rows colored red are where Alfred matches me, but not mother, meaning these segments are NOT parentally phased. If you look at the segment size, all of these with one exception are below 3cM, so would have been deleted if you are deleting small segments.
There is also a possibility that Alfred matches me and not Mother on some segments because he could ALSO match me on my father’s side. In my case, it’s very unlikely because my parents have very different geographic ancestry, but it’s not entirely impossible and we always need to keep that possibility in mind.
So, while I’m labeling this person, Alfred, as a match on Mom’s side, each segment always needs to be evaluated on their own merit when you’re actually evaluating the strength of matches. We’ll cover that in a later article. For today, we’re just assigning “sides” based on parental and identified relative matches.
In case you’re wondering, I selected the colors for these segment matches utilizing stop light colors. Green is go, a good match, red means stop, no phased match and yellow is “OK,” not green and not an alert. Both yellow and green are genealogically relevant to you. Red is not, at least not relative to this parent.
If a person doesn’t match BOTH you and your parent, do NOT label the side at all.
In other words, just because that person doesn’t match you and your Mom doesn’t mean they are from your Dad’s side. Yes, I know this is counter intuitive, but they could also be IBC (identical by chance) and someplace between 10 and 20% of your matches will indeed be IBC. So we are ONLY assigning sides when we are positive.
If you have full siblings in the spreadsheet as well, (because you have only one or no parents) you will have additional colored rows. If your sibling matches you, your mother and Alfred, for example, just type Mom for your siblings “side” as well if they fall into this grouping.
I don’t have a full sibling, but here’s an example of what a match between Alfred, me, mother and my full sibling would look like.
If a match matches you and one of your parents, but not on any overlapping segments, I put a Mom? in the “side” column to indicate that the person does match both me and Mother, but the match needs additional inspection. This happens very rarely, but I do see it occasionally, example below.
How Well Does This Work?
Using this technique, I was able to label a total of 7139 spreadsheet rows as Mom’s side. Remember, you’re labeling BOTH your Mom’s and your common matches (the pink and white, above,) so you can’t just sort the “side” column for “Mom” and count to see how many of your rows you labeled.
Some people only label their (white) rows with the “Mom” label. It does make sorting easier, but I label both Mom’s and mine because I want to easily see on Mom’s grouping which ones also match me. Therefore, I label both Mom’s and my rows “Mom” when we share a common match.
Filtering vs Sorting
Sorting columns sorts the column from either highest to lowest or lowest to highest and shows you all of the data in all of your rows. Filtering allows you to view just selected data, not displaying the rest. Filters can be layered so that you can filter one column, then filter another column for a smaller subset.
To find just my rows that were labeled Mom, I filtered the “side” column by the cell value of Mom – which shows me all the rows with the value of Mom in the “side” column – and just those rows. There are both pink and white rows showing.
To utilize filtering, when you only want to see a specific subset of data, click on “filter” under “Sort and Filter.”
Now we’re going to add a second filter by clicking on the down arrows by the column header we wish to filter.
I filtered the name column for Roberta Estes, which shows you only the rows with “Mom” that also have Roberta Jean Estes in the Name column. This then gives you the total number of rows that have BOTH Mom in the “side” column and Roberta Estes in the “name” column. (Hint, when using filters, don’t forget to clear the filter after you complete your function. Otherwise, you’re only working with the filtered set of data and you may think you’re working with the entire spreadsheet.)
That total, visible at the very bottom of the page after filtering, is 3532.
So, of my total rows of matches, 3532 of my 16,861 rows of matches are phased to my mother’s side, or 21%. That means the balance are either my father’s side or IBC. Given that my mother had only about one third of the matches I did, 21% isn’t bad.
Next we are going to work with our known cousins whom we match in the spreadsheet. This works whether you have parents and/or siblings in your spreadsheet or not.
A Phased Cousin Match
Even if you don’t have parents to match, you’ll hopefully have matches to known cousins, aunts, uncles, etc.. This is why we encourage genetic genealogists to test everyone they can find who will test. (The exception is that if your aunt tests, you don’t need her children to test – but you do need her siblings.)
This is exciting, because based on where your relative falls in your tree, you can assign them to the proper side of your family.
In this case, while my father is not available for testing, I know this individual and we are second cousins, so there is no question which “side” this match is from, especially since they don’t also match my mother. If I have full siblings, they probably match AP as well and you would see their colored rows interspersed in this match too.
Go back through your spreadsheet and assign positively identified cousins and family members from your non-phased parents side. In this case, people who I know positively are related to my father I’ll label Dad, because this person matches me on my father’s side of the tree.
Finish your entire combined master spreadsheet in this manner.
I was able to add 501 rows to my spreadsheet positively identified as my father’s side utilizing this methodology. This gives me a total of about 3% of my total spreadsheet rows. Not nearly as high as my mother’s side, but we’re no place near finished.
You might wonder how many people I had to work with on my father’s side. I had a total of 30 positively identified individuals. The closest to me was a 1st cousin once removed, and several that were quite distant. I have sponsored tests for about half of these individuals. The rest, I got lucky. I didn’t know most of them before I took up the hobby of genealogy. Several, I met through DNA testing.
With my mother and known cousins, I was able to identify about 25% of my matches to one side or the other, even without my father’s DNA. That’s pretty remarkable, especially given that my mother has so many fewer DNA matches than me.
Here’s a summary of what we’ve accomplished.
- Created a spreadsheet with all of your chromosome matches, with your rows colored white.
- If you have a parent, add their chromosome matches to the same spreadsheet, after coloring all of their rows appropriately. I suggest pink for Mom and blue for Dad.
- If you don’t have both parents, but do have full siblings, add their chromosome matches into the spreadsheet, after coloring their rows with a specific color.
- Delete small segments if you wish.
- Sort your spreadsheet into match order.
- Review all of your matches and label the matches that match you and either parent with the appropriate side.
- Review matches with known family members and assign the appropriate “parental side” to that cousin match.
In the next article, we’ll create match groups and figure out who is related to whom.
I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.
Thank you so much.
DNA Purchases and Free Transfers
- Family Tree DNA
- MyHeritage DNA only
- MyHeritage DNA plus Health
- MyHeritage FREE DNA file upload
- 23andMe Ancestry
- 23andMe Ancestry Plus Health
- Legacy Tree Genealogists for genealogy research
I am an English woman who has an 82% autosomal match with an American woman who has a single English ancestor dating back to 1860 who has the same name as my great-great-grandfather’s wife, who was born in around 1840. FamilyTreeDNA gives this woman being as a 2nd to 4th cousin. Can I take it for granted that this is our common ancestor?
No, you can’t take it for granted. Genealogically you would need to establish a paper trail and genetically you would need to establish triangulation. It’s a good avenue to follow, and quite promising, but you’re not there yet.
I am being pedantic, but I think “doing” should be “going”.
Time to Make A Decision – To Delete or Not
We’re doing to be using the terms centiMorgans abbreviated cM and SNPs.
Fixed. Thank you.
Roberta, just as a comparison, I have 2203 matches at FTDNA, and no AJ, which usually raises the number of matches at FTDNA.
As an aside, at Gedmatch, all of my reported matches are over 10.2 cMs.
This is a great opportunity to ask a question that always bothers me: If you have both parents tested, why are *you* important at all to the conversation? Yes, there might be a few segments that just miss the cutoff at the individual parent level, and combine in you to create a single segment which is large enough to be detected. But, generally speaking, non-trivial segments are already “visible” in the parents’ tests.
Please help me understand!
You could work on both parents individually. Some people want to know what they inherited, what parts they carry.
Ah, OK… So there’s nothing magical here that will help you find additional matches, right? It’s just that I can know that this particular segment in common with a match I received via my mother (or father). Thanks.
There is nothing that will find you additional matches. Your matches are what they are until more people test who match you. This is about how to utilize your matches.
Wouldn’t testing both parents and yourself assure that you are indeed the child of both parents, or not? Some adoptees never knew they were not the bio child.
I wondered about how useful my kit would be as both my parents have tested. I’ve been surprised how useful seeing what I got, and what I didn’t, has been in trying to phase my parents with my grandparents. It certainly helps with clarifying matching groups and in mapping, especially when I can identify a crossover.
While testing yourself won’t give you additional matches if both parents have undergone testing, there are reasons how *you* can help to sort things out:
(1) When making comparisons of a parent with a match on GEDmatch, if your phased GEDmatch kit also matches, it is helpful for indicating that matching segment is probably real and unlikely to be IBC (for long segments, this isn’t an issue, but for those, for example, 7-12 cM matches, it can be helpful).
(2) For matches on 23andMe and FTDNA who don’t/won’t upload to GEDmatch (to allow me to compare with other relatives who have tested elsewhere), having yourself plus a parent on those sites can help you sort out whether the match is on that parent’s paternal or maternal side – but only if you have done some chromosome mapping. For example, if I know that I have inherited a particular segment from my father’s mother (because both of us have a matching segment with his maternal cousin) and someone on my father’s match list at FTDNA or 23andMe is matching him, but not me, over this same segment, then this match must be on his paternal side (but I don’t rely on this for shortish segments).
I’ve already flunked out of your class. I’m an only child and both of my parents are untested and dead.
If you have any known upstream relatives who have tested, you haven’t flunked out. You can still designate parental sides based on where you match your various parents relatives matches.
I have found a 1C1xR. My grandmother is his great grandmother and this helps me in a sort of quasi phasing.
Do your parents have living siblings?
I’m in the same boat with you, only child, no test for my deceased parent and no living siblings for either of my parents. I tested at AncestryDNA and uploaded my results to FTDNA and GEDmatch, but being a total DNA newbie have no idea what it all means.
If you have any known cousins who could or have tested, that will be helpful. In the mean time, I would suggest contacting any people in circles at Ancestry and see if they have uploaded to either Family Tree DNA or GedMatch, or both. That will help you because you know a common ancestor that you share.
Wonderful post Roberta. I just have one question…. I have my father and myself tested. My father is 25% Italian and 75% AJ. My mother was fully AJ. Is this futile or perhaps I can just raise the bar for shading green at 20 cM?
No, it’s not a lost cause but I would NEVER raise the bar that far for one contiguous segment.
Roberta, Thanks for this series. I’ve been doing something similar, but like your approach and help with column headings.
When you download the Chromosome Browser data and find blank spots in the Match Name column, do you discard them or try to figure out who they are. Most of the values are very low, so will probably get deleted, if I set 3 cM as the lower limit. I just thought I’d ask for others, who might have the same situation, with higher values. My first thought is ‘they don’t care, why should I?’ Then again, they could be the key to my ancestry!
I leave them in for now. I figure they don’t hurt anything. You may see who they cluster with and know the answer without even contacting them.
Alas both my parents are deceased along with my hubbies parents. I have tested, my oldest son tested, one of my brothers tested and my hubby tested – hubby, I think has a link to your mom. his kit on GEDMatch is T058365
I have tried to figure out this Chromosome Browser to do my brother & I but still have things to figure out. Very slow learner on DNA.
I have a combined spreadsheet for Mom, Dad and me. It’s color coded for several things but I have found that I cannot be trusted with color alone. In order to prevent accidental changes anything that is color coded now has to also be somehow identified in a column, even if it’s just with an X.
I keep three individual spreadsheets for each of us. One is just for match names with emails that is appended with new matches. One is a master of downloaded chromosome browser data that is appended but never manipulated. It’s there if there is ever a question of a name change or somehow data is lost. And one is called Five-Up in which I pull out any matching segments over 5 cM from the combined spreadsheet. This one is the most useful for me and the one I annotate.
I leave the names in the first column for just that reason:)
Roberta, You forgot one step in your summary …. Create a Legend to your Color Coding!
I’ve been doing something similar but just with the matches I know for a particular line that I’m working on. It’s nice to see someone else’s process.
Question. If I’m working on a theory with my Dad’s side (who died before testing was available) but I have Mom tested and two sibs and myself tested at FTDNA. Can I utilize the “Not in Common With” feature on the match with Mom on each of our tests (mine and sibs) to make the spreadsheet more manageable from the start? (Filter out the matches shared with Mom on each of our tests before entering the spreadsheet phase so in theory I just have either Dad’s side or IBC matches.)
The legend instructions are in the article. They are crucial. Doing IBW and not IBW is the long way around the block when you can sort for common genetic matching. I do not recommend deleting anything (except small segments if you’re going to do that) because as you obtain new matches, you’ll be adding them to this spreadsheet and if you have removed someone, that’s a lost opportunity.
Realized why I didn’t got the route of my thought/question before. That way only downloads the “matches” and not the chromosome browser matching info which is what we want. I’m going to attribute that thought to too much garden weeding and not enough genealogy these last few weeks.
Weeding will do that to you:)
Another timely discussion, Roberta, prompting a question about a “gap” in one of my triangulated matches. This particular triangulation group is currently at 7 kits, but one of them shows an anomaly: there is a small segment of perhaps 3.5 cM where none of the other kits match. There are two matches immediately ahead of the gap, one match immediately beyond it, and 4 kits which match on both sides … with “dead space” in between. Have you ever encountered this?
Not exactly like that. Have you checked to be sure you’re not dealing with an area not tested by the various companies?
I didn’t even know there are untested areas! The area in question is roughly 172.7-174.1 Mbp on #5.
The areas that are untested are greyed out on the chromosome browser. They are referred to as SNP deserts.
I forgot to add that 3 of the 7 kits in the triangulation group match my own kit across the entire area which appears “dead” for one particular kit. So, it’s a tested area.
I have no parents living, but a sister and several first and second cousins who have tested. (1) Should I delete my matches to my sister and her matches to me just as you deleted direct matches to your mother? (2) To phase our combined results to mother or father’s side, can I use my maternal/paternal cousins now, or save them for later?
I have three atDNA tests at FTDNA: me, a full brother, and our biological mother.
How do I assign a paternal or maternal side to 4 different matches who show a match on X-DNA (and other chromosomes) to me and my brother, but our mother doesn’t show as any match to them at all?
You don’t. Those segments are likely IBC. You can’t assign those segments.
Roberta, I don’t have my mother’s DNA, only my father’s. I have a full sibling but she was done at Ancestry only this month, so I can’t import yet to FtDNA. I do have my mother’s full sibling, a sister, in FtDNA before the chip updates. Can I add her to my “master” sheet and do the same phasing of her to me? The matches that are my aunt’s but not mine are still my mom’s side; the ones that are match me but not my aunt or my father could be IBC or they could be a match to my mom where her sister, my aunt, did not get those segments?
Yes, absolutely, you can, understanding that you will miss some phased matches because your Mom and your aunt did inherit some different DNA from their parents.
Roberta, Within each match do you actually code each segment with the “stop light” colors (green, yellow, red) on your spreadsheet or was that just an illustration in your instructions?
No, I don’t. I do more of that when I’m doing the analysis. I was showing that as an example so people understand that different segments can come from different ancestors and match different parents.
Hi Roberta. I very much enjoyed the article. I am lucky enough to have tested my father and I have recently tested my wife, son, and grandson also. For homework and practice, I decided to phase my son’s kit. I found something curious that I am not sure I understand.
I know that my son got 100% of his DNA from my wife and me. One of my father’s matches match him on chr 4 at 24,450,790 – 34,709,899 / 12.06 cM’s. My son also matches this person on the same segment – 24,450,790 – 34,148,653 / 11.78 cM’s. The problem is, I do not match this same person on that or any other segment. I do not match that person on any segment or chromosome. We do not match at all.
Maybe you can discuss how this can happen in your next article?
By the way, my wife was born in South America and shares no matches with me or my father.
I do see that from time to time. On small segments, those are likely IBC. On larger segments, it’s very rare and if you look on the chromosome browser, you’ll see a “blank” and sometimes a no read location in the middle that divides a segment in such a way that it’s not showing as a match. There are only two options, you either received the DNA from one parents of the other or you are IBC. The fact that the neither parents matches the other person strongly suggests IBC.
Thanks for that excellent post, Roberta. I’m curious about something I notice in my spreadsheet, and yours. Sometimes my mother’s match lengths are slightly shorter than my match lengths with the same person. In an actual example, my mother’s match with the person is 17.66 cM and I match the same person on the same chromosome for 17.85 cM. The shared segment has the same exact ending point, but slightly different starting points. And 4195 SNPs for the match with me vs 4095 for the match with my mother.
Is this something that is normal, and should I ignore these small discrepancies? I.e., is it within some sort of margin of error for genetic genealogy? It is counterintuitive to me that in this situation my match length could be longer than my mother’s match length with the same person, since the match was passed down from her to me.
I’m guessing it means that in my DNA a portion of that match length is IBC and there is no way to distinguish how much.
It’s either IBC or fuzzy start or end points. Don’t worry about it.
Thank you eag0808 for this question and thank you Roberta for your answer. I am working my way through these comments for just that issue! With many of my ancestors in the U.S. for generations, I have lots of matches (and only my mother’s DNA to phase).
Thank you so much for your blog and instructions. It is so kind of you to share your knowledge so freely with all of us. I wouldn’t have any idea what to do with my results if it weren’t for you and others who can explain this in a way I can (sometimes) understand! I have a question on your stoplight example. I see that you have the segment on chromosome 10 colored green, but how come the segment on chromosome 9 isn’t colored green? I must be am missing something. I have over 15,000 rows between both my parents and myself (after deleting under 3cm segments), and I don’t want to color code them incorrectly!
Good eye Shelly. And the answer is…because I missed it. You were paying better attention than I was.
Thanks so much for this series. I really needed a step-by-step approach. My question is the same as Tom Morrow’s (above), but I didn’t see an answer. Both of my parents are deceased and untested, but my brother and my son have tested. 1)Should I delete the lines that show the matches between my brother and myself? 2)I have deleted my son’s results from mine, but should I delete the lines where he matches my brother?
Yes, your son’s matches are irrelevant to YOUR genealogy lines. You can also delete your matches to your brother. Those are irrelevant too.
Both my parents died before they could be tested, but I have tested a maternal uncle and a paternal aunt. Would phasing work in the same way using their results?
Yes, the technique is the same, but you won’t be able to phase all of your maternal matches – but many.
And, I have a 1C, 1x removed (his mother and I were 1st cousins) and this helps me with a quasi phasing.
Yes, indeed, it does. You’ve got it.
Hello Roberta! Thank you for posting this wonderful lesson! I now have my summer genealogy activity in process! And this lesson has given me that nudge to finally learn EXCEL. And, I want to thank you for taking time between this lesson (step 1) and the next lesson (step 2). I’ve already spent hours and am not done with Step 1! Mostly because I’m a new Excel user. 🙂 But the process of assigning sides is time consuming and I appreciate that you didn’t throw all the steps at your followers at once – I may have hit overload!
I love your article and I’m working through the process. I have many matches (on both sides) that I don’t seem to have gotten any of the DNA from either of my parents. Some as much as 18.68 cM. If I tested my brothers would it then be possible to connect to these people?
18 cM is a huge segment not to match either of your parents. Are you matching to both parents actual kits, not reconstructed kits?
I am working on this spreadsheet for me, my mother and my two full brothers. If there is a match that is exactly the same for me and one brother or all three siblings and not my mother at all, can I assume we all received that DNA from my (deceased) dad? Is there a cM distance below which the DNA is likely IBC or IBP (in my case, colonial N.E.)?
Either it’s from your Dad or you’re all IBC in the same way, which is certainly possible. In part, it depends on the size of the segment. I disregard segments below 3cM. Between 3cM and about 10cM, it increasingly likely that they are IBD the larger they are. Above 10cM it’s almost positive to be IBD but there are no guarantees in this, so I simply would either label them with a Dad? or not at all. About 20% of your matches will be IBC, even over the vendor threshold, so keep that in mind. That’s why I don’t label mine until I KNOW FOR SURE!
My spreadsheet is so big I’m only working on segments greater than 5 cM for right now and yes, I have many entries with “Dad?”
Am working on this. I added my Mom’s, my brother’s, and my colorized matches and just finished deleting matches below 3cMs … except … I decided to retain those below 3cMs who had 2000 or more shared SNPs.
Oddly, when I did this, those remaining matches who were a match below 3cMs with 2000 SNPs or more ALL matched on Chromosome 6. This amounted to 209 rows of matches. Some of these 209 rows contained people who appeared more than once.
Still, isn’t that weird? Why would this happen?
We’re going to talk about that on the next article.
Another question: What does it mean when I have an atDNA match who was
(1) on my FTDNA FF match list,
(2) on my brother’s FTDNA FF match list,
(3) NOT on our Mom’s FTDNA FF match list,
(4) matches my brother and I at 12.01 cMs and 4000 SNPs on same segment/location on Chromosome 11, AND
(5) matches me (only) on Chromosome X?
How is this possible???? Isn’t DNA on Chromosome X passed down only by the mother?
This person matches my brother and I as follows:
Chrom. Start End cMs SNPs
10 30622903 33570056 3.46 700 My brother
11 45033120 67624124 12.01 4000 Me
11 45033120 67624124 12.01 4000 My brother
X 98260963 102642207 4.51 525 Me
You get an X from both your mother and father but your brother only gets an X from your mother because he gets the Y from your father.
Both of my parents are deceased but I have dna for my brother. I use Genome Mate Pro and use 6 different colors to color code matching dna segments. Two colors when only P or M are known. I use 4 other colors for each of my grandparents lines. This provides a great visual aide on the paternal and maternal chromosome bars and in the tables and on the segment map.
I am a Powell via my father Thomas Powell. I am only able to go back to my Powell gg-gf — Thomas Jefferson Powell (1861-1899) and his wife (my gg-gm) Sarah Virginia Alford (1863-1931) . We have traced the Alford side, and know the descendants of Thomas and Sarah, but we have not been able to determine who my gg-gf TJ Powell’s parents and their ancestors were, or who his siblings and their descendants were.
That’s because all birth, marriage/divorce, death, and census records for Pike County, MS where he grew up were destroyed in a county courthouse fire.
As you know, my gg-gf TJ Powell’s siblings would be my 2nd great uncles and aunts, and their direct descendants would be related to me as follows:
– Their Children = my 1st cousins 3x removed (These would likely be deceased by now.)
– Their Grandchildren = my 2nd cousins 2x removed (Some of these would likely still be living.)
– Their Great-Grandchildren = my 3rd cousins 1x removed (Most of these are likely living.)
– Their Great-Great-Grandchildren = my 4th cousins (Most of these are likely living.)
Once I remove my mother’s definite matches, what should the cM cut off be in order for me to have the best shot at retaining ONLY the possible descendants of my paternal gg-gf TJ Powell’s siblings — that is, my 2nd cousins 2x removed, my 3rd cousins 1x removed, and my 4th cousins — and ensure that everyone else is eliminated from the spreadsheet.
You really can’t do that. The ranges are just too great. Take a look at Blaine’s shared cM project. Here’s the download. http://thegeneticgenealogist.com/wp-content/uploads/2016/06/Shared-cM-Project-Version-2.pdf
As I have been going through my spreadsheet and assigning “Mom” to my matches with her, I realize that there are a number of occasions where Mom will match someone on one chromosome, but I match that person on 2 or more chromosomes. Here’s an example (I’ve changed my name to “Me” and our match name to “Anderson”):
Me Anderson 3 148425802 151244214 3.61 700
Me Anderson 12 83951822 87831110 3.06 500
Me Anderson 20 46674201 48352524 3.3 500
Me Anderson 20 58958065 62382907 8.76 1170
Mom Anderson 20 58958065 62382907 8.76 1170
So, in this case, I see the segment on Chromosome 20 that I definitely inherited from Mom BUT what is that other segment on chromosome 20? And on Chromosomes 3 and 12? Are these from my Father (as I don’t have his autosomal and he has passed)? Or are they the result of “convergence”?
How do I label these types of occurrences on my spreadsheet?
I leave mine either unlabeled or mark them “Not Mom.”
Pingback: The Concepts Series | DNAeXplained – Genetic Genealogy
Pingback: Nine Autosomal Tools at Family Tree DNA | DNAeXplained – Genetic Genealogy
How do we represent the rows in our spreadsheet where we directly match our parent but that segment doesn’t match anyone else? Below is my example. I am Deborah and my mom is Iantha. How should this be colored? As a match with Mom?
Deborah Thurman Parks Iantha Lou Thurman 1 72017 247093448 267.21 59444
Deborah Thurman Parks Iantha Lou Thurman 2 8674 242697433 253.06 57885
Deborah Thurman Parks Iantha Lou Thurman 3 36495 199322659 219.1 47331
Deborah Thurman Parks Iantha Lou Thurman 4 61566 191152644 206.75 40558
Deborah Thurman Parks Iantha Lou Thurman 5 91139 180625733 199.6 42186
Deborah Thurman Parks Iantha Lou Thurman 6 148878 170761395 189.14 48521
Deborah Thurman Parks Iantha Lou Thurman 7 140018 158812247 180.79 38277
Deborah Thurman Parks Iantha Lou Thurman 8 154984 146264218 161.76 37109
Deborah Thurman Parks Iantha Lou Thurman 9 36587 140186312 160.36 32896
Deborah Thurman Parks Iantha Lou Thurman 10 88087 135327873 176.25 39139
Deborah Thurman Parks Iantha Lou Thurman 11 188510 134439273 155.78 36818
Deborah Thurman Parks Iantha Lou Thurman 12 61880 132287718 167.39 35585
Deborah Thurman Parks Iantha Lou Thurman 13 17956717 114121631 126.48 27967
Deborah Thurman Parks Iantha Lou Thurman 14 18325726 106358708 111.66 23382
Deborah Thurman Parks Iantha Lou Thurman 15 18331687 32194759 20.96 3038
Deborah Thurman Parks Iantha Lou Thurman 15 32922299 100278685 96.05 18496
Deborah Thurman Parks Iantha Lou Thurman 16 28165 88690776 131.9 22794
Deborah Thurman Parks Iantha Lou Thurman 17 8547 78639702 124.33 20344
Deborah Thurman Parks Iantha Lou Thurman 18 3034 76116152 119.39 21734
Deborah Thurman Parks Iantha Lou Thurman 19 211912 63788972 99.07 15094
Deborah Thurman Parks Iantha Lou Thurman 20 11244 62382907 104.2 18396
Deborah Thurman Parks Iantha Lou Thurman 21 9849404 46897738 58.99 10222
Deborah Thurman Parks Iantha Lou Thurman 22 15492342 45772802 53.03 8902
Deborah Thurman Parks Iantha Lou Thurman X 1370495 154570039 195.93 18092
I actually remove those from my spreadsheet because I know I match my mother on the full length of every chromosome.
I have no close living relatives (that I know of) except for mother’s brother. Would testing him help me with this type of sorting?
I don’t have any close living relatives (that I know of) except my mother’s brother, would having him tested help with this type of sorting?
Yes. Very much.
As I go through my matches with my mother (I do not have my father’s autosomal DNA) I’m finding that I have a LOT of Identical by Chance with my mom on chromosome X. Or at least, what is happening is that I’ll have a match with mom and then with that particular person, I’ll have a match with them on X – but Mom won’t. I just thought I’d pass this by you as I’m seeing it a lot.
You carry an X from your dad too.
Pingback: Concepts – Sorting Spreadsheets for Autosomal DNA | DNAeXplained – Genetic Genealogy
Am so appreciative of your concepts article. Really helpful. I have a question about deleting from my master spreadsheet. Is the fact that both my sister and I match 2 different females on a tiny chromosome segment (1 cM) with proportionately large SNPs (2500 and 2700) of any significance, or is this just another instance of IBC? (Our brother didn’t match on this segment.)
Also, very timely Excel article. Often thought you could say it better than the references I’ve been using—and it’s true. Thanks!
That’s a huge number of SNPs for a 1cM block. I would not ignore it. I would just leave it there and marvel at it from time to time:) One day there may be something very interesting.
Remember, Ancestry strips out part of the DNA in their Timber processing.
Roberta, I didn’t have time this summer to start this project, but started today. Then I started wondering which of this is necessary since FTDNA has instituted the Phased Family Finder Matches. Functions on the website have changed too. Where do we begin now?
Before you decide what to do, I’d suggest reading the concepts articles, including Step 1 and Step 2 of Managing Your Autosomal DNA, before you decide exactly where and how to start.
Like the previous reader, I’m a bit late coming to the party, so I’ve got some catching up to do! But, I’ve run into an issue and thought that I would share in hopes that others might avoid some confusion down the road. The issue has to do with having multiple matches which have the same MATCHNAME string in the Chromosome Browser Results file. If not corrected, these matches will end up looking like a single match and it’s easiest to correct them before doing any sorting on the file. I’ve seen several scenarios causing this situation. One scenario results in a blank MATCHNAME (this was mentioned earlier by reader Phyllis Morefield). This appears to happen when a match only has one of the three name fields (first, middle or last) filled in. You can check your Family Finder Matches file by using a filter to find all the entries with a blank “First Name”, then check to see if that match is in the CB Results file using the Full Name. If not, this would be one of the blank MATCHNAME cases. Then repeat the process looking for those with a blank “Last Name.” Sometimes a single-name-MATCHNAME actually makes it into the CB Results file, but all of my blank MATCHNAMES were set up this way. However, the majority of the duplicates that I’ve found really are cases where the names are the same. You can find these by using the Family Finder Matches file. Sort on the “Full Name” column, then use “Conditional Formatting” under the “Home” tab. By selecting Conditional Formatting / Highlight Cells Rules / Duplicate Values… and then OK on the popup window to accept the default colors, any name that is duplicated will be highlighted in red. You can then use some identifying info (e.g. Jr/Sr, email address, match date) to make the MATCHNAMEs unique in the CB Results file. I did find some where it appears that the same person (name & email) has two kits that have exactly the same shared segments (ignoring sub 3 cM segments). I deleted one of those from my CB Results file. Finally, thanks Roberta for another great article. I always learn something from your blog articles!
Finally got around to “filtering” my spreadsheet and this is a great place to find any colorizing mistakes. Every step of the way, I’m learning more by actually “using” the data. AND I’m learning that I make mistakes – lots of them! 🙂
Pingback: 2016 Genetic Genealogy Retrospective | DNAeXplained – Genetic Genealogy
Ok. I’ve made a stab at a start. Mom and Dad are 3C1R through one line. Mom’s 2 lines aren’t related. There’s a bit of overlap between dad’s 2 lines, but it’s not recent.
I pulled lists of matches ICW one on Mom’s paternal first cousins, and a maternal 1C1R. Same for daddy. So I have 4 sets of ICW matches for each of my 4 lines. Am I on the right track?
Yes, you sure are. If you’re at FTDNA, link those people to your tree and the Family Matching will then assign other people based on those matches to different sides of your tree.
This article is over a year old and has been revived. I have a difficult time triangulating at Gedmatch and am beginning to wonder if it is because of trees that are incorrect because of NPEs.
Some time ago I had about 10 people with tree matches of the same ancestral couple and shared cMs but could not triangulate them on the ancestral couple. Then one of the participants contacted me to say she has found out her father, the match in question, is a NPE.
I wonder if there are many more NPEs than we realize which makes triangulating more problematic.
That’s why we have to focus on those who do triangulate. We don’t know why the ones who don’t, dont.
BTW, where’s part 2?
Alrighty, I’m learning this technique for the 3rd time, lol. I’ll get started, and then life happens. I’m trying to keep it all straight. So when I’m assigning sides, and I have a match that matches me and mom (I don’t have my father’s) on the same chromosome but not the same location, I mark that as ‘not mom’? Ex.
Me Match 11 15760452 19074354 5.41
Mom Match 11 16195220 19074354 4.86
Mom Match 11 65075841 67135592 2.87
Or would I mark it ‘not mom’ only in the instance where I would have a match on a chromosome, and my mom isn’t on that chromosome at all? I was reading all the comments and have gotten really confused. Thanks!
If that person doesn’t match you and your mom on the same location on the same chromosome, then it’s “not mom.” Every segment has its own history.
Also, please forgive me if I’m over looking the obvious, but I have my full siblings results mixed in with mine and mom’s. What do I assign if the match is between mom and sibling but doesn’t include me? Thanks again!
I would think that would be very confusing in one spreadsheet. What is your goal with that? I have two spreadsheets. One is me and my mother because that’s how MY DNA works. The other is a combination of close relatives to see if I can figure out which line up the tree matches are on.
I thought it was helpful to do that if you had a missing parent (in our case, my father). My goal is to hopefully see which matches are from Mom’s side and which are potentially from Dad’s side by including my brother’s matches as well. Am I doing this wrong?
No, there is no “wrong” in this case. If that’s your goal, it makes sense. I would probably have used two spreadsheets just to keep myself from getting confused, but if this is easier for you, then by all means. Work with the largest matches first. Those will be the easies to track and also to know are not IBC.
I’m not sure if it’s actually easier, lol. When you say you would have probably used two spreadsheets, how would you have done it if your goal was the same as mine? Just checking in case how you’re thinking is actually easier. I need all the help I can get.
One for each child/parent connection so I don’t get confused with who is matching to whom in the common spreadsheet. This may not be confusing to you at all, so you may be fine. Different people find different techniques easier to use.
And to make things even more interesting, I manage my father’s brother’s kit and was planning to add him to the spreadsheet to help show the matches that would possibly point to my dad’s side. Kind of like having him stand in for Dad. Do you think that would work or be too much?
Ugh. I’m stuck (& feel really dumb). I clicked Ctrl+A to select all on my Dad’s spreadsheet but when I go to my spreadsheet to paste it makes me choose A1 (or R1C1 – whatever that means) and wipes out all of MY info. How do you paste from your parents’ spreadsheets into yours without it pasting over? (I chose Excel Macro-Enabled Workbook file extension if that’s relevant.)
Nevermind. After 20 tries it finally pasted.
Wouldn’t it save a lot of time adding the “Side” column in to the parent’s spreadsheet and filling in “Mom” or “Dad” before pasting into the Master?
What I did was change the column header to “Dad,” the bkgd color to blue, used Ctrl + Space to highlight the column, and Ctrl + D to duplicate “Dad” down the entire column. Then I went back and changed “Dad” to “Side” and the bkgd color back to “no fill.” 😀
NOW I’ll paste Mom and Dad’s spreadsheets into mine!
Hope this helps save someone else 100,000+ entries. 😉