Site icon DNAeXplained – Genetic Genealogy

The Leeds Method

This is the first in a series of two articles. This article explains the Leeds Method and how I created a Leeds Spreadsheet in preparation for utilizing the results in DNAPainter. I stumbled around a bit, but I think I’ve found a nice happy medium and you can benefit from my false starts by not having to stumble around in the dark yourself. Of course, I’m telling you about the pitfalls I discovered.

The second article details the methodology I utilized to paint these matches, because they aren’t quite the same as “normal” matching segments with identified ancestors.

Welcome to the Leeds Method

Dana Leeds developed a novel way to utilize a spreadsheet for grouping your matches from second through fourth cousins and to assign them to “grandparent” quadrants with no additional or previous information. That’s right, this method generates groupings that can be considered good hints without any other information at all.

Needless to say, this is great for adoptees and those searching for a parent.

It’s also quite interesting for genetic genealogists as well. One of the best aspects is that it’s very easy to do and very visual. Translation – no math. No subtraction.

Caveat – it’s also not completely accurate 100% of the time, especially when you are dealing with more distant matches, intermarriage and/or endogamy. But there are ways to work around these issues, so read on!

You can click to enlarge any image.

I’ll be referring to this graphic throughout this article. It shows the first several people on my Ancestry match list, beginning with second cousins, using pseudonyms. I chose to use Ancestry initially because they don’t provide chromosome browsers or triangulation tools, so we need as much help there as we can get.

I’ve shown the surnames of my 4 grandparents in the header columns with an assigned color, plus a “Weird group” (grey) that doesn’t seem to map to any of the 4. People in that group are much more distant in my match list, so they aren’t shown here.

I list the known “Most Common Recent Ancestor,” when identified, along with the color code that so I can easily see who’s who.

All those blanks in the MCRA column – those are mostly people without trees. Just think how useful this would be if everyone who could provide a tree did!

What Does the Leeds Method Tell You?

The Leeds Method divides your matches into four colored quadrants representing each grandparent unless your genealogical lines are heavily intermarried. If you have lots of people who fall into both of two (or more) colors, that probably indicates intermarriage or a heavily endogamous population.

In order to create this chart, you work with your closest matches that are 2nd cousins or more distant, but no more distant than 4th cousins. For endogamous people, by the time you’re working in 4th cousins, you’ll have too much overlap, meaning people who fall into multiple columns, so you’ll want to work with primarily 2nd and 3rd cousins. The good news is that endogamous people tend to have lots of matches, so you should still have plenty to work with!

Instructions

In this article, I’m using Dana’s method, with a few modifications.

By way of a very, very brief summary:

To understand exactly what I’m doing, read Dana’s articles, then continue with this article.

DNA Color Clustering: The Leeds Method for Easily Visualizing Matches  
DNA Color Clustering: Identifying “In Common” Surnames 
DNA Color Clustering: Does it Work with 4th Cousins? By the way, yes it does, most of the time.
DNA Color Clustering: Dealing with 3 Types of Overlap

Why Use “The Leeds Method”?

In my case, I wanted to experiment. I wanted to see if this method works reliably and what could be done with the information if you already know a significant amount about your genealogy. And if you don’t.

The Leeds Method is a wonderful way to group people into 4 “grandparent” groups in order to search for in-common surnames. I love being able to perform this proof of concept “blind,” then knowing my genealogy and family connections well enough to be able to ascertain whether it did or didn’t work accurately.

If you can associate a match with a single grandparent, that really means you’ve pushed that match back to the great-grandparent couple.

That’s a lot of information without any genealogical knowledge in advance.

How Low Can You Go?

I have more than 1000 fourth cousins at Ancestry. This makes the task of performing the Leeds Method manually burdensome at that level. It means I would have had to type all 1000+ fourth cousins into a spreadsheet. I’m patient, but not that patient, at least not without a lot of return for the investment. I have to ask myself, exactly what would I DO with that information once they were grouped?

Would 4th cousin groupings provide me with additional information that second and third cousin groupings wouldn’t? I don’t think so, but you can be the judge.

After experimenting, I’d recommend creating a spreadsheet listing all of your 2nd and 3rd cousins, along with about 300 or so of your closest 4th cousin matches. Said another way, my results started getting somewhat unpredictable at about 40-45 cMs, although that might not hold true for others. (No, you can’t tell the longest matching segment length at Ancestry, but I could occasionally verify at the other vendors, especially when people from Ancestry have transferred.)

Therefore, I only proceeded through third cousins and about 300 of the Ancestry top 4th cousin matches.

I didn’t just utilize this methodology with Ancestry, but with Family Tree DNA, MyHeritage and 23andMe as well. I didn’t use GedMatch because those matches would probably have tested at one of the primary 4 vendors and I really didn’t want to deal with duplicate kits any more than I already had to. Furthermore, GedMatch is undergoing a transition to their Genesis platform and matching within the Genesis framework has yet to be perfected for kits other than those from these vendors.

Let’s talk about working with matches from each vendor.

Ancestry

At Ancestry, make a list of all of your second and third cousin matches, plus as many 4th cousins as you want to work with.

To begin viewing your common matches, select your first second cousin on the list and click on the green View Match. (Note that I am using my own second kit at Ancestry, RobertaV2Estes, not a cousin’s kit in these examples. The methodology is the same, so don’t fret about that.)

Then, click on Shared Matches.

Referring to your spreadsheet, assign a color to this match group and color the spreadsheet squares for this match group. Looking at my spreadsheet, my first group would be the yellow Estes group, so I color the squares for each person that I match in common with this particular cousin. On my spreadsheet, those cousins have all been assigned pseudonyms, of course.

Your shared match list will be listed in highest match order which should be approximately the same order they are listed on your spreadsheet. I use two monitors so I can display the spreadsheet on one and the Ancestry match list on the other.

Lon is shared in common with the gold person I’m comparing against (Roberta V2 Estes), and me, so his box would be colored gold on the spreadsheet. Lon’s pseudonym is Sneezy and the person beneath him on this list, not shown, would be Ariel.

Ancestry only shows in-common matches to the 4th cousin level, so you really couldn’t reach deeper if you wanted. Furthermore, I can’t see any advantage to working beyond the 4th cousin’s level, maximum. Your best matches are going to be the largest ones that reveal the most information and have the most matches, therefore allowing you to group the most people by color.

Unfortunately, Ancestry provides the total cMs and the number of segments, but not the largest matching segment.

One benefit of this methodology is that it’s fairly easy to group those pesky private matches like the last one on the master spreadsheet, Cersei, shown in red. You’ll at least know which grandparent group they match. Based on your identified ancestors of matches in the color group, you may be able to tell much more about that private match.

For example, one of my private matches is a match to someone who I share great-great-grandparents with AND they also match with two people further on up that tree on the maternal side of that couple, shown above, in red. I may never know which ancestor I share with that private match specifically, but I have a pretty darned good idea now in spite of that ugly little lock. The more identified matches, the better and more accurate this technique.

Is the Leeds Method foolproof? No.

Is this a great tool? Yes, absolutely.

Family Tree DNA

Thankfully, Family Tree DNA provides more information about my matches than Ancestry, including segment information combined with a chromosome browser and Family Matching. I often refer to Family Matching as parental bucketing, shown on your match list with the maternal and paternal tabs, because Family Tree DNA separates your matches into parental “sides” based on common segments with others on your maternal and paternal branches of your tree when you link your matches’ results.

At Family Tree DNA, sign on and then click on Matches under Family Finder.

When viewing your matches, you’ll see blue or red people icons any that are assigned to either your maternal, paternal side, or both (purple) on your match list. If you click on the tabs at the top,  you’ll see JUST the maternal, paternal or both lists.

This combination of tools allows you to confirm (and often triangulate) the match for several people. If those matches are bucketed, meaning assigned to the same parental side, and they match on the same segment, they are triangulated for all intents and purposes if the segment is above 20 cM. All of the matches I worked with for the Leeds Method were well above 20 cM, so you don’t really need to worry about false or identical by chance matches at that level.

Family Tree DNA matches are initially displayed by the total number of “Shared cM.” Click on “Longest Block” to sort in that manner. I considered people through 30 cM and above as equivalent to the Ancestry 3rd cousin category. Some of the matching became inconsistent below that threshold.

List all of your second and third cousins on the spreadsheet, along with however many 4th cousins you want to work with.

Then, select your closest second cousin by checking the box to the left of that individual, then click on “In Common With” above the display. This shows you your matches in common with this person.

On the resulting common match list, sort your matches in Longest block order, then mark the matches on your spreadsheet in the correct colored columns.

With each vendor, you may need to make new columns until you can work with enough matches to figure out which column is which color – then you can transfer them over. If you’re lucky enough to already know the family association of your closest cousins, then you already know which colored column they belong to.

All of my matches that fell into the Leeds groups were previously bucketed to maternal or paternal, so consistency between the two confirms both methodologies. Between 20 and 28 cM, three of my bucketed matches at Family Tree DNA fell into another group using the Leeds method, which is why I drew the line at 30cM.

For genealogists who already know a lot about their tree, this methodology in essence divides the maternal and paternal buckets into half. FTDNA already assigns matches maternally or paternally with Family Matching if you have any information about how your matches fit into your tree and can link any matching testers to either side of your tree at the 3rd cousin level or closer.

If you don’t know anything about your heritage, or don’t have any way to link to other family members who have tested, you’ll start from scratch with the Leeds Method. If you can link family members, Family Tree DNA already does half of the heavy lifting for you which allows you to confirm the Leeds methodology.

MyHeritage

At MyHeritage, sign in, click on DNA and sort by “largest segment,” shown at right, above. I didn’t utilize matches below 40 cM due to consistency issues. I wonder if imputation affects smaller matches more than larger matches.

You’ll see your closest matches at the top of the page. Scroll down and make a list on your spreadsheet of your second and third cousins. Return to your closest DNA match that is a second cousin and click on the purple “Review DNA Match” which will display your closest in-common matches with that person, but not necessarily in segment size order.

Scroll down to view the various matches and record on the spreadsheet in their proper column by coloring that space.

The great aspect of MyHeritage is that triangulation is built in, and you can easily see which matches triangulate, providing another layer of confirmation, assuming you know the relationship of at least some of your matches.

The message for me personally at MyHeritage is that I need to ask known cousins who are matches elsewhere to upload to MyHeritage because I can use those as a measuring stick to group matches, given that I know the cousin’s genealogy hands-down.

The great thing about MyHeritage is that they are focused on Europe, and I’m seeing European matches that aren’t anyplace else.

23andMe

At 23andMe, sign in and click on DNA Relatives under the Ancestry tab.

You’ll see your list of DNA matches. Record 2nd and third cousins on your spreadsheet, as before.

To see who you share in common with a match, click on the person’s name and color your matches on the spreadsheet in the proper column.

Unfortunately, the Leeds Method simply didn’t work well for me with my 23andMe data, or at least the results are highly suspect and I have no way of confirming accuracy.

Most of my matches fell into in the Estes category, with the Boltons overlapping almost entirely, and none in the Lore or Ferverda columns. There is one small group that I can’t identify. Without trees or surnames, genealogically, my hands are pretty much tied. I can’t really explain why this worked so poorly at 23andMe. Your experience may be different.

The lack of trees is a significant detriment at 23andMe because other than a very few matches whose genealogy I know, there’s no way to correlate or confirm accuracy. My cousins who tested at 23andMe years ago and whose tests I paid for lost interest and never signed in to re-authorize matching. Many of those tests are on the missing Ferverda side, but their usefulness is now forever lost to me.

23andMe frustrates me terribly. Their lack of commitment to and investment in the genealogical community makes working with their results much more difficult than it needs to be. I’ve pretty much given up on using 23andMe for anything except adoption searches for very close matches as a last resort, and ethnicity.

The good news is that with so many people testing elsewhere, there’s a lot of good data just waiting!

What are the Benefits?

The perception of “benefit” is probably directly connected to your goal for DNA testing and genetic genealogy.

Unfortunately, Ancestry doesn’t provide segment information, so you can’t chromosome paint from Ancestry directly, BUT, you can upload to either Family Tree DNA, MyHeritage or GedMatch and paint Ancestry matches from there. At GedMatch, their kit numbers begin with A.

What Did I Do Differently than Dana?

Instead of adding a 5th column with the first person (Sam) who was not grouped into the first 4 groups, I looked for the closest matches that I shared with Sam who were indeed in the first 4 color groups. I added Sam to that existing color group along with my shared matches with Sam that weren’t already grouped into that color so long as it was relatively consistent. If it looked too messy, meaning I found people in multiple match groups, I left it blank or set that match aside. This didn’t happen until I was working at the 4th cousin level or between 30 and 40 cM, depending on the vendor.

Please note that just because you find people that you match in common with someone does NOT MEAN that you all share a common ancestor, or the same ancestor. It’s a hint, a tip to be followed.

There were a couple of groups that I couldn’t cluster with other groups, and one match that clustered in three of the four grandparent groups. I set that one aside as an outlier. I will attempt to contact them. They don’t have a tree.

I grouped every person through third cousin matches. I started out manually adding the 4th cousins for each match, but soon gave up on that due to the sheer magnitude. I did group my closest 4th cousins, or until they began to be inaccurate or messy, meaning matching in multiple groups. Second and third cousin matching was very consistent.

Tips

What Did I Learn?

Almost all of my (endogamous by definition) Acadian matches are more distant, which means the segments are smaller. I expected to find more in the painted group, because I have SO MANY Acadian matches, but given that my closest Acadian ancestor was my great-great-grandfather, those segments are now small enough that those matches don’t appear in the candidate group of matches for the Leeds Method. My Acadian heritage occurs in my green Lore line, and there are surprisingly few matches in that grouping large or strong enough to show up in my clustered matches. In part, that’s probably because my other set of great-great-grandparents in that line arrived in 1852 from Germany and there are very few people in the US descended from them.

I found 4th cousin matches I would have otherwise never noticed because they don’t have a tree attached. At Ancestry, I only pay attention to closer matches, Shared Ancestor Hints and people with trees. We have so many matches today that I tend to ignore the rest.

Based on the person’s surname and the color group into which they fall, it’s often possible to assign them to a probable ancestral group based on the most distant ancestors of the people they match within the color group. In some cases, the surname is another piece of evidence and may provide a Y DNA lead.

For example, one of my matches user name is XXXFervida. They do match in the Ferverda grandparent group, and Fervida is how one specific line of the family spelled the surname. Of course, I could have determined that without grouping, but you can never presume a specific connection based solely on surname, especially with a more common name. For all I know, Fervida could be a married name.

By far the majority of my matches don’t have trees or have very small trees. That “no-tree” percentage is steadily increasing at Ancestry, probably due to their advertising push for ethnicity testing. At Family Tree DNA where trees are infinitely more useful, the percentage of people WITH trees is actually rising. By and large, Family Tree DNA users tend to be the more serious genealogists.

MyHeritage launched their product more recently with DNA plus trees from the beginning, although many of the new transfers don’t have trees or have private trees. Their customers seem to be genealogically savvy and many live in Europe where MyHeritage DNA testing is focused.

23andMe is unquestionably the least useful for the Leeds Method because of their lack of support for trees, among other issues, but you may still find some gems there.

Keeping Current

Now that I invested in all of this work, how will I keep the spreadsheet current, or will I at all?

At Ancestry, I plan to periodically map all of my SAH (Shared Ancestor Hints) green leaf matches as well as all new second and third cousin matches, trees or not.

In essence, for those with DNA matches and trees with a common ancestor, Ancestry already provides Circles, so they are doing the grouping for those people. Where this falls short, of course, is matches without trees and without a common identified ancestor.

For Ancestry matches, I would be better served, I think, to utilize Ancestry matches at GedMatch instead of at Ancestry, because GedMatch provides segment information which means the matches can be confirmed and triangulated, and can be painted.

For matches outside of Ancestry, in particular at Family Tree DNA and MyHeritage I will keep the spreadsheet current at least until I manage to paint my entire set of chromosomes. That will probably be a very long time!

I may not bother with 23andMe directly, given that I have almost no ability to confirm accuracy. I will utilize 23andMe matches at GedMatch. People who transfer to GedMatch tend to be interested in genealogy.

What Else Can I Do?

At Ancestry, I can use Blaine’s new “DNA Match Labeling” tool that facilitates adding 8 colored tags to sort matches at Ancestry. Think of it as organizing your closet of matches. I could tag each of these matches to their grandparent side which would make them easy to quickly identify by this “Leeds Tag.”

My Goals

I have two primary goals:

I want to map my DNA segments to specific ancestors. I am already doing this using Family Tree DNA and MyHeritage where common ancestors are indicated in trees and by surnames. I can map these additional Leeds leads (pardon the pun) to grandparents utilizing this methodology.

To the extent I can identify paternal and maternal matches at 23andMe, I can do the same thing. I don’t have either parents’ DNA there, and few known relatives, so separating matches into maternal and paternal is more difficult. It’s not impossible but it means I can associate fewer matches with “sides” of my genealogy.

For associating segments with specific ancestors and painting my chromosomes, DNAPainter is my favorite tool.

In my next article, we’ll see how to use our Leeds Method results successfully with DNAPainter and how to interpret the results.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Exit mobile version