AutoKinship at GEDmatch by Genetic Affairs

Genetic Affairs has created a new version of AutoKinship at GEDmatch. The new AutoKinship report adds new features, allows for more kits to be included in the analysis, and integrates multiple reports together:

  • AutoCluster – the autoclusters we all know and love
  • AutoSegment – clusters based on segments
  • AutoTree – reconstructed tree based on GEDCOM files of you and your matches, even if you don’t have a tree
  • AutoKinship – the original AutoKinship report provided genetic trees. The new AutoKinship report includes AutoTree, combines both, and adds features called AutoKinship Tree. (Trust me on this one – you’ll see in a minute!)
  • Matches
    • Common Ancestors with your ancestors
    • Common Ancestors between matches, even if they don’t match your tree
    • Common Locations

Maybe the best news is that some reports provide automatic triangulation because, at GEDmatch, it’s possible to not only see how you match multiple people, but also if those people match each other on that same segment. Of course, triangulation requires three-way matching in addition to the identification of common ancestors which is part of what AutoKinship provides, in multiple ways.

Let’s step through the included reports and features one at a time, using my clusters as an example.

Order Your Report

As a Tier 1 GEDmatch customer, sign in, select AutoKinship and order your report.

Note that there are now two clustering settings, the default setting and one that will provide more dense clusters. The last setting is the default setting for AutoKinship, since it has been shown to produce better AutoKinship results.

You can also select the number of kits to consider. Since this tool is free with a GEDmatch Tier 1 subscription, you can start small and rerun if you wish, as often as you wish.

Currently, a maximum of 500 matches can be included, but that will be increased to 1000 in the future. Your top 500 matches will be included that fall within the cM matching parameters specified.

I’m leaving this at the maximum 400 cM threshold, so every match below that is included. I generally leave this default threshold because otherwise my closest matches will be in a huge number of clusters which may cause processing issues.

For a special use case where you will want to increase the cM threshold, see the Special Use Cases section near the end of this article.

You can select a low number of matches, like 25 or 50 which is particularly useful if you want to examine the closest matches of a kit without a tree.

Keep in mind that there is currently a maximum processing time of 10 minutes allowed per report. This means that if you have large clusters, which are the last ones processed, you may not have AutoKinship results for those clusters.

This also means that if you select a high cM threshold and include all 500 allowable matches, you will receive the report but the AutoKinship results may not be complete.

When finished, your report will be delivered to you as a download link with an attached zipped file which you will need to save someplace where you can find it.

Unzip

If you’re a PC user, you’ll need to unzip or extract the files before you can use the files. You’ll see the zipper on the file.

If you don’t extract the contents, you can click on the file to open which will display a list of the files, so it looks like the files are extracted, but they aren’t.

You can see that the file is still zipped.

You can click on the html file which will display the AutoCluster correctly too, but when you click on any other link within that file, you’ll receive this error message if the file is still zipped.

If this happens to you, it means the file is still zipped. Close the files you have open, right click on the yellow zipped file folder and “extract all.”

Then click on the HTML link again and everything should work.

Ok, on to the fun part – the tools.

Tools

I’ve written about most of these tools individually before, except for the new combinations of course. I’ve put all of the Genetic Affairs Tools, Instructions and Resources in one article that you can find here.

I recommend that you take a look to be sure you’re using each tool to its greatest advantage.

AutoCluster

Click on the html file and watch your AutoCluster fly into place. I always, always love this part.

The first thing I noticed about my AutoCluster at GEDmatch is that it’s HUGE! I have a total of 144 clusters and that’s just amazing!

Information about the cluster file, including the number of matches, maximum and minimum cM used for the report, and minimum cluster size appears beneath your cluster chart.

22 people met the criteria but didn’t have other matches that did, so they are listed for my review, but not included in the cluster chart.

At first glance, the clusters look small, but don’t despair, they really aren’t.

My clusters only look small because the tool was VERY successful, and I have many matches in my clusters. The chart has to be scaled to be able to display on a computer monitor.

New Layout

Genetic Affairs has introduced a new layout for the various included tools.

Each section opens to provide a brief description of the tool and what is occurring. This new tool includes four previous tools plus a new one, AutoCluster Tree, as follows:

AutoCluster

AutoCluster first organizes your DNA matches into shared match clusters that likely represent branches of your family. Everyone in a cluster will likely be on the same ancestral line, although the MRCA between any of the matches and between you and any match may vary. The generational level of the clusters may vary as well. One may be your paternal grandmother’s branch, another may be your paternal grandfather’s father’s branch.

AutoSegment

AutoSegment organizes your matches based on triangulating segments. AutoSegment employs the positional information of segments (chromosome and start and stop position) to identify overlapping segments in order to link DNA matches. In addition, triangulated data is used to collaborate these links. Using the user defined minimum overlap of a DNA segment we perform a clustering of overlapping DNA segments to identify segment clusters. The overlap is calculated in centimorgans using human genetic recombination maps. Another aspect of overlapping segments is the fact that some regions of our genome seem to have more matches as compared to the other regions. These so-called pile-up areas can influence the clustering. The removal of known pile-up regions based on the paper of Li et al 2014 is optional and is not performed for this analysis However, a pileup report is provided that allows you to examine your genome for pileup regions.

AutoTree

By comparing the tree of the tested person and the trees from the members of a certain cluster, we can identify ancestors that are common amongst those trees. First, we collect the surnames that are present in the trees and create a network using the similarity between surnames. Next, we perform a clustering on this network to identify clusters of similar surnames. A similar clustering is performed based on a network using the first names of members of each surname cluster. Our last clustering uses the birth and death years of members of a cluster to find similar persons. As a consequence, initially large clusters (based on the surnames) are divided up into smaller clusters using the first name and birth/death year clustering.

AutoKinship

AutoKinship automatically predicts family trees based on the amount of DNA your DNA matches share with you and each other. Note that AutoKinship does not require any known genealogical trees from your DNA matches. Instead, AutoKinship looks at the predicted relationships between your DNA matches, and calculates many different paths you could all be related to each other. The probabilities used by this AutoKinship analysis are based on simulated data for GEDmatch matches and are kindly provided by Brit Nicholson (methodology described here). Based on the shared cM data between shared matches, we create different trees based on the putative relationships. We then use the probabilities to test every scenario which are then ranked.

AutoKinship Tree

Predicted trees from the AutoTree analysis are based on genealogical trees shared by the DNA matches and, if available, shared by the tested person. The relationships between DNA matches based on their common ancestors as provided AutoTree are used to perform an AutoKinship analysis and are overlayed on the predicted AutoKinship tree.

AutoKinship Tree is New

AutoKinship Tree is the new feature that combines the features of both AutoTree and AutoKinship. You receive:

  • Common ancestors between you and your matches
  • Trees of people who don’t share your common ancestors but share ancestors with each other
  • Combined with relationship predictions and
  • A segment analysis

Of course, the relative success of the tree tools depends upon how many people have uploaded GEDCOM files.

Big hint, if you haven’t uploaded your family tree, do so now. If you are an adoptee or searching for a parent and don’t know who your ancestors are, AutoKinship Tree does its best without your tree information, and you will still benefit from the trees of others combined with predicted relationships based on DNA.

It’s easier to show you than to tell you, so let’s step through my results one section at a time.

I’m going to be using cluster 5 which has 32 members and cluster 136 which has 8 members. Ironically, cluster 136 is a much more useful cluster, with 8 good matches, than cluster 5 which includes 32 people.

Results of the AutoKinship Analyses

As you scroll down your results, you’ll see a grid beneath the Explanation area.

It’s easy to see which cluster received results for each tool. My cluster 5 has results in each category, along with surnames. (Notice that you can search for surnames which displays only the clusters that contain that surname.)

I can click on each icon to see what’s there waiting for me.

Additionally, you can click at the top on the blue middle “here” for an overview of all common ancestors. Who can resist that, right?

Click on the ancestor’s name or the tree link to view more information.

You can also view common locations too by clicking on the blue “here” at far right. A location, all by itself, is a HUGE hint.

Clicking on the tree link shows you the tree of the tester with ancestors at that location. I had several others from North Carolina, generally, and other locations specifically. Let’s take a look at a few examples.

Common Ancestor Clusters

Click on the first blue link to view all common ancestors.

Common Ancestor Clusters summarize all of the clusters by ancestor. In other words, if any of your matches have ancestors in common in their tree, they are listed here.

These clusters include NOT just the people who share ancestors in a tree with you, but who also share known ancestors with each other BUT NOT YOU. That may be incredibly important when you are trying to identify your ancestors – as in brick walls. Your ancestors may be their ancestors too, or your common segments might lead to your common ancestors if you complete their tree.

There are other important hints too.

In my case, above, Jacob Lentz is my known ancestor.

However, Sarah Barron is not my ancestor, nor is John Vincent Dodson. They are the descendants of my Dodson ancestor though. I recognized that surname and those people. In other instances, recognizing a common geography may be your clue for figuring out how you connect.

In the cluster column at left, you can see the cluster number in which these people are found.

Common Locations Table

Clicking on the second link provides a Common Location Table

Some locations are general, like a state, and others are town, county or even village names. Whatever people have included in their GEDCOM files that can be connected.

Looking at this first entry, I recognize some of the ancestral surnames of Karen’s ancestors. The fact that we are found in the same cluster and share DNA indicates a common ancestor someplace.

Check for this same person in additional locations, then, look at their tree.

Ok, back to the AutoKinship Analysis Table and Cluster 136.

Cluster 136

I’m going to use Cluster 136 as an example because this cluster has generated great reports using all of the tools, indicated by the icon under each column heading. Some clusters won’t have enough information for everything so the tools generate as much as possible.

Scrolling down to Cluster 136 in the AutoCluster Information report, just beneath the list of clusters, I can see my 8 matches in that cluster.

Of course, I can click on the links for specific information, or contact them via email. At the end of this article in the “Tell Me Everything” section, I’ll provide a way to retrieve as much information as possible about any one match. For now, let’s move to the AutoTree.

Cluster 136 AutoTree

Clicking on the icon under AutoTree shows me how two of the matches in this cluster are related to each other and myself.

Note that the centimorgan badges listed refer to the number of cM that I share with each of these people, not how much they share with each other.

Click on any of the people to see additional information.

When I click on J Lentz m F Moselman, a popup box shows me how this couple is related to me and my matches.

Of course, you can also view the Y DNA or mitochondrial DNA haplogroups if the testers have provided that information when they set up their GEDmatch profile information.

Just click on the little icons.

If the testers have not provided that information, you can always check at FamilyTreeDNA or 23andMe, if they have tested at either of those vendors, to view their haplogroup information.

Today, GEDmatch kit numbers are assigned randomly, but in the early days, before Genesis, the leading letter of A meant AncestryDNA, F or T for FamilyTreeDNA, M for 23andMe and H for MyHeritage. If the kit number is something else, perform a one-to-one or a one-to-many report which will display the source of their DNA file.

The small number, 136 in this case, beside the cM number indicates the cluster or clusters that these people are members of. Some people are members of multiple clusters

Let’s see what’s next.

Cluster 136 Common Ancestors

Clicking on the Ancestors icon provides a report that shows all of the Ancestor Clusters in cluster 136.

The difference between this ancestor chart and the larger chart is that this only shows ancestors for cluster 136, while the larger chart shows ancestors for the entire AutoCluster report.

Cluster 136 Locations

All of the locations shown are included in trees of people who cluster together in cluster 136. Of course, this does NOT mean that these locations are all relevant to cluster 136. However, finding my own tree listed might provide an important clue.

Using the location tool, I discover 5 separate location clusters. This location cluster includes me with each tester’s ancestors who are found in Montgomery County, Ohio.

The difference between this chart for cluster 136 only and the larger location chart is that every location in this chart is relevant for people who all cluster together meaning we all share some ancestral line.

Viewing the trees of other people in the cluster may suggest ancestors or locations that are essential for breaking down brick walls.

Cluster 136 AutoKinship

Clicking on the anchor in the AutoKinship column provides a genetically reconstructed tree based on how closely each of the people match me, and each other. Clearly, in order to be able to provide this prediction, information about how your matches also match each other, or don’t, is required.

Again, the cM amount shown is the cM match with me, not with each other. However, if you click on a match, a popup will be shown that shows the shared cM between that person and the other matches as well as the relationship prediction between them in this tree

So, Bill matches David with a total of 354.3 cM and they are positioned as first cousins once removed in this tree. The probability of the match being a 1C1R (first cousin once removed) is 64.9%, meaning of course that other relationships are possible.

Note that Bill and David ALSO share a segment with me in autosegment cluster 185, on chromosome 3.

It’s important to note that while 136 is the autocluster number, meaning that colored block on the report, WITHIN clusters, autosegment clusters are formed and numbered. 

Each autosegment cluster receives its own number and the numbers are for the entire report. You will have more autosegment clusters than autoclusters, because at least some of the colorful autoclusters will contain more than one segment cluster.

Remember, autoclusters are those colorful boxes of matches that fly into place. Autosegment clusters are the matching triangulated clusters on chromosomes and they are represented by the blue bars, shown below.

AutoCluster 136 contains 5 different autosegment clusters, but Bill is only included in one of those autosegment clusters.

You’ll notice that there are some people, like Robin at the bottom, who do match some other people in the cluster, but either not enough people, or not enough overlapping DNA to be included as an autocluster member.

The small colored chromosomes with numbers, boxed in red, indicate the chromosome on which this person matches me.

If you click on that chromosome icon, you’ll see a popup detailing everyone who matches me on that segment.

Note that in some cases a member of a segment cluster, like Robin, did not make it in the AutoCluster cluster. You can spot these occurrences by scrolling down and looking at the cluster column which will then be empty for that particular match.

Reconstructed AutoKinship Trees in Most Likely Order

Scrolling down the page, next we see that we have multiple possible trees to view. We are shown the most likely tree first.

Tree likelihood is constructed based on the combined probability of my matching cM to an individual plus their likely relationship to each other based on the amount of DNA they share with each other as well.

In my case, all of the first 8 trees are equally as likely to be accurate, based on autosomal genetic relationships only. The ninth tree is only very slightly less likely to be accurate.

The X chromosome is not utilized separately in this analysis, nor are Y or mitochondrial DNA haplogroups if provided.

DNA Relationship Matrix

Continuing to scroll down, we next see the DNA matrix that shows relationships for cluster 5 in a grid format. Click on “Download Relationship Matrix” to view in a spreadsheet.

Keep scrolling for the next view which is the Individual Segment Cluster Information

Individual Segment Cluster Information

Remember that we are still focused on only one cluster – in this case, cluster 136. Each cluster contains people who all match at least some subset of other people in the cluster. Some people will match each other and the tested person on the same chromosome segment, and some won’t. What we generally see within clusters are “subclusters” of people who match each other on different chromosomes and segments. Also, some matches from cluster 136 might match other people but those matches might not be a member of cluster 136.

In autocluster 136, I have 14 DNA segments that converge into 5 segment clusters with my matches. Here’s segment cluster 185 that consists of two people in addition to me. Note that for individuals to be included in these segment clusters at GEDmatch, they must triangulate with people in the same segment cluster.

From left to right, we see the following information:

  • AutoCluster number 136, shown below

  • Segment cluster 185. This is a segment cluster within autocluster 136.

  • Segment cluster 185 occurs on chromosome 3, between the designated start and stop locations.
  • The segment representation shows the overlapping portions of the two matches, to me. You can easily see that they overlap almost exactly with each other as well.
  • The SNP count is shown, followed by the name and cM count.

Cluster 136 AutoKinship Tree

The AutoKinship Tree column is different from the AutoKinship column in one fundamental way. The new AutoKinship Tree feature combines the genealogical AutoTree and the genetic AutoKinship output together in one report.

You can see that the “prior” genealogical tree information that one of my matches also descends from Jacob Lentz (and wife, if you click further) has now been included. The matches without trees have been reconstructed around the known genealogy based on how they match me and each other.

I was already aware of how I’m related to Bill, David, *C and *R, but I don’t know how I am related to these other people. Based on their kit identifier, I can go to the vendor where they tested and utilize tools there, and I can check to see if they have uploaded their DNA files elsewhere to discover additional records information or critical matches. Now at least I know where in the tree to search.

Cluster 136 AutoSegment

Clicking on AutoSegment provides you with segment information. Each cluster is painted on your chromosomes.

By hovering over the darkly colored segments, which are segment clusters, you can view who you match, although to view multiple matches, continue scrolling.

In the next section, you’ll see the two segment clusters contained wholly within cluster 136.

Following that is the same information for segment clusters partially linked to cluster 136, but not contained wholly within 136.

Bonus – Tell Me Everything – Individual Match Clusters

We’ve focused specifically on the AutoKinship tools, but if you’re interested in “everything” about one specific match, you can approach things from that perspective too. I often look at a cluster, then focus on individuals, beginning with those I can identify which focuses my search.

If you click on any person in your match list, you’ll receive a report focusing on that person in your autocluster.

Let’s use cousin Bill as an example. I know how he’s related to me.

You can choose to display your chosen cluster by:

  • Cluster
  • Number of shared matches
  • Shared cM with the tester
  • Name

I would suggest experimenting with all of the options and see which one displays information that is most useful to the question you’re trying to answer.

Beneath the cluster for Bill, you’ll see the relevant information about the cluster itself. Bill has cluster matches on two different chromosomes.

The AutoCluster Cluster member Information report shows you how much DNA each cluster member shares with the tested person, which is me, and with each other cluster member. It’s easy to see at a glance who Bill is most closely related to by the number of cMs shared.

Only one of Bill’s chromosomes, #3, is included in clusters, but this tells me immediately that this/these segments on chromosome 3 triangulate between me, Bill, and at least one other person.

Segments shown in orange (chromosome 22) match me, but are not included in a cluster.

Special Use Cases – Unknown People

For adoptees and people trying to figure out how they are related to closer relatives, especially those without a tree, this new combined AutoKinship tool is wonderful.

400 cM is the upper default limit when running the report, meaning that close family members will not be included because they would be included in many clusters. However, you can make a different selection. If you’re trying to determine how several closely related people intersect, select a high threshold to include everyone.

Select a lower number of matches, like 25 or 50.

In this example, ‘no limit” was selected as the upper total match threshold and 25 closest matches.

AutoKinship then constructs a genetic tree and tells you which trees are possible and most likely. If some people do have trees, that common ancestor information would be included as well.

Note that when matches occur over the 400 cM threshold, there will be too many common chromosome matches so the chromosome numbers are omitted. Just check the other reports.

This tool would have helped a great deal with a recent close match who didn’t know how they are related to my family.

You can see this methodology in action and judge its accuracy by reconstructing your own family, assuming some of your known family members have uploaded to GEDmatch. Try it out.

It’s a Lot!

I know there’s a lot here to absorb, but take your time and refer back to this article as needed.

This flexible new tool combines DNA matching, genealogy trees, genetic trees, locations, autoclusters, a chromosome browser, and triangulation. It took me a few passes and working with different clusters to understand and absorb the information that is being provided.

For people who don’t know who their parents or close relatives are, these tools are amazing. Not only can they determine who they are related to, and who is related to each other, but with the use of trees, they can view common ancestors which provides possible ancestors for them too.

For people painting their triangulated segments at DNAPainter, AutoKinship provides triangulation groups that can be automatically painted using the Cluster Auto Painter, here, plus helps to identify that common ancestor. You can read more about DNAPainter, here.

For people seeking to break down brick walls, AutoKinship Tree provides assistance by providing tree matching between your matches for common ancestors NOT IN YOUR TREE, but that ARE in theirs. Your brick walls are clearly not (yet) identified in your tree, although that’s our fervent hope, right?

Even if your matches’ trees don’t go far enough back, as a genealogist, you can extend those trees further to hopefully reveal a previously unknown common ancestor.

The Best Things You Can Do

Aside from DNA testing, the three best things you can do to help yourself, and your clusters are:

  • Upload your GEDCOM file, complete with locations, so you have readily available trees. Ask your matches to do so as well. Trees help you and others too.
  • Encourage people you match at Ancestry who provides no chromosome segment information or chromosome browser to upload a copy of their DNA files and tree.
  • Test your family members and cousins, and encourage them to upload their DNA and their trees. Offer to assist them. You can find step-by-step download/upload instructions here.

Have fun!

______________________________________________________________

Sign Up Now – It’s Free!

If you enjoyed this article, subscribe to DNAeXplain for free to automatically receive new articles by email each week.

Here’s the link. Just look for the little grey “follow” button on the right-hand side on your computer screen below the black title bar, enter your e-mail address, and you’re good to go!

In case you were wondering, I never have nor ever will share or use your e-mail outside of the intended purpose.

Share the Love

You can always forward these articles to friends or share by posting links on social media. Who do you know that might be interested?

_____________________________________________________________

Follow DNAexplain on Facebook, here or follow me on Twitter, here.

Share the Love!

You’re always welcome to forward articles or links to friends and share on social media.

If you haven’t already subscribed (it’s free,) you can receive an email whenever I publish by clicking the “follow” button on the main blog page, here.

You Can Help Keep This Blog Free

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Uploads

Genealogy Products and Services

My Book

Genealogy Books

Genealogy Research

Genetic Affairs: AutoPedigree Combines AutoTree with WATO to Identify Your Potential Tree Locations

July 2020 Update: Please note that Ancestry issues a cease-and-desist order against Genetic Affairs, and this tool no longer works at Ancestry. The great news is that it still works at the other vendors, and you can ask Ancestry matches to transfer, which is free.

If you’re an adoptee or searching for an unknown parent or ancestor, AutoPedigree is just what you’ve been waiting for.

By now, we’re all familiar with Genetic Affairs who launched in 2018 with their signature autocluster tool. AutoCluster groups your matches into clusters by who your matches match with each other, in addition to you.

browser autocluster

A year later, in December 2019, Genetic Affairs introduced AutoTree, automated tree reconstruction based on your matches trees at Ancestry and Family Finder at Family Tree DNA, even if you don’t have a tree.

Now, Genetic Affairs has introduced AutoPedigree, a combination of the AutoTree reconstruction technology combined with WATO, What Are the Odds, as seen here at DNAPainter. WATO is a statistical probability technique developed by the DNAGeek that allows users to review possible positions in a tree for where they best fit.

Here’s the progressive functionality of how the three Genetic Affairs tools, combined, function:

  • AutoCluster groups people based on if they match you and each other
  • AutoTree finds common ancestors for trees from each cluster
  • Next, AutoTree finds the trees of all matches combined, including from trees of your DNA matches not in clusters
  • AutoPedigree checks to see if a common ancestor tree meets the minimum requirement which is (at least) 3 matches of greater to or equal to 30-40 cM. If yes, an AutoPedigree with hypotheses is created based on the common ancestor of the matching people.
  • Combined AutoPedigrees then reviews all AutoTrees and AutoPedigrees that have common ancestors and combine them into larger trees.

Let’s look at examples, beginning with DNAPainter who first implemented a form of WATO.

DNA Painter

Let’s say you’re trying to figure out how you’re related to a group of people who descend from a specific ancestral couple. This is particularly useful for someone seeking unknown parents or other unknown relationships.

DNA tools are always from the perspective of the tester, the person whose kit is being utilized.

At DNAPainter, you manually create the pedigree chart beginning with a common couple and creating branches to all of their descendants that you match.

This example at DNAPainter shows the matches with their cM amounts in yellow boxes.

xAutoPedigree DNAPainter WATO2

The tester doesn’t know where they fit in this pedigree chart, so they add other known lines and create hypothesis placeholder possibilities in light blue.

In other words, if you’re searching for your mother and you were born in 1970, you know that your mother was likely born between 1925 (if she was 45 when she gave birth to you) and 1955 (if she was 15 when she gave birth to you.) Therefore, in the family you create, you’d search for parents who could have given birth to children during those years and create hypothetical children in those tree locations.

The WATO tool then utilizes the combination of expected cMs at that position to create scores for each hypothesis position based on how closely or distantly you match other members of that extended family.

The Shared cM Project, created and recently updated by Blaine Bettinger is used as the foundation for the expected centimorgan (cM) ranges of each relationship. DNAPainter has automated the possible relationships for any given matching cM amount, here.

In the graphic above, you can see that the best hypothesis is #2 with a score of 1, followed by #4 and #5 with scores of 3 each. Hypothesis 1 has a score of 63.8979 and hypothesis 3 has a score of 383.

You’ll need to scroll to the bottom to determine which of the various hypothesis are the more likely.

Autopedigree DNAPainter calculated probability

Using DNAPainter’s WATO implementation requires you to create the pedigree tree to test the hypothesis. The benefit of this is that you can construct the actual pedigree as known based on genealogical research. The down-side, of course, is that you have to do the research to current in each line to be able to create the pedigree accurately, and that’s a long and sometimes difficult manual process.

Genetic Affairs and WATO

Genetic Affairs takes a different approach to WATO. Genetic Affairs removes the need for hand entry by scanning your matches at Ancestry and Family Tree DNA, automatically creating pedigrees based on your matches’ trees. In addition, Genetic Affairs automatically creates multiple hypotheses. You may need to utilize both approaches, meaning Genetic Affairs and DNAPainter, depending on who has tested, tree completeness at the vendors, and other factors.

The great news is that you can import the Genetic Affairs reconstructed trees into DNAPainter’s WATO tool instead of creating the pedigrees from scratch. Of course, Genetic Affairs can only use the trees someone has entered. You, on the other hand, can create a more complete tree at DNAPainter.

Combining the two tools leverages the unique and best features of both.

Genetic Affairs AutoPedigree Options

Recently, Genetic Affairs released AutoPedigree, their new tool that utilizes the reconstructed AutoTrees+WATO to place the tester in the most likely region or locations in the reconstructed tree.

Let’s take a look at an example. I’m using my own kit to see what kind of results and hypotheses exist for where I fit in the tree reconstructed from my matches and their trees.

If you actually do have a tree, the AutoTree portion will simply be counted as an equal tree to everyone else’s trees, but AutoPedigree will ignore your tree, creating hypotheses as if it doesn’t exist. That’s great for adoptees who may have hypothetical trees in progress, because that tree is disregarded.

First, sign on to your account at Genetic Affairs and select the AutoPedigree option for either Ancestry or Family Tree DNA which reconstructs trees and generates hypotheses automatically. For AutoPedigree construction, you cannot combine the results from Ancestry and FamilyTreeDNA like you can when reconstructing trees alone. You’ll need to do an AutoPedigree run for each vendor. The good news is that while Ancestry has more testers and matches, FamilyTreeDNA has many testers stretching back 20 years or so in the past who passed away before testing became available at Ancestry. Often, their testers reach back a generation or two further. You can easily transfer Ancestry (and other) results to Family Tree DNA for free to obtain more matches – step-by-step instructions here.

At Genetic Affairs, you should also consider including half-relations, especially if you are dealing with an unknown parent situation. Selecting half-relationships generates very large trees, so you might want to do the first run without, then a second run with half relationships selected.

AutoPedigree options

Results

I ran the program and opened the resulting email with the zip file. Saving that file automatically unzips for me, displaying the following 5 files and folders.

Autopedigree cluster

Clicking on the AutoCluster HTML link reveals the now-familiar clusters, shown below.

Autopedigree clusters

I have a total of 26 clusters, only partially shown above. My first peach cluster and my 9th blue cluster are huge.

Autopedigree 26 clusters

That’s great news because it means that I have a lot to work with.

autopedigree folder

Next, you’ll want to click to open your AutoPedigree folder.

For each cluster, you’ll have a corresponding AutoPedigree file if an AutoPedigree can be generated from the trees of the people in that cluster.

My first cluster is simply too large to show successfully in blog format, so I’m selecting a smaller cluster, #21, shown below with the red arrow, with only 6 members. Why so small, you ask? In part, because I want to illustrate the fact that you really don’t need a lot of matches for the AutoPedigree tool to be useful.

Autopedigree multiple clusters

Note also that this entire group of clusters (blue through brown) has members in more than one cluster, indicated by the grey cells that mean someone is a member of at least 2 clusters. That tells me that I need to include the information from those clusters too in my analysis. Fortunately, Genetic Affairs realizes that and provides a combined AutoPedigree tool for that as well, which we will cover later in the article. Just note for now that the blue through brown clusters seem to be related to cluster 21.

Let’s look at cluster 21.

autopedigree cluster 21

In the AutoPedigree folder, you’ll see cluster files when there are trees available to create pedigrees for individual clusters. If you’re lucky, you’ll find 2 files for some clusters.

autopedigree ancestors

At the top of each cluster AutoPedigree file, Genetic Affairs shows you the home couple of the descendant group shown in the matches and their corresponding trees.

Autopedigree WATO chart

Image 1 – click to enlarge

I don’t expect you to be able to read everything in the above pedigree chart, just note the matches and arrows.

You can see three of my cousins who match, labeled with “Ancestry.” You also see branches that generate a viable hypothesis. When generating AutoPedigrees, Genetic Affairs truncates any branches that cannot result in a viable hypothesis for placing the tester in a viable location on the tree, so you may not see all matches.

Autopedigree hyp 1

Image 2 – click to enlarge

On the top branch, you’ll see hyp-1-child1 which is the first hypothesis, with the first child. Their child is hyp-2- child2, and their child is hyp-3-child3. The tester (me, in this case) cannot be the persons shown with red flags, called badges, based on how I match other people and other tree information such as birth and death dates.

Think of a stoplight, red=no, green are your best bets and the rest are yellow, meaning maybe. AutoPedigree makes no decisions, only shows you options, and calculated mathematically how probable each location is to be correct.

Remember, these “children,” meaning hypothesis 1-child 1 may or may not have actually existed. These relationships are hypothetical showing you that IF these people existed, where the tester could appear on the tree.

We know that I don’t fit on the branch above hypothesis 1, because I only match the descendant of Adam Lentz at 44.2 cM which is statistically too low for me to also inhabit that branch.

I’ve included half relationships, so we see hyp-7-child1-half too, which is a half-sibling.

The rankings for hypotheses 1, 2, and 7 all have red badges, meaning not possible, so they have a score of 0. Hypothesis 3 and 8 are possible, with a ranking of 16, respectively.

autopedigree my location

Image 3 – click to enlarge

Looking now at the next segment of the tree, you see that based on how I match my Deatsman and Hartman cousins, I can potentially fit in any portion of the tree with green badges (in the red boxes) or yellow badges.

You can also see where I actually fit in the tree. HOWEVER, that placement is from AutoTree, the tree reconstruction portion, based on the fact that I have a tree (or someone has a tree with me in it). My own tree is ignored for hypothesis generation for the AutoPedigree hypothesis generation portion.

Had my first cousins once removed through my grandfather John Ferverda’s brother, Roscoe, tested AND HAD A TREE, there would have been no question where I fit based on how I match them.

autopedigree cousins

As it turns out they did test, but provided no tree meaning that Genetic Affairs had no tree to work with.

Remember that I mentioned that my first cluster was huge. Many more matches mean that Genetic Affairs has more to work with. From that cluster, here’s an example of a hypothesis being accurate.

autopedigree correct

Image 4 – click to enlarge

You can see the hypothetical line beneath my own line, with hypothesis 104, 105, 106, 107, 108. The AutoTree portion of my tree is shown above, with my father and grandparents and my name in the green block. The AutoPedigree portion ignores my own tree, therefore generating the hypothesis that’s where I could fit with a rank of 2. And yes, that’s exactly where I fit in the tree.

In this case, there were some hypotheses ranked at 1, but they were incorrect, so be sure to evaluate all good (green) options, then yellow, in that order.

Genetic Affairs cannot work with 23andMe results for AutoPedigree because 23andMe doesn’t provide or support trees on their site. AutoClusters are integrated at MyHeritage, but not the AutoTree or AutoPedigree functions, and they cannot be run separately.

That leaves Family Tree DNA and Ancestry.

Combined AutoPedigree

After evaluating each of the AutoPedigrees generated for each cluster for which an AutoPedigree can be generated, click on the various cluster combined autopedigrees.

autopedigree combined

You can see that for cluster 1, I have 7 separate AutoPedigrees based on common ancestors that were different. I have 3 AutoPedigrees also for cluster 9, and 2 AutoPedigrees for 15, 21, and 24.

I have no AutoPedigrees for clusters 2, 3, 5, 6, 7, 8, 14, 17, 18, and 22.

Moving to the combined clusters, the numbers of which are NOT correlated to the clusters themselves, Genetic Affairs has searched trees and combined ancestors in various clusters together when common ancestors were found.

Autopedigree multiple clusters

Remember that I asked you to note that the above blue through brown clusters seem to have commonality between the clusters based on grey cell matches who are found in multiple groups? In fact, these people do share common ancestors, with a large combined AutoPedigree being generated from those multiple clusters.

I know you can’t read the tree in the image that follows. I’m only including it so you’ll see the scale of that portion of my tree that can be reconstructed from my matches with hypotheses of where I fit.

autopedigree huge

Image 5 – click to enlarge

These larger combined pedigrees are very useful to tie the clusters together and understand how you match numerous people who descend from the same larger ancestral group, further back in time.

Integration with DNAPainter

autopedigree wato file

Each AutoPedigree file and combined cluster AutoPedigree file in the AutoPedigree folder is provided in WATO format, allowing you to import them into DNAPainter’s WATO tool.

autopedigree dnapainter import

You can manually flesh out the trees based on actual genealogy in WATO at DNAPainter, manually add matches from GEDmatch, 23andMe or MyHeritage or matches from vendors where your matches trees may not exist but you know how your match connects to you.

Your AutoTree Ancestors

But wait, there’s more.

autopedigree ancestors folder

If you click on the Ancestors folder, you’ll see 5 options for tree generations 3-7.

autopedigree ancestor generations

My three-generation auto-generated reconstructed tree looks like this:

autopedigree my tree

Selecting the 5th generation level displays Jacob Lentz and Frederica Ruhle, the couple shown in the AutoCluster 21 and AutoPedigree examples earlier. The color-coding indicates the source of the ancestors in that position.

Autopedigree expanded tree

click to enlarge

You will also note that Genetic Affairs indicates how many matches I have that share this common ancestor along with which clusters to view for matches relevant to specific ancestors. How cool is this?!!

Remember that you can also import the genetic match information for each AutoTree cluster found at Family Tree DNA into DNAPainter to paint those matches on your chromosomes using DNAPainter’s Cluster Auto Painter.

If you run AutoCluster for matches at 23andMe, MyHeritage, or FamilyTreeDNA, all vendors who provide segment information, you can also import that cluster segment information into DNAPainter for chromosome painting.

However, from that list of vendors, you can only generate AutoTrees and AutoPedigrees at Family Tree DNA. Given this, it’s in your best interest for your matches to test at or upload their DNA (plus tree) to Family Tree DNA who supports trees AND provides segment information, both, and where you can run AutoTree and AutoPedigree.

Have you painted your clusters or generated AutoTrees? If you’re an adoptee or looking for an unknown parent or grandparent, the new AutoPedigree function is exactly what you need.

Documentation

Genetic Affairs provides complete instructions for AutoPedigree in this newsletter, along with a user manual here, and the Facebook Genetic Affairs User Group can be found here.

I wrote the introductory article, AutoClustering by Genetic Affairs, here, and Genetic Affairs Reconstructs Trees from Genetic Clusters – Even Without Your Tree or Common Ancestors, here. You can read about DNAPainter, here.

Transfer your DNA file, for free, from Ancestry to Family Tree DNA or MyHeritage, by following the easy instructions, here.

Have fun! Your ancestors are waiting.

_____________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

 

Are You DNA Testing the Right People?

We often want to purchase DNA kits for relatives, especially during the holidays when there are so many sales. (There are links for free shipping on tests in addition to sale prices at the end of this article. If you already know who to test, pop on down to the Sales section, now.)

Everyone is on a budget, so who should we test to obtain results that are relevant to our genealogy?

We tell people to test as many family members as possible – but what does that really mean?

Testing everyone may not be financially viable, nor necessary for genealogy, so let’s take a look at how to decide where to spend YOUR testing dollars to derive the most benefit.

It’s All Relative😊

When your ancestors had children, those children inherited different pieces of your ancestors’ DNA.

Therefore, it’s in your best interest to test all of the direct descendants generationally closest to the ancestor that you can find.

It’s especially useful to test descendants of your own close ancestors – great-great-grandparents or closer – where there is a significant possibility that you will match your cousins.

All second cousins match, and roughly 90% (or more) of third cousins match.

Percent of cousins match.png

This nifty chart compiled by ISOGG shows the probability statistics produced by the major testing companies regarding cousin matching relationships.

My policy is to test 4th cousins or closer. The more, the merrier.

Identifying Cousins

  • First cousins share grandparents.
  • Second cousins share great-grandparents.
  • Third cousins share great-great-grandparents.

The easiest way for me to see who these cousins might be is to open my genealogy software on my computer, select my great-great-grandparent, and click on descendants. Pretty much all software has a similar function.

The resulting list shows all of the descendants of that ancestor that I’ve entered in my software. Most genealogists already have or could construct this information with relative ease. These are the cousins you need to be talking to anyway, because they will have photos and stories that you don’t. If you don’t know them, there’s never been a better time to reach out and introduce yourself.

Who to test descendants software

Click to enlarge

People You Already Know

Sometimes it’s easier to start with the family you already know and may see from time to time. Those are the people who will likely be the most beneficial to your genealogy.

Who to test 1C.png

Checking my tree at FamilyTreeDNA, Hiram Ferverda and Evaline MIller are my great-grandparents. All of their children are deceased, but I have a relationship with the children born to their son, Roscoe. Both Cheryl and her brother carry parts of Hiram and Eva’s DNA their son John Ferverda (my grandfather) didn’t inherit, and therefore that I can’t carry.

Therefore, it’s in my best interest to gift my cousin, Cheryl and her brother, both, with DNA kits. Turns out that I already have and my common matches with both Cheryl and her brother are invaluable because I know that people who match me plus either one of them descend from the Ferverda or Miller lines. This relationship and linking them on my tree, shown above, allows Family Tree DNA to perform phased Family Matching which is their form of triangulation.

It’s important to test both siblings, because some people will match me plus one but not the other sibling.

Who’s Relevant?

Trying to convey the concept of who to test and not to test, and why, is sometimes confusing.

Many family members may want to test, but you may only be willing to pay for those tests that can help your own genealogy. We need to know who can best benefit our genealogy in order to make informed decisions.

Let’s look at example scenarios – two focused on grandparents and two on parents.

In our example family, a now-deceased grandmother and grandfather have 3 children and multiple grandchildren. Let’s look at when we test which people, and why.

Example 1: Grandparents – 2 children deceased, 1 living

In our first example, Jane and Barbara, my mother, are deceased, but their sibling Harold is living. Jane has a living daughter and my mother had 3 children, 2 of which are living. Who should we test to discover the most about my maternal grandparents?

Please note that before making this type of a decision, it’s important to state the goal, because the answer will be different depending on your goal at hand. If I wanted to learn about my father’s family, for example, instead of my maternal grandparents, this would be an entirely different question, answer, and tree.

Descendant test

Click to enlarge

The people who are “married in” but irrelevant to the analysis are greyed out. In this case, all of the spouses of Jane, Barbara and Harold are irrelevant to the grandmother and grandfather shown. We are not seeking information about those spouses or their families.

The people I’ve designated with the red stars should be tested. This is the “oldest” generation available. Harold can be tested, so his son, my first cousin, does not need to test because the only part of the grandparent’s DNA that Harold’s son can inherit is a portion of what his father, Harold, carries and gave to him.

Unfortunately, Jane is deceased but her daughter, Liz, is available to test, so Liz’s son does not need to.

I need to test, as does my living brother and the children of my deceased brother in order to recover as much as possible of my mother’s DNA. They will all carry pieces of her DNA that I don’t.

The children of anyone who has a red star do NOT need to test for our stated genealogical purpose because they only carry a portion of thier parent’s DNA, and that parent is already testing.

Those children may want to test for their own genealogy given that they also have a parent who is not relevant to the grandfather and grandmother shown. In my case, I’m perfectly happy to facilitate those tests, but not willing to pay for the children’s tests if the relevant parent is living. I’m only willing to pay for tests that are relevant to my genealogical goals – in this case, my grandparents’ heritage.

In this scenario, I’m providing 5 tests.

Of course, you may have other family factors in play that influence your decision about how many tests to purchase for whom. Family dynamics might include things like hurt feelings and living people who are unwilling or unable to test. I’ve been known to purchase kits for non-biologically related family members so that people could learn how DNA works.

Example 2: Grandparents – 2 children living, one deceased

For our second example, let’s change this scenario slightly.

Descendant test 2

Click to enlarge

From the perspective of only my grandparents’ genealogy, if my mother is alive, there’s no reason to test her children.

Barbara and Harold can test. Since Jane is deceased, and she had only one child, Liz is the closest generationally and can test to represent Jane’s line. Liz’s son does not need to test since his mother, the closest relative generationally to the grandparents is available to test.

In this scenario, I’m providing 3 tests.

Example 3: My Immediate Family – both parents living

In this third example, I’m looking from strictly MY perspective viewing my maternal grandparents (as shown above) AND my immediate family meaning the genealogical lines of both of my parents. In other words, I’ve combined two goals. This makes sense, especially if I’m going to be seeing a group of people at a family gathering. We can have a swab party!

Descendants - parents alive

Click to enlarge

In the situation where my parents are both living, I’m going to test them in addition to Harold and Liz.

I’m testing myself because I want to work using my own DNA, but that’s not really necessary. My parents will both have twice as many matches to other people as I do – because I only inherited half of each parent’s DNA.

In this scenario, I’m providing 5 tests.

Example 4: My Immediate Family – one parent living, one deceased

Descendants - father deceased

Click to enlarge

In our last example, my mother is living but my father is deceased. In addition to Harold and Liz who reflect the DNA of my maternal grandparents, I will test myself, my mother my living brother and my deceased brother’s child.

Because my father is deceased, testing as many of my father’s descendants as possible, in addition to myself, is the only way for me to obtain some portion of his DNA. My siblings will have pieces of my parent’s DNA that I don’t.

I’m not showing my father’s tree in this view, but looking at his tree and who is available to test to provide information about his side of the family would be the next logical step. He may have siblings and cousins that are every bit as valuable as the people on my mother’s side.

Applying this methodology to your own family, who is available to test?

Multiple Databases

Now that you know WHO to test, the next step is to make sure your close family members test at each of the major providers where your DNA is as well.

I test everyone at Family Tree DNA because I have been testing family members there for 19 years and many of the original testers are deceased now. The only way new people can compare to those people is to be in the FamilyTreeDNA data base.

Then, with permission of course, I transfer all kits, for free, to MyHeritage. Matching is free, but if you don’t have a subscription, there’s an unlock fee of $29 to access advanced tools. I have a full subscription, so all tools are entirely free for the kits I transfer and manage in my account.

Transferring to Family Tree DNA and matching there is free too. There’s an unlock fee of $19 for advanced tools, but that’s a good deal because it’s substantially less than a new test.

Neither 23andMe nor Ancestry accept transfers, so you have to test at each of those companies.

The great news is that both Ancestry and 23andMe tests can be transferred to  MyHeritage and FamilyTreeDNA.

Before purchasing tests, check first by asking your relatives or testing there yourself to be sure they aren’t already in those databases. If they took a “spit in a vial” test, they are either at 23andMe or Ancestry. If they took a swab test, it’s MyHeritage or FamilyTreeDNA.

I wrote about creating a testing and transfer strategy in the article, DNA Testing and Transfers – What’s Your Strategy? That article includes a handy dandy chart about who accepts which versions of whose files.

Sales

Of course, everything is on sale since it’s the holidays.

Who are you planning to test?

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Products and Services

Genealogy Research

Fun DNA Stuff

  • Celebrate DNA – customized DNA themed t-shirts, bags and other items

Jane Dodson (c1760-1830/1840), Pioneer Wife on 5 Frontiers, 52 Ancestors #142

Jane Dodson was the wife of Lazarus Dodson who was born in about 1760 and probably died in either McMinn County or Claiborne County, Tennessee in about 1826. However, were it not for the 1861 death record of Lazarus and Jane’s son, Lazarus Dodson (Jr.), we would never have known Jane’s name.

Lazarus Jr. died in Pulaski County, Kentucky on October 5, 1861, just before fighting began there in the Civil War. Fortunately, for us, he has a death record and that record tells us that he was born in 1795 and that the names of his parents were Lazarus Dodson and Jane.

dodson-lazarus-1861-death

dodson-lazarus-1861-death-2

This is the only extant record of Lazarus’s mother’s name. Granted, there is no surname, but I’m just grateful for the tidbit we do have. How I do wish though that someone had thought to record her maiden name, because it’s unlikely at this point that we will ever know.

Getting to Know Jane Through Lazarus

What do we know about Jane? Most of what we know about Jane’s life is through Lazarus’s records – not an uncommon circumstance for a frontier wife.

The first positive ID of Lazarus Dodson Sr., Jane’s husband, was when he was recorded as having camped at the headwaters of Richland Creek (in present day Grainger County, TN) in the winter 1781/1782. Lazarus would have been approximately 22 years of age at this time, or possibly slightly older.

From the book Tennessee Land Entries, John Armstrong’s Office:

Page 105, grant 1262 – Dec. 4, 1783 – James Lea enters 317 acres on the North side of the Holston below the mouth of Richland Ck at a “certain place where Francis Maberry, Major John Reid, and Lazarus Dodson camped with the Indians at they was going down to the Nation last winter and opposite the camp on the other side of the river, border, begins at upper end of the bottom and runs down, warrant issued June 7, 1784, grant to Isaac Taylor.

The “Nation” referred to is the Cherokee Nation.

It has long been suspected that the Dodson and Lea families were intermarried or somehow interrelated, and it’s certainly possible that Lazarus’s wife, Jane, was a Lea. I almost hate to mention that possibility, because I don’t want to start any unsubstantiated rumors.

On the other hand, if an unattached Jane Lea were to be documented, of the right age, in the right place, she would have to be considered as a candidate. Keep in mind that we don’t know who Lazarus’s mother was either, so these families could have been intermarried before Lazarus came onto the scene.

It’s also possible that the only connection between the two families was that they were neighbors for more than a decade on the rough shores of Country Line Creek in Caswell County, North Carolina, before moving to untamed waters of the Holston River in what would become eastern Tennessee. Country Line Creek was described by the 1860 census taker almost a hundred years after Raleigh and Lazarus lived there as the roughest area in Caswell County. The area called Leasburg, in fact, was designated at the first county seat in in Caswell County in 1777, although it was a few miles distant from Country Line Creek.

The James Lea (1706-1792) family lived on Country Line Creek in Caswell County, NC, as did Raleigh Dodson, Lazarus’s father. This James Lea, according to his will, did not have a son James, nor a daughter, Jane – so it wasn’t his son who patented the land at the mouth of Richland Creek.

Due to the land entries, we know that both Lazarus and members of the Lea family were present in what would become Hawkins County at least by 1783, and probably earlier.

We don’t know exactly when Lazarus arrived in what was then Sullivan County, NC, but we do know that in 1777, men named Lazarus and Rolly Dodson are recorded as having given oaths of allegiance in Pittsylvania County, Virginia, bordering Caswell County, NC, an area where they were known to have lived, based on multiple records including their Revolutionary War service records. It’s unclear whether this pair is our Raleigh and Lazarus, but the fact that those two names appeared together is highly suggestive that they might be. However, they were not the only Raleigh and Lazarus males in the Dodson family or in this region.

If indeed this is our Lazarus, he was likely of age at that time, so he could have been born before 1760. This suggests that Lazarus was likely married not long after 1777.

Therefore, it’s likely that Raleigh along with Lazarus moved from the Halifax/Pittsylvania Virginia border with Caswell County, North Carolina to what was then Sullivan County, Tennessee sometime after July 1778 when Raleigh sold his land and before May of 1779 when Raleigh’s first tract was granted in what would become Hawkins County, Tennessee.

We know that Lazarus was clearly there by the winter of 1781/1782 and probably by spring of 1779 when his father first appears in the written records.

Sometime in the fall or winter of 1778, Raleigh and Lazarus, and Jane if she were married to Lazarus, would have navigated the old wagon roads from Caswell County to near Rogersville, Tennessee. Was Jane frightened, or excited? Was she pregnant? Did she have any idea what to expect? Was this, perchance, her honeymoon? If so, she probably didn’t care where she went, so long as it was with Lazarus. I remember those days of lovestruck early marriage. The words “to the moon and back” are in love songs for a reason!

The earliest record where we find Raleigh Dodson in what would become Hawkins County, TN is in a land warrant dated October 24,1779 which is a tract for Rowley Dotson for 150 acres joining another tract “where said Dotson lives,” that warrant being issued on May 21, 1779.

By 1780, the Revolutionary War had come to eastern North Carolina.

In October, 1780, the forces under Col. Arthur Campbell gathered at Dodson’s Ford before going downriver to the attack on the Overhill Cherokee towns of Chota, Talequah, Tallassee, and others.

Jane and Lazarus lived at Dodson Ford, and this would probably have been quite frightening for Jane. Could she see the soldiers from her cabin? Did she hear the talk about the expedition? Did Lazarus go along?  Colonel Arthur Campbell brought 200 additional men to the Battle of King’s Mountain, also fought in October of 1780.  Was Lazarus among those men too?  Unfortunately, there is no definitive roster for the Battle of King’s Mountain, only information gathered from here and there.

We know that both Lazarus and his father, Raleigh, served during the Revolutionary War, being discharged in August of 1783 in what was then western North Carolina. Both of their service records provide that information. We don’t know how long they served, but most men served in local militia units routinely.

We also know that in the winter of 1781/1782, Lazarus Dodson was camped on the Holston at the mouth of Richland Creek with Major John Reid “with the Indians,” before they “went down to the Nation,” meaning the Cherokee Nation.  Major Reid’s militia unit was form in 1778 and early 1779 at Long Island on Holston. The phrase, “with the Indians” is baffling, especially given that the militiamen destroyed the Indian towns.

One way or another, Jane was probably alone much of the time between when they settled on the Holston in late 1778 or early 1779 until August of 1783.  Those days, waiting for word about Lazarus were probably very long days, weeks and months, although during this timeframe, men often returned home between engagements if they could.

We don’t know if Jane was Lazarus’s first wife, or not – or whether he married her in Pittsylvania or Halifax County, Virginia, Caswell County, North Carolina or on the frontier in what would become Tennessee. Pittsylvania, Halifax and Caswell Counties bordered each other on the Virginia/North Carolina line, and the Dodson family was active in all three counties.

We do know unquestionably that Jane was the mother of Lazarus Dodson Jr. born in 1795, so she was assuredly married to Lazarus Sr. by that time.

In 1794, Raleigh Dodson, Jane’s father-in-law, died and in 1797, Lazarus moved within Hawkins County from near Dodson Ford on the Holston River to the White Horn Fork of Bent Creek near Bull’s Gap.

The 1800 census is missing, as is 1810, but we know that by 1800 Lazarus and Jane had moved once again were living near the Cumberland Gap, on Gap Creek, in Claiborne County. In 1802 Lazarus is recorded in the court notes of Claiborne County as a juror, which would indicate that he owned land there by then, a requirement to be on a jury.

Lazarus, and therefore most likely Jane as well, was a member of Gap Creek Baptist Church in Claiborne Co., which was located on Lazarus’ land. Lazarus is referenced in the minutes on Saturday, June 5th, 1805. Another church, Big Springs, in the same association, had asked for Gap Creek’s help with determining what to do about “a breach of fellowship with James Kenney and it given into the hands of members from other churches, to wit Absolom Hurst, Lazarus Dodson and Matthew Sims and they report on Sunday morning a matter too hard for them to define on for they had pulled every end of the string and it led them into the mire and so leave us just where they found us.”

I’m sure whatever that breach was, it was the talk of Gap Creek Baptist Church.

The only Lazrus Dotson or similar name in the 1820 census is found in Williamson County, Tennessee and is age 26-44, born 1776-1794, so too young to be our Lazarus who was born about 1760.

However, 1819 is when Lazarus Dodson sells his land on Gap Creek in Claiborne County, Tennessee and reportedly goes to Jackson County, Alabama for some time. So the 1820 census may simply have missed him. It’s also possible that Lazarus and Jane were living on Indian land in what is now Jackson County.

Or perhaps Lazarus and Jane were in transit. Lazarus’s nephew, William, son of Lazarus’s brother,Toliver, also known as Oliver, was living in Jackson County by early 1819 and lived there until his death in 1872. In fact, there is a now extinct town named Dodsonville named after William.

Two of Lazarus Sr’s sons apparently went with him to Jackson County; Lazarus Jr. and Oliver (not to be confused with Lazarus’s brother Oliver,) born in 1794. Lazarus Jr.’s son and Oliver’s son both claim to have been born in Alabama, Oliver’s son in 1819 and Lazarus Jr.’s son about 1821. If Lazarus Sr. was living in Alabama during this time, then so was Jane. It must have pained Jane to leave some of her children behind in Tennessee. No matter how old your children are, they are still your children.

Jane would have been close to 60, and she would have been packing up her household, for at least the third time, if not the fourth time, and moving across the country in a wagon. The distance from Claiborne County to Jackson County, Alabama was approximately 200 miles, which, at the rate of about 10 miles per day in a wagon would have taken about 3 weeks. I wonder if Jane got to vote in the decision to move to Jackson County. I’m guessing not.

Trying to wrap our hands around when Jane was born is made somewhat easier by the fact that she was recorded in the 1830 McMinn County, Tennessee census. Yes, I said Tennessee. Yes, she moved back. With or without Lazarus? We don’t know.

jane-1830-census

In the 1830 census, Jane Dodson is living alone and is recorded as being age 60-70, elderly by the standards of 1830 when the average life expectancy was a mere 37 years. This would put Jane’s birth year between 1760 and 1770. Therefore, Jane was likely married between 1778 and 1790. Those dates bracket the other information we have perfectly, but it doesn’t offer us any help in determining whether or not Jane was married to Lazarus before moving to the frontier, or after. Jane is not shown in the 1840 census, so either she has died or she is living with a family member where she can not be identified.

How Many Moves?

We know that Jane wasn’t born in eastern Tennessee in 1760 or 1770, because very few white families lived there then. Well, of course, this is assuming that Jane was not Native. I’m not entirely sure that’s a valid assumption, but without her mitochondrial DNA, we’ll never know for sure. Without any evidence, or even oral history for that matter, we’ll assume that Jane is not Native, although the fly in that ointment could be the record showing Lazarus camping “with the Indians.” Certainly not direct evidence about Jane, but enough to make you pause a bit and wonder, especially in a time and place when Indians were considered the enemy.

One way or another, perhaps as teenager or maybe as a bride, Jane probably moved from the relative security of the Piedmont area to the volatile frontier with Indians and soldiers coming and going for at least half a decade.

The soldiers destroyed the Cherokee villages in 1780 and early 1781, so the war on the frontier was far from over. The Revolutionary War was still being fought in many locations – and if Jane was married to Lazarus then, she spent that time in a cabin on the frontier along the Holston River, below, in what is today Hawkins County, Tennessee. Her cabin joined the land of her father-in-law, Raleigh, but he was gone fighting in the War too. Perhaps Jane spent a lot of time with her mother-in-law, Elizabeth, and her sister-in-law, Nelly Dodson Saunders whose husband John was serving as well. In fact, I’d wager that every able-bodied man was serving, so the women of Dodson Creek on the Holston River had better be able to defend themselves.

jane-near-dodson-ford

This photo was taken very near where Dodson Ford crossed the river, also the location where the Great Warrior Path and Trading Path had crossed for generations.

Lazarus served in the Revolutionary War and was discharged in 1783. That would mean that Jane likely waited at home, hoping that he would not be killed and leave her with some number of small children. At that time, women were either pregnant or nursing, so Jane could have been pregnant while he was at war.

We know that after Lazarus was discharged, he patented land in the western Tennessee counties, but it appears that Lazarus lived on Dodson and Honeycutt Creeks adjacent his father, Raleigh, during this time. That does not mean Lazarus and Jane didn’t perhaps move from one place to another, just not a great distance.

jane-dodson-creek

Dodson Creek, above, is beautiful, as is Honeycutt Creek, below. Jane and Lazarus lived between the two.

jane-honeycutt-creek

This old tree stands at the mouth of Honeycutt Creek and the Holston River.

jane-tree-at-honeycutt

Did Jane stand beneath this tree when it was small and watch for Lazarus to return?

In 1793 or 1794, Jane’s father-in-law, Raleigh, died and the family would have mourned his passing. Jane may have been pregnant at that time for either Oliver or Lazarus Jr. I’m quite surprised that there is no Raleigh among her children, although it’s certainly possible than an earlier Raleigh may have been born and died.

There is a hint that Lazarus may have moved to Greene County, TN and was living there in 1794, or at least a stud racehorse that he co-owned with his brother-in-law, James Menasco, was being advertised “at stud” in Greene County. I can just see Jane rolling her eyes over this great adventure.

Sadly, Lazarus’s sister, Peggy Dodson Mensaco died between 1794 and 1795 when James Menasco sold his land and moved to Augusta, Georgia. Jane would have stood in the cemetery a second time in just a few months as they buried her sister-in-law. I do wonder who raised Peggy’s two children. Was it Jane who comforted them at the funeral?

Oliver was born to Jane in 1794 and Lazarus in 1795.

In 1797, we know that Lazarus sold his land on Dodson Creek and moved to the Whitehorn Fork of Bent Creek, ten miles or so south in Hawkins County, but now in Hamblen County.

White Horn Fork of Bent Creek begins someplace near Summitt Hill Road, runs south, and then intersects with Bent Creek in Bull’s Gap. However, White Horn runs through an area called White Horn, following 66 the entire way, for about 5 miles, from the top of the map below to Bull’s Gap, at the bottom.

jane-white-horn-map

You can see on the satellite map of the region below that this is rough country.

jane-white-horn-satellite

This view of White Horn Creek, below, is from White Horn Road.

jane-white-horn-from-road

White Horn from a side road, below. The creek wasn’t large, but the water would have been very fresh. Water from the source of a stream was always coveted for its cleanliness.

jane-white-horn-side-road

A few years later, by about 1800, Lazarus and family had moved to Claiborne County, where they settled just beneath the Cumberland Gap on Gap Creek, shown below on Lazarus’s land where it crosses Tipprell Road today.

jane-gap-creek

Lazarus bought land early and by 1810 had patented additional land on Gap Creek.

jane-tipprell-road

Lazarus and Jane were likely living on or near this land the entire time they lived in Claiborne County, based on deed and church records. The Gap Creek Baptist Church, which stood on their land still exists today. Jane very probably attended this church, but of course it would have looked very different then, if it was even the same building, at all. It would have been a log structure at that time, as would their home.

gap-creek-church-cropped

In 1819, Lazarus sold out, again, and headed for Alabama. In Alabama, Jane and Lazarus would have settled in the part of Jackson County ceded by the Cherokee earlier that year, so perhaps someplace on what is now Alabama 79, then the main road from Tennessee into Alabama. It probably looked much the same then as it does today. Hilly and treed – for miles and miles and miles. I can’t help but feel for the displaced Cherokee. I wonder if Jane did as well.

jane-jackson-co.

The historic town of Dodsonville once existed in Jackson County, just beneath Scottsboro.

jane-dodsonville

Lazarus’s brother Oliver’s son, William, lived in Jackson County from 1819 until his death in 1872. He is buried in the Dodson Cemetery near Lim Rock, not far from historic Dodsonville, named for him. Dodsonville is probably under dammed Guntersville Lake, today.

By this time, I just feel weary for Jane. I’m sure she longed for a cabin where she could put down roots and didn’t have to sell out and pack up every few years to start over again with few belongings in an unfamiliar place with unknown dangers and strangers she didn’t know. I wonder if Lazarus was the kind of man that was always starry-eyed and enamored with the next great opportunity. Was life just one great adventure after another to him?

We know that in 1826, Lazarus Jr. (we believe) repurchased his father’s land back in Claiborne County, and that Lazarus Sr.’s land transactions, apparently having to do with his estate, were being handled in McMinn County. There is no will or probate for Lazarus Sr. in either Claiborne County or McMinn County, and the Jackson County records were burned in the Civil War.

Giving Lazarus Sr. the benefit of the doubt here, we’ll presume that Lazarus Sr. moved from Alabama directly back to McMinn County and did not first return to Claiborne and then move to McMinn. One way or another, they, or at least Jane, came back to Tennessee as did her sons Lazarus Jr. and Oliver.

Sometime between 1827 and 1830, Jane’s daughter-in-law, Elizabeth Campbell Dodson, Lazarus Jr.’s wife died. If Jane had not already returned to Tennessee, she may have returned in the wagon with Lazarus Jr. to help with his four children born between 1820 and 1827. However, by 1830, those children were living with their Campbell grandparents, who would raise them to adulthood, in Claiborne County. Perhaps the Campbell grandparents raised the children instead of Jane because they owned a farm and there were two of them and they were somewhat younger than Jane by at least a decade, if not more.  Jane, alone, would have had to handle 4 young children. Besides that, Jane’s other son, David had recently died too, leaving his widow needing help with her children as well.  Jane would have been approaching 70 by this time.

Lazarus Jr. returned to Claiborne County and is found in the records beginning in 1826 when he repurchased his father’s land. This is presuming that the land repurchase was by Lazarus Jr. and not Lazarus Sr. Lazarus Jr. remained in Claiborne County where he is found in the court notes from 1827 through about 1833 when he is recorded as being absent and owing taxes.

We know that in 1830 Jane lived someplace near Englewood in McMinn County. Liberty Hill Road runs between Englewood and Cochran Cemetery Road, so this view would have been familiar to Jane, then, too.

jane-liberty-hill-road

So Jane got to pack up for at least a 5th time and move back to Tennessee, and that’s if we know about all the moves, which is certainly not likely.

If Jane married Lazarus in 1778 or 1779, before they left Virginia, that means she got to make major moves at least 5 times between about 1780 and 1825, or roughly every 9 years. And those moves would have been while pregnant, nursing babies, with toddlers, and whatever other challenge or inconvenience you can think of.

In 1825 or so, Jane would have been 60-65 years old. The last thing most people want to do at that age is bounce around in a wagon with no shocks on rough rutty roads crossing mountains – relocating “one last time.”

jane-cumberland-gap

Cumberland Gap, from the summit, overlooking Claiborne County.

Perhaps Lazarus died mysteriously after suggesting “just one more move.”

Jane’s Children

We know beyond a doubt that Lazarus Jr., born in 1795, was Jane’s son, and we can presume that any children born after Lazarus were Jane’s as well since she was still living in 1830.

This 1826 McMinn County deed comes as close as we’re going to get to identifying Jane’s children.

Abner Lea and Others Obligation to William Dodson: State of Tennessee McMinn County. Know all men by these presents that the Abner Lea and Oliver Dodson and Eligha (sic) Dodson and William Dodson and Jessee Dodson and Lazrus Dodson and held and firmly bound in the penal sum of two thousand dollars which payment will and freely to be maid now(?) and each of us do bind our selves our heirs executor and administrators to the abounded signed sealed and delivered this day and date above written. This is our obligation is as such that has the above abound to appoint Abner Lea and Oliver Dodson to be the gardeans [guardians] of the estate of Lazarous Dodson dc’d also we authorize the said Abner Lea and Oliver Dodson to make to William Dodson a deed of Conveyeance to the part of land granted to the said William Dodson North East Quarter of Section 11 Township 5 Range first east of the meridian. Also that we confirm the sale made on the 13 day of May 1826 we also agree to give unto the heirs of David Dodson a certain piece or parcel of land designated to David Dodson by Lazarus Dodson de’d be it further understood that this is to be there part and all that they are entitled to by us, where unto we have set our hand and quill this 11 day of September 1826. Abner Lea Oliver Dodson Eligha Dodson Lazarous Dodson Jesse Dodson

Witnesses: Landford and Rhodes, William Dodson

Therefore, based on the above deed, and the information for each of the individuals below, I believe that Lazarus had 7 children that lived to adulthood, and therefore, Jane probably did as well. We know for sure that the youngest three are Jane’s children.

  • Jesse
  • Elijah
  • Mary
  • Oliver
  • Lazarus
  • David
  • William

Jesse Dodson was born by 1781 or earlier as he was of age in March 1802 when he served as a juror in Claiborne Co., TN at the March term and also the June term when he was designated as “Little Jesse Dodson.” Junior or “little” in this context meant younger, not necessarily “son of Jesse.” This designation was no doubt for the purpose of distinguishing him from Rev. Jesse Dodson, a much older man who was also a resident of Claiborne County at this time. Jesse, the son of Rev. Jesse Dodson was born in 1791, thus being too young to serve as a juror in 1802.

Prior to this, Jesse Dodson Jr. was “assessed for 1 white poll” and was was included “among those living within the Indian Boundary for the year of 1797 which the county court of Grainger released the sheriff from the collection of taxes.”

Apparently these people, it had been determined, were living beyond the treaty line on Indian land and were not within the jurisdiction of Grainger Co. This part of Grainger became Claiborne in 1801 and included the area beneath Cumberland Gap that Lazarus eventually owned and was living on by 1800.

Jesse Dodson and Mary Stubblefield Dodson joined the Big Spring Baptist church “by experience” in March 1802. They received letters of dismissal from the church in Nov. 1805, but Jesse returned his letter in May 1806. Apparently in early 1807 Jesse got into a dispute with the church over a theological question which continued on through Sept 1807 when the question was dismissed. In Aug 1808, Jesse was “excluded” from the church for “withholding from the Church.” He is not again found in the records of Claiborne Co.

On June 20, 1811, one Jesse Dodson was licensed to trade with Indian tribes in Madison Co., Alabama which borders Jackson County. Descendants of this man reportedly carry the oral tradition that he was an Indian trader. Jesse was said to be the oldest son of a large family of boys. Once when the Indian trader returned from one trip and was preparing to leave on another, the father implored his older son to take along his younger brother. The trader refused, saying the boy was so inexperienced that he would be killed by Indians. The father was adamant and insisted, so the trader relented and took the boy along. The brother was killed by Indians before Jessee’s eyes. From then on there were hard feelings between the Indian Trader and his father.

This is a tradition which may have grown with the telling over the generations, but there could be some grains of truth in the tale. The land that became Jackson Co., Alabama was originally part of the Mississippi Territory and was occupied by the Cherokee until they gave it up by treaty on Feb. 27, 1819. It is certainly possible that Jesse Dodson, Indian Trader of the Mississippi territory, was a son of Lazarus Dodson, Sr.

A Jesse Dodson was on the 1830 census of Jackson Co., AL though the family statistics are puzzling. The household consisted of 2 males 5-10, 1 male 10-15, 1 male 20-30, 1 female under 5, 1 female 10-15, 1 female 30-40 and 1 female 50-60. This would not be Jesse Dodson the Indian Trader unless he were away from home on the date of the census enumeration or unless the census taker made an error in recording the statistics. We have no record of the children of this Jesse Dodson.

Elijah Dodson, based on the 1826 deed, was also a son of Lazarus Dodson Sr, although there were multiple Elijah Dodsons. Elijah appears to be connected in the records of Claiborne with Martin Dodson and Jehu Dodson who are not mentioned in the 1826 deed. Elijah was born in 1790 in Hawkins County according to information in the Oregon Donation land claims. He died in Yamhill Co., Oregon in 1859. His first wife was Mary, surname unknown, whom he married March 12, 1807 in “Clayborn Co, Tn.”. His second wife was Elizabeth surname unknown who died in the Autumn of 1854. They were married in September of 1848 in Polk Co., Oregon.

In the June 1805 term of court, Claiborne Co., TN, Elijah along with Jehu was appointed as a road hand to work on a road of which Martin Dodson was overseer. It was a segment of the Kentucky road from the top of Wallen’s ridge to Blair’s creek. In August 1814 Elijah proved a wolf scalp he had killed in 1814 and at the August term 1815 he served as a juror. There are no records of Elijah in Claiborne beyond this date.

It is possible that Elijah eventually went to Henry Co., Ohio and Clay Co., Missouri before moving to Oregon where he made a claim to land in Yamhill Co. on which he lived from Feb 1848 until his death. It is believed that two of his sons were with him in Oregon. The record stated that his first wife left 6 children.

Mary Dodson

Abner Lea is certainly an interested party in the 1826 deed from the heirs of Lazarus Dodson. Abner is reported (although unverified) to have been married to a Mary Dodson on November 15, 1796 in Orange County, NC. The list of Lazarus’s heirs, which apparently includes Abner Lea, strongly suggests that Mary, Abner’s wife, was the daughter of Lazarus Sr. Abner’s birth date is reported to be about 1770 in Caswell County, NC, so too young to be a brother-in-law to Lazarus Sr. and about the right age to have married his daughter.

In 1810, Lazarus purchased land from Abner Lea in Claiborne County. If this is the Abner Lea born in 1770, he was about 40 in 1810. Abner Lea’s brother was James Lea, born in 1767, and in the winter of 1781/1782, Lazarus Dodson was encamped on the land patented by one James Lea in 1783 at the mouth of Richland Creek where it intersected with the Holston River, in what is now Grainger County. A James Lea family is also found on Country Creek in Caswell County, near where Raleigh and Lazarus Dodson lived before moving to the Holston River in 1778/1779.

Nothing is known about descendants of this couple.

Oliver Dodson was born August 31, 1794 in Hawkins Co., TN and died December 8, 1875 in McMinn Co., TN. He married Elizabeth, surname unknown who was born March 16, 1795 in Virginia and died Aug 7, 1883 in McMinn Co., TN. Both are buried in the Mt. Cumberland Cemetery, McMinn County.

jane-oliver-dodson

The first records of Oliver in Claiborne County are found in the court minutes in August 1815 when he proved he had killed a wolf and collected the bounty for the wolf scalp.

On January 16, 1820, Oliver was relieved as road overseer of the Kentucky Road from where Powell’s Valley Road intersects the same at Wallen’s field to the state line at Cumberland Gap. At the August term 1820 he exhibited the scalp of a wolf he had killed in Claiborne in 1819. In June, 1824 he sued William Hogan for a debt and was awarded damages and costs.

Sometime before or after these events, Oliver spent some time in Jackson Co., Alabama. where one of his sons Marcellus M. Dodson claimed to be born in 1819. By 1830, Oliver was settled in McMinn Co, TN where he lived the remainder of his life.

A chancery suit filed in McMinn in 1893 involving the estate of Oliver Dodson gives us a list of his children and some of his grandchildren. The suit, chancery case #1282 Lazarus Dodson (his son) vs Mary Jane Reynolds stated that all were nonresidents of McMInn County except for Lazarus who files for himself and as administrator of Oliver Dodson and Mary Jane Reynolds. Some grandchildren lived in Knox Co., TN and the others lived in California, Texas, Missouri, Oregon, Montana, Georgia and other states.

David Dodson, based on the 1826 deed, is also a son of Lazarus Dodson, Sr. David is not in the records of Claiborne County except for the one time when he witnessed the deed to William Hogan from Lazarus Dotson and Abner Lea in May 1819.

If it is the same David Dodson who later appeared in McMinn Co., TN, then he was probably born between 1790 and 1800. David Dodson (Dotson) died in McMinn County before the 1826 deed. David’s widow was Fanny Dotson born 1790-1800 according to the 1830 census of McMinn Co. with a household consisting of herself, 1 male 5-10, 1 male 10-15, 1 female under 5, 2 females 5-10. She is living beside Jane Dodson, the widow of Lazarus Sr. and also beside William Dodson.

The land referenced in the 1826 deed is roughly the Cochran Cemetery area, shown below, near Englewood in McMinn Co.

David Dodson who died on August 15, 1826 is reported to be buried in this Cemetery, although he is not listed on FindAGrave, so his grave is apparently unmarked. It appears that David and Lazarus may have died in very close proximity to each other relative to their death dates. Poor Jane apparently lost a husband and a son within a very short time. This makes me wonder if there was an illness that took them both.

cochran cemetery

William Dotson was living next door to Jane Dodson in 1830. His household consisted of 1 male under 5, 1 male 20-30, (so born 1800-1810) 1 female under 5, 1 female 5-10 and 1 female 20-30. He was the administrator of the estate of David Dotson and seems a little old to be a son of David and Fanny, so could conceivably be a brother instead.

In 1826 in McMinn County, we find the land in Section 11, Township 5, Range first east of the meridian being conveyed to William by “guardians of the estate of Lazarus Dodson, deceased.”

jane-mcminn-1836

1836 McMinn County district map – The Rogers Connection – Myth or Fact by Sharon R. McCormack

If William is Jane’s son, and he was born about 1800, then she would have been about 30-40 at that time, and based on the birth years of her other children, closer to 40.

A William L. Dotson was appointed one of the arbitrators between the administrators of the estates of Thomas and William Burch, decd, in June of 1834. Thomas Burch died circa 1830 and had been the administrator of the estate of his father, William Burch, who died about 1828. One of the daughters of William Burch was Mrs. Aaron Davis, apparently, a former neighbor of Lazarus Dodson in Claiborne Co. Mentioned in Thomas Burch’s estate is a note against the estate of William Burch, decd and an unidentified piece of land in Claiborne Co. Aaron Davis was a member of Gap Creek Church of Claiborne Co. TN in 1818.

There were several William Dodsons in McMinn Co and it is not entirely possible to separate them without further records, but one of them was the son of Lazarus Sr.  William L. Dodson, believed to be the son of Lazarus, was born December 11, 1804 and died August 29, 1873. I sure would like to know what the L. stood for. Lazarus, or perhaps his mother’s maiden name?  William L. is buried in the Cochran Cemetery in McMinn County, along with Lazarus’s son David. It’s likely that Jane, Lazarus Sr.’s widow, is buried in the Cochran Cemetery as well, given that she was living adjacent to David and William in 1830, and William owned the land on which the cemetery stood.

It’s possible that Lazarus Sr. is buried in the Cochran Cemetery too, although based on the land purchase back in Claiborne County in 1826, it’s also possible that he is buried in Claiborne County or even back in Jackson County, Alabama. It has never been entirely clear whether the Lazarus that repurchased that Claiborne County land was Sr. or Jr. In any event, Claiborne County is where Lazarus Sr.’s marker rests today, set by descendants in 2011 in the Cottrell Cemetery on the land Lazarus once owned.

laz dodson marker

Unfortunately, Lazarus’s death date of 1826 was inscribed incorrectly as 1816, but by the time we saw the stone for the first time, it had already been set and it was too late to change the engraving.

Jane’s Other Children

If the children listed above are all Lazarus and Jane’s children, there were other children who were born and did not survive, given that children were typically born every 18 months to 2 years. The (approximate) birth dates of the children we can identify:

  • Jesse – 1781
  • Elijah – 1790
  • Mary – 1790+, so say 1792
  • Oliver – 1794
  • Lazarus – 1795
  • David – 1790-1800, so call it 1797
  • William – 1800-1810, so call it 1804 based on the cemetery record

This means there were children born in the following approximate years, in the following locations, that did not survive:

  • 1783 – probably on Dodson Creek
  • 1785 – probably on Dodson Creek
  • 1787 – probably on Dodson Creek
  • 1789 – probably on Dodson Creek
  • 1799 – probably on White Horn Branch
  • 1801 – in Claiborne County
  • 1803 – in Claiborne County

If Jane was 60-70 in 1830, she would have had to be closer to 70, or born about 1760 to be having children by 1781, so she would have been about 40 in 1800. It’s likely that she did not have any children after William born in 1804.

Of course, we don’t know when or where those children died, or were buried. It could have been where they were born or anyplace between there and McMinn County. One son could have been killed by Indians. If that is true, Jane must have been heartsick and I’d wager there were some rather unpleasant words between Jane and Lazarus, if indeed he encouraged Jesse to take the son who was killed along on the trading expedition.

All we know for sure is that no additional children were mentioned in the 1826 deed and unlike son David, they did not leave heirs. Given that Lazarus apparently did not have a will, or if he did, it has never been found, all of his living children or deceased children with heirs would have been mentioned in the deed.

If Jesse is Jane’s son and first child, that puts her marriage year at about 1780, so she either was married in North Carolina (or bordering Virginia) and her honeymoon was spent in a wagon bouncing its way to the new frontier, or she arrived to homestead on the Holston River with her parents, whoever they were, and soon thereafter married the handsome frontiersman, Lazarus Dodson. There were probably not many spousal candidates to choose from on the Holston River, so they were both probably very pleased to marry and begin their family.

Jane’s Death and Burial

Jane died sometime after 1830 and before 1840, based on the census. In 1830 she was living beside son David Dodson’s widow and William Dodson. Later deeds show that the land owned by William Dodson conveyed in the 1826 deed includes the Cochran Cemetery near present-day Englewood.

jane-cochran-cemetery-map

We know that William Dodson is buried there and David Dodson is reported to be buried there as well, along with several other Dodsons listed on FindAGrave. Jane seems to be surrounded by her descendants.

jane-cochran-internments-2

William L. Dodson, buried in the Cochran Cemetery, is shown on FindAGrave to be the son of Elisha Dodson and Mary Matlock. Elisha is shown to be the son of the Reverend Jesse Dodson, who was the preacher at Big Springs in Claiborne County. I don’t know if this is accurate, nor do I know what documentation was utilized for this information.

Unfortunately, both the Reverend Jesse Dodson and Lazarus Dodson Sr. were both functioning in Claiborne County at the same time in the early 1800s. I do find it odd that Jesse’s son, Elisha, who died in Polk County in 1864, would have a son, William L., living beside Jane and David Dodson, in McMinn County. It’s entirely possible that Elijah and Elisha, very similar names, have been confused and intermixed.

jane-cochran-aerial

The Cochran Cemetery, where Jane is probably buried is shown above and below.

jane-cochran-from-road

County Road 479 is Cochran Cemetery Road.

jane-cochran-cemetery-road

The terrain is hilly but not mountainous and these rolling hills are what Jane saw in her last few years, living in McMinn County.

jane-cochran-distance

Mitochondrial DNA

If Mary Dodson who married Abner Lea is indeed the daughter of Jane Dodson, and if there are descendants who descend through all females to the current generation, we could test that descendant to obtain the mitochondrial DNA of Jane.

Mothers give their mitochondrial DNA to both genders of children, but only females pass it on. In order to find Jane’s mitochondrial DNA we’d need to find a descendant through her one female child, Mary – assuming that indeed Mary is Jane’s daughter.

Jane has been theorized to be a Honeycutt, given that Lazarus lives on Honeycutt Creek and has some interest in land conveyed in 1810, a Lea based on continued interaction with that family, and a Native woman since Lazarus was encamped with the Native people in 1781/1782. That may not be terribly likely since the Cherokee towns were destroyed, but then again, love has never been hindered terribly by warfare – and married to a white man might be as safe as a Native woman could be at that time.

Finding the haplogroup of Jane’s mitochondrial DNA would at least put the Native possibility, as small as it is, to sleep one way or the other, forever. Native American haplogroups are distinct from European, African or Asian haplogroups.

If you descend from Jane Dodson through daughter Mary through all females to the current generation, which can be male, please let me know. I have a DNA testing scholarship for you.

Autosomal DNA – The Dog’s Leg 

Can autosomal DNA help?

Well, theoretically, yes. However, in actuality, for me, today, the answer is “not exactly” or at least not in the way I intended.

I need to warn you, before we start, that this section is the proverbial dog’s leg – meaning we start in one place, and through a series of twists and turns, wind up someplace entirely different.  I debated removing this section – but I decided to leave it because of the educational value and discussion.  “The Dog’s Leg” would actually be an apt description of my entire 37+ years doing genealogy.

So, if you’re up for a bit of an adventure on twisty roads, let’s go!!!

jane-dodson-chart

The first problem we encounter is that Jane is several generations back in the tree, even to the most closely related descendants that have DNA tested at Family Tree DNA where we have chromosome data to work with.

Son Lazarus Jr. carried half of Jane’s DNA, and with each generation, roughly half of Jane’s DNA from the previous generation was lost. Today, descendants would carry anyplace from 3.12% to less than 1% of her DNA, so the chances of carrying the same segment that matches other descendants is progressively smaller in each generation.

Furthermore, today, we have no way to tell which DNA that the descendants might carry is Jane’s DNA, even if it can be attributed to Lazarus and Jane and no common ancestor downstream. In other words, Jane’s DNA and Lazarus’s DNA combined in their children and to sort it back into Jane’s and Lazarus’s individually, we have to have the DNA of Lazarus’s ancestral Dodson line and Jane’s ancestral line to be able to sort their DNA into his and her buckets. Today, we have some people from Lazarus’s line, but obviously none from Jane’s, since we don’t know the identity of her parents or siblings.

To know whose DNA is whose, we’d need matching DNA from Lazarus Sr.’s siblings descendants, for example. That, we may be able to obtain. However, we don’t have that information about Jane.

For the record, the person labeled “Tester,” below, in red has not tested today. If they were to test, because they descend through Lazarus Dodson Jr. through a second wife, if that red tester matches any of the green testers, we would know for sure that their common DNA is that of Lazarus Jr. (and not his wife), assuming no other common ancestral lines, because the green testers and red tester descend through different wives of Lazarus Jr.

jane-dodson-chart-2

While this would help us identify Dodson DNA in Lazarus Jr.’s generation, which means that DNA came from Lazarus Sr. and Jane as a couple, it doesn’t help us identify Jane’s DNA.

What Can We Tell About Jane?

So, what might we be able to tell about Jane?

I have access to the DNA results for Buster and Charlene (above) at Family Tee DNA, in addition to my own DNA results, of course.

I checked my own results for any Honeycutt, using the match search filter. There were two, and both also shared other surnames that I share. No particular common ancestral line or location was evident.

I also attempted to search for the surname Lea, but unfortunately, one cannot request only a particular match string, so the matches included any first or surname that included “lea.” Even more difficult, the matching Ancestral Surnames column often didn’t extend to the “L” names, so I can’t tell whether the matching surname is Lea or something else that includes “lea.”

That’s disappointing.

Next, let’s try Dodson.

You can see an example of the Ancestral Surnames below and only 4 rows maximum are displayed, even when expanded. The first three matches didn’t make it to the D surnames. I’m hoping this problem, which is relatively new, will be fixed soon.

jane-ancestral-surnames

I have 21 matches for Dodson, with 15 having trees. Let’s see if any of these people share my Dodson line.

Match # Common Ancestors
1 George Dodson and Margaret Dagord, Raleigh Dodson’s parents
3 Greenham Dodson and Eleanor Hightower (brother to George Dodson who married Margaret Dagord), also a Campbell line
4 George Dodson and Margaret Dagord, also a Crumley line
5 No common ancestor shown, but have Dodson in their ancestor surname list (5 matches)
6 Not far enough back to connect (5 matches)
7 Greenham Dodson and Eleanor Hightower

Some of my Dodson matches list Dodson in their Ancestral Surnames, but I don’t find an ancestor with the Dodson surname in their actual tree.

Of the people who do have Dodson ancestors in their trees, I find 4 where I can identify the common ancestor, and all 4 are some number of generations before Lazarus Sr. or even his father, Raleigh. In one case, there is also another identifiable ancestor with a different surname (Crumley) and in another line, a common surname (Campbell) but no common ancestor.  However, I’m brick walled on Campbell and the Campbell line did marry into the Dodson line in Lazarus Jr’s generation.

These Dodson matches are exciting, and here’s my dream list of what I’d like to do next:

  • What I’d really like to be able to do is to select all 21 of my matches and create a grid or matrix that shows me the people who match in common with me and any of them. Those would obviously be people who do NOT carry the Dodson surname, because people who do carry the ancestral (or current) Dodson surname are already listed in the 21.
  • Then, I’d like to see a matrix that shows me which of all these people match me and each other on common segments – and without having to push people through to the chromosome browser 5 at a time.
  • I’d like to be able to sort through all of the ICW matches (both Ancestral Surnames and direct ancestors in trees) to see if they have Honeycutt or Lea, or any other common surnames with each other. Because if the common surname isn’t Dodson, then perhaps it is Jane’s surname and finding a common surname among the matches might help me narrow that search or at least give me hints.
  • I’d like to be able to see who in my match list matches me on any particular given segment. In other words, let’s say that I match three individuals on a specific chromosome segment. I’d like to be able to search through my matches online for that information.
  • I’d like to be able to sort through my Dodson matches list by specific ancestor in their tree, like Lazarus Dodson. Today, I have to search each account’s tree individually, which isn’t bad if there are a few. However, with a common surname, there can be many pages of matches.

In the following example, I match 3 other Dodson descendants on a large segment of chromosome 5. This match is not trivial, as it’s 32 to 39 cM in length and approximately 7500 to 9000 SNPs.  These are very solid matches.

jane-chromosome-browser

  • The green person (JP) is stuck in Georgia in 1818 with a female Dodson birth, so the common ancestor is unknown.
  • The yellow person (CA) descends from George Dodson and Margaret Dagord, Raleigh’s parents, through another child.
  • The pink person (JP) has no tree but shows Dodson, Smoot and Durham in Virginia which tells me these are the early generations of the Dodson line. Thomas Dodson’s wife’s birth name was Durham and they were parents of both George and Greenham Dodson.  Smoot comes through the Durham line.

These individuals match me on the following segment of chromosome 5.

jane-segment-matches

Lazarus and Jane are 6 generations upstream from me, so George Dodson is 8 and Thomas Dodson is 9. That’s pretty amazing that this relatively large segment of DNA appears to have potentially been passed through the Dodson line for this many generations.  Note the word potentially.  We’re going to work on that word.

Regardless of how early or how many generations back, these matches are clearly relevant AND have been parentally phased to my father’s side, both by virtue of the Phased Family Matching (maternal and paternal buckets) at Family Tree DNA and by virtue of the fact that they don’t match my mother.

The next question is whether or not these people match each other, so to answer that question, I need to move to the matrix tool.

jane-matrix

Utilizing the matrix, we discover that they DO match each other. What we don’t know is whether they match each other on that particular segment of chromosome 5, but given the size of the segment involved, and that they do match each other, the chances are very good that they do match on the same segment.

Of course, since the yellow match is unquestionably my line of Dodson DNA and because my common ancestor with this person is upstream of both Lazarus and Raleigh, then this matching DNA segment on chromosome 5 cannot be Jane’s DNA.

Therefore, I’d really like to know who else I match on this specific segment, particularly on my father’s side, so that I can see if there are any additional proven Dodson lineage matches on this segment.  This would allow me to properly assign the people who match me on my father’s side on this segment as being “Dodson line,” even if I can’t tell for sure who the common ancestor is.

That function, of course, doesn’t exist via searching at Family Tree DNA today, but what I can do is to check my Master DNA Spreadsheet that I’ve downloaded to see who else matches me on that segment.  If you would like to know how to download and manage your spreadsheet, see the Concepts Series of articles.

My Master DNA Spreadsheet shows 23 additional matches on this segment on my father’s side, 8cM or larger, with two, one at 32.96 cM indicating a common Durham lineage, and another at 33.75 cM indicating a Dodson lineage.  Therefore, this segment can reasonably confidently be assigned to the Dodson side of the tree, and probably to the Durham line – an unanticipated bonus if it holds.

jane-dodson-pedigree

I would need additional evidence before positively assigning this segment to the Durham line, given the distance back in time.  I would need to be sure my Durham match doesn’t have a hidden Dodson match someplace, and that their tree is fairly complete.

While this little exercise helps me to identify Dodson DNA and possibly Durham DNA, it hasn’t done anything to help me identify Jane’s DNA.

Of course, if I had matches to people with Honeycutt or Lea DNA, then that might be another matter and we would have a hypothesis to prove or disprove. Or, if I could search for common surnames, other than Dodson, among my matches trees and Ancestral Surnames.

I’m going to try one more cousin, Buster, who is generationally closer than I am to see if he matches a Honeycutt at Family Tree DNA, by any chance. Nope, no Honeycutt.

I also checked at Ancestry, just to see if I match anyone there who also descends from Lazarus Sr., and I do not. I do, however, match 2 people through Lazarus’s father Raleigh, 15 people through Raleigh’s parents, George Dodson and Margaret Dagord and 14 people through Raleigh’s grandfather, Thomas Dodson.

If I match this many, it sure makes me wonder how many from this line have tested and that I don’t match. Of course, at Ancestry, they have no chromosome browser or matrix types of tools (without building your own pseudo-matrix using the Shared Matches feature), so there is no way to discern if your matches also match each other and there is no way to know if they match you and/or each other on the same segments.

The Ancestor Library – My DNA Daydream

I dream of the day when we will be able to recreate the DNA profiles of our ancestors and store them in an “Ancestor Library.” That way, when I identify the DNA on chromosome 5, for example, to be that of George Dodson and Margaret Dagord, I can assign it to that couple in the “ancestor library.” Then, if this segment on chromosome 5 is either partially or wholly Durham, I can move it up one generation and then to the Durham ancestral line in the library.

Let me explain what this “Ancestor Library” will do for us.

Let’s say we know that a piece of DNA on chromosome 1 that was inherited from Lazarus and Jane is not Dodson DNA, and let’s say we have ideal circumstances.  We know this DNA came from Lazarus and Jane because this large common matching segment is found in three descendants through three different children. We already know what the Dodson progenitor DNA in this location looks like, because it’s proven and already in the library, and our Lazarus/Jane DNA on chromosome 1 doesn’t match the Dodson DNA in the Ancestor Library. Therefore, by process of logical deduction, we know that this segment on chromosome 1 has to be Jane’s DNA. Finally, we have an identifiable piece of Jane.

Now, let’s say we can submit this sequence of Jane’s DNA into the “Ancestor Library” to see which “ancestors” in the library match that sequence of DNA.

There could be several of course who descend from the same ancestral couple.

We obtain our “Ancestor Library” match list of potential ancestors that could be ours based on Jane’s DNA segment, and we see that indeed, there is a Honeycutt line and our DNA matches that line. Depending on how many other ancestral lines also match, the segment size, etc., this would be sufficient to send me off scurrying to research Honeycutt, even if the results don’t “prove” beyond a shadow of a doubt who Jane’s parents were.  Ancestor Library matches most assuredly would give us more to work with on that magical day, sometime in the future, than we have to work with today. In fact, the Ancestor Library would actively break down brick walls.

Ok, I’ve returned from my daydream now…but I do wonder how many years it will be until that DNA future with the “Ancestor Library” comes to pass and we’ll be able to fill in the blanks in our family tree utilizing DNA to direct our records research, at least in some cases.

The Rest of the Story – My Secret

Ok, I’ll let you in on my secret. Truth is that I’ve been working on the Ancestor Library proof of concept for over 2 years now.  In November 2016, I gave a presentation at the Family Tree DNA Conference titled “Crumley Y DNA to Autosomal Case Study – Kicking It Up a Notch” about reconstructing James Crumley from 50 of his descendants.  Just to give you an idea, this is a partial reconstruction utilizing Kitty Cooper’s tools, not quite as she intended.

james-crumley-reconstruct

Just to let you know, ancestor reconstruction can be done. It may be a daydream today in the scope that I’m dreaming, but one day, it will happen. Jane’s ancestry may someday be within reach once we develop the ability to functionally “subtract out” Lazarus’s DNA from Jane’s descendants.

In Summary

I wish we had some small snippet of Jane’s voice, or even Jane’s identifiable DNA, but we don’t. All we can do is to surmise from what we do know.

We know that Jane moved from place to place, and apparently a non-trivial number of times.

Jane’s life can be divided into frontiers.

  • Birth to 1778 – 1780 – Virginia or North Carolina, probably
  • 1780 – 1797 – Holston River between Honeycutt and Dodson Creeks, present day Hawkins County, Tennessee
  • 1797 – 1800 – White Horn Fork, near Bull’s Gap, then Hawkins County, Tennessee, today, probably Hamblin County
  • 1800 – 1819 – Gap Creek beneath the Cumberland Gap, Claiborne County, Tennessee spanning the old Indian boundary line
  • 1819 – before 1830 – Jackson County, Alabama when the Cherokee ceded their land
  • 1830 – 1840/death – McMinn County, Tennessee

The longest time Jane spent in one place was about 19 years in Claiborne County where Lazarus was a member of the Gap Creek Baptist Church by 1805.  Jane was very likely a member there too, as it would be extremely unusual for a woman not to attend the same church where her husband was a member of some status.

It’s actually rather amazing that we were able to track Jane and family at all, considering the number of places they lived and given the distances that they moved. While we do hold onto them by the tiniest threads – surely we must know how many of the threads of the fabric of Jane’s life are now irrecoverably lost – like pieces of a quilt, frayed with wear and gone.

Jane had at least three children that lived, and probably a 4th since Oliver was born the year before Lazarus. She may have had 7 living children if all of Lazarus’s children were hers too – meaning she was Lazarus’s only wife. We have nothing to indicate that either Lazarus or Jane were married more than once, except for how common death was on the frontier. If all of Lazarus’s children were also Jane’s, then Jane likely had as many children that died as lived, presuming she was married for her entire child-bearing life. Losing every other child is a nightmare thought for a mother, especially today – but it was more or less expected before the days of modern medicine. Let that soak in for a minute.

One of Jane’s children may have been killed by Indians. If this is true, then that episode may have affected Jane’s relationship with her husband and potentially her son Jesse, too. Unfortunately, records during this time are scant and many are missing entirely. We will probably never know if Jesse, the Indian trader, was Jane’s son.

I hope that some day, in some way, we’ll be able to unravel the mystery of Jane’s surname. In order for that to happen, new records will either need to appear, perhaps in the form of a nice juicy chancery suit, or a family Bible needs to be found, or DNA technology needs to improve combined with some serendipity and really good luck.

In the meantime, I’ll remember Jane as the weary and infinitely patient frontier wife, repeatedly packing up and moving from one frontier to the next, for roughly 45 years, whether she really wanted to or not.

I will think of her gently caring for her grandchildren after Elizabeth Campbell Dodson died, perhaps wiping their tears as their mother was buried in a grave lost to time, not long after Jane lost her own husband, Lazarus and son David. 1826 and 1827 were grief-filled years for Jane, with one loss after another.  She buried far too many close family members.

I will think of Jane living in McMinn County in her final years, between her son David’s widow, Fanny, and their children, and son William’s family. Between those two families, Jane had 7 grandchildren living within earshot: 3 toddlers, 3 between 5 and 10 and one boy about 11 or 12. He was probably a big help to Jane and Fanny both.

I hope Jane’s golden years were punctuated by the ring of grandchildren’s voices and laughter as she gathered them around her chair in front of the fireplace on crisp winter evenings, or on the shady porch on hot summer days.  She would have regaled them with stories “from a time far away and long ago” about her journeys in wagons, across rivers before bridges and through wars into uncharted territory, where Indians and soldiers both camped in their yard at Dodson’s Ford more than 50 years earlier. I can hear her now, can’t you? “Why, they were right outside, chile.” Their eyes must have been as big as saucers. Grandma Dodson’s life was amazing!

I hope Jane’s death, when it came, was swift and kind. Ironically, she outlived her adventure-loving husband by at least 4 years and maybe more than 14. And I will always wonder if Lazarus died after suggesting to Jane that they move one more time!

Jane can never regret not having taken that leap of faith, not having followed the elusive dream, be it hers or his, or both, because it seems that they always went…well, maybe except for that one last time.

I surely hope Jane is resting in peace, because while her life is infinitely interesting to us today, with her progressive migrations to “the next” frontier, it appears that rest is probably not something Jane got much of during her lifetime.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Concepts – How Your Autosomal DNA Identifies Your Ancestors

Welcome to the concepts articles. This series presents the concepts of genetic genealogy, not the details.  I have written a lot of detailed articles, and I’ve linked to them for those of you who want more.  My suggestion would be to read this article once, entirely, all the way through to understand the concepts with continuity of thought, then go back and reread and click through to other articles if you are interested.

All of autosomal genetic genealogy is based on these concepts of inheritance and matching, so if you don’t understand these, you won’t understand your matches, how they work, why, or how to interpret what they do or don’t tell you.

The Question 

Someone sent me this question about autosomal DNA matching.

“I do not quite understand how the profiles can be identified to an ancestor since that person is not among us to provide DNA material for “testing” and “comparison.”

That’s a really good question, so let’s take a shot at answering this question conceptually.

Do you have a cat or dog?

Chica Pixie Quilt

I bet I could tell if I could see your clothes, your house, your car or your quilt. Why or how?  Because pets shed, and try as you might, it’s almost impossible to get rid of the evidence.  I went to the dentist once and he looked at my sweatshirt and said, “German Shepherd?” I laughed.

When your ancestor had children, he or she shed their DNA, half of it, and it’s still being passed down to their descendants today, at least for the next several generations. Let’s look, conceptually, at how and why this works.

In the following diagram, on the left you can see the generations and the relationships of the people both to the ancestor and to each other.

Our ancestor, John Doe, married a wife, J, and had 2 children. Gender of the children, in this example, does not matter.

Everyone receives one strand of DNA from their mother and one from their father. If you’re interested in more detail about how this works, click here.

In our example below, I’ve divided this portion of John’s DNA into 10 buckets. Think of each of these buckets as having maybe 100 units of John’s DNA.  You can think of pebbles in the bucket if you’d like.  Our DNA is passed, often, in buckets where the group of pebbles sticks together, at least for a while.  Since this is conceptual, our buckets are being passed intact from generation to generation.

John’s mother’s strand of DNA has her buckets labeled MATERNALAB and I’ve colored them pink to make them easy to identify. John’s father’s strand of DNA has his buckets labeled FATHERSIDE and is blue.  Important note – buckets don’t come colored coded pink or blue in nature – you have no idea which side your DNA comes from.  Yes, I know, that’s a cruel joke of Nature.

John married J, call her Jean. Jean also has 2 strands of DNA, one from her mother and one from her father, but in order to simplify things, rather than have two colors for the wives, I’d rather you think of this generationally, so the wives in each generation only have one color. That way you can see the wives’ DNA mixing with the husbands by just looking at the colors. Jean’s color is lavender.

DNA “Shedding” to Descendants

So, now let’s look at how John “sheds” his DNA to his two children and their descendants – and why that matters to us several generations later.

Concept ancestor inheritance

Please note that you can click on any of the graphics to make them larger.

In the examples above, the DNA that is descended in each generational line from John is bolded within the colored square. I also intentionally put it at the beginning and ends of the segments for each child so it’s easy to see.

In the first generation, John’s children each receive one strand of DNA from their mother, J, and one from John. John’s DNA that his children receive is mixed between John’s father’s DNA and John’s mother’s DNA – roughly 50-50 – but not exactly.

At every position, or bucket, during recombination, John’s child will receive either the value in John’s Mom’s bucket or the value at that location in John’s Dad’s bucket.  In other words, the two strands of John’s parent’s DNA, in John, combine to make one strand to give to one of John’s children.  Each time this happens, for each child conceived, the recombination happens differently.

Concept Ancestor inheritance John

In this case, John’s children will receive either the M or the F in bucket one.  In buckets 2 and 3, the values are the same.  This happens in DNA.  The child’s bucket 4 will receive either an E or H.  Bucket 5 an R or E.  Bucket 6 an N or R.  And so forth.  This is how recombination works, and it’s called “random recombination” meaning that we have not been able to discern why or how the values for each location are chosen.

Is recombination really random, like a coin flip?  No, it’s not.  How do we know?  Because clumps of neighboring DNA stick often together, in buckets – in fact we call them “sticky segments.”  Groups of buckets stick together too, sometimes for many generations.  So it’s not entirely random, but we don’t know why.

What we do know for absolutely positively sure is that every person get’s exactly half of their parents’ DNA on chromosomes 1-22.  We are not talking about the X chromosome (meaning chromosome 23) or mitochondrial DNA or Y DNA.  Different topics entirely relative to inheritance.

You can see which buckets received which of John’s parents’ DNA based on the pink and blue color coding and the letters in the buckets.  Jean’s contribution to Child 1 and Child 2 would be mixed between her parents’ DNA too.

Concept Ancestor inheritance child

In the first generation, Child 1 received 6 pink buckets (segments) from John’s mother and 4 blue buckets from John’s father – MATHERSLAB.  Child 2 received 6 blue buckets from John’s father and 4 pink buckets from John’s mother – FATHERALAB.  On the average, each child received half of their grandparents’ DNA, but in reality, neither child received exactly half.

Note that Child 1 and 2 did not necessarily receive the SAME buckets, or segments, from John’s parents, although Child 1 and 2 did receive some buckets with the same letters in them – ATHERLAB.

If you’re thinking, “lies, damned lies and statistics” right about now, and chuckling, or maybe crying, join the club!

Looking at the next generation, John’s Child 1 married K and John’s Child 2 married O.

Child 1

Let’s follow John’s pink and blue DNA in Child 1’s descendants.  Child 1 marries K and had one child.

Concept Ancestor inheritance grandchild child 1 c

John’s grandchild by Child 1 has one strand of DNA from Child 1’s spouse K and one strand from Child 1 which reads MATJJJJLAB. You can see this by K’s entire strand and the grandchild’s other strand, contributed by Child 1, being a mixture of John’s DNA along with his wife J’s DNA.  In this case, for these buckets, John’s mother’s pink DNA is only being passed on.  John’s father’s buckets 4-7 were “washed out” in this generation and the grandchild received grandmother J’s DNA instead.

Concept Ancestor inheritance gen 4 c

In the next generation, 3, John’s grandchild married P and had generation 4, the great-grandchild. Generation 4 of course carries a strand from wife P, but the Doe strand now carries less of John’s original DNA – just MA and LAB at the beginning and end of the grouping.

Concept Ancestor inheritance gen 5 c

In the next generation, 5, the great-great-grandchild, you can see that now John Doe’s inherited DNA is reduced to only the AB at the right end.

Concept Ancestor inheritance gen 6

In the next generation, 6, the great-great-great-grandchild carries only the A, and in the final generation, below, the great-great-great-great-grandchild, none of John Doe’s DNA is carried by that descendant in those particular buckets.

Concept Ancestor inheritance gen 7 c1

Can there be exceptions? Yes.  Buckets are sometimes split and the X chromosome functions differently in male and female inheritance.  But this example is conceptual, remember.

You always receive exactly half of your parents’ DNA, but after that, how much you receive of an ancestor’s DNA isn’t 50% in each generation. You saw that in our examples where both Child 1 and Child 2 inherited a little more or a little less than 50% of each of John’s parents’ DNA.

Sometimes groups of DNA buckets are passed together and sometimes, the entire bucket or group of buckets are replaced by DNA from “the next generation.”

To summarize for Child 1, from John Doe to generation 7, each generation inherited the following buckets from John, with the final generation, 7, having none of John’s DNA at all – at least not in these buckets.

concept child 1

Now, let’s see how the DNA of Child 2 stacks up.

Child 2

You can follow the same sequence with Child 2. In the first generation, Child 2 has one strand of John’s DNA and one of their mother’s, J.

Child 2 marries O, Olive, and their child has one strand from O, and one from Child 2.

Concept Ancestor inheritance gen 3 c 2

Child 2’s contributed strand is comprised of DNA from John Doe and mother J.  You can see that the grandchild has FA and ALAB from John, but the rest is from mother J.

Concept Ancestor inheritance gen 4 c 2

The grandchild (above) married Q and their child generation 4, inherits most of John’s DNA, but did drop the A .

Concept Ancestor inheritance gen 5 c 2

Sometimes the DNA between generations is passed on without recombining or dividing.  That’s what happened in generation 5, above, and 6 below, with John’s DNA.

Concept Ancestor inheritance gen 6 c 2

Generations, 5 (great-great-grandchild) and 6 (great-great-great-grandchild) both receive John’s F and AB, above.

Concept Ancestor inheritance gen 7 c 2

However, in the 7th generation, the great-great-great-great-grandchild only inherits John’s bucket with B.  The F and A were both lost in this generation.

concept child 2

This summary of the inheritance of John’s DNA in Child 2’s descendants shows that in the 7th generation, that individual carries only one of John’s DNA buckets, the rest having been replaced by the DNA of other ancestors during the inheritance recombination process in each generation.

Half the Equation

To answer the question of how we can identify the profile of a person long dead is not answered by this inheritance diagram, at least not directly – because we don’t KNOW how much of John’s DNA we inherited, or which parts.  In fact, that’s what we’re trying to figure out – but first, we had to understand how we inherited DNA from John (or not).

Matching with known family members is what actually identifies John’s DNA and tells us which parts of our DNA, if any, come from John.

Generational Matching

Let’s say I’m in the first cousin generation and I’m comparing my autosomal DNA against my first cousin from this line.  First cousins share common grandparents.

Assuming that they are genetically my first cousin (meaning no adoptions or misattributed parentage,) they are close enough that we can both be expected to carry some of our common ancestor’s DNA. I wrote an in-depth article about first cousin matching here, but for our purposes, we know genetically that first cousins are going to match each other virtually 100% of the time.

Here’s a nice table from the Family Tree DNA Learning Center that tells us what to expect in terms of matching at different relationship levels.

concept generational match

The reason our autosomal DNA matches with our reasonably close relatives is because we share a common ancestor and have inherited at least a bucket, if not more than one bucket, of the same DNA from that ancestor.

That’s the ONLY WAY our DNA could match at the bucket level, given what we know about inheritance. The only way to get our DNA is through our parents who got their DNA through their parents and ancestors.  Now, could we share more than one common ancestral line?  Yes – but that’s beyond conceptual, for now.  And yes, there is identical by chance (IBC), which doesn’t apply to close relatives and in general, nor to larger buckets. If you want to read more about this complex subject, which is far beyond conceptual, click here.

Now, let’s see how we identify our ancestor’s DNA!

Concept ancestor matching

Let’s look at people of the same generation of descendants and see how they match each other.  In other words, now we’re going to read left to right across rows, to compare the descendants of child 1 and 2.  Previously, we were reading up and down columns where we tracked how DNA was inherited.

Bolded letters in buckets indicate buckets inherited from John, just like before, but buckets with black borders indicate buckets shared with a cousin from John’s other child.  In other words, a black border means the DNA of those two people match at that location.  Let’s look at the grandchildren of John compared to each other.  John’s grandchildren are first cousins to each other.

Concept ancestor matching 1c

Our first cousins match on 4 different buckets of John’s DNA: A, L, A and B.  In this case, you can see that both individuals inherited some DNA from John that they don’t share with each other, such as their first letters, M for Child 1 and F for child 2.  Because they inherited different pieces from John, because he inherited those pieces from different ancestors, the first cousins don’t match each other on that particular bucket because the letters in their individual buckets are different.

Yes, the first cousins also match on wife J’s DNA, but we’re just talking about John’s DNA here.  Now, let’s look at the next generation.

Concept ancestor matching 2c

Our second cousins, above, match on four buckets of John’s DNA.  Yes, the A bucket was inherited from John’s Mom in one case, and John’s Dad in the other case, but because the letter in the bucket is the same, when matching, we can’t tell them apart.  We only “know” which side they came from, in this case, because I told you and colored the buckets pink and blue to illustrate inheritance.  All the actual software matching comparison has to go by is the letter in the bucket.  Software doesn’t have the luxury of “knowing” because in nature there is no pink and blue color coding.

concept ancestor matching 3c

Our third cousins, above, match, but share only A and B, half as much of John’s DNA as the second cousins shared with each other.

Concept ancestor matching 4c

Our 4th cousins, above, are lucky and do match, although they share only one bucket, A, of John’s DNA, which happens to have come from John’s mother.

Concept ancestor matching 5c

By the time you get down to the 5th cousins, meaning the 7th generation, the cousins’ luck has run out, because these two 5th cousins don’t match on any of John’s DNA.

Most 5th cousins don’t match and few 6th cousins match, at least not at the default thresholds used by the testing companies – but some do.  Remember, we’re dealing with matching predictions based on averages, and actual individual DNA inheritance varies quite a bit.  Lies, damned lies and statistics again!

You can adjust your own thresholds at GedMatch, in essence making the buckets smaller, so increasing the odds that the contents of the buckets will match each other, but also increasing the chances that the matches will be by chance.  Again, beyond conceptual.

concept buckets inherited

While this is how matching worked for these comparisons of descendants, it will work differently for every pair of people who are compared against each other, because they will have, or not have, inherited different (or the same) buckets of DNA from their common ancestor.  That’s a long way of saying, “your mileage will vary.”  These are concepts and guidelines, not gospel.

Now, let’s put these guidelines to work.

Matching People at Testing Companies

Ok, so now let’s say that I match Sarah Doe. I don’t know Sarah, but we are predicted to be in the 2nd or 3rd cousin range, based on the amount of our DNA that we share.

As we know, based on our inheritance example, amounts of shared DNA can vary, but we may well be able to discern a common ancestor by looking at our pedigree charts.

Sure enough, given her surname as a hint, we determined that John Doe is our common ancestor.

That’s great evidence that this DNA was passed from John to both of us, but to prove it takes a third person matching us on the same segment, also with proven descent from John Doe. Why?  Because Sarah and I might also have a second common genealogical line, maybe even one we don’t know about, that’s isn’t on our pedigree chart. And yes, that happens far more than you’d think. To prove that Sarah Doe and my shared DNA is actually from John Doe or his wife, we need a third confirmed pedigree and DNA match on that same bucket.

A Circle is Not a Bucket

If you just said to yourself, “but Ancestry doesn’t show me buckets,” you’re right – and a Circle is not a bucketA Circle means you match someone’s DNA and have a common tree ancestor.  It doesn’t mean that you or any Circle members match each other on the same buckets. A bucket, or segment information, tells you if you match on common buckets, which buckets, and exactly where.  You could match all those people in a Circle on different buckets, from completely different ancestors, and there is no way to know without bucket information.  If you want to read more about the effects of lack of tools at Ancestry, click here and here.

Proof

Matching multiple people on the same buckets who descend from the same ancestor through different children is proof – and it’s the only proof except for very close relatives, like siblings, grandparents, first cousins, etc.  Circles are hints, good hints, but far, far from proof.  For buckets, you’ll need to transfer your Ancestry results to Family Tree DNA or to GedMatch, or preferably, both.

I’m most comfortable if at least two of the individuals of a minimum of three who match on the same buckets and share an ancestor, which is called a triangulation group, descend from at least two different children of John.  In other words, the first common ancestor of the matches is John and his wife, not their children.

Cross generational matches 2

The reason I like the different children aspect is because it removes the possibility that people are really matching on the downstream wives DNA, and not John’s.  In other words, if you have two people who match on the same buckets, A and B above, who both descend from John’s Child 1 who married K, they also will share K’s DNA in addition to John’s.  So their match to each other on a given bucket might be though K’s side and not through John’s line at all.

Let’s say A and B have a match to unknown person D who is adopted and doesn’t know their pedigree chart.  We can’t make the presumption that D’s match to A and B is through John Doe and Jean, because it might be through K.

However, a match on the same buckets to a third person, C, who descends through John’s other child, Child 2, assuming that Child 2 did not also marry into K’s (or any other common) line, assures that the shared DNA of A and B (and C) in that bucket is through John or his wife – and therefore D’s match to A, B and C on that bucket is also through the same common ancestor.

If you want to read more about triangulation, click here.

In Summary

The beauty of autosomal DNA is that we carry some readily measurable portion of each of our ancestors, at least the ones in the past several generations, in us. The way we identify that DNA and assign it to that ancestor is through matching to other people on the same segments (buckets) that also descend from the same ancestor or ancestral line, preferably through different children.  In many cases, after time, you’ll have a lot more than 3 people descended from that ancestral line matching on that same bucket.  Your triangulation group will grow to many – all connected by the umbilical lifethread of your common ancestors’ DNA.

As you can see, the concepts, taken one step at a time are pretty simple, but the layers of things that you need to think about can get complex quickly.

I’ll tell you though, this is the most interesting puzzle you’ll ever work on!  It’s just that there’s no picture on the box lid.  Instead, it’s incredible real-life journey to the frontiers inside of you to discover your ancestors and their history:)  Your ancestors are waiting for you, although my ancestors have a perverse sense of humor and we play hide and seek from time to time!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

The Best and Worst of 2015 – Genetic Genealogy Year in Review

2015 Best and Worst

For the past three years I’ve written a year-in-review article. You can see just how much the landscape has changed in the 2012, 2013 and 2014 versions.

This year, I’ve added a few specific “award” categories for people or firms that I feel need to be specially recognized as outstanding in one direction or the other.

In past years, some news items, announcements and innovations turned out to be very important like the Genographic Project and GedMatch, and others, well, not so much. Who among us has tested their full genome today, for example, or even their exome?  And would you do with that information if you did?

And then there are the deaths, like the Sorenson database and Ancestry’s own Y and mitochondrial data base. I still shudder to think how much we’ve lost at the corporate hands of Ancestry.

In past years, there have often been big new announcements facilitated by new technology. In many ways, the big fish have been caught in a technology sense.  Those big fish are autosomal DNA and the Big Y types of tests.  Both of these have created an avalanche of data and we, personally and as a community, are still trying to sort through what all of this means genealogically and how to best utilize the information.  Now we need tools.

This is probably illustrated most aptly by the expansion of the Y tree.

The SNP Tsunami Growing Pains Continue

2015 snp tsunami

Going from 800+ SNPs in 2012 to more than 35,000 SNPs today has introduced its own set of problems. First, there are multiple trees in existence, completely or partially maintained by different organizations for different purposes.  Needless to say, these trees are not in sync with each other.  The criteria for adding a SNP to the tree is decided by the owner or steward of that tree, and there is no agreement as to the definition of a valid SNP or how many instances of that SNP need to be in existence to be added to the tree.

This angst has been taking place for the most part outside of the public view, but it exists just the same.

For example, 23andMe still uses the old haplogroup names like R1b which have not been used in years elsewhere. Family Tree DNA is catching up with updating their tree, working with haplogroup administrators to be sure only high quality, proven SNPs are added to branches.  ISOGG maintains another tree (one branch shown above) that’s publicly available, utilizing volunteers per haplogroup and sometimes per subgroup.  Other individuals and organizations maintain other trees, or branches of trees, some very accurate and some adding a new “branch” with as little as one result.

The good news is that this will shake itself out. Personally, I’m voting for the more conservative approach for public reference trees to avoid “pollution” and a lot of shifting and changing downstream when it’s discovered that the single instance of a SNP is either invalid or in a different branch location.  However, you have to start with an experimental or speculative tree before you can prove that a SNP is where it belongs or needs to be moved, so each of the trees has its own purpose.

The full trees I utilize are the Family Tree DNA tree, available for customers, the ISOGG tree and Ray Banks’ tree which includes locations where the SNPs are found when the geographic location is localized. Within haplogroup projects, I tend to use a speculative tree assembled by the administrators, if one is available.  The haplogroup admins generally know more about their haplogroup or branch than anyone else.

The bad news is that this situation hasn’t shaken itself out yet, and due to the magnitude of the elephant at hand, I don’t think it will anytime soon. As this shuffling and shaking occurs, we learn more about where the SNPs are found today in the world, where they aren’t found, which SNPs are “family” or “clan” SNPs and the timeframes in which they were born.

In other words, this is a learning process for all involved – albeit a slow and frustrating one. However, we are making progress and the tree becomes more robust and accurate every year.

We may be having growing pains, but growing pains aren’t necessarily a bad thing and are necessary for growth.

Thank you to the hundreds of volunteers who work on these trees, and in particular, to Alice Fairhurst who has spearheaded the ISOGG tree for the past nine years. Alice retired from that volunteer position this year and is shown below after receiving two much-deserved awards for her service at the Family Tree DNA Conference in November.

2015 ftdna fairhurst 2

Best Innovative Use of Integrated Data

2015 smileDr. Maurice Gleeson receives an award this year for the best genealogical use of integrated types of data. He has utilized just about every tool he can find to wring as much information as possible out of Y DNA results.  Not only that, but he has taken great pains to share that information with us in presentations in the US and overseas, and by creating a video, noted in the article below.  Thanks so much Maurice.

Making Sense of Y Data

Estes pedigree

The advent of massive amounts of Y DNA data has been both wonderful and perplexing. We as genetic genealogists want to know as much about our family as possible, including what the combination of STR and SNP markers means to us.  In other words, we don’t want two separate “test results” but a genealogical marriage of the two.

I took a look at this from the perspective of the Estes DNA project. Of course, everyone else will view those results through the lens of their own surname or haplogroup project.

Estes Big Y DNA Results
http://dna-explained.com/2015/03/26/estes-big-y-dna-results/

At the Family Tree DNA Conference in November, James Irvine and Maurice Gleeson both presented sessions on utilizing a combination of STR and SNP data and various tools in analyzing their individual projects.

Maurice’s presentation was titled “Combining SNPs, STRs and Genealogy to build a Surname Origins Tree.”
http://www.slideshare.net/FamilyTreeDNA/building-a-mutation-history-tree

Maurice created a wonderful video that includes a lot of information about working with Y DNA results. I would consider this one of the very best Y DNA presentations I’ve ever seen, and thanks to Maurice, it’s available as a video here:
https://www.youtube.com/watch?v=rvyHY4R6DwE&feature=youtu.be

You can view more of Maurice’s work at:
http://gleesondna.blogspot.com/2015/08/genetic-distance-genetic-families.html

James Irvine’s presentation was titled “Surname Projects – Some Fresh Ideas.” http://www.slideshare.net/FamilyTreeDNA/y-dna-surname-projects-some-fresh-ideas

Another excellent presentation discussing Y DNA results was “YDNA maps Scandinavian Family Trees from Medieval Times and the Viking Age” by Peter Sjolund.
http://www.slideshare.net/FamilyTreeDNA/ydna-maps-scandinavian-family-trees-from-medieval-times-and-the-viking-age

Peter’s session at the genealogy conference in Sweden this year was packed. This photo, compliments of Katherine Borges, shows the room and the level of interest in Y-DNA and the messages it holds for genetic genealogists.

sweden 2015

This type of work is the wave of the future, although hopefully it won’t be so manually intensive. However, the process of discovery is by definition laborious.  From this early work will one day emerge reproducible methodologies, the fruits of which we will all enjoy.

Haplogroup Definitions and Discoveries Continue

A4 mutations

Often, haplogroup work flies under the radar today and gets dwarfed by some of the larger citizen science projects, but this work is fundamentally important. In 2015, we made discoveries about haplogroups A4 and C, for example.

Haplogroup A4 Unpeeled – European, Jewish, Asian and Native American
http://dna-explained.com/2015/03/05/haplogroup-a4-unpeeled-european-jewish-asian-and-native-american/

New Haplogroup C Native American Subgroups
http://dna-explained.com/2015/03/11/new-haplogroup-c-native-american-subgroups/

Native American Haplogroup C Update – Progress
http://dna-explained.com/2015/08/25/native-american-haplogroup-c-update-progress/

These aren’t the only discoveries, by any stretch of the imagination. For example, Mike Wadna, administrator for the Haplogroup R1b Project reports that there are now over 1500 SNPs on the R1b tree at Family Tree DNA – which is just about twice as many as were known in total for the entire Y tree in 2012 before the Genographic project was introduced.

The new Y DNA SNP Packs being introduced by Family Tree DNA which test more than 100 SNPs for about $100 will go a very long way in helping participants obtain haplogroup assignments further down the tree without doing the significantly more expensive Big Y test. For example, the R1b-DF49XM222 SNP Pack tests 157 SNPs for $109.  Of course, if you want to discover your own private line of SNPs, you’ll have to take the Big Y.  SNP Packs can only test what is already known and the Big Y is a test of discovery.

                       Best Blog2015 smile

Jim Bartlett, hands down, receives this award for his new and wonderful blog, Segmentology.

                             Making Sense of Autosomal DNA

segmentology

Our autosomal DNA results provide us with matches at each of the vendors and at GedMatch, but what do we DO with all those matches and how to we utilize the genetic match information? How to we translate those matches into ancestral information.  And once we’ve assigned a common ancestor to a match with an individual, how does that match affect other matches on that same segment?

2015 has been the year of sorting through the pieces and defining terms like IBS (identical by state, which covers both identical by population and identical by chance) and IBD (identical by descent). There has been a lot written this year.

Jim Bartlett, a long-time autosomal researcher has introduced his new blog, Segmentology, to discuss his journey through mapping ancestors to his DNA segments. To the best of my knowledge, Jim has mapped more of his chromosomes than any other researcher, more than 80% to specific ancestors – and all of us can leverage Jim’s lessons learned.

Segmentology.org by Jim Bartlett
http://dna-explained.com/2015/05/12/segmentology-org-by-jim-bartlett/

When you visit Jim’s site, please take a look at all of his articles. He and I and others may differ slightly in the details our approach, but the basics are the same and his examples are wonderful.

Autosomal DNA Testing – What Now?
http://dna-explained.com/2015/08/07/autosomal-dna-testing-101-what-now/

Autosomal DNA Testing 101 – Tips and Tricks for Contact Success
http://dna-explained.com/2015/08/11/autosomal-dna-testing-101-tips-and-tricks-for-contact-success/

How Phasing Works and Determining IBS vs IBD Matches
http://dna-explained.com/2015/01/02/how-phasing-works-and-determining-ibd-versus-ibs-matches/

Just One Cousin
http://dna-explained.com/2015/01/11/just-one-cousin/

Demystifying Autosomal DNA Matching
http://dna-explained.com/2015/01/17/demystifying-autosomal-dna-matching/

A Study Using Small Segment Matching
http://dna-explained.com/2015/01/21/a-study-utilizing-small-segment-matching/

Finally, A How-To Class for Working with Autosomal Results
http://dna-explained.com/2015/02/10/finally-a-how-to-class-for-working-with-autosomal-dna-results/

Parent-Child Non-Matching Autosomal DNA Segments
http://dna-explained.com/2015/05/14/parent-child-non-matching-autosomal-dna-segments/

A Match List Does Not an Ancestor Make
http://dna-explained.com/2015/05/19/a-match-list-does-not-an-ancestor-make/

4 Generation Inheritance Study
http://dna-explained.com/2015/08/23/4-generation-inheritance-study/

Phasing Yourself
http://dna-explained.com/2015/08/27/phasing-yourself/

Autosomal DNA Matching Confidence Spectrum
http://dna-explained.com/2015/09/25/autosomal-dna-matching-confidence-spectrum/

Earlier in the year, there was a lot of discussion and dissention about the definition of and use of small segments. I utilize them, carefully, generally in conjunction with larger segments.  Others don’t.  Here’s my advice.  Don’t get yourself hung up on this.  You probably won’t need or use small segments until you get done with the larger segments, meaning low-hanging fruit, or unless you are doing a very specific research project.  By the time you get to that point, you’ll understand this topic and you’ll realize that the various researchers agree about far more than they disagree, and you can make your own decision based on your individual circumstances. If you’re entirely endogamous, small segments may just make you crazy.  However, if you’re chasing a colonial American ancestor, then you may need those small segments to identify or confirm that ancestor.

It is unfortunate, however, that all of the relevant articles are not represented in the ISOGG wiki, allowing people to fully educate themselves. Hopefully this can be updated shortly with the additional articles, listed above and from Jim Bartlett’s blog, published during this past year.

Recreating the Dead

James Crumley overlapping segments

James and Catherne Crumley segments above, compliments of Kitty Cooper’s tools

As we learn more about how to use autosomal DNA, we have begun to reconstruct our ancestors from the DNA of their descendants. Not as in cloning, but as in attributing DNA found in multiple descendants that originate from a common ancestor, or ancestral couple.  The first foray into this arena was GedMatch with their Lazarus tool.

Lazarus – Putting Humpty Dumpty Back Together Again
http://dna-explained.com/2015/01/14/lazarus-putting-humpty-dumpty-back-together-again/

I have taken a bit of a different proof approach wherein I recreated an ancestor, James Crumley, born in 1712 from the matching DNA of roughly 30 of his descendants.
http://www.slideshare.net/FamilyTreeDNA/roberta-estes-crumley-y-dna

I did the same thing, on an experimental smaller scale about a year ago with my ancestor, Henry Bolton.
http://dna-explained.com/2014/11/10/henry-bolton-c1759-1846-kidnapped-revolutionary-war-veteran-52-ancestors-45/

This is the way of the future in genetic genealogy, and I’ll be writing more about the Crumley project and the reconstruction of James Crumley in 2016.

                         Lump Of Coal Award(s)2015 frown

This category is a “special category” that is exactly what you think it is. Yep, this is the award no one wants.  We have a tie for the Lump of Coal Award this year between Ancestry and 23andMe.

               Ancestry Becomes the J.R. Ewing of the Genealogy World

2015 Larry Hagman

Attribution : © Glenn Francis, http://www.PacificProDigital.com

Some of you may remember J.R. Ewing on the television show called Dallas that ran from 1978 through 1991. J.R. Ewing, a greedy and unethical oil tycoon was one of the main characters.  The series was utterly mesmerizing, and literally everyone tuned in.  We all, and I mean universally, hated J.R. Ewing for what he unfeelingly and selfishly did to his family and others.  Finally, in a cliffhanger end of the season episode, someone shot J.R. Ewing.  OMG!!!  We didn’t know who.  We didn’t know if J.R. lived or died.  Speculation was rampant.  “Who shot JR?” was the theme on t-shirts everyplace that summer.  J.R. Ewing, over time, became the man all of America loved to hate.

Ancestry has become the J.R. Ewing of the genealogy world for the same reasons.

In essence, in the genetic genealogy world, Ancestry introduced a substandard DNA product, which remains substandard years later with no chromosome browser or comparison tools that we need….and they have the unmitigated audacity to try to convince us we really don’t need those tools anyway. Kind of like trying to convince someone with a car that they don’t need tires.

Worse, yet, they’ve introduced “better” tools (New Ancestor Discoveries), as in tools that were going to be better than a chromosome browser.  New Ancestor Discoveries “gives us” ancestors that aren’t ours. Sadly, there are many genealogists being led down the wrong path with no compass available.

Ancestry’s history of corporate stewardship is abysmal and continues with the obsolescence of various products and services including the Sorenson DNA database, their own Y and mtDNA database, MyFamily and most recently, Family Tree Maker. While the Family Tree Maker announcement has been met with great gnashing of teeth and angst among their customers, there are other software programs available.  Ancestry’s choices to obsolete the DNA data bases is irrecoverable and a huge loss to the genetic genealogy community.  That information is lost forever and not available elsewhere – a priceless, irreplaceable international treasure intentionally trashed.

If Ancestry had not bought up nearly all of the competing resources, people would be cancelling their subscriptions in droves to use another company – any other company. But there really is no one else anymore.  Ancestry knows this, so they have become the J.R. Ewing of the genealogy world – uncaring about the effects of their decisions on their customers or the community as a whole.  It’s hard for me to believe they have knowingly created such wholesale animosity within their own customer base.  I think having a job as a customer service rep at Ancestry would be an extremely undesirable job right now.  Many customers are furious and Ancestry has managed to upset pretty much everyone one way or another in 2015.

AncestryDNA Has Now Thoroughly Lost Its Mind
https://digginupgraves.wordpress.com/2015/04/02/ancestrydna-has-now-thoroughly-lost-its-mind/

Kenny, Kenny, Kenny
https://digginupgraves.wordpress.com/2015/04/10/kenny-kenny-kenny/

Dear Kenny – Any Suggestions for our New Ancestor Discoveries?
https://digginupgraves.wordpress.com/2015/04/13/dear-kenny-any-suggestions-for-our-new-ancestor-discoveries/

RIP Sorenson – A Crushing Loss
http://dna-explained.com/2015/05/15/rip-sorenson-a-crushing-loss/

Of Babies and Bathwater
http://www.legalgenealogist.com/blog/2015/05/17/of-babies-and-bathwater/

Facts Matter
http://legalgenealogist.com/blog/2015/05/03/facts-matter/

Getting the Most Out of AncestryDNA
http://dna-explained.com/2015/02/02/getting-the-most-out-of-ancestrydna/

Ancestry Gave Me a New DNA Ancestor and It’s Wrong
http://dna-explained.com/2015/04/03/ancestry-gave-me-a-new-dna-ancestor-and-its-wrong/

Testing Ancestry’s Amazing New Ancestor DNA Claim
http://dna-explained.com/2015/04/07/testing-ancestrys-amazing-new-ancestor-dna-claim/

Dissecting AncestryDNA Circles and New Ancestors
http://dna-explained.com/2015/04/09/dissecting-ancestrydna-circles-and-new-ancestors/

Squaring the Circle
http://legalgenealogist.com/blog/2015/03/29/squaring-the-circle/

Still Waiting for the Holy Grail
http://legalgenealogist.com/blog/2015/04/05/still-waiting-for-the-holy-grail/

A Dozen Ancestors That Aren’t aka Bad NADs
http://dna-explained.com/2015/04/14/a-dozen-ancestors-that-arent-aka-bad-nads/

The Logic and Birth of a Bad NAD (New Ancestor Discovery)
http://dna-explained.com/2015/08/12/the-logic-and-birth-of-a-bad-nad-new-ancestor-discovery/

Circling the Shews
http://legalgenealogist.com/blog/2015/05/24/circling-the-shews/

Naughty Bad NADs Sneak Home Under Cover of Darkness
http://dna-explained.com/2015/08/24/naughty-bad-nads-sneak-home-under-cover-of-darkness/

Ancestry Shared Matches Combined with New Ancestor Discoveries
http://dna-explained.com/2015/08/28/ancestry-shared-matches-combined-with-new-ancestor-discoveries/

Ancestry Shakey Leaf Disappearing Matches: Now You See Them – Now You Don’t
http://dna-explained.com/2015/09/24/ancestry-shakey-leaf-disappearing-matches-now-you-see-them-now-you-dont/

Ancestry’s New Amount of Shared DNA – What Does It Really Mean?
http://dna-explained.com/2015/11/06/ancestrys-new-amount-of-shared-dna-what-does-it-really-mean/

The Winds of Change
http://legalgenealogist.com/blog/2015/11/08/the-winds-of-change/

Confusion – Family Tree Maker, Family Tree DNA and Ancestry.com
http://dna-explained.com/2015/12/13/confusion-family-tree-maker-family-tree-dna-and-ancestry-com/

DNA: good news, bad news
http://legalgenealogist.com/blog/2015/01/11/dna-good-news-bad-news/

Check out the Alternatives
http://legalgenealogist.com/blog/2015/12/09/check-out-the-alternatives/

GeneAwards 2015
http://www.tamurajones.net/GeneAwards2015.xhtml

23andMe Betrays Genealogists

2015 broken heart

In October, 23andMe announced that it has reached an agreement with the FDA about reporting some health information such as carrier status and traits to their clients. As a part of or perhaps as a result of that agreement, 23andMe is dramatically changing the user experience.

In some aspects, the process will be simplified for genealogists with a universal opt-in. However, other functions are being removed and the price has doubled.  New advertising says little or nothing about genealogy and is entirely medically focused.  That combined with the move of the trees offsite to MyHeritage seems to signal that 23andMe has lost any commitment they had to the genetic genealogy community, effectively abandoning the group entirely that pulled their collective bacon out of the fire. This is somehow greatly ironic in light of the fact that it was the genetic genealogy community through their testing recommendations that kept 23andMe in business for the two years, from November of 2013 through October of 2015 when the FDA had the health portion of their testing shut down.  This is a mighty fine thank you.

As a result of the changes at 23andMe relative to genealogy, the genetic genealogy community has largely withdrawn their support and recommendations to test at 23andMe in favor of Ancestry and Family Tree DNA.

Kelly Wheaton, writing on the Facebook ISOGG group along with other places has very succinctly summed up the situation:
https://www.facebook.com/groups/isogg/permalink/10153873250057922/

You can also view Kelly’s related posts from earlier in December and their comments at:
https://www.facebook.com/groups/isogg/permalink/10153830929022922/
and…
https://www.facebook.com/groups/isogg/permalink/10153828722587922/

My account at 23andMe has not yet been converted to the new format, so I cannot personally comment on the format changes yet, but I will write about the experience in 2016 after my account is converted.

Furthermore, I will also be writing a new autosomal vendor testing comparison article after their new platform is released.

I Hate 23andMe
https://digginupgraves.wordpress.com/2015/06/14/i-hate-23andme/

23andMe to Get Makeover After Agreement With FDA
http://dna-explained.com/2015/10/21/23andme-to-get-a-makeover-after-agreement-with-fda/

23andMe Metamorphosis
http://throughthetreesblog.tumblr.com/post/131724191762/the-23andme-metamorphosis

The Changes at 23andMe
http://legalgenealogist.com/blog/2015/10/25/the-changes-at-23andme/

The 23and Me Transition – The First Step
http://dna-explained.com/2015/11/05/the-23andme-transition-first-step-november-11th/

The Winds of Change
http://legalgenealogist.com/blog/2015/11/08/the-winds-of-change/

Why Autosomal Response Rate Really Does Matter
http://dna-explained.com/2015/02/24/why-autosomal-response-rate-really-does-matter/

Heads Up About the 23andMe Meltdown
http://dna-explained.com/2015/12/04/heads-up-about-the-23andme-meltdown/

Now…and not now
http://legalgenealogist.com/blog/2015/12/06/now-and-not-now/

                             Cone of Shame Award 2015 frown

Another award this year is the Cone of Shame award which is also awarded to both Ancestry and 23andMe for their methodology of obtaining “consent” to sell their customers’, meaning our, DNA and associated information.

Genetic Genealogy Data Gets Sold

2015 shame

Unfortunately, 2015 has been the year that the goals of both 23andMe and Ancestry have become clear in terms of our DNA data. While 23andMe has always been at least somewhat focused on health, Ancestry never was previously, but has now hired a health officer and teamed with Calico for medical genetics research.

Now, both Ancestry and 23andMe have made research arrangements and state in their release and privacy verbiage that all customers must electronically sign (or click through) when purchasing their DNA tests that they can sell, at minimum, your anonymized DNA data, without any further consent.  And there is no opt-out at that level.

They can also use our DNA and data internally, meaning that 23andMe’s dream of creating and patenting new drugs can come true based on your DNA that you submitted for genealogical purposes, even if they never sell it to anyone else.

In an interview in November, 23andMe CEO Anne Wojcicki said the following:

23andMe is now looking at expanding beyond the development of DNA testing and exploring the possibility of developing its own medications. In July, the company raised $79 million to partly fund that effort. Additionally, the funding will likely help the company continue with the development of its new therapeutics division. In March, 23andMe began to delve into the therapeutics market, to create a third pillar behind the company’s personal genetics tests and sales of genetic data to pharmaceutical companies.

Given that the future of genetic genealogy at these two companies seems to be tied to the sale of their customer’s genetic and other information, which, based on the above, is very clearly worth big bucks, I feel that the fact that these companies are selling and utilizing their customer’s information in this manner should be fully disclosed. Even more appropriate, the DNA information should not be sold or utilized for research without an informed consent that would traditionally be used for research subjects.

Within the past few days, I wrote an article, providing specifics and calling on both companies to do the following.

  1. To minimally create transparent, understandable verbiage that informs their customers before the end of the purchase process that their DNA will be sold or utilized for unspecified research with the intention of financial gain and that there is no opt-out. However, a preferred plan of action would be a combination of 2 and 3, below.
  2. Implement a plan where customer DNA can never be utilized for anything other than to deliver the services to the consumers that they purchased unless a separate, fully informed consent authorization is signed for each research project, without coercion, meaning that the client does not have to sign the consent to obtain any of the DNA testing or services.
  3. To immediately stop utilizing the DNA information and results from customers who have already tested until they have signed an appropriate informed consent form for each research project in which their DNA or other information will be utilized.

And Now Ancestry Health
http://dna-explained.com/2015/06/06/and-now-ancestry-health/

Opting Out
http://legalgenealogist.com/blog/2015/07/26/opting-out/

Ancestry Terms of Use Updated
http://legalgenealogist.com/blog/2015/07/07/ancestry-terms-of-use-updated/

AncestryDNA Doings
http://legalgenealogist.com/blog/2015/07/05/ancestrydna-doings/

Heads Up About the 23andMe Meltdown
http://dna-explained.com/2015/12/04/heads-up-about-the-23andme-meltdown/

23andMe and Ancestry and Selling Your DNA Information
http://dna-explained.com/2015/12/30/23andme-ancestry-and-selling-your-dna-information/

                      Citizen Science Leadership Award   2015 smile

The Citizen Science Leadership Award this year goes to Blaine Bettinger for initiating the Shared cM Project, a crowdsourced project which benefits everyone.

Citizen Scientists Continue to Push the Edges of the Envelope with the Shared cM Project

Citizen scientists, in the words of Dr. Doron Behar, “are not amateurs.” In fact, citizen scientists have been contributing mightily and pushing the edge of the genetic genealogy frontier consistently now for 15 years.  This trend continues, with new discoveries and new ways of viewing and utilizing information we already have.

For example, Blaine Bettinger’s Shared cM Project was begun in March and continues today. This important project has provided real life information as to the real matching amounts and ranges between people of different relationships, such as first cousins, for example, as compared to theoretical match amounts.  This wonderful project produced results such as this:

2015 shared cM

I don’t think Blaine initially expected this project to continue, but it has and you can read about it, see the rest of the results, and contribute your own data here. Blaine has written several other articles on this topic as well, available at the same link.

Am I Weird or What?
http://dna-explained.com/2015/03/07/am-i-weird-or-what/

Jim Owston analyzed fourth cousins and other near distant relationships in his Owston one-name study:
https://owston.wordpress.com/2015/08/10/an-analysis-of-fourth-cousins-and-other-near-distant-relatives/

I provided distant cousin information in the Crumley surname study:
http://www.slideshare.net/FamilyTreeDNA/roberta-estes-crumley-y-dna

I hope more genetic genealogists will compile and contribute this type of real world data as we move forward. If you have compiled something like this, the Surname DNA Journal is peer reviewed and always looking for quality articles for publication.

Privacy, Law Enforcement and DNA

2015 privacy

Unfortunately, in May, a situation by which Y DNA was utilized in a murder investigation was reported in a sensationalist “scare” type fashion.  This action provided cause, ammunition or an excuse for Ancestry to remove the Sorenson data base from public view.

I find this exceedingly, exceedingly unfortunate. Given Ancestry’s history with obsoleting older data bases instead of updating them, I’m suspecting this was an opportune moment for Ancestry to be able to withdraw this database, removing a support or upgrade problem from their plate and blame the problem on either law enforcement or the associated reporting.

I haven’t said much about this situation, in part because I’m not a lawyer and in part because the topic is so controversial and there is no possible benefit since the damage has already been done. Unfortunately, nothing anyone can say or has said will bring back the Sorenson (or Ancestry) data bases and arguments would be for naught.  We already beat this dead horse a year ago when Ancestry obsoleted their own data base.  On this topic, be sure to read Judy Russell’s articles and her sources as well for the “rest of the story.”

Privacy, the Police and DNA
http://legalgenealogist.com/blog/2015/02/08/privacy-the-police-and-dna/

Big Easy DNA Not So Easy
http://legalgenealogist.com/blog/2015/03/15/big-easy-dna-not-so-easy/

Of Babies and Bathwater
http://www.legalgenealogist.com/blog/2015/05/17/of-babies-and-bathwater/

Facts Matter
http://legalgenealogist.com/blog/2015/05/03/facts-matter/

Genetic genealogy standards from within the community were already in the works prior to the Idaho case, referenced above, and were subsequently published as guidelines.

Announcing Genetic Genealogy Standards
http://thegeneticgenealogist.com/2015/01/10/announcing-genetic-genealogy-standards/

The standards themselves:
http://www.thegeneticgenealogist.com/wp-content/uploads/2015/01/Genetic-Genealogy-Standards.pdf

Ancient DNA Results Continue to Amass

“Moorleiche3-Schloss-Gottorf” by Commander-pirx at de.wikipedia – Own work. Licensed under CC BY-SA 3.0 via Commons

Ancient DNA is difficult to recover and even more difficult to sequence, reassembling tiny little blocks of broken apart DNA into an ancient human genome.

However, each year we see a few more samples and we are beginning to repaint the picture of human population movement, which is different than we thought it would be.

One of the best summaries of the ancient ancestry field was Michael Hammer’s presentation at the Family Tree DNA Conference in November titled “R1B and the Peopling of Europe: an Ancient DNA Update.” His slides are available here:
http://www.slideshare.net/FamilyTreeDNA/r1b-and-the-people-of-europe-an-ancient-dna-update

One of the best ongoing sources for this information is Dienekes’ Anthropology Blog. He covered most of the new articles and there have been several.  That’s the good news and the bad news, all rolled into one. http://dienekes.blogspot.com/

I have covered several that were of particular interest to the evolution of Europeans and Native Americans.

Yamnaya, Light Skinned Brown Eyed….Ancestors?
http://dna-explained.com/2015/06/15/yamnaya-light-skinned-brown-eyed-ancestors/

Kennewick Man is Native American
http://dna-explained.com/2015/06/18/kennewick-man-is-native-american/

Botocudo – Ancient Remains from Brazil
http://dna-explained.com/2015/07/02/botocudo-ancient-remains-from-brazil/

Some Native had Oceanic Ancestors
http://dna-explained.com/2015/07/22/some-native-americans-had-oceanic-ancestors/

Homo Naledi – A New Species Discovered
http://dna-explained.com/2015/09/11/homo-naledi-a-new-species-discovered/

Massive Pre-Contact Grave in California Yields Disappointing Results
http://dna-explained.com/2015/10/20/mass-pre-contact-native-grave-in-california-yields-disappointing-results/

I know of several projects involving ancient DNA that are in process now, so 2016 promises to be a wonderful ancient DNA year!

Education

2015 education

Many, many new people discover genetic genealogy every day and education continues to be an ongoing and increasing need. It’s a wonderful sign that all major conferences now include genetic genealogy, many with a specific track.

The European conferences have done a great deal to bring genetic genealogy testing to Europeans. European testing benefits those of us whose ancestors were European before immigrating to North America.  This year, ISOGG volunteers staffed booths and gave presentations at genealogy conferences in Birmingham, England, Dublin, Ireland and in Nyköping, Sweden, shown below, photo compliments of Catherine Borges.

ISOGG volunteers

Several great new online educational opportunities arose this year, outside of conferences, for which I’m very grateful.

DNA Lectures YouTube Channel
http://dna-explained.com/2015/04/26/dna-lectures-youtube-channel/

Allen County Public Library Online Resources
http://dna-explained.com/2015/06/03/allen-county-public-library-online-resources/

DNA Data Organization Tools and Who’s on First
http://dna-explained.com/2015/09/08/dna-data-organization-tools-and-whos-on-first/

Genetic Genealogy Educational Resource List
http://dna-explained.com/2015/12/03/genetic-genealogy-educational-resource-list/

Genetic Genealogy Ireland Videos
https://www.youtube.com/channel/UCHnW2NAfPIA2KUipZ_PlUlw

DNA Lectures – Who Do You Think You Are
https://www.youtube.com/channel/UC7HQSiSkiy7ujlkgQER1FYw

Ongoing and Online Classes in how to utilize both Y and autosomal DNA
http://www.dnaadoption.com/index.php?page=online-classes

Education Award

2015 smile Family Tree DNA receives the Education Award this year along with a huge vote of gratitude for their 11 years of genetic genealogy conferences. They are the only testing or genealogy company to hold a conference of this type and they do a fantastic job.  Furthermore, they sponsor additional educational events by providing the “theater” for DNA presentations at international events such as the Who Do You Think You Are conference in England.  Thank you Family Tree DNA.

Family Tree DNA Conference

ftdna 2015

The Family Tree DNA Conference, held in November, was a hit once again. I’m not a typical genealogy conference person.  My focus is on genetic genealogy, so I want to attend a conference where I can learn something new, something leading edge about the science of genetic genealogy – and that conference is definitely the Family Tree DNA conference.

Furthermore, Family Tree DNA offers tours of their lab on the Monday following the conference for attendees, and actively solicits input on their products and features from conference attendees and project administrators.

2015 FTDNA lab

Family Tree DNA 11th International Conference – The Best Yet
http://dna-explained.com/2015/11/18/2015-family-tree-dna-11th-international-conference-the-best-yet/

All of the conference presentations that were provided by the presenters have been made available by Family Tree DNA at:
http://www.slideshare.net/FamilyTreeDNA?utm_campaign=website&utm_source=sendgrid.com&utm_medium=email

2016 Genetic Genealogy Wish List

2015 wish list

In 2014, I presented a wish list for 2015 and it didn’t do very well.  Will my 2015 list for 2016 fare any better?

  • Ancestry restores Sorenson and their own Y and mtDNA data bases in some format or contributes to an independent organization like ISOGG.
  • Ancestry provides chromosome browser.
  • Ancestry removes or revamps Timber in order to restore legitimate matches removed by Timber algorithm.
  • Fully informed consent (per research project) implemented by 23andMe and Ancestry, and any other vendor who might aspire to sell consumer DNA or related information, without coercion, and not as a prerequisite for purchasing a DNA testing product. DNA and information will not be shared or utilized internally or externally without informed consent and current DNA information will cease being used in this fashion until informed consent is granted by customers who have already tested.
  • Improved ethnicity reporting at all vendors including ancient samples and additional reference samples for Native Americans.
  • Autosomal Triangulation tools at all vendors.
  • Big Y and STR integration and analysis enhancement at Family Tree DNA.
  • Ancestor Reconstruction
  • Mitochondrial and Y DNA search tools by ancestor and ancestral line at Family Tree DNA.
  • Improved tree at Family Tree DNA – along with new search capabilities.
  • 23andMe restores lost capabilities, drops price, makes changes and adds features previously submitted as suggestions by community ambassadors.
  • More tools (This is equivalent to “bring me some surprises” on my Santa list as a kid.)

My own goals haven’t changed much over the years. I still just want to be able to confirm my genealogy, to learn as much as I can about each ancestor, and to break down brick walls and fill in gaps.

I’m very hopeful each year as more tools and methodologies emerge.  More people test, each one providing a unique opportunity to match and to understand our past, individually and collectively.  Every year genetic genealogy gets better!  I can’t wait to see what 2016 has in store.

Here’s wishing you a very Happy and Ancestrally Prosperous New Year!

2015 happy new year

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

A Study Utilizing Small Segment Matching

There has been quite a bit of discussion in the last several weeks, both pro and con, about how to use small matching DNA segments in genetic genealogy.  A couple of people are even of the opinion that small segments can’t be used at all, ever.  Others are less certain and many of us are working our way through various scenarios.  Evidence certainly exists that these segments can be utilized.

I’ve been writing foundation articles, in preparation for this article, for several weeks now.  Recently, I wrote about how phasing works and determining IBD versus IBS matches and included guidelines for telling the difference between the different kinds of matches.  If you haven’t read that article, it’s essential to understanding this article, so now would be a good time to read or review that article.

I followed that with a step by step article, Demystifying Autosomal DNA Matching, on how to do phasing and matching in combination with the guidelines about how to determine IBD (identical by descent) versus IBS (identical by chance) and identical by population matches when evaluating your own matches.

Now that we understand IBS, IBD, Phasing and how matching actually works on a case by case basis, let’s look at applying those same matching and IBS vs IBD guidelines to small data segments as well.

A Little History

So those of you who haven’t been following the discussion on various blogs and social media don’t feel like you’ve been dropped into the middle of a conversation with no context, let me catch you up.

On Thanksgiving Day, I published an article about identifying one of my ancestors, after many years of trying, Sarah Hickerson.

That article spurred debate, which is just fine when the debate is about the science, but it subsequently devolved into something less pleasant.  There are some individuals with very strong opinions that utilizing small segments of DNA data can “never be done.”

I do not agree with that position.  In fact, I strongly disagree and there are multiple cases with evidence to support small segments being both accurate and useful in specific types of genealogical situations.  We’ll take a look at several.

I do agree that looking at small segment data out of context is useless.  To the best of my knowledge, no genealogist begins with their smallest segments and tries to assemble them, working from the bottom up.  We all begin with the largest segments, because they are the most useful and the closest connections in our tree, and work our way down.  Generally, we only work with small segments when we have to – and there are times that’s all we have.  So we need to establish guidelines and ways to know if those small segments are reliable or not.  In other words, how can we draw conclusions and how much confidence can we put in those conclusions?

Ultimately, whether you choose to use or work with small segment data will be your own decision, based on your own circumstances.  I simply wanted to understand what is possible and what is reasonable, both for my own genealogy and for my readers.

In my projects, I haven’t been using small segment data out of context, or randomly.  In other words, I don’t just pick any two small segment matches and infer or decide that they are valid matches.  Fortunately, by utilizing the IBD vs IBS guidelines, we have tools to differentiate IBD (Identical by Descent) segments from IBS (Identical by State) by chance segments and IBD/IBS by population for matching segments, both large and small.

Studying small segment data is the key to determining exactly how small segments can reasonably be utilized.  This topic probably isn’t black or white, but shades of gray – and assuming the position that something can’t be done simply assures that it won’t be.

I would strongly encourage those involved and interested in this type of research to retain those small segments, work with them and begin to look for patterns.  The only way we, as a community, are ever going to figure out how to work with small segments successfully and reliably is to, well, work with them.

Discussing the science and scenarios surrounding the usage of small data segments in various different situations is critical to seeing our way through the forest.  If the answers were cast in concrete about how to do this, we wouldn’t be working through this publicly today.

Negative personal comments and inferences have no place in the scientific community.  It discourages others from participating, and serves to stifle research and cooperation, not encourage it.  I hope that civil scientific discussions and comparisons involving small segment data can move forward, with decorum, because they are critically needed in order to enhance our understanding, under varying circumstances, of how to utilize small segment data.  As Judy Russell said, disagreeing doesn’t have to be disagreeable.

Two bloggers, Blaine Bettinger and CeCe Moore wrote articles following my Hickerson article.  Blaine subsequently wrote a second article here.  Felix Immanuel wrote articles here and here.

A few others have weighed in, in writing, as well although most commentary has been on Facebook.  Israel Pickholtz, a professional genealogist and genetic consultant, stated on his blog, All My Foreparents, the following:

It is my nature to distrust rules that put everything into a single category and that’s how I feel about small segments. Sometimes they are meaningful and useful, sometimes not.

When I reconstructed my father’s DNA using Lazerus (described last week in Genes From My Father), I happily accepted all small segments of whatever size because those small segments were in the DNA of at least one of his children and at least one of his brother/sister/first cousin. If I have a particular small segment, I must have received it from my parents. If my father’s brother (or sister) has it as well, then it is eminently clear to me that I got it from my father and that it came to him and his brother from my grandfather. And it is not reasonable to say that a sliver of that small segment might have come from my mother, because my father’s people share it.

After seeing Israel’s commentary about Lazarus, I reconstructed the genome of both Roscoe and John Ferverda, brothers, which includes both large and small segments.  Working with the Ferverda DNA further, I wrote an article, Just One Cousin, about matching between two siblings and a first cousin, which includes lots of small data segments, some of which were proven to triangulate, meaning they are genuine, and some which did not.  There are lots more examples in the demystifying article, as well.

What Not To Do 

Before we begin, I want to make it very clear that am not now, and never have, advocated that people utilize small data segments out of context of larger matching segments and/or at least suspected matching genealogy.  For example, I have never implied or even hinted that anyone should go to GedMatch, do a “one to many” compare at 1 cM and then contact people informing them that they are related.  Anyone who has extrapolated what I’ve written to mean that either simply did not understand or intentionally misinterpreted the articles.

Sarah Hickerson Revisited

If I thought Sarah Hickerson caused me a lot of heartburn in the decades before I found her, little did I know how much heartburn that discovery would cause.

Let’s go back to the Sarah Hickerson article that started the uproar over whether small data segments are useful at all.

In that article, I found I was a member of a new Ancestry DNA Circle for Charles Hickerson and Mary Lytle, the parents of Sarah Hickerson.

Ancestry Hickerson match

Because there are no tools at Ancestry to prove DNA connections, I hurried over to Family Tree DNA looking for any matches to Hickersons for myself and for my Vannoy cousins who also (potentially) descended from this couple.  Much to my delight, I found  several matches to Hickersons, in fact, more than 20 – a total of 614 rows of spreadsheet matches when I included all of my Vannoy cousins who potentially descend from this couple to their Hickerson matches.  There were 64 matching clusters of segments, both small and large.  Some matches were as large as 20cM with 6000 SNPs and more than 20 were over 10cM with from 1500 to 6000 SNPs.  There were also hundreds of small segments that matched (and triangulated) as well.

By the time I added in a few more Vannoy cousins that we’ve since recruited, the spreadsheet is now up to 1093 rows and we have 52 Vannoy-Hickerson TRIANGULATED CLUSTERS utilizing only Family Tree DNA tools.

Triangulated DNA, found in 3 or more people at the same location who share a common ancestor is proven to be from that ancestor (or ancestral couple.)  This is the commonly accepted gold standard of autosomal DNA triangulation within the industry.

Here’s just one example of a cluster of three people.  Charlene and Buster are known (proven, triangulated) cousins and Barbara is a descendant of Charles Hickerson and Mary Lytle.

example triang

What more could you want?

Yes, I called this a match.  As far as I’m concerned, it’s a confirmed ancestor.  How much more confirmed can you get?

Some clusters have as many as 25 confirmed triangulated members.

chr 13 group

Others took issue with this conclusion because it included small segment data.  This seems like the perfect opportunity in which to take a look at how small segments do, or don’t stand up to scrutiny.  So, let’s do just that.  I also did the same type of matching comparison in a situation with 2 siblings and a known cousin, here.

To Trash…or Not To Trash

Some genetic genealogists discard small segments entirely, generally under either 5 or 7cM, which I find unfortunate for several reasons.

  1. If a person doesn’t work with small segments, they really can’t comment on the lack of results, and they’ll never have a success because the small segments will have been discarded.
  2. If a person doesn’t work with small segments, they will never notice any trends or matches that may have implications for their ancestry.
  3. If a person doesn’t work with small segments, they can’t contribute to the body of evidence for how to reasonably utilize these segments.
  4. If a person doesn’t work with small segments, they may well be throwing the baby out with the bathwater, but they’ll never know.
  5. They encourage others to do the same.

The Sarah Hickerson article was not meant as a proof article for anything – it was meant to be an article encouraging people to utilize genetic genealogy for not only finding their ancestor and proving known connections, but breaking down brick walls.  It was pointing the way to how I found Sarah Hickerson.  It was one of my 52 Ancestors Series, documenting my ancestors, not one of the specifically educational articles.  This article is different.

If you are only interested in the low hanging fruit, meaning within the past 5 or 6 generations, and only proving your known pedigree, not finding new ancestors beyond that 5-6 generation level, then you can just stop reading now – and you can throw away your small segments.  But if you want more, then keep reading, because we as a community need to work with small segment data in order to establish guidelines that work relative to utilizing small segments and identifying the small segments that can be useful, versus the ones that aren’t.

I do not believe for one minute that small segments are universally useless.  As Israel said, if his family did not receive those segments from a common family member, then where did they all get those matching segments?

In fact, utilizing triangulated and proven DNA relationships within families is how adoptees piece together their family trees, piggybacking off of the work of people with known pedigrees that they match genetically.  My assumption had been that the adoptee community utilized only large DNA segments, because the larger the matching segments, generally the closer in time the genealogy match – and theoretically the easier to find.

However, I discovered that I was wrong, and the adoptee community does in fact utilize small segments as well.  Here’s one of the comments posted on my Chromosome Browser War blog article.

“Thanks for the well thought out article, Roberta, I have something to add from the folks at DNAadoption. Adoptees are not just interested in the large segments, the small segments also build the proof of the numerous lines involved. In addition, the accumulation of surnames from all the matches provides a way to evaluate new lines that join into the tree.”

Diane Harman-Hoog (on behalf of the 6 million adoptees in this country, many of who are looking for information on medical records and family heritage).

Diane isn’t the only person who is working with small segment data.  Tim Janzen works with small segments, in particular on his Mennonite project, and discusses small segments on the ISOGG WIKI Phasing page.  Here is what Tim has to say:

“One advantage of Family Finder is that FF has a 1 cM threshold for matching segments. If a parent and a child both have a matching segment that is in the 2 to 5 cM range and if the number of matching SNPs is 500 or more then there is a reasonably high likelihood that the matching segment is IBD (identical by descent) and not IBS (identical by state).”

The same rules for utilizing larger segment data need to be applied to small segment data to begin with.

Are more guidelines needed for small segments?  I don’t know, but we’ll never know if we don’t work with many individual situations and find the common methods for success and identify any problematic areas.

Why Do Small Segments Matter?

In some cases, especially as we work beyond the 6 generation level, small segments may be all we have left of a specific ancestor.  If we don’t learn to recognize and utilize the small segments available to us, those ancestors, genetically speaking, will be lost to us forever.

As we move back in time, the DNA from more distant ancestors will be divided into smaller and smaller segments, so if we ever want the ability to identify and track those segments back in time to a specific ancestor, we have to learn how to utilize small segment data – and if we have deleted that data, then we can’t use it.

In my case, I have identified all of my 5th generation ancestors except one, and I have a strong lead on her.  In my 6th generation, however, I have lots of walls that need to be broken through – and DNA may be the only way I’ll ever do that.

Let’s take a look at what I can expect when trying to match people who also descend from an ancestor 5 generations back in time.  If they are my same generation, they would be my fourth cousins.

Based on the autosomal statistics chart at ISOGG, 4th cousins, on the average, would expect to share about 13.28 cM of DNA from their common ancestor.  This would not be over the match threshold at FTDNA of approximately 20 cM total, and if those segments were broken into three pieces, for example, that cousin would not show as a match at either FTDNA or 23andMe, based on the vendors’ respective thresholds.

% Shared DNA Expected Shared cM Relationship
0.781% 53.13 Third cousins, common ancestor is 4 generations back in time
0.391% 26.56 Third cousins once removed
20 cm Family Tree DNA total cM Threshold
0.195% 13.28 Fourth cousins, common ancestor is 5 generations back in time
7 cM 23andMe individual segment cM match threshold
0.0977% 6.64 Fourth cousins once removed
0.0488% 3.32 Fifth cousins, common ancestor is 6 generations back in time
0.0244 1.66 Fifth cousins once removed

If you’re lucky, as I was with Hickerson, you’ll match at least some relative who carries that ancestral DNA line above the threshold, and then they’ll match other cousins above the threshold, and you can build a comparison network, linking people together, in that fashion.  And yes you may well have to utilize GedMatch for people testing at various different vendors and for those smaller segment comparisons.

For clarification, I have never “called” a genealogy match without supporting large segment data.  At the vendors, you can’t even see matches if they don’t have larger segments – so there is no way to even know you would match below the threshold.

I do think that we may be able to make calls based on small segments, at least in some instances, in the future.  In fact, we have to figure out how to do this or we will rarely be able to move past the 5th or 6th generation utilizing genetics.

At the 5th generation, or third cousins, one expects to see approximately 26 cM of matching DNA, still over the threshold (if divided correctly), but from that point further back in time, the expected shared amount of DNA is under the current day threshold.  For those who wonder why the vendors state that autosomal matches are reliable to about the 5th or 6th generation, this is the answer.

I do not discount small segments without cause.  In other words, I don’t discount small segments unless there is a reason.  Unless they are positively IBS by chance, meaning false, and I can prove it, I don’t disregard them.  I do label them and make appropriate notes.  You can’t learn from what’s not there.

Let me give you an example.  I have one area of my spreadsheet where I have a whole lot of segments, large and small, labeled Acadian.  Why?  Because the Acadians are so intermarried that I can’t begin to sort out the actual ancestor that DNA came from, at least not yet…so today, I just label them “Acadian.”

This example row is from my master spreadsheet.  I have my Mom’s results in my spreadsheet, so I can see easily if someone matches me and Mom both. My rows are pink.  The match is on Mom’s side, which I’ve color coded purple.  I don’t know which ancestor is the most recent common ancestor, but based on the surnames involved, I know they are Acadian.  In some cases, on Acadian matches, I can tell the MRCA and if so, that field is completed as well.

Me Mom acadian

As a note of interest, I inherited my mother’s segment intact, so there was no 50% division in this generation.

I also have segments labeled Mennonite and Brethren.  Perhaps in the future I’ll sort through these matches and actually be able to assign DNA segments to specific ancestors.  Those segments aren’t useless, they just aren’t yet fully analyzed.  As more people test, hopefully, patterns will emerge in many of these DNA groupings, both small and large.

In fact, I talked about DNA patterns and endogamous populations in my recent article, Just One Cousin.

For me, today, some small segment matches appear to be central European matches.  I say “appear to be,” because they are not triangulated.  For me this is rather boring and nondescript – but if this were my African American client who is trying to figure out which line her European ancestry came from, this could be very important.  Maybe she can map these segments to at least a specific ancestral line, which she would find very exciting.

Learning to use small segments effectively has the potential to benefit the following groups of people:

  • People with colonial ancestry, because all that may be left today of colonial ancestors is small segments.
  • People looking to break down brick walls, not just confirm currently known ancestors.
  • People looking for minority ancestors more than 5 or 6 generations back in their trees.
  • Adoptees – although very clearly, they want to work with the largest matches first.
  • People working with ethnic identification of ancestors, because you will eventually be able to track ethnicity identifying segments back in time to the originating ancestor(s).

Conversely, people from highly endogamous groups may not be helped much, if at all, by small segments because they are so likely to be widely shared within that population as a group from a common ancestor much further back in time.  In fact, the definition of a “small segment” for people with fully endogamous families might be much larger than for someone with no known endogamy.

However, if we can identify segments to specific populations, that may help the future accuracy of ethnicity testing.

Let’s go back and take a look at the Hickerson data using the same format we have been using for the comparisons so far.

Small Segment Examples

These Hickerson/Vannoy examples do not utilize random small segment matches, but are utilizing the same matching rules used for larger matches in conjunction with known, triangulated cousin groups from a known ancestor.  Many cousins, including 2 brothers and their uncle all carry this same DNA.  Like in Israel’s case, where did they get that same DNA if not from a common ancestor?

In the following examples, I want to stress that all of the people involved DO HAVE LARGER SEGMENT MATCHES on other chromosomes, which is how we knew they matched in the first place, so we aren’t trying to prove they are a match.  We know they are.  Our goal is to determine if small segments are useful in the same situation, proving matches, as with larger segments.  In other words, do the rules hold true?  And how do we work with the data?  Could we utilize these small segment matches if we didn’t have larger matching segments, and if so, how reliable would they be?

There is a difference between a single match and a triangulated group:

  • Matches between two people are suggestive of a common ancestor but could be IBS by chance or population..
  • Multiple matches, such as with the 6 different Hickersons who descend from Charles Hickerson and Mary Lytle, both in the Ancestry DNA Circle and at Family Tree DNA, are extremely suggestive of a specific common ancestor.
  • Only triangulated groups are proof of a common ancestor, unless the people are  closely related known relatives.

In our Hickerson/Vannoy study, all participants match at least to one other (but not to all other) group members at Family Tree DNA which means they match over the FTDNA threshold of approximately 20 cM total and at least one segment over 7.7cM and 500 SNPs or more.

In the example below, from the Hickerson article, the known Vannoy cousins are on the left side and the Hickerson matches to the Vannoy cousins are across the top.  We have several more now, but this gives you an idea of how the matching stacked up initially.  The two green individuals were proven descendants from Charles Hickerson and Mary Lytle.

vannoy hickerson higginson matrix

The goal here is to see how small data segments stack up in a situation where the relationship is distant.  Can small segments be utilized to prove triangulation?  This is slightly different than in the Just One Cousin article, where the relationship between the individuals was close and previously known.  We can contrast the results of that close relationship and small segments with this more distant connection and small segments.

Sarah Hickerson and Daniel Vannoy

The Vannoy project has a group of about a dozen cousins who descend from Elijah Vannoy who have worked together to discover the identify of Elijah’s parents.  Elijah’s father is one of 4 Vannoy men, all sons of the same man, found in Wilkes County, NC. in the late 1700s.  Elijah Vannoy is 5 generations upstream from me.

What kind of evidence do we have?  In the paper genealogy world, I have ruled out one candidate via a Bible record, and probably a second via census and tax records, but we have little information about the third and fourth candidates – in spite of thoroughly perusing all existent records.  So, if we’re ever going to solve the mystery, short of that much-wished-for Vannoy Bible showing up on e-Bay, it’s going to have to be via genetic genealogy.

In addition to the dozen or so Vannoy cousins who have DNA tested, we found 6 individuals who descend from Sarah Hickerson’s parents, Charles Hickerson and Mary Lytle who match various Vannoy cousins.  Additionally, those cousins match another 21 individuals who carry the Hickerson or derivative surnames, but since we have not proven their Hickerson lineage on paper, I have not utilized any of those additional matches in this analysis.  Of those 26 total matches, at Family Tree DNA, one Hickerson individual matches 3 Vannoy cousins, nine Hickerson descendants match 2 Vannoy cousins and sixteen Hickerson descendants match 1 Vannoy cousin.

Our group of Vannoy cousins matching to the 6 Charles Hickerson/Mary Lytle descendants contains over 60 different clusters of matching DNA data across the 22 chromosomes.  Those 6 individuals are included in 43 different triangulated groups, proving the entire triangulation group shares a common ancestor.  And that is BEFORE we add any GedMatch information.

If that sounds like a lot, it’s not.  Another recent article found 31 clusters among siblings and their first cousin, so 60 clusters among a dozen known Vannoy cousins and half a dozen potential Hickerson cousins isn’t unusual at all.

To be very clear, Sarah Hickerson and Daniel Vannoy were not “declared” to be the parents of Elijah Vannoy, born in 1784, based on small segment matches alone.  Larger segment matches were involved, which is how we saw the matches in the first place.  Furthermore, the matches triangulated.  However, small segments certainly are involved and are more prevalent, of course, than large segments.  Some cousins are only connected by small segments.  Are they valid, and how do we tell?  Sometimes it’s all we have.

Let me give you the classic example of when small segments are needed.

We have four people.  Person A and B are known Vannoy cousins and person C and D are potential Hickerson cousins.  Potential means, in this case, potential cousins to the Vannoys.  The Hickersons already know they both descend from Charles Hickerson and Mary Lytle.

  • Person A matches person C on chromosome 1 over the matching threshold.
  • Person B matches person D on chromosome 2 over the matching threshold.

Both Vannoy cousins match Hickerson cousins, but not the same cousin and not on the same segments at the vendor.  If these were same segment matches, there would be no question because they would be triangulated, but they aren’t.

So, what do we do?  We don’t have access to see if person C and D match each other, and even if we did, they don’t match on the same segments where they match persons A and B, because if they did we’d see them as a match too when we view A and B.

If person A and B don’t match each other at the vendor, we’re flat out of luck and have to move this entire operation to GedMatch, assuming all 4 people have or are willing to download their data.

a and b nomatch

If person A and B match each other at the vendor, we can see their small segment data as compared to each other and to persons C and D, respectively which then gives us the ability to see if A matches C on the same small segment as B matches D.

a and b match

If we are lucky, they will all show a common match on a small segment – meaning that A will match B on a small segment of chromosome 3, for example, and A will match C on that same segment.  In a perfect world, B will also match D on that same segment, and you will have 4 way triangulation – but I’m happy with the required 3 way match to triangulate.

This is exactly what happened in the article, Be Still My H(e)art.  As you can see, three people match on chromosomes 1 and 8, below – two of whom are proven cousins and the third was the wife surname candidate line.

Younger Hart 1-8

The example I showed of chromosome 2 in the Hickerson article was where all participants of the 5 individuals shown on the chromosome browser were matching to the Vannoy participant.  I thought it was a good visual example.  It was just one example of the 60+ clusters of cousin matches between the dozen Vannoy cousins and 6 Hickerson descendants.

This example was criticized by some because it was a small segment match.  I should probably have utilized chromosome 15 or searched for a better long segment example, but the point in my article was only to show how people that match stack up together on the chromosome browser – nothing more.   Here’s the entire chromosome, for clarity.

hickerson vannoy chr 2

Certainly, I don’t want to mislead anyone, including myself.  Furthermore, I dislike being publicly characterized as “wrong” and worse yet, labeled “irresponsible,” so I decided to delve into the depths of the data and work through several different examples to see if small segment data matching holds in various situations.  Let’s see what we found.

Chromosome 15

I selected chromosome 15 to work with because it is a region where a lot of Vannoy descendants match – and because it is a relatively large segment.  If the Hickersons do match the Vannoys, there’s a fairly good change they might match on at least part of that segment.  In other words, it appears to be my best bet due to sheer size and the number of Elijah Vannoy’s descendants who carry this segment.  In addition to the 6 individuals above who matched on chromosome 15, here are an additional 4.  As you can see, chromosome 15 has a lot of potential.

Chrom 15 Vannoy

The spreadsheet below shows the sections of chromosome 15 where cousins match.  Green individuals in the Match column are descendants of Charles Hickerson and Mary Lytle, the parents of Sarah Hickerson.  The balance are Vannoys who match on chromosome 15.

chr 15 matches ftdna v4

As you can see, there are several segments that are quite large, shown in yellow, but there are also many that are under the threshold of 7cM, which are all  segments that would be deleted if you are deleting small segments.  Please also note that if you were deleting small segments, all of the Hickerson matches would be gone from chromosome 15.

Those of you with an eagle eye will already notice that we have two separate segments that have triangulated between the Vannoy cousins and the Hickerson descendants, noted in the left column by yellow and beige.  So really, we could stop right here, because we’ve proven the relationship, but there’s a lot more to learn, so let’s go on.

You Can’t Use What You Can’t See

I need to point something out at this point that is extremely important.

The only reason we see any segment data below the match threshold is because once you match someone on a larger segment at Family Tree DNA, over the threshold, you also get to view the small segment data down to 1cM for your match with that person. 

What this means is that if one person or two people match a Hickerson descendant, for example you will see the small segment data for their individual matches, but not for anyone that doesn’t match the participant over the matching threshold.

What that means in the spreadsheet above, is that the only Hickerson that matches more than one Vannoy (on this segment) is Barbara – so we can see her segment data (down to 1cM ) as compared to Polly and Buster, but not to anyone else.

If we could see the smaller segment data of the other participants as compared to the Hickerson participants, even though they don’t match on a larger segment over the matching threshold, there could potentially be a lot of small segment data that would match – and therefore triangulate on this segment.

This is the perfect example of why I’ve suggested to Family Tree DNA that within projects or in individuals situations, that we be allowed to reduce the match threshold – especially when a specific family line match is suspected.

This is also one of the reasons why people turn to GedMatch, and we’ll do that as well.

What this means, relative to the spreadsheet is that it is, unfortunately, woefully incomplete – and it’s not apples to apples because in some cases we have data under the match threshold, and in some, we don’t.  So, matches DO count, but nonmatches where small segment data is not available do NOT count as a non-match, or as disproof.  It’s only negative proof IF you have the data AND it doesn’t match.

The Vannoys match and triangulate on many segments, so those are irrelevant to this discussion other than when they match to Hickerson DNA.  William (H), descends from two sons of Charles Hickerson and Mary Lytle.  Unfortunately, he only matches one Vannoy, so we can only see his small segments for that one Vannoy individual, William (V).  We don’t know what we are missing as compared to the rest of the Vannoy cousins.

To see William (H)’s and William (V)’s DNA as compared to the rest of the Vannoy cousins, we had to move to GedMatch.

Matching Options

Since we are working with segments that are proven to be Vannoy, and we are trying to prove/disprove if Daniel Vannoy and Sarah Hickerson are the parents of Elijah through multiple Hickerson matches, there are only a few matching options, which are:

  1. The Hickerson individuals will not triangulate with any of the Vannoy DNA, on chromosome 15 or on other chromosomes, meaning that Sarah Hickerson is probably not the mother of Elijah Vannoy, or the common ancestor is too far back in time to discern that match at vendor thresholds.
  2. The Hickerson individuals will not triangulate on this segment, but do triangulate on other segments, meaning that this segment came entirely from the Vannoy side of the family and not the Hickerson side of the family. Therefore, if chromosome 15 does not triangulate, we need to look at other chromosomes.
  3. The Hickerson individuals triangulate with the Vannoy individuals, confirming that Sarah Hickerson is the mother of Elijah Vannoy, or that there is a different common unknown ancestor someplace upstream of several Hickersons and Vannoys.

All of the Vannoy cousins descend from Elijah Vannoy and Lois McNiel, except one, William (V), who descends from the proven son of Sarah Hickerson and Daniel Vannoy, so he would be expected to match at least some Hickerson descendants.  The 6 Hickerson cousins descend from Charles Hickerson and Mary Lytle, Sarah’s parents.

hickerson vannoy pedigree

William (H), the Hickerson cousin who descends from David, brother to Sarah Hickerson, is descended through two of David Hickerson’s sons.

I decided to utilize the same segment “mapping comparison” technique with a spreadsheet that I utilized in the phasing article, because it’s easy to see and visualize.

I have created a matching spreadsheet and labeled the locations on the spreadsheet from 25-100 based on the beginning of the start location of the cluster of matches and the end location of the cluster.

Each individual being compared on the spreadsheet below has a column across the top.  On the chart below, all Hickerson individuals are to the right and are shown with their cells highlighted yellow in the top row.

Below, the entire colorized chart of chromosome 15 is shown, beginning with location 25 and ending with 100, in the left hand column, the area of the Vannoy overlap.  Remember, you can double click on the graphics to enlarge.  The columns in this spreadsheet are not fully expanded below, but they are in the individual examples.

entire chr 15 match ss v4

I am going to step through this spreadsheet, and point out several aspects.

First, I selected Buster, the individual in the group to begin the comparison, because he was one of the closest to the common ancestor, Elijah Vannoy, genealogically, at 4 generations.  So he is the person at Family Tree DNA that everyone is initially compared against.

Everyone who matches Buster has their matching segments shown in blue.  Buster is shown furthest left.

When participants match someone other than Buster, who they match on that segment is typed into their column.  You can tell who Buster matches because their columns are blue on matching locations.  Here’s an example.

Me Buster match

You can see that in my column, it’s blue on all segments which means I match Buster on this entire region.  In addition, there are names of Carl, Dean, William Gedmatch and Billie Gedmatch typed into the cell in the first row which means at that location, in addition to Buster, I also match Carl and Dean at Family Tree DNA and William (descended from the son of Daniel Vannoy and Sarah Hickerson) at Gedmatch and Billie (a Hickerson) at Gedmatch.  Their name is typed into my column, and mine into theirs.  Please note that I did not run everyone against everyone at GedMatch.  I only needed enough data to prove the point and running many comparisons is a long, arduous process even when GedMatch isn’t experiencing problems.

On cells that aren’t colorized blue, the person doesn’t match Buster, but may still match other Vannoy cousin segments.  For example, Dean, below, matches Buster on location 25-29, along with some other cousins.  However, he does not match Buster on location 30 where he instead matches Harold and Carl who also don’t match Buster at that location. Harold, Carl and Dean do, however, all descend from the same son of Elijah so they may well be sharing DNA from a Vannoy wife at this location, especially since no one who doesn’t share that specific wife’s line matches those three at this location.

Me Buster Dean match

Remember, we are not working with random small data segments, but with a proven matching segment to a common Vannoy ancestor, with a group of descendants from a possible/probable Hickerson ancestor that we are trying to prove/disprove.  In other words, you would expect either a lot of Hickerson matches on the same segments, if Hickerson is indeed a Vannoy ancestral family, or virtually none of them to match, if not.

The next thing I’d like to point out is that these are small segments of people who also have larger matching segments, many of whom do triangulate on larger segments on other chromosomes.  What we are trying to discern is whether small segment matches can be utilized by employing the same matching criteria as large segment matching.  In other words, is small segment data valid and useful if it meets the criteria for an IBD match?

For example, let’s look at Daniel.  Daniel’s segments on chromosome 15, were it not for the fact that he matches on larger segments on other chromosomes, would not be shown as matches, because they are not individually over the match threshold.

Look at Daniel’s column for Polly and Warren.

Daniel matches 2

The segments in red show a triangulated group where Daniel and Warren, or Daniel, Warren and Polly match.  The segments where all 3 match are triangulated.

This proves, unquestionably, that small segments DO match utilizing the normal prescribed IBD matching criteria.  This spreadsheet, just for chromosome 15, is full of these examples.

Is there any reason to think that these triangulated matches are not identical by descent?  If they are not IBD, how do all of these people match the same DNA? Chance alone?  How would that be possible?  Two people, yes, maybe, but 3 or more?  In some cases, 5 or 6 on the same segment?  That is simply not possible, or we have disproven the entire foundation that autosomal DNA matching is based upon.

The question will soon be asked if small segments that triangulate can be useful when there are no larger matching segments to put the match over the initial vendor threshold.

Triangulated Groups

As you can see, most of the people and segments on the spreadsheet, certainly the Elijah descendants, are heavily triangulated, meaning that three or more people match each other on the same locations.  Most of this matching is over the vendor threshold at Family Tree DNA.

You can see that Buster, Me, Dean, Carl and Harold all match each other on the same segments, on the left half of the spreadsheet where our names are in each other’s columns.

triangulated groups

Remember when I said that the spreadsheet was incomplete?  This is an example.  David and Warren don’t match each other at a high enough total of segments to get them over the matching threshold when compared to each other, so we can’t see their small segment data as compared to each other.  David matches Buster, but Warren doesn’t, so I can’t even see them both in relationship to a common match.  There are several people who fall into this category.

Let’s select one individual to use as an example.

I’ve chosen the Vannoy cousin, William(V), because his kit has been uploaded to Gedmatch, he has Vannoy matches and because William is proven to descend from Sarah Hickerson and Daniel Vannoy through their son Joel – so we expect some Hickerson DNA to match William(V).

If William (V) matches the Hickersons on the same DNA locations as he matches to Elijah’s descendants, then that proves that Elijah’s descendant’s DNA in that location is Hickerson DNA.

At GedMatch, I compared William(V) with me and then with Dean using a “one to one” comparison at a low threshold, simply because I wanted as much data as I could get.  Family Tree DNA allows for 1 cM and I did the same, allowing 100 SNPs at GedMatch.  Family Tree DNA’s lowest SNP threshold is 500.

In case you were wondering, even though I did lower the GedMatch threshold below the FTDNA minimum, there were 45 segments that were above 1cM and above 500 SNPs when matching me to William(V), which would have been above the lowest match threshold at FTDNA (assuming we were over the initial match threshold.)  In other words, had we not been below the original match threshold (20cM total, one segment over 7.7cM), these segments would have been included at FTDNA as small segments.  As you can see in the chart below, many triangulated.

I colorized the GedMatch matches, where there were no FTDNA matches, in dark red text.  This illustrates graphically just how much is missed when the small segments are ignored in cases with known or probable cousins.  In the green area, the entry that says “Me GedMatch” could not be colorized red (because you can’t colorize only part of the text of a cell) so I added the Gedmatch designation to differentiate between a match through FTDNA and one from GedMatch.  I did the same with all Gedmatch matches, whether colorized or not.

Let’s take a look and see how small segments from GedMatch affect our Hickerson matching.  Note that in the green area, William (V) matches William (H), the Hickerson descendant, and William (V) matches to me and Dean as well.  This triangulates William (V)’s Hickerson DNA and proves that Elijah’s descendants DNA includes proven Hickerson segments.

William (V) gedmatch matches v2

In this next example, I matched William (H), the Hickerson cousin (with no Vannoy heritage) against both Buster and me.

William (H) gedmatch me buster

Without Gedmatch data, only two segments of chromosome 15 are triangulated between Vannoy and Hickerson cousins, because we can’t see the small data segments of the rest of the cousins who don’t match over the threshold.

You can see here that nearly the entire chromosome is triangulated using small segments.  In the chart below, you can see both William(V) and William (H) as they match various Vannoy cousins.  Both triangulate with me.

William V and William H

I did the same thing with the Hickerson descendant, Billie, as compared to both me and Dean, with the same type of results.

The next question would be if chromosome 15 is a pileup area where I have a lot of IBS matches that are really population based matches.  It does not appear to be.  I have identified an area of my chromosomes that may be a pileup area, but chromosome 15 does not carry any of those characteristics.

So by utilizing the small segments at GedMatch for chromosome 15 that we can’t otherwise see, we can triangulate at least some of the Hickerson matches.  I can’t complete this chart, because several individuals have not uploaded to GedMatch.

Why would the Hickerson descendant match so many of the Vannoy segments on chromosome 15?  Because this is not a random sample.  This is a proven Vannoy segment and we are trying to see which parts of this segment are from a potential Hickerson mother or the Vannoy father.  If from the Hickerson mother, then this level of matching is not unexpected.  In fact, it would be expected.  Since we cheated and saw that chromosome 15 was already triangulated at Family Tree DNA, we already knew what to expect.

In the spreadsheet below, I’ve added the 2 GedMatch comparisons, William (V) to me and Dean, and William (H) to me and Buster.  You can see the segments that triangulate, on the left.  We could also build “triangulated groups,” like GedMatch does.  I started to do this, but then stopped because I realized most cells would be colored and you’d have a hard time seeing the individual triangulated segments.  I shifted to triangulating only the individuals who triangulate directly with the Hickerson descendant, William(H), shown in green.  GedMatch data is shown in red.

chr 15 with gedmatch

I would like to make three points.

1.  This still is not a complete spreadsheet where everyone is compared to everyone.  This was selectively compared for two known Hickerson cousins, William (V) who descends from both Vannoys and Hickersos and William (H) who descends only from Hickersons.

2. There are 25 individually triangulated segments to the Hickerson descendant on just this chromosome to the various Vannoy cousins.  That’s proof times 25 to just one Hickerson cousin.

3.  I would NEVER suggest that you select one set of small segments and base a decision on that alone.  This entire exercise has assembled cumulative evidence.  By the same token, if the rules for segment matching hold up under the worst circumstances, where we have an unknown but suspected relationship and the small segments appear to continue to follow the triangulation rules, they could be expected to remain true in much more favorable circumstances.

Might any of these people have random DNA matches that are truly IBS by chance on chromosome 15?  Of course, but the matching rules, just like for larger segments, eliminates them.  According to triangulation rules, if they are IBS by chance, they won’t triangulate.  If they do triangulate, that would confirm that they received the same DNA from a common ancestor.

If this is not true, and they did not receive their common DNA from a common ancestor, then it disproves the fundamental matching rule upon which all autosomal DNA genetic genealogy is based and we all need to throw in the towel and just go and do something else.

Is there some grey area someplace?  I would presume so,  but at this point, I don’t know how to discern or define it, if there is.  I’ve done three in-depth studies on three different families over the past 6 weeks or so, and I’ve yet to find an area (except for endogamous populations that have matches by population) where the guidelines are problematic.  Other researchers may certainly make different discoveries as they do the same kind of studies.  There is always more to be discovered, so we need to keep an open mind.

In this situation, it helps a lot that the Hickerson/Vannoy descendants match and triangulate on larger segments on other chromosomes.  This study was specifically to see if smaller segments would triangulate and obey the rules. We were fortunate to have such a large, apparently “sticky” segment of Vannoy DNA on chromosome 15 to work with.

Does small segment matching matter in most cases, especially when you have larger segments to utilize?  Probably not. Use the largest segments first.  But in some cases, like where you are trying to prove an ancestor who was born in the 1700s, you may desperately need that small segment data in order to triangulate between three people.

Why is this important – critically important?  Because if small segments obey all of the triangulation rules when larger segments are available to “prove” the match, then there is no reason that they couldn’t be utilized, using the same rules of IBD/IBS, when larger segments are not available.  We saw this in Just One Cousin as well.

However, in terms of proof of concept, I don’t know what better proof could possibly be offered, within the standard genetic genealogy proofs where IBD/IBS guidelines are utilized as described in the Phasing article.  Additional examples of small segment proof by triangulation are offered in Just One Cousin, Lazarus – Putting Humpty Dumpty Together Again, and in Demystifying Autosomal DNA Matching.

Raising Elijah Vannoy and Sarah Hickerson from the Dead

As I thought more about this situation, I realized that I was doing an awful lot of spreadsheet heavy lifting when a tool might already be available.  In fact, Israel’s mention of Lazarus made me wonder if there was a way to apply this tool to the situation at hand.

I decided to take a look at the Lazarus tool and here is what the intro said:

Generate ‘pseudo-DNA kits’ based on segments in common with your matches. These ‘pseudo-DNA kits’ can then be used as a surrogate for a common ancestor in other tests on this site. Segments are included for every combination where a match occurs between a kit in group1 and group2.

It’s obvious from further instructions that this is really meant for a parent or grandparent, but the technique should work just the same for more distant relatives.

I decided to try it first just with the descendants of Elijah Vannoy.  At first, I thought that recreated Elijah would include the following DNA:

  • DNA segments from Elijah Vannoy
  • DNA segments from Elijah Vannoy’s wife, Lois McNiel
  • DNA segments that match from Elijah’s descendants spouse’s lines when individuals come from the same descendant line. This means that if three people descend from Joel Vannoy and Phoebe Crumley, Elijah’s son and his wife, that they would match on some DNA from Phoebe, and that there was no way to subtract Phoebe’s DNA.

After working with the Lazarus tool, I realized this is not the case because Lazarus is designed to utilize a group of direct descendants and then compare the DNA of that group to a second group of know relatives, but not descendants.

In other words, if you have a grandson of a man, and his brother.  The DNA shared by the brother and the grandson HAS to be the DNA contributed to that grandson by his grandfather, from their common ancestor, the great grandfather.  So, in our situation above, Phoebe’s DNA is excluded.

The chart below shows the inheritance path for Lazarus matching.

Lazarus inheritance

Because Lazarus is comparing the DNA of Son Doe with Brother Doe – that eliminates any DNA from the brother’s wives, Sarah Spoon or Mary – because those lines are not shared between Brother Doe and Son Doe.  The only shared ancestors that can contribute DNA to both are Father Doe and Methusaleh Fisher.

The Lazarus instructions allow you to enter the direct descendants of the person/couple that you are reconstructing, then a second set of instructions asks for remaining relatives not directly descended, like siblings, parents, cousins, etc. In other words, those that should share DNA through the common ancestor of the person you are recreating.

To recreate Elijah, I entered all of the Vannoy cousins and then entered William (V) as a sibling since he is the proven son of Daniel Vannoy and Sarah Hickerson.

Here is what Lazarus produced.

lazarus elijah 1

Lazarus includes segments of 4cM and 500 SNPs.

The first thing I thought was, “Holy Moly, what happened to chromosome 15?”  I went back and looked, and sure enough, while almost all of the Elijah descendants do match on chromosome 15, William (V), kit 156020, does not match above the Lazarus threshold I selected.  So chromosome 15 is not included.  Finding additional people who are known to be from this Vannoy line and adding them to the “nondescendant” group would probably result in a more complete Elijah.

lazarus elijah 2

Next, to recreate Sarah Hickerson, I added all of the Vannoy cousins plus William (V) as descendants of Sarah Hickerson and then I added just the one Hickerson descendant, William, as a sibling.  William’s ancestor is proven to be the sibling of Sarah.

I didn’t know quite what to expect.

Clearly if the DNA from the Hickerson descendant didn’t match or triangulate with DNA from any of the Vannoy cousins at this higher level, then Sarah Hickerson wasn’t likely Elijah’s mother.  I wanted to see matching, but more, I wanted to see triangulation.

lazarus elijah 3

I was stunned.  Every kit except two had matches, some of significant size.

lazarus elijah 4

lazarus elijah 5 v2

Please note that locations on chromosomes 3, 4 and 13, above, are triangulated in addition to matching between two individuals, which constitutes proof of a common ancestor.  Please also note that if you were throwing away segments below 7cM, you would lose all of the triangulated matches and all but two matches altogether.

Clearly, comparing the Vannoy DNA with the Hickerson DNA produced a significant number of matches including three triangulated segments.

lazarus elijah 6

Where Are We?

I never have, and I never would recommend attempting to utilize random small match segments out of context.  By out of context, I mean simply looking at all of your 1cM segments and suggesting that they are all relevant to your genealogy.  Nope, never have.  Never would.

There is no question that many small segments are IBS by chance or identical by population.  Furthermore, working with small segments in endogamous populations may not be fruitful.

Those are the caveats.  Small segments in the right circumstances are useful.  And we’ve seen several examples of the right circumstances.

Over the past few weeks, we have identified guidelines and tools to work with small segments, and they are the same tools and guidelines we utilize to work with larger segments as well.  The difference is size.  When working with large segments, the fact that they are large serves an a filter for us and we don’t question their authenticity.  With all small segments, we must do the matching and analysis work to prove validity.  Probably not worthwhile if you have larger segments for the same group of people.

Working with the Vannoy data on chromosome 15 is not random, nor is the family from an endogamous population.  That segment was proven to be Vannoy prior to attempts to confirm or disprove the Hickerson connection.  And we’ve gone beyond just matching, we’ve proven the ancestral link by triangulation, including small segments.  We’ve now proven the Hickerson connection about 7 ways to Sunday.  Ok, maybe 7 is an exaggeration, but here is the evidence summed up for the Vannoy/Hickerson study from multiple vendors and tools:

  • Ancestry DNA Circle indicating that multiple Hickerson descendants match me and some that don’t match me, match each other. Not proof, but certainly suggestive of a common ancestor.
  • A total of 26 Hickerson or derivative family name matches to Vannoy cousins at Family Tree DNA. Not proof, but again, very suggestive.
  • 6 Charles Hickerson/Mary Lytle descendants match to Vannoy cousins at Family Tree DNA. Extremely suggestive, needs triangulation.
  • Triangulation of segments between Vannoy and Hickerson cousins at Family Tree DNA. Proof, but in this study we were only looking to determine whether small segment matches constituted proof.
  • Triangulation of multiple Hickerson/Vannoy cousins on chromosome 15 at GedMatch utilizing small segments and one to one matching. More proof.
  • Lazarus, at higher thresholds than the triangulation matching, when creating Sarah Hickerson, still matched 19 segments and triangulated three for a total of 73.2cM when comparing the Hickerson descendant against the Vannoy cousins. Further proof.

So, can small segment matching data be useful? Is there any reason NOT to accept this evidence as valid?

With proper usage, small segment data certainly looks to provide value by judiciously applying exactly the same rules that apply to all DNA matching.  The difference of course being that you don’t really have to think about utilizing those tools with large segment matches.  It’s pretty well a given that a 20cM match is valid, but you can never assume anything about those small segment matches without supporting evidence. So are larger segments easier to use?  Absolutely.

Does that automatically make small segments invalid?  Absolutely not.

In some cases, especially when attempting to break down brick walls more than 5 or 6 generations in the past, small segment data may be all we have available.  We must use it effectively.  How small is too small?  I don’t know.  It appears that size is really not a factor if you strictly adhere to the IBD/IBS guidelines, but at some point, I would think the segments would be so small that just about everyone would match everyone because we are all humans – so the ultimate identical by population scenario.

Segments that don’t match an individual and either or both parents, assuming you have both parents to test, can safely be disregarded unless they are large and then a look at the raw data is in order to see if there is a problem in that area.  These are IBS by chance.  IBS segments by chance also won’t triangulate further up the tree.  They can’t, because they don’t match your parents so they cannot come from an ancestor.  If they don’t come from an ancestor, they can’t possibly match two other people whose DNA comes from that ancestor on that segment.

If both parents aren’t available, or your small segments do match with your parents, I would suggest that you retain your small segments and map them.

You can’t recognize patterns if the data isn’t present and you won’t be able to find that proverbial needle in the haystack that we are all looking for.

Based on what we’ve seen in multiple case studies, I would conclude that small segment data is certainly valid and can play a valid role in a situation where there is a known or suspected relationship.

I would agree that attempting to utilize small segment data outside the context of a larger data match is not optimal, at least not today, although I wish the vendors would provide a way for us to selectively lower our thresholds.  A larger segment match can point the way to smaller segment matches between multiple people that can be triangulated.  In some situations, like the person A, B, C, D Hickerson-Vannoy situation I described earlier in this article, I would like to be able to drop the match threshold to reveal the small segment data when other matches are suggestive of a family relationship.

In the Hickerson situation, having the ability to drop the matching thresholds would have been the key to positively confirming this relationship within the vendor’s data base and not having to utilize third party tools like GedMatch – which require the cooperation of all parties involved to download their raw data files.  Not everyone transferred their data to Gedmatch in my Vannoy group, but enough did that we were able to do what we needed to do.  That isn’t always the case.  In fact, I have an nearly identical situation in another line but my two matches at Ancestry have declined to download their data to Gedmatch.

This not the first time that small segment data has played a successful role in finding genealogy solutions, or confirming what we thought we knew – although in all cases to date, larger segments matched as well – and those larger segment matches were key and what pointed me to the potential match that ultimately involved the usage of the small segments for triangulation.

Using larger data segments as pointers probably won’t be the case forever, especially if we can gain confidence that we can reliably utilize small segments, at least in certain situations.  Specifically, a small segment match may be nothing, but a small segment triangulated match in the context of a genealogical situation seems to abide by all of the genetic genealogy DNA rules.

In fact, a situation just arose in the past couple weeks that does not include larger segments matching at a vendor.

Let’s close this article by discussing this recent scenario.

The Adoptee

An adoptee approached me with matching data from GedMatch which included matches to me, Dean, Carl and Harold on chromosome 15, on segments that overlap, as follows.

adoptee chr 15

On the spreadsheet above, sent to me by the adoptee, we can see some matches but not all matches. I ran the balance of these 4 people at GedMatch and below is the matching chart for the segment of chromosome 15 where the adoptee matches the 4 Vannoy cousins plus William(H), the Hickerson cousin.

  Me Carl Dean Harold Adoptee
Me NA FTDNA FTDNA GedMatch GedMatch
Carl FTDNA NA FTDNA FTDNA GedMatch
Dean FTDNA FTDNA NA FTDNA GedMatch
Harold GedMatch FTDNA FTDNA NA GedMatch
Adoptee GedMatch GedMatch GedMatch GedMatch NA
William (H) GedMatch GedMatch GedMatch GedMatch GedMatch

I decided to take the easy route and just utilize Lazarus again, so I added all of the known Vannoy and Hickerson cousins I utilized in earlier Lazarus calculations at Gedmatch as siblings to our adoptee.  This means that each kit will be compared to the adoptees DNA and matching segments will be reported.  At a threshold of 300 SNPs and 4cM, our adoptee matches at 140cM of common DNA between the various cousins.

adoptee vannoy match

Please note that in addition to matching several of the cousins, our adoptee also triangulates on chromosomes 1, 11, 15, 18, 19 and 21.  The triangulation on chromosome 21 is to two proven Hickerson descendants, so he matches on this line as well.

I reduced the threshold to 4cM and 200 SNPs to see what kind of difference that would make.

adoptee vannoy match low threshold

Our adoptee picked up another triangulation on chromosome 1 and added additional cousins in the chromosome 15 “sticky Vannoy” cluster and the chromosome 18 cluster.

Given what we just showed about chromosome 15, and the discussions about IBD and IBS guidelines and small matching segments, what conclusions would you draw and what would you do?

  1. Tell the adoptee this is invalid because there are no qualifying large match segments that match at the vendors.
  2. Tell the adoptee to throw all of those small segments away, or at least all of the ones below 7cM because they are only small matching segments and utilizing small matching segments is only a folly and the adoptee is only seeing what he wants to see – even though the Vannoy cousins with whom he triangulates are proven, triangulated cousins.
  3. Check to see if the adoptee also matches the other cousins involved, although he does clearly already exceeds the triangulation criteria to declare a common ancestor of 3 proven cousins on a matching segment. This is actually what I did utilizing Lazarus and you just saw the outcome.

If this is a valid match, based on who he does and doesn’t match in terms of the rest of the family, you could very well narrow his line substantially – perhaps by utilizing the various Vannoy wives’ DNA, to an ancestral couple.  Given that our adoptee matches both the Vannoys and the Hickersons, I suspect he is somehow descended from Daniel Vannoy and Sarah Hickerson.

In Conclusion

What is the acceptable level to utilize small segments in a known or suspected match situation?

Rather than look for a magic threshold number, we are much better served to look at reliable methods to determine the difference between DNA passed from our ancestors to us, IBD, and matches by chance.  This helps us to establish the reliability of DNA segments in individual situations we are likely to encounter in our genealogy.  In other words, rather that throw the entire pile of wheat away because there is some percentage of chaff in the wheat, let’s figure out how to sort the wheat from the chaff.

Fortunately, both parental phasing and triangulation eliminate the identical by chance segments.

Clearly, the smaller the segments, even in a known match situation, the more likely they are identical by population, given that they triangulate.  In fact, this is exactly how the Neanderthal and Denisovan genomes have been reconstructed.

Furthermore, given that the Anzick DNA sample is over 12,000 years old, Identical by population must be how Anzick is matching to contemporary humans, because at least some of these people do clearly share a common ancestor with Anzick at some point, long ago – more than 12,000 years ago.  In my case, at least some of the Anzick segments triangulate with my mother’s DNA, so they are not IBS by chance.  That only leaves identical by population or identical by descent, meaning within a genealogical timeframe, and we know that isn’t possible.

There are yet other situations where small segment matches are not IBS by chance nor identical by population.  For example, I have a very hard time believing that the adoptee situation is nothing but chance.  It’s not a folly.  It’s identical by descent as proven by triangulation with 10 different cousins – all on segments below the vendor matching thresholds.

In fact, it’s impossible to match the Vannoy cousins, who are already triangulated individually, by chance.  While the adoptee match is not over the vendor threshold, the segments are not terribly small and they do all triangulate with multiple individuals who also triangulate with larger segments, at the vendors and on different chromosomes.

This adoptee triangulated match, even without the Hickerson-Vannoy study disproves the blanket statement that small segments below 5cM cannot be used for genealogy.  All of these segments are 7.1cM or below and most are below 5.

This small segment match between my mother and her first cousins also disproves that segments under 5cM can never be used for genealogy.

Two cousins combined

This small segment passed from my mother to me disproves that statement too – clearly matching with our cousin, Cheryl.  If I did not receive this from my mother, and she from her parent, then how do we match a common cousin???

me mother small seg

More small segment proof, below, between my mother and her second cousin when Lazarus was reconstructing my mother’s father.

2nd cousin lazarus match

And this Vannoy Hickerson 4 cousin triangulated segment also disproves that 5cM and below cannot be used for genealogy.

vannoy hickerson triang

Where did these small segments come from if not a common ancestor, either one or several generations ago?  If you look at the small segment I inherited from my mother and say, “well, of course that’s valid, you got it from your mother” then the same logic has to apply that she inherited it from her parent.  The same logic then applies that the same small segment, when shared by my mother’s cousin, also came from the their common grandparents.  One cannot be true without the others being true.  It’s the same DNA. I got it from my mother.  And it’s only a 1.46cM segment, shown in the examples above.

Here are my observations and conclusions:

  • As proven with hundreds of examples in this and other articles cited, small segments can be and are inherited from our ancestors and can be utilized for genetic genealogy.
  • There is no line in the sand at 7cM or 5cM at which a segment is viable and useful at 5.1cM and not at 4.9cM.
  • All small segment matches need to be evaluated utilizing the guidelines set forth for IBD versus IBS by chance versus identical by population set forth in the articles titled How Phasing Works and Determining IBD Versus IBS Matches and Demystifying Autosomal DNA Matching.
  • When given a choice, large segment matches are always easier to use because they are seldom IBS by chance and most often IBD.
  • Small segment matches are more likely to be IBS by chance than larger matches, which is why we need to judiciously apply the IBD/IBS Guidelines when attempting to utilize small segment matches.
  • All DNA matches, not just small segments, must be triangulated to prove a common ancestor, unless they are known close relatives, like siblings, first cousins, etc.
  • When working in genetic genealogy, always glean the information from larger matches and assemble that information.  However, when the time comes that you need those small segments because you are working 5, 6 or 7 generations back in time, remember that tools and guidelines exist to use small segments reliably.
  • Do not attempt to use small segments out of context.  This means that if you were to look only at your 1cM matches to unknown people, and you have the ability to triangulate against your parents, most would prove to be IBS by chance.  This is the basis of the argument for why some people delete their small segments.  However, by utilizing parental phasing, phasing against known family members (like uncles, aunts and first cousins) and triangulation, you can identify and salvage the useable small segments – and these segments may be the only remnants of your ancestors more than 5 or 6 generations back that you’ll ever have to work with.  You do not have to throw all of them away simply because some or many small segments, out of context, are IBS by chance.  It doesn’t hurt anything to leave them just sit in your spreadsheet untouched until the day that you need them.

Ultimately, the decision is yours whether you will use small segments or not – and either decision is fine.  However, don’t make the decision based on the belief that small segments under some magic number, like 5cM or 7cM are universally useless.  They aren’t.

Whether small segments are too much work and effort in your individual situation depends on your personal goals for genetic genealogy and on factors like whether or not you descend from an endogamous population.  People’s individual goals and circumstances vary widely.  Some people test at Ancestry and are happy with inferential matching circles and nothing more.  Some people want to wring every tidbit possible out of genealogy, genetic or otherwise.

I hope everyone will begin to look at how they can use small segment data reliably instead of simply discarding all the small segments on the premise that all small segment data is useless because some small segments are not useful.  All unstudied and discarded data is indeed useless, so discarding becomes a self-fulfilling prophecy.

But by far, the worst outcome of throwing perfectly good data away is that you’ll never know what genetic secrets it held for you about your ancestors.  Maybe the DNA of your own Sarah Hickerson is lurking there, just waiting for the right circumstances to be found.

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

Lazarus – Putting Humpty Dumpty Back Together Again

Recently, GedMatch introduced a tool, Lazarus, to figuratively raise the dead by combining the DNA of descendants, siblings and other relatives of long-dead ancestors to recreate their genome.  Kind of like piecing Humpty Dumpty back together again.

Humpty Dumpty

Blaine Bettinger wrote about using Lazarus here and here where he recreated the genome of his grandmother.  I’d like to use Lazarus to see how it works with one pair of siblings and a first cousin.  Blaine was fortunate to have 4 siblings.  I have a much smaller group of people to work with, so let’s see what we can do and how successful we are, or aren’t.  But first, lets talk about the basics and how we can reconstruct an ancestor.

The Basics

An individual has 6766.2 cM of DNA.  Both parents give half of their DNA to each child, but not exactly the same parental DNA is contributed to each child.  A random process selects which half of the parents’ DNA is given to each child.  Different children will have some of the same DNA from their parents, and some different DNA from each parent.

Obviously, the DNA contributed to each child from a parent is a combination of the DNA given to the parent by the grandparents.  Approximately half of the grandparent’s DNA is given to each child.  In many cases, the DNA contributed to the child from the grandparents is not actually divided evenly, and we receive all or nothing of individual segments, not half.  Half is an average that works pretty well most of the time.  It’s a statistic, and we all know about statistics…right???

Therefore, children carry 3383cM of each parent’s DNA.  Each sibling carries half of the same DNA from their parents.  From the ISOGG autosomal DNA statistics chart, each sibling actually carries 25% of exactly the same DNA from both parents, 50% where they inherited half of the same DNA from one parent and different DNA from the other parent, and 25% where the siblings don’t share any of the identical DNA from their parents. This averages 50%.

This chart, also from ISOGG, sums up what percentage of the same DNA different relatives can expect to carry.

cousin percents

Recreating Ferverda Brothers

I have a situation where I have a person, Barbara, and two of her first cousins, Cheryl and Don, who are siblings.  This is the same family we discussed in the Just One Cousin article.

Miller Ferverda chart

In this case, Cheryl and Don share 50% of Roscoe’s DNA.

Barbara shares 12.5% of Hiram and Evaline’s DNA with Cheryl and 12.5% with Don, but not the same 12.5%.  Since siblings share 50% of their DNA, Barbara should share about 12.5% of Cheryl’s DNA and an additional 6.25% that the Cheryl didn’t receive from Roscoe, but that Don did.

Translating that into cMs, Barbara should share about 850 cM with Cheryl and an additional 425 cM with Don, for an approximate total of 1275 cM.

At http://www.gedmatch.com, I selected the Tier 1 (subscription or donation) option of Lazarus and was presented with this menu.

lazarus menu

My first attempt was to recreate Barbara’s father, John W. Ferverda.  I allowed 100 SNPs and 4cM because I was hoping to be able to accumulate more than the required 1500cM of matching DNA for the kit to be utilized as a “real kit,” available for one-to-many matching.

100SNP 4cM 200SNP 4cM 300SNP 4cM 400SNP 4cM 500SNP 4cM 600SNP 4cM 700SNP 4cM
John W. Ferverda 1330.7 cM 1370.2 cM 1360.0 cM 1353.5 cM 1338.7 cM 1336.2 cM 1322.9 cM

I then experimented with the various SNP levels, leaving the cM at 4.

The resulting number of cM of just over 1300, no matter how you slice and dice it, is very near the expected approximation of 1275.

Using the Lazarus tool, I created “John Ferverda” by listing Barbara as his descendant and both Cheryl and Don as cousins.

To create “Roscoe Ferverda,” I reversed the positions of the individuals, listing Don and Cheryl as descendants and Barbara as the cousin.

Lazarus options

These two created individuals, “John” and “Roscoe” should be exactly the same, and, thankfully, they were.

Both recreated “John” and “Roscoe” represent a common set of DNA from the parents of both of these men, Hiram Ferverda and Evaline Miller based on the matching DNA of their descendants, Barbara, Cheryl and Don.

The way Lazarus works is that all kits in Group 1, the descendants, are compared with Group 2, other relatives but not descendants.  The descendants will carry some of Roscoe’s DNA, but also the DNA of Roscoe’s wife, the mother of Don and Cheryl.  By comparing against known relatives but not direct descendants, Lazarus effectively narrows the DNA to that contributed only by the common ancestor of group 1 and group 2.  In this case, that common ancestor would be John and Roscoe’s parents, Hiram Ferverda and Evaline Miller.  By comparing the descendant and non-descendant-but-otherwise-related groups, you effectively subtract out the mother’s DNA from the descendants – in this case meaning the DNA of John Ferverda’s wife and Roscoe Ferverda’s wife.

In other words, the descendants, above, are NOT compared to each other, but instead, to each one of the not-descendant-but-otherwise-related group.

Unfortunately, none of the kits generated was over the 1500 cM threshold.  I remembered that there is also a second cousin, Rex, whose DNA we can add because he descends from the parents of Evaline Miller.

Adding Rex to the mix brought the resulting “Roscoe” kit to 1589.7 cM and the resulting “John” kit to 1555.7 cM, both now barely over the 1500 threshold – but over just the same and that’s all that matters.  Soon, we’ll be able to utilize both of these kits for direct matching as a “person” at GedMatch.  Now how cool is that???

You receive four pieces of output information when you create a Lazarus kit.

First, a comparison between the descendants (Group 1 above, Kit 2 below) and each of the cousins and related-but-not-descendants individuals (Group 2 above, Kit 1 below), by chromosome.

John W. Ferverda

Processed: 2015/01/09 17:32:41
Name: John W. Ferverda
SNP threshold = 100 cM
Threshold = 4.0 cM
Batch processing will be performed if resulting kit achieves required threshold of 1500 cM.

Contributions:

Kit 1

Kit 2

Chr

Start

End

cM

F9141

M133930

1

72017

5703284

14.8

F9141

M133930

1

17271101

18589169

4.1

F9141

M133930

1

32804999

65722466

37.8

F9141

M133930

1

242601404

247174776

8.5

Obviously, these are only snippets of the output for chromosome 1.  You receive a chart of this same information for all of the chromosomes of the people being compared.

Second, a chart that shows the resulting matching segments.

Resulting Segments:

Chr

Start

End

cM

1

742429

5694404

14.8

1

17285357

18588145

4.1

1

38226163

43823334

7.2

1

43975578

54990495

8.0

1

55040097

62847030

12.1

1

76341094

85237614

8.7

1

242606491

247179501

8.5

At the bottom of this second set of numbers is the all-important total cM.  This is the only place you will find this number

Total cM: 1555.7

Third, a list of the original kits that have match results between the two groups.

Original Kits match with result:

Kit

Chr

Start

End

cM

F9141

1

742429

5700507

14.8

F9141

1

10899689

12530765

4.5

F9141

1

35075204

65714854

35.3

F9141

1

76334120

85252045

8.7

F9141

1

242606379

247169190

8.5

M133930

1

742429

5705356

14.8

M133930

1

35075956

65714854

35.3

M133930

1

242606491

247165725

8.5

F50000

1

10899689

12530765

4.5

F153785

1

742584

5700507

14.8

F153785

1

76337055

85252045

8.7

F153785

1

242606379

247169190

8.5

And finally, a summary.

196074 single allele SNPs were derived for the resulting kit.
37068 bi-allelic SNPs were derived for the resulting kit.
233142 total SNPs were derived for the resulting kit.
Kit number of Result: LX056148
Kit Name: John Ferverda 8
Your Lazarus file has been generated.

Is this as good as the real McCoy, meaning swabbing John and Roscoe?  Of course not, but John and Roscoe aren’t available for swabbing.  In fact, John and Roscoe are both probably finding this pretty amusing from someplace on the other side, watching their children “recreate” them!

I can hear them now, shaking their heads, “Well I never….”

They should have known if they left Cheryl and me here, together, unsupervised that we would do something like this!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

2014 Top Genetic Genealogy Happenings – A Baker’s Dozen +1

It’s that time again, to look over the year that has just passed and take stock of what has happened in the genetic genealogy world.  I wrote a review in both 2012 and 2013 as well.  Looking back, these momentous happenings seem quite “old hat” now.  For example, both www.GedMatch.com and www.DNAGedcom.com, once new, have become indispensable tools that we take for granted.  Please keep in mind that both of these tools (as well as others in the Tools section, below) depend on contributions, although GedMatch now has a tier 1 subscription offering for $10 per month as well.

So what was the big news in 2014?

Beyond the Tipping Point

Genetic genealogy has gone over the tipping point.  Genetic genealogy is now, unquestionably, mainstream and lots of people are taking part.  From the best I can figure, there are now approaching or have surpassed three million tests or test records, although certainly some of those are duplicates.

  • 500,000+ at 23andMe
  • 700,000+ at Ancestry
  • 700,000+ at Genographic

The organizations above represent “one-test” companies.  Family Tree DNA provides various kinds of genetic genealogy tests to the community and they have over 380,000 individuals with more than 700,000 test records.

In addition to the above mentioned mainstream firms, there are other companies that provide niche testing, often in addition to Family Tree DNA Y results.

In addition, there is what I would refer to as a secondary market for testing as well which certainly attracts people who are not necessarily genetic genealogists but who happen across their corporate information and decide the test looks interesting.  There is no way of knowing how many of those tests exist.

Additionally, there is still the Sorenson data base with Y and mtDNA tests which reportedly exceeded their 100,000 goal.

Spencer Wells spoke about the “viral spread threshold” in his talk in Houston at the International Genetic Genealogy Conference in October and terms 2013 as the year of infection.  I would certainly agree.

spencer near term

Autosomal Now the New Normal

Another change in the landscape is that now, autosomal DNA has become the “normal” test.  The big attraction to autosomal testing is that anyone can play and you get lots of matches.  Earlier in the year, one of my cousins was very disappointed in her brother’s Y DNA test because he only had a few matches, and couldn’t understand why anyone would test the Y instead of autosomal where you get lots and lots of matches.  Of course, she didn’t understand the difference in the tests or the goals of the tests – but I think as more and more people enter the playground – percentagewise – fewer and fewer do understand the differences.

Case in point is that someone contacted me about DNA and genealogy.  I asked them which tests they had taken and where and their answer was “the regular one.”  With a little more probing, I discovered that they took Ancestry’s autosomal test and had no clue there were any other types of tests available, what they could tell him about his ancestors or genetic history or that there were other vendors and pools to swim in as well.

A few years ago, we not only had to explain about DNA tests, but why the Y and mtDNA is important.  Today, we’ve come full circle in a sense – because now we don’t have to explain about DNA testing for genealogy in general but we still have to explain about those “unknown” tests, the Y and mtDNA.  One person recently asked me, “oh, are those new?”

Ancient DNA

This year has seen many ancient DNA specimens analyzed and sequenced at the full genomic level.

The year began with a paper titled, “When Populations Collide” which revealed that contemporary Europeans carry between 1-4% of Neanderthal DNA most often associated with hair and skin color, or keratin.  Africans, on the other hand, carry none or very little Neanderthal DNA.

http://dna-explained.com/2014/01/30/neanderthal-genome-further-defined-in-contemporary-eurasians/

A month later, a monumental paper was published that detailed the results of sequencing a 12,500 Clovis child, subsequently named Anzick or referred to as the Anzick Clovis child, in Montana.  That child is closely related to Native American people of today.

http://dna-explained.com/2014/02/13/clovis-people-are-native-americans-and-from-asia-not-europe/

In June, another paper emerged where the authors had analyzed 8000 year old bones from the Fertile Crescent that shed light on the Neolithic area before the expansion from the Fertile Crescent into Europe.  These would be the farmers that assimilated with or replaced the hunter-gatherers already living in Europe.

http://dna-explained.com/2014/06/09/dna-analysis-of-8000-year-old-bones-allows-peek-into-the-neolithic/

Svante Paabo is the scientist who first sequenced the Neanderthal genome.  Here is a neanderthal mangreat interview and speech.  This man is so interesting.  If you have not read his book, “Neanderthal Man, In Search of Lost Genomes,” I strongly recommend it.

http://dna-explained.com/2014/07/22/finding-your-inner-neanderthal-with-evolutionary-geneticist-svante-paabo/

In the fall, yet another paper was released that contained extremely interesting information about the peopling and migration of humans across Europe and Asia.  This was just before Michael Hammer’s presentation at the Family Tree DNA conference, so I covered the paper along with Michael’s information about European ancestral populations in one article.  The take away messages from this are two-fold.  First, there was a previously undefined “ghost population” called Ancient North Eurasian (ANE) that is found in the northern portion of Asia that contributed to both Asian populations, including those that would become the Native Americans and European populations as well.  Secondarily, the people we thought were in Europe early may not have been, based on the ancient DNA remains we have to date.  Of course, that may change when more ancient DNA is fully sequenced which seems to be happening at an ever-increasing rate.

http://dna-explained.com/2014/10/21/peopling-of-europe-2014-identifying-the-ghost-population/

Lazaridis tree

Ancient DNA Available for Citizen Scientists

If I were to give a Citizen Scientist of the Year award, this year’s award would go unquestionably to Felix Chandrakumar for his work with the ancient genome files and making them accessible to the genetic genealogy world.  Felix obtained the full genome files from the scientists involved in full genome analysis of ancient remains, reduced the files to the SNPs utilized by the autosomal testing companies in the genetic genealogy community, and has made them available at GedMatch.

http://dna-explained.com/2014/09/22/utilizing-ancient-dna-at-gedmatch/

If this topic is of interest to you, I encourage you to visit his blog and read his many posts over the past several months.

https://plus.google.com/+FelixChandrakumar/posts

The availability of these ancient results set off a sea of comparisons.  Many people with Native heritage matched Anzick’s file at some level, and many who are heavily Native American, particularly from Central and South America where there is less admixture match Anzick at what would statistically be considered within a genealogical timeframe.  Clearly, this isn’t possible, but it does speak to how endogamous populations affect DNA, even across thousands of years.

http://dna-explained.com/2014/09/23/analyzing-the-native-american-clovis-anzick-ancient-results/

Because Anzick is matching so heavily with the Mexican, Central and South American populations, it gives us the opportunity to extract mitochondrial DNA haplogroups from the matches that either are or may be Native, if they have not been recorded before.

http://dna-explained.com/2014/09/23/analyzing-the-native-american-clovis-anzick-ancient-results/

Needless to say, the matches of these ancient kits with contemporary people has left many people questioning how to interpret the results.  The answer is that we don’t really know yet, but there is a lot of study as well as speculation occurring.  In the citizen science community, this is how forward progress is made…eventually.

http://dna-explained.com/2014/09/25/ancient-dna-matches-what-do-they-mean/

http://dna-explained.com/2014/09/30/ancient-dna-matching-a-cautionary-tale/

More ancient DNA samples for comparison:

http://dna-explained.com/2014/10/04/more-ancient-dna-samples-for-comparison/

A Siberian sample that also matches the Malta Child whose remains were analyzed in late 2013.

http://dna-explained.com/2014/11/12/kostenki14-a-new-ancient-siberian-dna-sample/

Felix has prepared a list of kits that he has processed, along with their GedMatch numbers and other relevant information, like gender, haplogroup(s), age and location of sample.

http://www.y-str.org/p/ancient-dna.html

Furthermore, in a collaborative effort with Family Tree DNA, Felix formed an Ancient DNA project and uploaded the ancient autosomal files.  This is the first time that consumers can match with Ancient kits within the vendor’s data bases.

https://www.familytreedna.com/public/Ancient_DNA

Recently, GedMatch added a composite Archaic DNA Match comparison tool where your kit number is compared against all of the ancient DNA kits available.  The output is a heat map showing which samples you match most closely.

gedmatch ancient heat map

Indeed, it has been a banner year for ancient DNA and making additional discoveries about DNA and our ancestors.  Thank you Felix.

Haplogroup Definition

That SNP tsunami that we discussed last year…well, it made landfall this year and it has been storming all year long…in a good way.  At least, ultimately, it will be a good thing.  If you asked the haplogroup administrators today about that, they would probably be too tired to answer – as they’ve been quite overwhelmed with results.

The Big Y testing has been fantastically successful.  This is not from a Family Tree DNA perspective, but from a genetic genealogy perspective.  Branches have been being added to and sawed off of the haplotree on a daily basis.  This forced the renaming of the haplogroups from the old traditional R1b1a2 to R-M269 in 2012.  While there was some whimpering then, it would be nothing like the outright wailing now that would be occurring as haplogroup named reached 20 or so digits.

Alice Fairhurst discussed the SNP tsunami at the DNA Conference in Houston in October and I’m sure that the pace hasn’t slowed any between now and then.  According to Alice, in early 2014, there were 4115 individual SNPs on the ISOGG Tree, and as of the conference, there were 14,238 SNPs, with the 2014 addition total at that time standing at 10,213.  That is over 1000 per month or about 35 per day, every day.

Yes, indeed, that is the definition of a tsunami.  Every one of those additions requires one of a number of volunteers, generally haplogroup project administrators to evaluate the various Big Y results, the SNPs and novel variants included, where they need to be inserted in the tree and if branches need to be rearranged.  In some cases, naming request for previously unknown SNPs also need to be submitted.  This is all done behind the scenes and it’s not trivial.

The project I’m closest to is the R1b L-21 project because my Estes males fall into that group.  We’ve tested several, and I’ll be writing an article as soon as the final test is back.

The tree has grown unbelievably in this past year just within the L21 group.  This project includes over 700 individuals who have taken the Big Y test and shared their results which has defined about 440 branches of the L21 tree.  Currently there are almost 800 kits available if you count the ones on order and the 20 or so from another vendor.

Here is the L21 tree in January of 2014

L21 Jan 2014 crop

Compare this with today’s tree, below.

L21 dec 2014

Michael Walsh, Richard Stevens, David Stedman need to be commended for their incredible work in the R-L21 project.  Other administrators are doing equivalent work in other haplogroup projects as well.  I big thank you to everyone.  We’d be lost without you!

One of the results of this onslaught of information is that there have been fewer and fewer academic papers about haplogroups in the past few years.  In essence, by the time a paper can make it through the peer review cycle and into publication, the data in the paper is often already outdated relative to the Y chromosome.  Recently a new paper was released about haplogroup C3*.  While the data is quite valid, the authors didn’t utilize the new SNP naming nomenclature.  Before writing about the topic, I had to translate into SNPese.  Fortunately, C3* has been relatively stable.

http://dna-explained.com/2014/12/23/haplogroup-c3-previously-believed-east-asian-haplogroup-is-proven-native-american/

10th Annual International Conference on Genetic Genealogy

The Family Tree DNA International Conference on Genetic Genealogy for project administrators is always wonderful, but this year was special because it was the 10th annual.  And yes, it was my 10th year attending as well.  In all these years, I had never had a photo with both Max and Bennett.  Everyone is always so busy at the conferences.  Getting any 3 people, especially those two, in the same place at the same time takes something just short of a miracle.

roberta, max and bennett

Ten years ago, it was the first genetic genealogy conference ever held, and was the only place to obtain genetic genealogy education outside of the rootsweb genealogy DNA list, which is still in existence today.  Family Tree DNA always has a nice blend of sessions.  I always particularly appreciate the scientific sessions because those topics generally aren’t covered elsewhere.

http://dna-explained.com/2014/10/11/tenth-annual-family-tree-dna-conference-opening-reception/

http://dna-explained.com/2014/10/12/tenth-annual-family-tree-dna-conference-day-2/

http://dna-explained.com/2014/10/13/tenth-annual-family-tree-dna-conference-day-3/

http://dna-explained.com/2014/10/15/tenth-annual-family-tree-dna-conference-wrapup/

Jennifer Zinck wrote great recaps of each session and the ISOGG meeting.

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy/

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-isogg-meeting/

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-sunday/

I thank Family Tree DNA for sponsoring all 10 conferences and continuing the tradition.  It’s really an amazing feat when you consider that 15 years ago, this industry didn’t exist at all and wouldn’t exist today if not for Max and Bennett.

Education

Two educational venues offered classes for genetic genealogists and have made their presentations available either for free or very reasonably.  One of the problems with genetic genealogy is that the field is so fast moving that last year’s session, unless it’s the very basics, is probably out of date today.  That’s the good news and the bad news.

http://dna-explained.com/2014/11/12/genetic-genealogy-ireland-2014-presentations 

http://dna-explained.com/2014/09/26/educational-videos-from-international-genetic-genealogy-conference-now-available/

In addition, three books have been released in 2014.emily book

In January, Emily Aulicino released Genetic Genealogy, The Basics and Beyond.

richard hill book

In October, Richard Hill released “Guide to DNA Testing: How to Identify Ancestors, Confirm Relationships and Measure Ethnicity through DNA Testing.”

david dowell book

Most recently, David Dowell’s new book, NextGen Genealogy: The DNA Connection was released right after Thanksgiving.

 

Ancestor Reconstruction – Raising the Dead

This seems to be the year that genetic genealogists are beginning to reconstruct their ancestors (on paper, not in the flesh) based on the DNA that the ancestors passed on to various descendants.  Those segments are “gathered up” and reassembled in a virtual ancestor.

I utilized Kitty Cooper’s tool to do just that.

http://dna-explained.com/2014/10/03/ancestor-reconstruction/

henry bolton probablyI know it doesn’t look like much yet but this is what I’ve been able to gather of Henry Bolton, my great-great-great-grandfather.

Kitty did it herself too.

http://blog.kittycooper.com/2014/08/mapping-an-ancestral-couple-a-backwards-use-of-my-segment-mapper/

http://blog.kittycooper.com/2014/09/segment-mapper-tool-improvements-another-wold-dna-map/

Ancestry.com wrote a paper about the fact that they have figured out how to do this as well in a research environment.

http://corporate.ancestry.com/press/press-releases/2014/12/ancestrydna-reconstructs-partial-genome-of-person-living-200-years-ago/

http://www.thegeneticgenealogist.com/2014/12/16/ancestrydna-recreates-portions-genome-david-speegle-two-wives/

GedMatch has created a tool called, appropriately, Lazarus that does the same thing, gathers up the DNA of your ancestor from their descendants and reassembles it into a DNA kit.

Blaine Bettinger has been working with and writing about his experiences with Lazarus.

http://www.thegeneticgenealogist.com/2014/10/20/finally-gedmatch-announces-monetization-strategy-way-raise-dead/

http://www.thegeneticgenealogist.com/2014/12/09/recreating-grandmothers-genome-part-1/

http://www.thegeneticgenealogist.com/2014/12/14/recreating-grandmothers-genome-part-2/

Tools

Speaking of tools, we have some new tools that have been introduced this year as well.

Genome Mate is a desktop tool used to organize data collected by researching DNA comparsions and aids in identifying common ancestors.  I have not used this tool, but there are others who are quite satisfied.  It does require Microsoft Silverlight be installed on your desktop.

The Autosomal DNA Segment Analyzer is available through www.dnagedcom.com and is a tool that I have used and found very helpful.  It assists you by visually grouping your matches, by chromosome, and who you match in common with.

adsa cluster 1

Charting Companion from Progeny Software, another tool I use, allows you to colorize and print or create pdf files that includes X chromosome groupings.  This greatly facilitates seeing how the X is passed through your ancestors to you and your parents.

x fan

WikiTree is a free resource for genealogists to be able to sort through relationships involving pedigree charts.  In November, they announced Relationship Finder.

Probably the best example I can show of how WikiTree has utilized DNA is using the results of King Richard III.

wiki richard

By clicking on the DNA icon, you see the following:

wiki richard 2

And then Richard’s Y, mitochondrial and X chromosome paths.

wiki richard 3

Since Richard had no descendants, to see how descendants work, click on his mother, Cecily of York’s DNA descendants and you’re shown up to 10 generations.

wiki richard 4

While this isn’t terribly useful for Cecily of York who lived and died in the 1400s, it would be incredibly useful for finding mitochondrial descendants of my ancestor born in 1802 in Virginia.  I’d love to prove she is the daughter of a specific set of parents by comparing her DNA with that of a proven daughter of those parents!  Maybe I’ll see if I can find her parents at WikiTree.

Kitty Cooper’s blog talks about additional tools.  I have used Kitty’s Chromosome mapping tools as discussed in ancestor reconstruction.

Felix Chandrakumar has created a number of fun tools as well.  Take a look.  I have not used most of these tools, but there are several I’ll be playing with shortly.

Exits and Entrances

With very little fanfare, deCODEme discontinued their consumer testing and reminded people to download their date before year end.

http://dna-explained.com/2014/09/30/decodeme-consumer-tests-discontinued/

I find this unfortunate because at one time, deCODEme seemed like a company full of promise for genetic genealogy.  They failed to take the rope and run.

On a sad note, Lucas Martin who founded DNA Tribes unexpectedly passed away in the fall.  DNA Tribes has been a long-time player in the ethnicity field of genetic genealogy.  I have often wondered if Lucas Martin was a pseudonym, as very little information about Lucas was available, even from Lucas himself.  Neither did I find an obituary.  Regardless, it’s sad to see someone with whom the community has worked for years pass away.  The website says that they expect to resume offering services in January 2015. I would be cautious about ordering until the structure of the new company is understood.

http://www.dnatribes.com/

In the last month, a new offering has become available that may be trying to piggyback on the name and feel of DNA Tribes, but I’m very hesitant to provide a link until it can be determined if this is legitimate or bogus.  If it’s legitimate, I’ll be writing about it in the future.

However, the big news exit was Ancestry’s exit from the Y and mtDNA testing arena.  We suspected this would happen when they stopped selling kits, but we NEVER expected that they would destroy the existing data bases, especially since they maintain the Sorenson data base as part of their agreement when they obtained the Sorenson data.

http://dna-explained.com/2014/10/02/ancestry-destroys-irreplaceable-dna-database/

The community is still hopeful that Ancestry may reverse that decision.

Ancestry – The Chromosome Browser War and DNA Circles

There has been an ongoing battle between Ancestry and the more seasoned or “hard-core” genetic genealogists for some time – actually for a long time.

The current and most long-standing issue is the lack of a chromosome browser, or any similar tools, that will allow genealogists to actually compare and confirm that their DNA match is genuine.  Ancestry maintains that we don’t need it, wouldn’t know how to use it, and that they have privacy concerns.

Other than their sessions and presentations, they had remained very quiet about this and not addressed it to the community as a whole, simply saying that they were building something better, a better mousetrap.

In the fall, Ancestry invited a small group of bloggers and educators to visit with them in an all-day meeting, which came to be called DNA Day.

http://dna-explained.com/2014/10/08/dna-day-with-ancestry/

In retrospect, I think that Ancestry perceived that they were going to have a huge public relations issue on their hands when they introduced their new feature called DNA Circles and in the process, people would lose approximately 80% of their current matches.  I think they were hopeful that if they could educate, or convince us, of the utility of their new phasing techniques and resulting DNA Circles feature that it would ease the pain of people’s loss in matches.

I am grateful that they reached out to the community.  Some very useful dialogue did occur between all participants.  However, to date, nothing more has happened nor have we received any additional updates after the release of Circles.

Time will tell.

http://dna-explained.com/2014/11/18/in-anticipation-of-ancestrys-better-mousetrap/

http://dna-explained.com/2014/11/19/ancestrys-better-mousetrap-dna-circles/

DNA Circles 12-29-2014

DNA Circles, while interesting and somewhat useful, is certainly NOT a replacement for a chromosome browser, nor is it a better mousetrap.

http://dna-explained.com/2014/11/30/chromosome-browser-war/

In fact, the first thing you have to do when you find a DNA Circle that you have not verified utilizing raw data and/or chromosome browser tools from either 23andMe, Family Tree DNA or Gedmatch, is to talk your matches into transferring their DNA to Family Tree DNA or download to Gedmatch, or both.

http://dna-explained.com/2014/11/27/sarah-hickerson-c1752-lost-ancestor-found-52-ancestors-48/

I might add that the great irony of finding the Hickerson DNA Circle that led me to confirm that ancestry utilizing both Family Tree DNA and GedMatch is that today, when I checked at Ancestry, the Hickerson DNA Circle is no longer listed.  So, I guess I’ve been somehow pruned from the circle.  I wonder if that is the same as being voted off of the island.  So, word to the wise…check your circles often…they change and not always in the upwards direction.

The Seamy Side – Lies, Snake Oil Salesmen and Bullys

Unfortunately a seamy side, an underbelly that’s rather ugly has developed in and around the genetic genealogy industry.  I guess this was to be expected with the rapid acceptance and increasing popularity of DNA testing, but it’s still very unfortunate.

Some of this I expected, but I didn’t expect it to be so…well…blatant.

I don’t watch late night TV, but I’m sure there are now DNA diets and DNA dating and just about anything else that could be sold with the allure of DNA attached to the title.

I googled to see if this was true, and it is, although I’m not about to click on any of those links.

google dna dating

google dna diet

Unfortunately, within the ever-growing genetic genealogy community a rather large rift has developed over the past couple of years.  Obviously everyone can’t get along, but this goes beyond that.  When someone disagrees, a group actively “stalks” the person, trying to cost them their employment, saying hate filled and untrue things and even going so far as to create a Facebook page titled “Against<personname>.”  That page has now been removed, but the fact that a group in the community found it acceptable to create something like that, and their friends joined, is remarkable, to say the least.  That was accompanied by death threats.

Bullying behavior like this does not make others feel particularly safe in expressing their opinions either and is not conducive to free and open discussion. As one of the law enforcement officers said, relative to the events, “This is not about genealogy.  I don’t know what it is about, yet, probably money, but it’s not about genealogy.”

Another phenomenon is that DNA is now a hot topic and is obviously “selling.”  Just this week, this report was published, and it is, as best we can tell, entirely untrue.

http://worldnewsdailyreport.com/usa-archaeologists-discover-remains-of-first-british-settlers-in-north-america/

There were several tip offs, like the city (Lanford) and county (Laurens County) is not in the state where it is attributed (it’s in SC not NC), and the name of the institution is incorrect (Johns Hopkins, not John Hopkins).  Additionally, if you google the name of the magazine, you’ll see that they specialize in tabloid “faux reporting.”  It also reads a lot like the King Richard genuine press release.

http://urbanlegends.about.com/od/Fake-News/tp/A-Guide-to-Fake-News-Websites.01.htm

Earlier this year, there was a bogus institutional site created as well.

On one of the DNA forums that I frequent, people often post links to articles they find that are relevant to DNA.  There was an interesting article, which has now been removed, correlating DNA results with latitude and altitude.  I thought to myself, I’ve never heard of that…how interesting.   Here’s part of what the article said:

Researchers at Aberdeen College’s Havering Centre for Genetic Research have discovered an important connection between our DNA and where our ancestors used to live.

Tiny sequence variations in the human genome sometimes called Single Nucleotide Polymorphisms (SNPs) occur with varying frequency in our DNA.  These have been studied for decades to understand the major migrations of large human populations.  Now Aberdeen College’s Dr. Miko Laerton and a team of scientists have developed pioneering research that shows that these differences in our DNA also reveal a detailed map of where our own ancestors lived going back thousands of years.

Dr. Laerton explains:  “Certain DNA sequence variations have always been important signposts in our understanding of human evolution because their ages can be estimated.  We’ve known for years that they occur most frequently in certain regions [of DNA], and that some alleles are more common to certain geographic or ethnic groups, but we have never fully understood the underlying reasons.  What our team found is that the variations in an individual’s DNA correlate with the latitudes and altitudes where their ancestors were living at the time that those genetic variations occurred.  We’re still working towards a complete understanding, but the knowledge that sequence variations are connected to latitude and altitude is a huge breakthrough by itself because those are enough to pinpoint where our ancestors lived at critical moments in history.”

The story goes on, but at the bottom, the traditional link to the publication journal is found.

The full study by Dr. Laerton and her team was published in the September issue of the Journal of Genetic Science.

I thought to myself, that’s odd, I’ve never heard of any of these people or this journal, and then I clicked to find this.

Aberdeen College bogus site

About that time, Debbie Kennett, DNA watchdog of the UK, posted this:

April Fools Day appears to have arrived early! There is no such institution as Aberdeen College founded in 1394. The University of Aberdeen in Scotland was founded in 1495 and is divided into three colleges: http://www.abdn.ac.uk/about/colleges-schools-institutes/colleges-53.php

The picture on the masthead of the “Aberdeen College” website looks very much like a photo of Aberdeen University. This fake news item seems to be the only live page on the Aberdeen College website. If you click on any other links, including the link to the so-called “Journal of Genetic Science”, you get a message that the website is experienced “unusually high traffic”. There appears to be no such journal anyway.

We also realized that Dr. Laerton, reversed, is “not real.”

I still have no idea why someone would invest the time and effort into the fake website emulating the University of Aberdeen, but I’m absolutely positive that their motives were not beneficial to any of us.

What is the take-away of all of this?  Be aware, very aware, skeptical and vigilant.  Stick with the mainstream vendors unless you realize you’re experimenting.

King Richard

King Richard III

The much anticipated and long-awaited DNA results on the remains of King Richard III became available with a very unexpected twist.  While the science team feels that they have positively identified the remains as those of Richard, the Y DNA of Richard and another group of men supposed to have been descended from a common ancestor with Richard carry DNA that does not match.

http://dna-explained.com/2014/12/09/henry-iii-king-of-england-fox-in-the-henhouse-52-ancestors-49/

http://dna-explained.com/2014/12/05/mitochondrial-dna-mutation-rates-and-common-ancestors/

Debbie Kennett wrote a great summary article.

http://cruwys.blogspot.com/2014/12/richard-iii-and-use-of-dna-as-evidence.html

More Alike than Different

One of the life lessons that genetic genealogy has held for me is that we are more closely related that we ever knew, to more people than we ever expected, and we are far more alike than different.  A recent paper recently published by 23andMe scientists documents that people’s ethnicity reflect the historic events that took place in the part of the country where their ancestors lived, such as slavery, the Trail of Tears and immigration from various worldwide locations.

23andMe European African map

From the 23andMe blog:

The study leverages samples of unprecedented size and precise estimates of ancestry to reveal the rate of ancestry mixing among American populations, and where it has occurred geographically:

  • All three groups – African Americans, European Americans and Latinos – have ancestry from Africa, Europe and the Americas.
  • Approximately 3.5 percent of European Americans have 1 percent or more African ancestry. Many of these European Americans who describe themselves as “white” may be unaware of their African ancestry since the African ancestor may be 5-10 generations in the past.
  • European Americans with African ancestry are found at much higher frequencies in southern states than in other parts of the US.

The ancestry proportions point to the different regional impacts of slavery, immigration, migration and colonization within the United States:

  • The highest levels of African ancestry among self-reported African Americans are found in southern states, especially South Carolina and Georgia.
  • One in every 20 African Americans carries Native American ancestry.
  • More than 14 percent of African Americans from Oklahoma carry at least 2 percent Native American ancestry, likely reflecting the Trail of Tears migration following the Indian Removal Act of 1830.
  • Among self-reported Latinos in the US, those from states in the southwest, especially from states bordering Mexico, have the highest levels of Native American ancestry.

http://news.sciencemag.org/biology/2014/12/genetic-study-reveals-surprising-ancestry-many-americans?utm_campaign=email-news-weekly&utm_source=eloqua

23andMe provides a very nice summary of the graphics in the article at this link:

http://blog.23andme.com/wp-content/uploads/2014/10/Bryc_ASHG2014_textboxes.pdf

The academic article can be found here:

http://www.cell.com/ajhg/home

2015

So what does 2015 hold? I don’t know, but I can’t wait to find out. Hopefully, it holds more ancestors, whether discovered through plain old paper research, cousin DNA testing or virtually raised from the dead!

What would my wish list look like?

  • More ancient genomes sequenced, including ones from North and South America.
  • Ancestor reconstruction on a large scale.
  • The haplotree becoming fleshed out and stable.
  • Big Y sequencing combined with STR panels for enhanced genealogical research.
  • Improved ethnicity reporting.
  • Mitochondrial DNA search by ancestor for descendants who have tested.
  • More tools, always more tools….
  • More time to use the tools!

Here’s wishing you an ancestor filled 2015!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research

 

Ancestor Reconstruction

No, this is not Jurassic Park and we’re not actually recreating or cloning our ancestors – just on paper.

Back in early 2012, I began to discuss the possibility of using chromosome mapping of descendants to virtually recreate ancestors.

In 2013, I wrote a white paper about how to do this, and circulated it among a group of scientists who I was hoping would take the ball and run, creating tools for genetic genealogists.  So far, that hasn’t happened, but what has happened is that I’ve adapted a tool created by Kitty Cooper for something entirely different than its original purpose to do a “proof of concept.”

Kitty Cooper created the Ancestor Chromosome Mapper to allow people to map the DNA contributed by different ancestors on their chromosomes.  It’s exciting to see your ancestors mapped out, in color, on your chromosomes.

I utilized Kitty’s tool, found here, to map the proven DNA of my ancestors, below, utilizing autosomal matching and triangulation, to create this ancestor map of my own chromosomes.  As you can see there are still a lot of blank spaces.

Roberta's ancestor map2

After thinking about this a bit, I realized that I could do the same thing for my ancestors.

The chromosomes shown would be those of an individual ancestor, and the DNA mapped onto the chromosomes would be from the proven descendants that they inherited from that ancestor.  Eventually, with enough descendants we could create a “virtual file” for that ancestor to represent themselves in autosomal matching.  So, one day, I might create, or find created by someone else, a DNA “recreated” file for Abraham Estes, born in 1647 in Nonington, Kent, or for Henry Bolton, born about 1760 in England, or any of my other ancestors – all from the DNA of their descendants.

I decided a while back to take this concept for a test spin.

I wanted to see a visual of Joseph Preston Bolton’s DNA on his chromosomes, and who carries it today.  I wrote about this in Joseph’s 52 Ancestors article.

Utilizing Kitty Cooper’s wonderful ancestor chromosome mapping tool, a little differently than she had in mind, I mapped Joseph’s DNA and the contributors are listed to the right of his chromosome.  You can build a virtual ancestor from their descendants based on common matching segments, so long as they don’t share other ancestral lines as well.  I have only utilized the proven, or triangulated DNA segments, where three people match on the same segment.

joseph bolton reconstructed

We have a couple more DNA testers that descended from Joseph Bolton’s father, Henry Bolton through children other than Joseph Preston Bolton.  Adding these segments to the chromosome chart generated for Joseph Preston Bolton, we see the confirmed Henry Bolton segments below.

henry bolton proven

On the chart above, I’ve only used proven segments.

On the next chart I have not been able to “prove” all of the segments through triangulation (3 people), but if all of the provisional segments are indeed Bolton segments, then Henry’s chromosome map would have a few more colored segments.  Clearly, we need a lot more people to test to create more color on Henry’s map, but still, it’s pretty amazing that we can recreate this much of Henry’s chromosome map from these few descendants.

henry bolton probably

There’s a lot of promise in this technique.  Henry Bolton was married twice.  By looking at the DNA the two groups of children, 21 in total, have in common, we know that their common DNA comes from Henry himself.  DNA that is shared between only the groups descended from first wife, Catherine Chapman, but not from second wife, Nancy Mann, or vice versa, would be attributed to the wife of the couple.  Since Henry was married twice, with enough testers, it would be possible to reconstruct, in part, at least some of the genome of both wives, in addition to Henry.

Now, think for a minute, a bit further out in time.

We don’t know who Nancy Mann’s parents are for sure, although we’ve done a lot of eliminating and we know, probably, who her father was, and likely who her grandfather and great-grandfather were….but certainty is not within grasp right now.

But, it will be in the future through ancestor reconstruction.

Let’s say that the descendants of John Mann, the immigrant, reconstruct his genome.  He had 4 known sons and they had several children, so that would be possible.  John, the immigrant, is believed to be Nancy’s great-grandfather through son John Jr.

Now, let’s say that some of those segments that we can attribute through Henry Bolton’s children, as described above, are attributable to Nancy Mann.  The X chromosome match above is positively Nancy’s DNA.  How do I know that?  because it came through her son, Joseph Preston Bolton, and men don’t inherit an X chromosome from their father, only their mother.  So today, 3 descendants carry that segment of Nancy Mann’s X chromosome.

Let’s say that one of the Nancy Mann’s proven DNA segments (not the X, because John didn’t give his X to his son John) matches smack dab in the middle of one of the proven “John Mann” segments.  We’ve just proven that indeed, Nancy is related to John.

Think about the power of this for adoptees, for those who don’t know who their parent or parents are for other reasons, and for those of us who have dead end brick walls who are wives with no surnames.  Who doesn’t have those?

We have the potential, within the foreseeable future, to create “ancestor libraries” that we can match to in order to identify our ancestors.  Once the ancestor is reconstructed, kind of like reconstituting something dehydrated with water, we’ll be able to utilize their autosomal DNA file to make very interesting discoveries about them and their lives.  For example, eye color – at GedMatch today there is an eye color predictor.  There are several ethnicity admixture tools.  Want to know if your ancestor was ethnically admixed?  Virtually recreate them and find out.

Once recreated, we will be able to discover hair color, skin color and all of the other traits and medical conditions that we can today discover through the trait testing at Family Tree DNA and the genetic predispositions that Promethease reveals.

Yes, there will be challenges, like who creates those libraries, moderates any disputes and where are they archived for comparison….but those are details that can be worked out.  Maybe that’s one of the new roles of project administrators or maybe we’ll have ancestor administrators.

Someday, it may be possible to construct an entire family tree from your DNA combined with proven genealogy trees – not by intensely laborious work like it’s done today, but with the click of a button.

And that someday is very likely within our lifetimes, and hopefully, shortly.  The technology and techniques are here to do it today.

I surely hope one of the vendors implements this functionality, and soon, because, like all genealogists, I have a list of genealogy mysteries that need to be solved!!!

______________________________________________________________

Disclosure

I receive a small contribution when you click on some of the links to vendors in my articles. This does NOT increase the price you pay but helps me to keep the lights on and this informational blog free for everyone. Please click on the links in the articles or to the vendors below if you are purchasing products or DNA testing.

Thank you so much.

DNA Purchases and Free Transfers

Genealogy Services

Genealogy Research