The Autosomal Me – The Holy Grail – Identifying Native Genealogy Lines

holy grail

Sangreal – the Holy Grail.  We are finally here, Part 9 and the final article in our series.  The entire purpose of The Autosomal Me series has been to use our DNA and the clues it holds to identify minority admixture, in this case, Native American, and by identifying those Native segments, and building chromosomal clusters, to identify the family lines that contributed that Native admixture.  Articles 1-8 in the series set the stage, explained the process and walked us through the preparatory steps.  In this last article, we apply all of the ingredients, fasten the lid, shake and see what we come up with.  Let’s take a minute and look at the steps that got us to this point.

Part 1 was “The Autosomal Me – Unraveling Minority Admixture” and Part 2 was “The Autosomal Me – The Ancestors Speak.”  Part 1 discussed the technique we are going to use to unravel minority ancestry, and why it works.  Part two gave an example of the power of fragmented chromosomal mapping and the beauty of the results.

Part 3, “The Autosomal Me – Who Am I?,” reviewed using our pedigree charts to gauge expected results and how autosomal results are put into population buckets.

Part 4, “The Autosomal Me – Testing Company Results,” shows what to expect from all of the major testing companies, past and present, along with Dr. Doug McDonald’s analysis.

In Part 5, “The Autosomal Me – Rooting Around in the Weeds Using Third Party Tools,” we looked at 5 different third party tools and what they can tell us about our minority admixture that is not reported by the major testing companies because the segments are too small and fragmented.

In Part 6, “The Autosomal Me – DNA Analysis – Splitting Up” we began the analysis part of the data we’ve been gathering.   We looked at how to determine whether minority admixture on specific chromosomes came from which parent.

Part 7, “The Autosomal Me – Start, Stop, Go – Identifying Native Chromosomal Segments” took a deeper dive and focused on the two chromosomes with proven Native heritage and began by comparing those chromosome segments using the 4 GedMatch admixture tools.

Part 8, “The Autosomal Me – Extracting Data Segments and Clustering,” we  extract all of the Native and Blended Asian segments in all 22 chromosomes, but only used chromosomes 1 and 2 for illustration purposes.  We then clustered the resulting data to look for trends, grouping clusters by either the Strong Native criteria or the Blended Asian criteria.

In this final segment, Part 9, we will be applying the chromosomal information we’ve gathered to our matches and determine which of our lines are the most likely to have Native Ancestry.  This, of course, has been the goal all along.  So, drum roll…..here we go.

In Part 8, we ended by entering the start and stop locations of both Strong Native and Blended Asian clusters into a table to facilitate easy data entry into the chromosome match spreadsheet downloaded from either 23andMe or Family Tree DNA.  If you downloaded it previously, you might want to download it again if you haven’t modified it, or download new matches since you last downloaded the spreadsheet and add them to the master copy.

My goal is to determine which matches and clusters indicate Native ancestry, and how to correlate those matches to lineage.  In other words, which family lines in my family were Native or carry Native heritage someplace.

The good news is that my mother’s line has proven Native heritage, so we can use her line as proof of concept.  My father’s family has so many unidentified wives, marginalized families and family secrets that the Native line could be almost any of them, or all of them!  Let’s see how that tree shakes out.

Finding Matches

So let’s look at a quick example of how this would work.  Let’s say I have a match, John, on chromosome 4 in an area where my mother has no Native admixture, but I do.  Therefore, since John does not match my mother, then the match came from my father and if we can identify other people who also match both John and I in that same region on that chromosome, they too have Native ancestry.  Let’s say that we all also share a common ancestor.  It stands to reason at that point, that the common ancestor between us indicates the Native line, because we all match on the Native segment and have the same ancestor.  Obviously, this would help immensely in identifying Native families and at least giving pointers in which direction to look.  This is a “best case’ example.  Some situations, especially where both parents contribute Native heritage to the same chromosome, won’t be this straightforward.

Based on our findings, the maximum range and minimum (least common denominator or “In Common” range is as follows for the strongest Native segments on chromosomes 1 and 2.

  Chromosome 1 Chromosome 2
Largest   Range 162,500,000   – 180,000,000 79,000,000   – 105,000,000
Smallest   Range 165,658,091   – 171,000,000 90,000,000   – 103,145,425

At GedMatch

At GedMatch, I used a comparison tool to see who matched me on chromosome 1.  Only 2 people outside of immediate family members matched, and both from Family Tree DNA.  Both matched me on the critical Native segments between about 165-180mg.  I was excited.  I went to Family Tree DNA and checked to see if these two people also matched my mother, which would confirm the Native connection, but neither did, indicating of course that these two people matched me on my father’s side.  That too is valuable information, but it didn’t help identify any common Native heritage with my mother on chromosome 1.  It did, however, eliminate them as possibilities which is valuable information as well.

DNAGedcom

I used a new tool, DNAGedcom, compliments of Rob Warthen who has created a website, DNA Tools, at www.dnagedcom.com.  This wonderful tool allows you to download all of your autosomal matches at Family Tree DNA and 23andMe along with their chromosomal segment matches.  Since my mother’s DNA has only been tested at Family Tree DNA, I’m limiting the download to those results for now, because what I need is to find the people who match both she and I on the critical segments of chromosome 1 or 2.

Working with the Download Spreadsheet

It was disappointing to discover that my mother and I had no common matches that fell into this range on chromosome 1, but chromosome 2 was another matter.  Please note that I have redacted match surnames for privacy.

step 9 table 1

The spreadsheet above shows the comparison of my matches (pink) and Mother’s (white).  The Native segment of chromosome 2 where I match Mother is shaded mustard.  I shaded the chromosome segments that fell into the “common match” range in green.  Of those matches, there is only one person who matches both Mother and I, Emma.  The next step, of course, is to contact Emma and see if we can discover our common ancestor, because whoever it is, that is the Native line.  As you might imagine, I am chomping at the bit.

There are no segments of chromosome 2 that are unquestionably isolated to my father’s line.

Kicking it up a Notch

Are you wondering about now how something that started out looking so simple got so complex?  Well, I am too, you’re not alone.  But we’ve come this far, so let’s go that final leg in this journey.  My mom always used to say there was no point in doing something at all if you weren’t going to do it right.  Sigh….OK Mom.

The easiest way to facilitate a chromosome by chromosome comparison with all of your matches and your Strong Native and Blended Asian segments is to enter all of these segment groups into the match spreadsheet.  If you’re groaning and your eyes glaze over right after you do one big ole eye roll, I understand.

But let’s take a look at how this helps us.

On the excerpt from my spreadsheet below, for a segment of chromosome 5, I have labeled the people and how they match to me.  The ones labeled “Mom” in the last column are labeled that way because these people match both Mom and I.  The ones labeled “Dad” are labeled that way because I know that person is related on my father’s side.

Using the information from the tables created in Step 8, I entered the beginning and end of all matching segment clusters into my spreadsheet.  You can see these entries on lines 7, 8, 22, 23 and 24.  You then proceed to colorize your matches based on the entry for either Mom or Dad – in other words the blue row or the purple row, line 7, 22 or 24.  In this example, actually, line 5 Rex, based on the coloration, should have been half blue and half purple, but we’ll discuss his case in a minute.

The you can then sort either by match name or by chromosome to view data in both ways.  Let’s look at an example of how this works.

Legend:

  • White Rows:  Mother’s matches.  When Mother and I both match an individual, you’ll see the same matches for me in pink.  This double match indicates that the match is to Mother’s side and not Father’s side.
  • Pink Rows:  My matches.
  • Purple “Mom” labels in last column:  The individual matches both me and Mom.  This is a genetic match.
  • Teal “Dad” labels in last column: Genealogically proven to be from my father’s side.  This is a genealogical, not a genetic label, since I don’t have Dad’s DNA and can only infer these genetically when they don’t also match Mother.
  • Dark Pink Rows labeled “Me Amerind Only” are Strong Native or Blended Asian segments from Chromosome Table that I have entered.  My segments must come from one of my parents, so I’ve either colored them purple, if the match is someone who matches Mother and I both, or teal, if they don’t match both Mom and I, so by inference they come from my father’s line.
  • Dark Purple Rows labeled “Mom Amerind Only” are Mom’s segments from the Chromosome Table.
  • Dark Teal Rows labeled “Dad Amerind Only” are inferred segments belonging to my father based on the fact that Mother and I don’t share them.

Inferred Relationships

This is a good place to talk for just a minute about inferred relationships in this context.  Inference gets somewhat tenuous or weak.  The inferred matches on my father’s side began with the Native segments in the admix tools.  Some inferences are very strong, where Mother has no Native at all in that region.  For example, Mom has European and I have Native American.  No question, this had to come from my father.  But other cases are much less straightforward.

In many cases, categorization may be the issue.  Mom has West Asian for example and I have Siberian or Beringian.  Is this a categorization issue or is this a real genetic difference, meaning that my Siberian/Beringian is actually Native and came from my father’s side?

Other cases of confusion arise from segment misreads, etc.  I’ve actually intentionally included a situation like this below, so we can discuss it.  Like all things, some amount of common sense has to enter the picture, and known relationships will also weigh heavily in the equation.  How known family members match on other chromosome segments is important too.  Do you see a pattern or is this match a one-time occurrence?  Patterns are important.

Keep in mind that these entries only reflect STRONG Asian or Native signals, not all signals.  So even if Mother doesn’t have a strong signal, it doesn’t mean that she doesn’t have ANY signal in that region.  In some cases, start and stop segments for Mom and Dad overlapped due to very long segments on some matches.  In this case, we have to rely on the fact that we do have Mother’s actual DNA and assume that if they aren’t also a match to Mother, that what we are seeing is actually Dad’s lines, although this may not in actuality always be true.  Why?  Because we are dealing with segments below the matching threshold limit at both Family Tree DNA and 23andMe, and both of my parents carry Native heritage.  We can also have crossed a transitional boundary where the DNA that is being matched switches from Mom’s side to Dad’s side.

Ugh, you say, now that’s getting messy.  Yes, it is, and it has complicated this process immensely.

The Nitty-Gritty Data Itself

step 9 table 2

Taking a look at this portion of chromosome 5, we have lots going on in this cluster.  Most segments will just be boring pink and white (meaning no Native), but this segment is very busy.  Mom and I match on a small segment from 52,000,000 to 53,000,000.  Indeed, this is a very short segment when compared to the entire chromosome, but it is strongly Native.  We both also match Rex, our known cousin.  I’ve noted him with yellow in the table. Please note that Mom’s white matches are never shaded.  I am focused on determining where my own segments originate, so coloring Mother’s too was only confusing.  Yes, I did try it.

You can see that Mother actually shares all or any part of her segment with only me and Rex.  This simplifies matters, actually.  However, also note that I carry a larger segment in this region than does Mother, so either we have a categorization issue, a misread, or my father also contributed.  So, a conundrum.  This very probably implies that my father also carried Native DNA in this region.

Let’s see what Rex’s DNA looks like on this same segment of chromosome 5, from 52-53 using Eurogenes.  In the graph below, my chromosome is the top bar, Rex’s the middle and the bottom bar shows common DNA with the black nonmatching.  Yellow is Native American, red is South Asian, putty is Siberian, lime green is Mediterranean, teal is North Europe, orange is Caucus.

Step 9 item 3

This same comparison is shown to Mother’s DNA (top row) below.

step 9 item 4

It’s interesting that while Mother doesn’t have a lot of yellow (Native), she does have it throughout the same segment where Rex’s occurs, from about 52 through 53.5.

Does this actually point to a Native ancestor in the common line between Rex, Mom and I, which is the Swiss/German Johann Michael Miller line which does include an unidentified wife stateside, or does this simply indicate a common ancient population long ago in Asia?  It’s hard to say and is deserving of more research.  I feel that it is most likely Native because of the actual yellow, Native segment. If this was an Asian/European artifact, it would be much less likely to carry the actual yellow segment.

Is Rex also genealogically related to my father?  As I’ve worked through this process with all of my chromosomes and matches, I’ve really come to question if one of my father’s dead ends is also an ancestral line of my mother’s.

The key to making sense of these results is clusters.

Clusters vs Singleton Outliers

The work we’ve already done, especially in Step 8, clusters the actual DNA matching segments.  We’ve now entered that information into the spreadsheet and colored the segments of those who match.  What’s next?

The key is to look for people with clusters.  Many matches will have one segment, of say, 10 that match, colored.  Unless this is part of a large chromosome cluster, it’s probably simply an outlier.  Part of a large chromosome cluster would be like the large Strong Native segments on chromosome 1 or 2, for example.  How do we tell if this is a valid match or just an outlier?

Sort the spreadsheet by match name.  Take a look at all of the segments.

The example we’ll use is that of my cousin, Rex.  If you recall, he matches both me and Mother, is a known first cousin twice removed to me, (genetically equal to a second cousin), and is descended from the Miller line.

In this example, I also colored Mother’s segments because I wanted to see which segments that I did not receive from her were also Native. You can see that there are many segments where we all match and several of those are Native.  These also match to other Miller descendants as well, so are strongly indicative of a Native connection someplace in our common line.

If we were only to see one Native segment, we would simply disregard this as an outlier situation.  But that’s not the case.  We see a cluster of matches on various segments, we match other cousins from the same line on these segments, and reverting back to the original comparison admixture tools verifies these matches are Native for Rex, Mom and me.

step 9 item 5

Hmmmm…..what is Dad’s blue segment color doing in there?  Remember I said that we are only dealing with strong match segments?  Well, Mom didn’t have a strong segment at that location and so we inferred that Dad did.  But we know positively that this match does come from Mother’s side.  I also mentioned that I’ve come to wonder if my Mom and Dad share a common line.  It’s the Miller line that’s in question.  One of Johann Michael Miller’s children, Lodowick, moved from Pennsylvania to Augusta County, Virginia in the 1700s and his line became Appalachian, winding up in many of the same counties as my father’s family.  I’m going to treat this as simply an anomaly for now, but it actually could be, in this case, an small indication that these lines might be related.  It also might be a weak “Mom” match, or irrelevant.  I see other “double entries” like this in other Miller cousins as well.

What is the pink row on chromosome 12?  When I grouped the Strong Native and Asian Clusters, sometimes I had a strong grouping, and Mom had some.  The way I determined Dad’s inferred share was to subtract what Mom had in those segments from mine.  In a few cases, Mom didn’t have enough segments to be considered a cluster but she had enough to prevent Dad from being considered a cluster either, so those are simply pink, me with no segment coloring for Mom or Dad.

Let’s say I carry Strong Native/Mixed Asian at the following 8 locations:

10, 12, 14, 16, 18, 20, 22, 24

This meets the criteria for 8 of 15 ethno-geographic locations (in the admix tools) within a 2.5 cM distance of each other, so this cluster would be included in the Mixed Asian for me.  It could also be a Strong Native cluster if it was found in 3 of 4 individual tools.  Regardless of how, it has been included.

Let’s now say that Mom carries Native/Mixed Asian at 10, 12 and 14, but not elsewhere in this cluster.

Mom’s 3 does not qualify her for the 8/15 and it only leaves Dad with 5 inferred segments, which disqualifies him too.  So in this case, my cluster would be listed, but not attributable directly to either parent.

What this really says is that both of my parents carry some Native/Blended Asian on this segment and we have to use other tools to extrapolate anything further.  The logic steps are the same as for Dad’s blue segment.  We’re going to treat that as an outlier.  If I really need to know, I can go back to the actual admixture tools and see whether Mom or Dad really match me strongly on which segments and how we compare to Rex as well.  In this case, it’s obvious that this is a match to my Mother’s side, so I’m leaving well enough alone.

Let’s see what the matches reveal.

Matches

Referring back to the Nitty Gritty Data spreadsheet, Mom’s match to Phyllis on row 15 confirms an Acadian line.  This is the known line of Mother’s Native ancestry.  This makes sense and they match on Native segments on several other chromosomes as well.  In fact, many of my and Mother’s matches have Acadian ancestry.

My match to row 19, Joy, is a known cousin on my father’s side with common Campbell ancestry.  This line is short however, because our common ancestor, believed to be Charles Campbell died before 1825 in Hawkins County, TN.  He was probably born before 1750, given that his sons were born about 1770 and 1772.  Joy and I descend from those 2 sons.  Charles wife and parents are unknown, as is his wife.

My match to row 20, inferred through my father’s side, is to a Sizemore, a line with genetically proven Native ancestry.  Of course, this needs more research, but it may be a large hint.  I also match with several other people who carry Sizemore ancestors.  This line appears to have originated near the NC/VA border.

I wanted to mention rows 4 and 17.  Using our rules for the spreadsheet, if I match someone and they don’t also match Mother on this segment, I have inferred them to be through my father.  These are two instances that this is probably incorrect.  I do match these people through Mother, but Mother didn’t carry a strong signal on this segment, so it automatically became inferred to Dad.  Remember, I’m only recording the Strong Native or the Blended Asian segments, not all segments.  However, I left the inferred teal so that you can see what kinds of judgment calls you’ll have to make.  This also illustrates that while Mom’s genetic matches are solid, Dad’s inferred matches are less so and sometimes require interpretation.  The proper thing to do in this instance would be to refer back to the original admixture tools themselves for clarification.

Let’s see what that shows.

step 9 item 6

Using HarrappaWorld, the most pronounced segment is at about 52.  Teal is American.  You can see that Mother has only a very small trace between 53 and 54, almost negligible.  Mother’s admixture at location 52 is two segments of purple, brown and cinnamon which translate to Southwest Asian (lt purple), Mediterranean (dk purple), Caucasian (brown) and Balock (cinnamon), from Pakistan.

Checking Dodecad shows pretty much the same thing, except Mother’s background there is South Asian, which could be the same thing as Caucus and Pakistan, just different categorizations.

In this case, it looks like the admixture is not a categorization issue, but likely did come from my father.  Each segment will really be a case by case call, with only the strongest segments across all tools being the most reliable.

It’s times like this that we have to remember that we have two halves of each chromosome and they carry vastly different information from each of our parents.  Determining which is which is not always easy.  If in doubt, disregard that segment.

Raw Numbers

So, what, really did I figure out after all of this?

First, let’s look at some numbers.

I was working with a total of 292 people who had at least one chromosomal segment that matched me with a Strong Native or Blended Asian segment.  Of those, 59 also matched Mom’s DNA.  Of those, 18 had segments that matched only Mom.  This means that some of them had segments that also matched my father.  Keep in mind, again, that we are only using “strong matches” which involves inferring Dad’s segments and that referring back to the original tools can always clarify the situation.  There seems to be some specific areas that are hotspots for Native ancestry where it appears that both of my parents passed Native ancestry to me.

Many of my and my mother’s 59 matches have Acadian ancestry which is not surprising as the Acadians intermarried heavily with the Native population as well as within their own ethnic group.

Several also have Miller Ancestry.  My Miller ancestor is Johann Michael Miller (1692-1771) who immigrated in the colonial period and settled on the Pennsylvania frontier.  His son, Philip Jacob Miller’s (1726-1799) wife was a woman named Magdalena whose last name has been rumored for years to be Rochette, but no trace of a Rochette family has ever been found in the county where they lived, region or Brethren church history…and it’s not for lack of looking.  Several matches point to Native Ancestry in this line.  This also begs the question of whether this is really Native or whether it is really the Asian heritage of the German people.  Further analysis, referring back to the admixture tools, suggests that this is actually Native. It’s also interesting that absolutely none of Mother’s other German or Dutch lines show this type of ancestry.

There is no suggestion of Native ancestry in any of her other lines.  Mother’s results are relatively clean.  Dad’s are anything but.

Dad’s Messy Matches

My father’s side of the family, however, is another story.

I have 233 matches that don’t also match my mother.  There can be some technical issues related to no-calls and such, but by and large, those would not represent many.  So we need to accept that most of my matches are from my Father’s side originating in colonial America.  This line is much “messier” than my mother’s, genealogically speaking.

Of those 233 matches, only 25 can be definitely assigned to my father.  By definitely assigned, I mean the people are my cousins or there is an absolutely solid genealogical match, not a distant match.  Why am I not counting distant matches in this total?  We all know by virtue of the AncestryDNA saga that just because we match family lines and DNA does NOT mean that the DNA match is the genealogical line we think it is.  If you would like to read all about this, please refer to the details in CeCe Moore’s blog where she discussed this phenomenon.  The relevant discussion begins just after the third photo in this article where she shows that 3 of 10 matches at Ancestry where they “identify” the common DNA ancestor are incorrect.  Of course, they never SAY that the common ancestor is the DNA match, but it’s surely inferred by the DNA match and the “leaf” connecting these 2 people to a common ancestor.  It’s only evident to someone who has tested at least one parent and is savvy enough to realize that the individual whose ancestor on Mom’s side that they have highlighted, isn’t a match to Mom too.  Oops.  Mega-oops!!!

However, because we are dealing in our project, on Dad’s side, with inferences, we’re treading on some of the same ground.  Also, because we are dealing with only “strong clustered” segments, not all Native or Asian segments and because it appears that my parents both have Native ancestry.  To make matters worse, they may both have Algonquian, Iroquoian or both.

I have also discovered during this process that several of my matches are actually related to both of my parents.  I told you this got complex.

Of the people who don’t match Mother, 32 of them have chromosomal matches only to my father, so those would be considered reliable matches, as would the closest ones of the 25 that can be identified genealogically as matching Dad.  Many of these 25 are cousins I specifically asked to test, and those people’s results have been indispensable in this process.

In fact, it’s through my close circle of cousins that we have been able to eliminate several lines as having Native ancestry, because it doesn’t’ show as strong and they don’t have it either.

Many of these lines group together when looking at a specific chromosome.  There is line after line and cousin after cousin with highlighted data.

Dad’s Native Ancestors

So what has this told me?  This information strongly suggests that the following lines on my father’s side carry Native heritage.  Note the word “carry.”  All we can say at this point is that it’s in the soup – and we can utilize current matches at our testing company and at GedMatch, genealogy research and future matches to further narrow the branches of the tree.  Many of these families are intermarried and I have tried to group them by marriage group.  Obviously, eventually, their descendants all intermarried because they are all my ancestors on my father’s side.  But multiple matches to other people who carry the Native markers but aren’t related to my other lines are what define these as lines carrying Native heritage someplace.

  • Campbell – Hawkins County, Tn around 1800, missing wife and parents, married into the Dodson family
  • Dodson – Hawkins County, Tn, Virginia – written record of Lazarus Dodson camping with the Cherokee – missing wife, married into the Campbell and Estes family
  • Claxton/Clarkson – Russell Co., Va, Claiborne and Hancock Co., Tn – In NC associated with the known Native Hatcher family.  Possibly a son-in-law.  Missing family entirely.
  • Cook – Russell Co., Va. – daughter married Claxton/Clarkson – missing wives
  • Harrold, Harrell, Herrell – Hancock Co., Tn., Wilkes Co., NC – missing wives
  • McDowell – Hancock Co. Tn, Wilkes Co., NC, Augusta Co., Va – married into the Harrell family, missing wife
  • McNeil, McNiel – Wilkes Co., NC – missing wives, married into the Vannoy family
  • Vannoy – Wilkes County – some wives unaccounted for pre-1800
  • Crumley – Greene County, Tn., Lee Co., Va. – oral history of Native wife, married into the Vannoy family
  • Brown – Greene County, Tn, Montgomery Co., Va – married into the Crumley family, missing wives

While this looks like a long list, the list of families that don’t have any Native ancestry represented is much longer and effectively serves to eliminate all of those lines.  While I don’t have “THE” answer, I certainly know where to focus my research.  Maybe there isn’t the one answer.  Maybe there are multiple answers, in multiple lines.

The Take Away

Is this complex?  Yes!  Is it a lot of work?  You bet it is!  Is everything cast in concrete?  Never!  You can see that by the differences we’ve found in data interpretation, not to mention issues like no-calls (areas that for some reason in the test don’t read) and cross overs where your inheritance switches from your mom’s side to your dad’s side.  Is there any other way to do this?  No, not if your minority admixture is down in that weedy area around 1%.

Is it worth it?  You’ll have to decide.  It guess it depends on how desperately you want to know.

Part of the reason this is difficult is because we are missing tools in critical locations.  It’s an intensively laborious manual process.  In essence, using various tools, one has to figure out the locations of the Native and Asian chromosome segments and then use that information to infer Native matches by a double match (genetic match at DNA company plus match with Strong Native/Blended Asian segment) with the right parent.  It becomes even more complex if neither parent is available for testing, but it is doable although I would think the reliability could drop dramatically.

Tidbits and Trivia

I’ve picked up a number of little interesting tidbits during this process.  These may or may not be helpful to you.  Just kind of file them away until needed:)

  • Matches at testing companies come and go….and sometimes just go.  At Family Tree DNA, I have some matches that must be trembling on the threshold that come and go periodically.  Now you see them, now you don’t.  I lost matches moving from the Affy chip to the Illumina chip and lost additional matches between Build 36 and 37.  Some reappeared, some haven’t.
  • The start and stop boundaries changed for some matches between build 36 and build 37.  I did not go back and readjust, as most of these, in the larger scheme of things, were minor.  Just understand that you are looking for  patterns here that indicate Native heritage, not exact measurements.  This process is a tool, and unfortunately, not a magic wand:)
  • The centromere locations change between builds.  If you have matches near or crossing the middle of the chromosome, called the centromere, there may be breaks in that region.  I enter the centromere start and stop locations in my spreadsheet so that if I notice something odd going on in that region, the centromere addresses are right there to alert me that I’m dealing with that “odd” region.  You can find the centromere addresses in the FAQ at Family Tree DNA for their current build.
  • At 23andMe, when you reach the magic 1000 matches threshold, you start losing matches and the matching criteria is elevated so that you can stay under 1000 matches.  For people with colonial American or Jewish heritage, in other words those with high numbers of matches, this is a problem.
  • Watch for matches that are related to both sides of your family.  If your family lived in colonial America, you’re going to have a lot of matches and many are probably related to each other in ways you aren’t aware of.
  • If your parents are related to each other, this process might simply be too complex and intertwined to provide enough granular data to be useful.
  • Endogamous groups are impossible to sort through as to where, meaning which ancestor, the DNA came from.  This is because the original group founders’ DNA is just getting passed around and around, with little or no new DNA being introduced.  The effect of this on downstream generations relative to genetic genealogy is that matches appear to be more closely related than they are because of the amount of matching DNA they carry.  For my Brethren and my Acadian groups of people, I just list them by the group name, since, as the saying goes, “if you’re related to one Acadian, you’re related to all Acadians.”
  • If you’re going to follow this procedure, save one spreadsheet copy with the Strong Native only and then a second one with both the Strong Native and Blended Asian.  I’m undecided truthfully whether the Mixed Asian adds enough resolution for the extra work it generates.
  • When in question, refer back to the original tools.  The answer will always be found there.
  • Unfortunately, tools change.  You may want to take screen shots.  During this process, FTDNA went from build 36 to 37, match thresholds changed, 23andMe introduced a new user interface (which I find much less intuitive) and GedMatch has made significant changes.  The net-net of this is when you decide to undertake this project, commit to it and do it, start to finish.  Doing this little by little makes you vulnerable to changes that may make your data incompatible midstream – and you may not even realize it.
  • This entire process is intensively manual.  My spreadsheet is over 5500 rows long.  I won’t be doing it again…although I will update my spreadsheet with new matches from time to time.  The hard work is already done.
  • This same technique applies to any minority ancestry, not just Native, although that’s what I’ve been hunting for and one of the most common inquiries I receive.
  • I am hopeful that in the not too distant future many of these steps and processes will be automated by the group of bright developers that contribute to GedMatch or via other tools like DNAGedcom. HINT – HINT!!!

I would like to follow this same process to identify the source of my African heritage, but I’m thinking I’ll wait for the tools to become automated.  The great irony is that it’s very likely in the same lines as my Native ancestors.

If You Want to Test

What does it take to do this for yourself using the tools we have today, as discussed?

If your parents are living, the best gift you can give yourself is to test them, now, while you still can.  My mother has been gone for several years, but her DNA archived at Family Tree DNA was still viable.  This is not always the case.  I was fortunate.  Her DNA is one of the best gifts she gave me.  Not just by inheritance, but by having hers tested.  I thank her every single day, for both!  I could not have written this article without her DNA results.  The gift that keeps on giving.

If you don’t have a parent to test, you can test several other family members who will provide some information, but clearly won’t carry the same amounts of common DNA with you as your parents.  These would include your aunts and uncles, your parents’ siblings and what I’ve referred to as your close cousin circle.  Attempt to test at least someone from each line.  Yes, it gets expensive, but as one of my cousins said, as she took her third or 4th DNA test.  “It’s only money.  This is about family.”

You can also test your own siblings as well to obtain more information that you can use to match up to your family lines. Remember, you only receive half of your parents DNA, and your siblings will received some DNA from your parents that you didn’t.

I don’t have any other siblings to test, but I have tested cousins from several lines which have proven invaluable when trying to discern the sources of certain segments. For example, one of these Native segments fell on a common segment with my cousin Joy.  Therefore, I know it’s from the Campbell line, and because I have the Campbell paternal Y-DNA which is European, I know immediately the Native admixture would have had to be from a wife.

Much of this puzzle is deductive, but we now have the tools, albeit manual, to do this type of work that was previously impossible.  I am somewhat disappointed that I can’t pinpoint the exact family lines, yet, but hopefully as more people test and more matches provide genealogical information, this will improve.

If you want to play in this arena, you need to test at either Family Tree DNA, 23andMe, or both.  Right now, the most cost effective way to achieve this is to purchase a $99 kit from 23andMe, test there, then download your results from 23andMe and upload them to Family Tree DNA for $99.  That way, you are fishing in both pools.  Be aware that less than half of the people who test at either company download results to GedMatch, so your primary match locations are with the testing companies.  GedMatch is auxiliary, but critical for this analysis.  And the newest tool, DNAGedcom is a Godsend.

Also note that transferring your result to Family Tree DNA is NOT the same thing as actually testing there.  Why does this matter?  If you want a future test at Family Tree DNA, who is the premiere genetic genealogy testing company, offering the most variety and “deepest” commercial tests, they archive your DNA for 25 years, but if you transfer results, they don’t have your DNA to archive, so no future products can be ordered.  All I can say is thank Heavens Mom’s DNA was there.

Ancestry.com doesn’t provide any tools such as the chromosome browser or even the basic information of matching segments.  All you get is a little leaf that says you’re related, but the questions of which segment or how are not answerable today at Ancestry and as CeCe’s experience proved, its unreliable.  It’s  possible that you share the same surnames and ancestor, but your genetic connection is not through that family line.  Without tools, there is no way to tell.  Ancestry released raw data files a few weeks ago and very recently, GedMatch has implemented the ability to upload them so that Ancestry participants can now utilize the additional tools at GedMatch.

Although this has been an extraordinarily long and detailed process, I can’t tell you how happy I am to have developed this new technique to add to my toolbox.  My Native and African ancestors have been most elusive.  There are no records, they didn’t write and probably didn’t even speak English, certainly not initially.  The only clues to their existence, prior to DNA, were scant references and family lore.  The only prayer of actually identifying them is though these small segments of our DNA – yep – down in the weeds.  Are there false starts perhaps, and challenges and maybe a few snakes down there?  Yes, for sure, but so is the DNA of your ancestors.

Happy gardening and rooting around in the weeds.  Just think of it as searching for the very best buried treasure!  It’s down there, just waiting to be found.  Keep digging!

I hope you’ve enjoyed this series and that it leads you to your own personal genealogical treasure trove!

treasure chest

The Gift of a Davenport

I work with adoptees a lot.  They often order Personalized DNA Reports with the hope of finding some hint of their family.  Women have a distinct disadvantage – they have no Y chromosome.  About 30% of the time by looking at the Y chromosome, I can figure out the most likely genetic surname for men – and sometimes there is absolutely no question.  But women aren’t so lucky.

When adoptees order these reports, I suggest, strongly, that they also have the Family Finder test at Family Tree DNA performed.  This gives me two tools to work with, and they can be used together.

Recently, I completed a report for Caroline.  Here’s the sum knowledge of what she knew about her biological family.  She was born in Flagstaff, Arizona to a mother who was a college student.  That’s it.  Let’s just say there was a lot of opportunity for DNA to help Caroline.  Caroline said to me, “I don’t know the names of any of my blood relatives.”  Well Caroline, we’re about to fix that!!!

And indeed, she does now, through the magic of DNA and a little sleuthing.  Caroline, it turns out, is one of the lucky ones – she had a good match and that match has led us to well, a Davenport…and more.

                      Davenport

No, not this kind of Davenport – well – maybe not – but the Davenport family.  Maybe it’s the same Davenport family, because although the word davenport is generic like “Kleenex” today, it all started with the Davenport family, a Massachusetts furniture manufacturer, the A. H. Davenport Company.  Hmmm….I wonder.

Using Family Finder, Carolina had a solid second cousin match.  She contacted this person, we’ll call him Mr. Midkiff, who provided some initial information, but the 4 surnames Mr. Midkiff listed as Ancestral Surnames proved to be much more useful than the information provided to Caroline.

Often, it’s a good idea to list as many surnames as you possibly can, but in this case, Mr. Midkiff only listed 4 plus his own, for a total of 5 to work with, so I’m betting here that they are Mr. Midkiff’s closest surnames, meaning the grandparents generation plus one great-grandparent surname.

With that, I used the handy-dandy genetic relationship chart to show Caroline how this works.  One of the reasons I love this chart is because it’s all related to “self,” so you don’t have to try to figure out where and how you fit into the chart.

adopted cheat chart

If Mr. Midkiff is her second cousin, and she is “self” then we can see that self and the second cousin connect via great-grandparents. Mr. Midkiff’s great-grandparents would have the following surnames, plus three additional.

  • Midkiff
  • Davenport
  • Jennings
  • Potter
  • Veach
  • 3 additional unknown

These are the surnames of Mr. Midkiff’s ancestors and it’s all we have to work with since we don’t know the surnames of Caroline’s ancestors.

Using the chart and retrofitting surnames, we know that of Mr. Midkiff’s 5 surnames, 2 or 3 come from his mother’s side and 2 or 3 from his father’s side.  We know genetically that Caroline is related closely to at least one of those 5 lines, and possibly to more than one, meaning 2 or 3, depending on how closely she and Mr. Midkiff are actually related.

Next, we need to figure out which of those 5 surnames Caroline is related to.

Caroline only had one close match, but she had 960 total matches.  In order to be able to sort through those matches, I entered the 5 surnames listed by Mr. Midkiff as Caroline’s surnames.  This allowed me to then search for these ancestral surnames and to see them bolded in Caroline’s match list.

davenport 1

Because of different surname spellings, instead of simply relying on the search, I went through page by page and looked at each bolded surname.  I discovered that this was a very good move, because the Davenport family was spelled any number of ways, like Diefenback, Dieffenback, etc.  The Ancestral Surname search does not pick up alternate spellings, but the bolded surnames in the lists sometimes do.

A total of 13 people matched one or more of these surnames.

Her matches sort out like this:

  • Midkiff – 1
  • Jennings – 5
  • Davenport – 3
  • Potter – 4
  • Veach – 1 Vaux

I grouped people into categories by their surnames and then began using the Chromosome Browser to compare people to Caroline.

Normally, I could compare all 13 people in 3 comparisons (the browser allows 5 selections per comparison), download them, and then use a spreadsheet to sort by chromosome matches, but the downloads have been experiencing technical difficulties recently, so instead, I simply compared randomly and then by surname group.

One of the great options in the Chromosome Browser is the option for “common surnames” which then displayed all of 13 of her common surname matches and no non-matches.  So I, thankfully, did not have to sort through 960 people to find the 13 she matches for comparison.

Below, with the chromosome browser set to 1cM, you can see her matches to the Davenport group, plus a Fry who lists Potter as her ancestral surname but also matches the Davenport group.

davenport 2

What we are looking for here are people who match Caroline on the exact same chromosome segments and match each other as well.  This allows us to identify that segment with that surname.  In this case, chromosome 12 fits that bill exactly.

Davenport 2 ch 12

So Caroline, welcome to the Davenport family!!!

However, since Ms. Fry does not list Davenport, but does list Potter, let’s take a look at that Potter group.

davenport 3

Now, this gets very interesting, because look at that same segment of Chromosome 12 – in addition to  the Davenport folks, it also matches a Pinson who lists both Jennings and Potter in their list of ancestral surnames.  So the Davenport DNA is also Potter DNA.  Welcome to the Potter family Caroline!

Davenport 3 ch 12

So, let’s take a look at the Jennings folks.

davenport 4

Again, let’s look at Chromosome 12 and indeed, 4 of the 5 people who carry the Jennings surname also match Caroline on that same segment of Chromosome 12.

Davenport ch 12

What does this tell us?  Well, it tells us that this chromosome is inherited from the same ancestor.  What I can’t tell Caroline is which ancestor.  What we can say is that all three of these surnames, and all of these individuals share that ancestor and the chromosome is inherited through the Jennings, Davenport and Potter families in a particular family line – in Caroline’s family line and also in Mr. Midkiff’s.  Now it will be up to genealogy, and contacting these matches and asking for their Davenport/Potter/Jennings ancestry, to disclose just how these people’s ancestors are related.

Oh yes, and before I forget, welcome to the Jennings family Caroline!

So, here’s what I’m guessing.  Caroline has in essence no matches to Midkiff (other than the initial match to Mr. Midkiff) or Veach.  However, both Caroline and Mr. Midkiff have several matches, including the same segment of chromosome 12, to Jennings, Davenport and Potter.  I’m guessing that this is Mr. Midkiff’s mother’s side of the family and that if Caroline were to contact all of these people, she would, by process of elimination, discover commonalities in their pedigree charts and genealogy.  Then, by working forwards from what she finds, she can, again, by process of elimination, hopefully, find a line of the family that went to Arizona and candidates for one of her parents.

Maybe one of you holds the answer to Caroline’s quandry.  Does anyone know of a family with some history in Texas and in Arizona that carries the surnames Jennings, Davenport and Potter and perhaps married in to the Veach or Midkiff family?  If so, you can perhaps put some color into Caroline’s mysterious Davenport family.  Contact Caroline directly at cbfernandez@gmail.com. She would love to hear from you.

Davenport 5

Caveat:  Please note that this level of autosomal research is not normally included in a Personalized DNA report which focuses on either the Y-line or the Mitochondrial DNA lines.  Some research is included and was included for Caroline, identifying the Davenport common line.  The balance of this research was performed for the blog posting, with Caroline’s permission of course.  This type of autosomal research is available through www.dnaexplain.com at an hourly rate.  Everyone’s situation is unique and varies, and it is impossible to create a standard report product for autosomal situations.  Generally, a good approach is to start with a Y-line or mitochondrial DNA report and move forward from there.  You can see what it did for Caroline!

Still Part Redman Deep Inside

Do you have a persistent story of Native American heritage in your family?

Standing Bear, Ponca, 1877Mark Green’s wife did.  Her ancestor Nancy Pittman’s mother was supposed to be a Cherokee Indian.  If your family was from the south, chances are you have some similar story.

Mark tracked her story both through DNA and the Cherokee records.  Her DNA showed 1% Native ancestry, but the records pertaining to the Guion-Miller Roll provided additional information.  It’s most interesting, because although the paperwork having to do with her 1907 application is ambiguous, with the application subsequently denied, the DNA, some 100 years and a few generations later, isn’t.

Here’s Mark’s article about the family story, his research and what he found.  Sometimes a little footwork goes a long way – and there are lots of records available having to do with the Cherokee and 5 Civilized Tribes who were removed to Oklahoma.

http://southerngreens.blogspot.com/2013/04/im-still-part-redman-deep-inside.html

Digging Up Dad, Exhumation and Forensic Testing Alternatives

Dad in suit

I didn’t do it.  I really didn’t.  Ok, I wanted to, but I didn’t.

Yes, I seriously considered exhuming my father.  Ok, now that you’ve stopped gasping, let me tell you about the story, and what I did instead, and how successful it was, and wasn’t.

My father, William Sterling Estes, died in a car accident in 1963.  That means he’s been dead now for 50 years, half a century.  Depending on the source, he had between 2 and several children.  His obituary names me as his daughter, then inadvertently mixed up my mother, his x-wife’s name with that of his sister.  So my mother is listed as my father’s sister in his obit and his sister isn’t listed at all.  Neither is his other daughter, my half-sister.  For any of you who follow my family story, you already know it’s bizarre, so this unfortunate error should come as no surprise and would only provide Jeff Foxworthy with fodder for his “you might be….if” series.

But, as you’ll see, that obituary is part of the problem and so is the fact that he has been dead 50 years now.  That’s 50 years for his DNA to degrade.

My father was, well, ahem, somewhat of a playboy.  I keep finding children, and rumors of children, scattered about as I kept researching.  I keep waiting for a solid half-sibling match to some poor unsuspecting person on one of these autosomal tests too.  It hasn’t happened yet, but I’m just sure that one day it will.

And I haven’t published my blog article on Ilo yet, but suffice it to say that if you know of an Ilo (or maybe Flo?) who had a male child about 1920 in or near Battle Creek, Michigan and was briefly “married” to William Sterling Estes who was serving at Camp Custer at the time….I need to talk to you.

Now you’d think with all of these alleged children, there would be a male child to test, but the only male child I knew of back when DNA testing began was the male child of Ilo who I have never been able to identify, let alone locate.  I hadn’t found my “brother” Dave yet at that time, but as it turned out, Dave’s DNA did not match the Estes line anyway, so that would have been a red herring.

My Estes line out of Claiborne County Tennessee, for all of the males in earlier generations, dwindled to only a few, then to none in my generation.  The best I could do was a descendant of a male 3 or 4 generations upstream in my tree, and where there are paternity questions in more recent generations, a descendant from up the tree isn’t helpful, or wasn’t before autosomal testing.

Ah yes, that paternity question.  You see, it wasn’t definite.  A descendant tested the Y chromosome, and he was off just enough markers to be considered a problematic match.  But, it was enough to introduce doubt.  And doubt is a horrible nag for a companion – especially for the family genealogist who has spent the past three and a half decades working on this “doubtful” family.  In other words, OMG!!!  This was the genealogical equivalent of a panic attack.  And what could I do?  There was no one else to test.

On the chart below, the green line is the Estes ancestral line, as we know it today, proven by both genetics and genealogy.  The purple is the anonymous participant that tested and had the questionable match to the green ancestral Estes line.  The yellow group was then “suspect” because of the questionable match.  When I found David, supposedly my father’s son, and he tested, matching neither the purple participant nor the Estes ancestral line, it nearly put me over the edge.  My cousin, Buster agreed to test, which confirmed the ancestral Estes line back to Lazarus, which left the yellow still in the questionable realm.  There were no living males to test in the yellow line.

Digging up dad 1

So, I considered exhuming Dad.  That possible paternity issue had shaken me, pretty much to the bone, and I desperately wanted to know.  Was I barking up the wrong tree?  Was my Dad not my Dad, but David’s Dad?  David and I clearly were not genetic half-siblings, suggested at that time by CODIS testing, but proven eventually by 23andMe testing.  Was my Dad not the child of his father, William George?  Was his father maybe not the child of his father, Lazarus?  Why did my grandfather not look like the other Estes men?  We knew that John R. Estes matched the ancestral Estes line, but we had no one else to test below John R. on the tree.

Below, my great-great-grandfather, John Y. Estes, at left, my great-grandfather, Lazarus, center and my grandfather, William George, at right.

Digging up dad 2

Why did my son look so much like my father?  Was I just seeing things that weren’t there?  Below, my father as a teen in his military uniform and my son about the same age.

digging up dad 3

Without a male to test the Estes Y-line DNA, how would I ever know?

One day, a package arrived in the mail.  My step-mother had died some years ago, and her daughter had found a group of letters in her mother’s belongings that she felt I should have.  Among those letters were letters from my grandfather to my father.

Letters?  Envelopes?  Stamps?  Saliva?  DNA?  JACKPOT!!!  WOOHOOOO!!!!!!!

At the time my grandfather mailed those letters to my father, in the 1960s, my grandfather was living alone, so he should have licked the envelope and the stamp himself.

I called Bennett Greenspan at Family Tree DNA.  He referred me to a private lab that “does things like this,” called Trace Genetics.  Before you start googling, the company was subsequently sold and has now been defunct for years.  However, at that time they were doing custom processing of private forensic samples.

Yes, anything like that is considered forensic.  Anything you have to extract DNA from before you can have it processed in a regular lab is forensic work.

So, I got an estimate, took out a loan, and told them to go ahead.  You think I’m kidding, but I’m not.  The cost was in the $2000 range FOR EACH ATTEMPT.  So, we tried the envelope first.  No DNA.  Then we tried the stamp.  We got DNA, but it was female, so we knew it was contaminant DNA.  Think of how many people handle an envelope in the processing and delivery of mail, not to mention all the people who had handled it since.  Then we tried a second envelope.  No dice.

I was beyond frustrated and so were the two wonderfully patient scientists I was working with at Trace Genetics.  We all desperately wanted DNA.  In all fairness, they told me very clearly up front that there was a less than 50% chance of obtaining  ANY DNA, let alone usable DNA, let alone Y-line DNA.  Yes, the odds were very much stacked against me, and I knew it.

Y-line DNA is the least obtainable.  Most forensic work is done using mitochondrial DNA.  That’s because in each cell there is a total of 1 Y chromosome and there are thousands of mitochondria.  So the chances of recovering mitochondria are much greater than a Y chromosome.

Still, I had to try.  If you’re thinking of the word obsessed, I certainly wouldn’t argue with you.

Then I remembered, I had my father’s VFW hat.  I had it stored away in an old train case with other memorabilia from my childhood.  That was the one and only thing of my father’s I ever had – that hat.  I still remember him wearing it and I remember going to the VFW hall with him.  They had a slot machine and sometimes he used to let me pull the arm on the machine.  That was great fun.

I asked my friendly scientist at Trace Genetics what to do with the hat.  He suggested that I look for hairs in the interior of the hat, under the hatband, and then he told me how to extract the hair without touching it myself using sterile gloves.  I did so, put the hair in a Kleenex, put the Kleenex in an envelope and overnighted it to Trace Genetics.  This hair had the all-important follicle attached, the only part of the hair that will provide DNA.

I was positive, just positive, that this time was the jackpot.  But it wasn’t, and neither was the next hair.

Are you adding up the numbers in your mind?  Well, I assure you, I was adding them up.  And it wasn’t the money that bothered me, but the lack of results.  I was devastated.

Dad tombstone

So, I considered exhumation.  I looked into it, and I discovered a couple of things that were very important and were likely show-stoppers.

  1. In order to exhume someone, you have to petition the court and give a reason.  Then, you have to obtain the written, notarized, permission from every single descendant.  Yes, I said EVERY SINGLE DESCENDANT.  If even one disagrees, or refuses, it’s done, a deal-killer, dead.
  2. The cost of said exhumation is about $20,000 including all expenses, like attorney fees, backhoe, medical examiner, etc..

Choke, sputter, cough….clutching chest….

I happened to know someone who actually did exhume their ancestor, not for DNA testing, but because the cemetery was going to wind up at the bottom of a lake.  And yes, the entire process did cost in the neighborhood of 20K, a price-tag they did not anticipate in advance nor expect.

I had my doubts that any court would approve an exhumation for obtaining DNA for genealogy, but they might approve it to move the grave to Tennessee where my father’s family was buried.  Dad was (and is) buried alone in Indiana.

OLYMPUS DIGITAL CAMERA

But to move him, the cost of the exhumation would increase exponentially.  Moving a body which is considered medical waste is not inexpensive.  By way of comparison, to bring my sister home from Arizona to Michigan for burial was in the neighborhood of 10K.  And that would have been in addition to the 20K for exhumation.

For a minute, I thought about my brother, Dave, the long haul truck driver and I wondered if he had any room in that truck between pallets of yogurt.  But I got a grip on myself before asking him. I had visions of Dave putting Dad back in the sleeper cab…but I digress.

Ok, now we were talking the price of a car or a small house…a vacation home maybe or a trip around the world.  And it wasn’t 2K at a time, but an all or nothing proposition.

Not only did I not have the 20K or 30K, I couldn’t justify borrowing it, so I decided to leave sleeping Dad’s lie, so to speak.

I also decided that really, while I desperately did want to know about the paternity issue and its resolution, that I’m an Estes no matter what.  It’s my maiden name, it’s my name now that I’m married (I married a Kvochick, need I say more) and it will be my name on my tombstone.  So, I’m an Estes no matter whether I descended from them genetically or not.

I intentionally have not addressed any moral or ethical issues about exhumation.  Some feel the dead should be left alone, undisturbed.  However, there is precedent… the Catholic church regularly exhumes their saints to see if the body is well preserved.  I didn’t know what to think, truthfully, along those lines, and before I could have and would have actually made that decision, I would have had to think long and hard about it.  Would I have been there for the exhumation?  Could I have stayed away?  Would I have wanted to see my father like that?  All questions I would have had to answer, but did not have to, because the other issues precluded exhumation.

The first issue I would have encountered was who, exactly, were his descendants, and how, exactly, legally, was that determined?  I mean, does the court go by the obituary?  If so, my mother was his sister.  But I had a real half-sister.  Was she included?  No place did it say that she was his descendant.  He didn’t have a will.  And what about the children we knew about but couldn’t find?  Would that preclude the exhumation?  Or should we just stay quiet about them?  No, too many ethical issues and thorny problems, and that is BEFORE you get to the money issue.

I’m glad I didn’t slog through that mess, because before long, autosomal testing came about – not CODIS testing – which was inconclusive at best – but wide spectrum testing using hundreds of thousands of DNA positions, today’s 23andMe and Family Tree DNA’s Family Finder tests.

I have several Estes cousins who aren’t direct male lines but who who are fairly close genetically and I’m not related to any of them through any other genealogical lines.  If I matched them, it would be proof positive that I indeed was a blood descendant of the Estes line.  I wasn’t happy testing just one or two, so I tested 5 or 6 of my cousins from different children of my great and great-great-grandfather – and yes, I did indeed match all of them.

What a relief!  I didn’t have to dig up Dad or spend the equivalent of a couple years of college education.

But for those who are indeed as desperate as I was, let me tell you the following.

  1. There are very few labs that will do this kind of processing.  It is very unpopular as you basically have to shut the entire lab, sanitize it, and run no other tests until you are done.  You can see a forensic lab clean room in Ripan Malhi’s lab at the University of Illinois.
  2. Best case, with a relatively recent sample, meaning one from someone who died recently, you have about a 50% chance of useable DNA retrieval.  That’s BEST CASE.
  3. Skin is good.  The best is an electric razor contents.  Do NOT touch them.  Put the entire razor with contents into a plastic bag and DO NOT seal it.  Keep it in a temperature stable environment.  No attic or basement.   Sometimes hairbrushes have skin flakes in with the hair.
  4. Hearing aids are good.  Again, do not touch, etc.  Blood is good.  Spit is good.  A Kleenex is wonderful, providing you are sure it is their Kleenex.  If your mother was like my mother, check her bathrobe pockets.
  5. Older things like hair, sweat, envelopes etc. are not so good.  The older the sample, the less likely you’ll be able to retrieve DNA.  It degrades with time and these aren’t particularly good to begin with.
  6. Digging up a grave without doing all of the paperwork is illegal, and the legalities vary by locality – so consult an attorney and get the check book ready.  I just thought I should mention that little illegal detail, just in case.  I know genealogists are innovative and sometimes desperate people.

Having said all of that, don’t go throwing anything away.  There is new technology on the horizon that will only need one cell of DNA – so I’m told.  Seeing how far we’ve come in the past decade, I don’t doubt that someday this will be true, and someday may be closer than you think.  And no, I do not know how far away that horizon is.

So, store your DNA item safely.  Label it.  Do not seal it in plastic.  Do not store it in the attic (heat) or basement (cold, humidity) but someplace fairly temperature regulated.

One time when working with an archaeological specimen, we were told to freeze the sample.  Well, we did, in a plastic cool-whip container with water.  However, the electricity went out while the person whose freezer the specimen was stored in was out of town.  Their friend went to their house and did them the very big favor of disposing of everything in the fridge and freezer before they came home.   Needless to say, we were just sick.  So, don’t freeze it either.  Besides that, freezing in a frost-free refrigerator (that by definition defrosts itself regularly) is not the same as freezing a specimen in a laboratory temperature controlled stable environment.

So, what’s the upshot of this?

  • Forensic genetics is expensive
  • Exhumations are extremely expensive and fraught with all kinds of legal and technical landmines
  • There are very few labs, if any, that will process private forensic samples
  • When DNA is retrieved from a forensic specimen, it may be contaminant, not the DNA of the person you think it belongs to
  • When DNA is retrieved from a forensic specimen, you still have to pay for the DNA testing, in addition – and it may not work
  • When DNA is retrieved from a forensic specimen, if it does amplify, it will most likely be mitochondrial DNA
  • Using today’s combined genetic genealogy tests, there is almost always a way around the lack of a particular DNA donor, making exhumation and or forensic testing unnecessary

And if you’re considering grabbing a shovel, an urge which I well understand, I’ll leave you with the advice of an ethicist that Family Tree DNA invited to speak at their annual conference a few years ago, “Don’t do anything in the dark of night that you wouldn’t do in the middle of the day.”  Put another way, don’t do anything you wouldn’t be comfortable seeing in the headlines, because if you get caught, that’s where you’ll be:)

But then again, those headlines would certainly be something interesting for future generations of genealogists to dig up about you!

The Autosomal Me – Extracting Data Segments and Clustering

This is Part 8 of a multi-part series, “The Autosomal Me.”

Part 1 was “The Autosomal Me – Unraveling Minority Admixture” and Part 2 was “The Autosomal Me – The Ancestors Speak.”  Part 1 discussed the technique we are going to use to unravel minority ancestry, and why it works.  Part two gave an example of the power of fragmented chromosomal mapping and the beauty of the results.

Part 3, “The Autosomal Me – Who Am I?,” reviewed using our pedigree charts to gauge expected results and how autosomal results are put into population buckets.  Part 4, “The Autosomal Me – Testing Company Results,” shows what to expect from all of the major testing companies, past and present, along with Dr. Doug McDonald’s analysis.  In Part 5, “The Autosomal Me – Rooting Around in the Weeds Using Third Party Tools,” we looked at 5 different third party tools and what they can tell us about our minority admixture that is not reported by the major testing companies because the segments are too small and fragmented.

In Part 6, “The Autosomal Me – DNA Analysis – Splitting Up” we began the analysis part of the data we’ve been gathering.   We looked at how to determine whether minority admixture on specific chromosomes came from which parent.

Part 7, “The Autosomal Me – Start, Stop, Go – Identifying Native Chromosomal Segments”, took a deeper dive and focused on the two chromosomes with proven Native heritage and began by comparing those chromosome segments using the 4 GedMatch admixture tools.

In this segment, Part 8, we’ll be extracting all of the Native and Blended Asian segments on all 22 chromosomes, but I’ll only be using chromosomes 1 and 2 for illustration purposes.  We will then be clustering the resulting data to look for trends.  If you’re following along and using this methodology, you’ll be extracting the Native segment start and stop locations from all 22 chromosomes.

I apologize in advance for the length of this article, but there was just no good place to break it into pieces.

So, let’s get started.  As a reminder, we are using the admixture tools at www.gedmatch.com.

I experimented with several types of extractions to see which ones best reflected the results found by both 23andMe and Dr. McDonald and confirmed by the start and stop segments in the highly Native segments of chromosomes 1 and 2 in Part 7 of this series.  We verified that all 4 tools accurately reflected and corroborated the segments listed as Native, so now we’re going to apply that same methodology to the rest of our chromosomal data.

Initially, I tried to use the information from chromosomes 1 and 2 to extract the Native chromosomes using only the “best” tool, but when I looked at all 4 tools, I quickly realized that there was no single “best” choice.  A couple of crucial points came to light.

  • Some of the geographic colors are almost impossible to tell apart.
  • None of the tools are universally best.
  • When looking at all 4 tools, generally a “best 3 out of 4” approach allowed for one of the tools to be wrong, to perhaps reference a slightly different data base that called the segment differently or for the colors to be indistinguishable.  In other words, if three called a segment Native and one did not, it’s Native and conversely, if less than 3 call it Native, in this comparison, it’s not.

Unfortunately, this created an awful lot of work.  This is probably the best example of where automation tools could and would make a huge difference in this process.

I did two separate extracts.  The first one is what I refer to as the “Strong Native” extract and the second is the “Blended Asian.”  In part, I did these separately as a check and balance to be sure that my first extraction was accurate.

In the first extract, I selected only one category, the one best fitted to “Native American” for each tool.  I used the following categories for each admixture tool:

  • MDLP – Amerind
  • Eurogenes – North Amerindian
  • Dodecad – NE Asian
  • Harrappaworld – American

I completed this process for every chromosome, but I’m only showing the first two chromosomes in this article.

By way of example, using the first tool, MDLP, North Amerind looks black, but is actually very dark grey.  It is, fortunately, distinctive.

On the chromosome painting below, my results for the first part of chromosome 1 are shown in the first band, and mother’s for the same segment are shown as the second band.  The bottom band represents common segments and the black is non-matching segments, meaning those I obtained from my father.  Sometimes this third band can help you determine what you are really seeing in terms of colors and blending, but it’s not always useful.  In this case, trying to spot a small amount of dark gray against black is almost impossible, so not terribly helpful.  But if you were looking for red, that would be another story.  As you move through this process, remember, it’s not exact and utilizing best 3 of 4 will help you recover from any major errors.

You can see that my grey segments show up from about 12-13 and then again at about 14.5.  Sometimes it’s difficult to know how to count something.  For example, my Native at 14.5 – it’s actually more like 14.25 -14.5, but I chose not to divide further than half mb segments.  As long as you are consistent in whatever methodology you select, it will work out.

step 8 - 1

Please note that when reading these charts, that the small hash mark is the indicator for the measure.  In other words, the small hash mark above 10M means that is the 10M location.  It’s obvious here, but on some charts, the hash mark and the location legend look to be 1-off.  Again, as long as you’re consistent, it really doesn’t matter.

Mother’s Native segments are more pronounced and obvious.  They range from about 8-14.  Using the actual tools, you would record this and then continue scrolling to the right until you reach the end of the chromosome.  On chromosomes 1 and 2, I found the strong Native segments for the four admixture tools, as shown below.

The boxed numbers show the areas that were found “in common” between 23andMe, Dr. McDonald and the admixture tools, as determined in Part 7 of this series.  Highlighted segments show segments where at least 3 of 4 admixture tools reported Native heritage.  As you can see, there were clearly additional Native segments not reported by 23andme and Dr. McDonald.

Strong Native Chromosomal Detail Table

step 8 - 2

step 8 - 3

Because we have both my and mother’s results, we can infer my father’s contribution.  Clearly, some of his will wind up being some amount of “noise” and some IBS segments, but not all, by any means, and this is the only way to get a “read” on Dad.  This is one form of phasing data.  Phasing refers to various methodologies of figuring out which DNA comes from what source, meaning which parental line.

While the strongest Native segments are the ones individually most likely to indicate Native American ancestry, that really isn’t the whole story.  I discovered that many of these Native segments are actually embedded in other segments that are indicative of Native heritage too.  In other words, it’s not a line in the sand, yes or no, but more of a sliding scale.

On the chromosome painting below, this one using Eurogenes, with my results shown above and mother’s below, you can see two excellent examples.  Regions relevant to Native ancestry include:

  • Red – South Asian
  • Brown – Southwest Asian
  • Yellow – North Amerindian and      Arctic
  • Putty – Siberian
  • Emerald – East Asian

You can see that while mine is almost universally yellow, or Native, with a little Siberian (putty) mixed in for good measure between 169-170, a hint of East Asian (emerald) plus a little Asian (red), mother’s isn’t.  In fact, hers is a mixture of Native American and South Asian (red), with more red than yellow,  Siberian (putty) and a large segment of East Asian (emerald green).

step 8 - 4A

While her yellow Native segments alone would be staggered across this entire segment in 7 different pieces, when taken together as a whole, the “blended Asian” segment reaches entirely across the screen with the exception of 1 mb between 161.5-162.5, roughly.

The following Blended Asian Chromosomal Detail Table shows all of the blended Asian segments using all four of the admixture tools for chromosomes 1 and 2.

It’s clear that these regions are not solely “Native American” but reach back in time genetically into Asia, particularly Northeast Asia.

Again, the boxed numbers show the “in common” segments between all tools and the yellow highlighted segments are common between at least three of the four admixture tools.

Please note that there were some issues distinguishing colors, as follows:

  • For the MDLP comparison, Mesoamerican and Paleo Siberian are both putty colored and indistinguishable on the chart.  Also, the apple green for Arctic Amerind is very similar to the Austronesian.
  • When using Dodecad, Southeast Asian (light green) and South Asian (apple green) are nearly impossible to distinguish from each other on the graphs.
  • When using HarappaWorld, the apple green for Siberian was very similar to the light forest green for Papua New Guinea and was very difficult to distinguish.  The South Asian putty appears often with the other Native markers, and I considered including this group, but it too was difficult to distinguish from other regions so in the end, I opted not to include this category.
  • If you are colorblind – get help as this is impossible otherwise.

Blended Asian Chromosomal Detail Table

On the blended Asian Chromosome Detail Table, I added yellow highlighting where the same segments show in other Asian geographies that showed in the Strong Native table.  In each column, the Strong Native category is the last one at the bottom of the list.

The blue highlighting shows other common segments found that were not included in the Strong Native segments.  For a Strong Native yellow segment to be highlighted, it had to be present in 3 of 4 tools, or 75%.  In the Blended Asian group, there are a total of 15 categories between the 4 admixture tools, so for a segment to be shaded blue, it must be found in at least 8 of the categories, so just over half.  There are many segments that are found in several categories across the tools.  For example, segment 192-193 on chromosome 1 is found five times.  This isn’t to say you should discount this segment, only that it isn’t one of the strongest, most universal.  Surprisingly, there really weren’t too many that were close to the cutoff.  Several, but not a majority, were in the 4 or 5 range, only one was at 7.

step 8 - 4

 step 8 - 5

step 8 - 6

 step 8 - 7

  step 8 - 8step 8 - 9

 step 8 - 10

 Step 8 - 11

step 8 - 12

Clustering

The third step in data extraction is to look at all of the data together.  In this step, we are removing the geographic boundaries of Siberian, N. Amerindian, etc. and combining all of our data.  I have only combined the data within columns, not between columns, so we can get a feel for which tool or tools performed best or maybe not so well.  Each chromosome in each column has its data ordered numerically, and yes, this is a manual cut and paste process.  Sorry.  I warned you, this is an very manually intensive process.

After I put each column in numerical order, I arranged them so that the numbers were approximately in a line, or a row, with each other.  For example, in the first group below, you can clearly see that the first cluster of results is found using all 4 tools.  When looked at individually, only the blue results were noted as common (at least 8 of 15 for blue), but when viewed as a cluster, you can see between the tools that the cluster itself runs from about 7.5, with a small break from 8-9, and then to about 14.5.  As you would expect the beginning and end points of the cluster trail off and are not uniform between tools, but the main part of the cluster is found in all the tools.  This introduces the question of how to measure a cluster.  In this case, there is a clean break using all tools between 8 and 9, but that is only 1 mb, rather difficult to measure accurately.  You could record this as two distinct clusters but since it’s very closely adjacent the rest of the cluster, I’m inclined to include this as one large cluster and use the starting and ending segments for the cluster as a whole, in other words, the cluster runs from 7.5 through 14.5.  The alternate, or more conservative methodology would be to use the “in common” numbers, but in this case, that would be only 10-11.5 and I think you would miss a great deal of useful data.  So, for clusters, I’m recording the full extent of the cluster.  In some cases, you may need to exercise a judgment call.

Let’s look at the second group of numbers, beginning with 18.5 in Harrappaworld.  This grouping runs though about 28.  Eurogenes found some blended Asian between 27-28.5 as well in two of the geographies, but over all, of the 15 tools, we don’t see much.  This could be a result of a number of things.  I could have had problems with the colors, there may be only a very small amount and it may be categorized as something else with the other tools.  I would not consider this a cluster, and using our best 3 or 4 methodology eliminates this cluster from consideration.  This also holds true for 43-43.5.

However, the next cluster, from 55.5 to 58 is found in the Strong Native comparison, indicated by the yellow highlighting and is found using all 4 tools.  This is definitely a cluster.

step 8 - 13

step 8 - 14

step 8 - 15

step 8 - 16

step 8 - 17

step 8 - 18

step 8 - 19

Step 8 - 20

step 8 - 21

step 8 - 22

step 8 - 23

step 8 - 24

I’ve synthesized the cluster information into a list.  From the clusters above, I’ve created a list that I will be using in the next segment for data input into my spreadsheet of matches.  The blended segments below that include Strong Native segments are shown with yellow.

step 8 - 25

Using the GedMatch admixture applications, we’ve isolated the strongest Native and the Blended Asian segments and clusters in preparation for identifying specific Native family lines within our group of matches.

This process shows that, for the most part, the Strong Native segments picked up the strongest signals, about half of the segments that will be useful in determining Native admixture, although it does miss some.

When we use the clustering technique to view our results across all the admixture tools, we see a somewhat different picture emerge, adding several Blended Asian clusters.

In Part 9 of this series, we will use the highlighted Strong Native segments and the Blended Asian clusters, both of which suggest Native chromosomal “hotspots” to begin our comparison to our genetic matches for genealogical relevance.  In other words, using this information, we will determine which genealogical lines carry Native ancestry.

Part 9 may be somewhat delayed.  The good news is that Family Tree DNA is finishing work on their Build 36 to Build 37 conversion.  The bad news is that it fell right in the middle of writing this series.  When they finish Build 37, I’ll finish Part 9 of this series.  In the mean time, you can be extracting your minority segments using the tools and techniques that we have covered in Parts 1-8.

Ancestry Needs Another Push – Chromosome Browser

ancestry push

It seems that the genetic genealogy community is constantly doing battle with Ancestry in regards to Ancestry’s mediocre and at times, outright faulty autosomal DNA product, AncestryDNA.  AncestryDNA, similar to Family Finder at Family Tree DNA and the 23andMe test, matches you against others who have taken the test for “relatedness” across all of your ancestral lines.  I wrote a primer about autosomal testing in an earlier article, another comparing the various company offerings and a third comparing the actual results.

While we were excited this week that Ancestry has finally lived up to their promise to provide our raw data files for download, albeit many months later,  they have made a decision apparently to NOT provide a chromosome browser, their logic being, according to genetic genealogists who spoke to Kenny Freestone, Ancestry’s product development manager this week at Rootstech, that their primary focus is to keep things simple for the newer users.  Just so you know, if you’re an Ancestry user, not only have they just called you “stupid” but they also insinuated that you are unable to learn and to be anything other than stupid.  Are you insulted?  I surely am.

Ok, let’s forget, for the moment, about the fact that Ancestry just insulted us and let’s look at why having a chromosome browser is important.

This is very simple.

Just because you have a paper genealogy match with someone, especially a distant DNA match, does NOT mean that is how you’re related to them. 

Ancestry does a good job of linking up people who match by connecting people in their trees.  But that doesn’t mean that connection is how they are genetically related.  Plus, we all know about the, ahem, “quality” of Ancestry trees.

ancestry push 1

Here’s an example.  This is a match to someone through my ancestor, James Claxton and his wife Sarah Cook.  However, what if I’m also related to this person through the Estes family too?  Or an unknown line?  Just because the paper connection is to James Claxton doesn’t mean the genetic connection is to him as well.  This person has over 11,000 people in his tree.  If we are from the same geography, it’s likely that we match on multiple lines.  What if we match on paper on two or three lines?  How do we know how we are genetically related – through which line or lines?

At Ancestry, you don’t – you can’t – because they want to “keep things simple.”   Let me translate – they would rather leave you with a vague “feel good” notion about who you are related to, even if it’s not true, than give you the tools to discover the truth.

We need a chromosome browser to let us see how and if the DNA we share with these people is really from the Clarkson/Claxton family or the Cook family, or if maybe it’s from another line that isn’t shown on the pedigree chart being displayed by Ancestry.

Let’s move to Family Tree DNA to see what a chromosome browser does for you.  At Family Tree DNA, three of my Vannoy cousins have tested.  By using the chromosome browser to look at their DNA compared to mine, we can identify some segments as “Vannoy” segments – meaning they unquestionably come from that line.  We do that by using triangulation. It’s easy.  Using 3 or more relatives from a particular line, if three or more match on a particular segment, you know that segment is from that family line.

ancestry push 2

I’ve selected three cousins to compare to my results, above, and their results will be displayed using these colors.  Below, you can see that on chromosome 15, all 4 of us match on a significant sized matching segment.  That means that this segment is definitely “Vannoy.”  How does this benefit us?

ancestry push 3

Well, it benefits us in two ways.  Let’s say an adoptee, or someone who has hit a brick wall also matches us on this segment.  It tells us that they are also “Vannoy” or perhaps ancestors of Vannoys.  Ancestors of Vannoys?

ancestry push 4

Yes, Vannoy is of course made up of their ancestral names and lineages too, so in time, let’s say that a Hickerson matches this segment too.  Then we’ll know that this segment comes from Daniel Vannoy’s wife, Sarah Hickerson’s line.  Do you have any wives surnames in your lines that need to be identified?  This is one way to do it, but you can’t without a chromosome browser.  And you could be the one who is brickwalled with the answer just waiting…..if there was a chromosome browser.  Do you see why this is so important, especially given the number of people who have tested at Ancestry?

Pretty simple stuff, right?  Well, Ancestry doesn’t think so.  They think you’re not capable of understanding this.  Funny, both Family Tree DNA and 23andMe provide this capability and people use it and depend upon it daily.  If you don’t want to use it, you certainly don’t have to, but to deprive all of us of an absolutely critical component of genetic genealogy is unconscionable. It’s simply not acceptable.

What can we do about this?  CeCe Moore, Tim Janzen and Dave Dowell were at Rootstech this week where they spoke with Kenny Freestone, among others.  He’s says he does personally read the information submitted through the “Feedback” button.  That is apparently how Ancestry gauges what needs to be done and prioritizes items.  Of course, if most of their novice clients don’t know what they are missing, they won’t be able to ask for what they don’t know about.  They are living under the illusion that they ARE genetically connected to everyone whose tree shows, and through the common paper line, and that’s it.  They don’t know that Ancestry is intentionally leaving them in their “feel good” cocoon and intentionally withholding “the rest of the story” and with it, their ability to discover even more.

But we know better and we were all “new users” at one time.  Use the feedback button.

ancestry push 5

It’s at the top right of your DNA pages at Ancestry.  Send Kenny the message…..”Kenny, we need a chromosome browser.”

Pssst….pass it on.  Everyone needs to provide this feedback.  This is how we got the raw data released and it’s the only way we’ll ever convince Ancestry to implement a chromosome browser.  Facebook this posting, Tweet it, post it on groups and forums.  Get the word out.  Send Feedback!!!

ancestry push

Judy Russell, the Legal Genealogist blogged about this today as well.

Downloading Ancestry’s Autosomal DNA Raw Data File

Well, the big day has finally arrived.  Ancestry has at last allowed us to download our raw data files.  To download yours, sign on to your Ancestry account and fly over the DNA tab.  You’ll see the selection, “Your DNA Home Page,”  Click on that.

ancestry download

Then click on “Manage Test Settings” to the right of the orange “View Results” box.  You’ll see the following screen.

ancestry download 1Click on “Get Started” in the right hand box under “Download your raw DNA data.”  You will then be prompted to enter your password to receive an e-mail to allow the download.

ancestry download 2

The e-mail will arrive, and you will need to click the link in the e-mail, shown below, to activate the download.

ancestry download 3

Clicking on the e-mail link “Confirm Data Download” takes you to the next step on Ancestry’s website, below.

ancestry download 4

Clicking on the green “Download DNA Raw Data” link shows the following:

ancestry download 5

Shortly, your browser will do whatever it does to ask you if you want to save or display the file.

ancestry download 6

I use Internet Explorer and download files are automatically saved in the “download” folder.  I renamed it and moved it to someplace where I can find it, hopefully.  The good news is that if I “lose” it on my computer, it’s easy to repeat this process.

Now, what can you do with this file today?  Not a lot.  You can compare raw data segments with others who might download their files too, but life will be a lot easier when tools like GedMatch can accept these files and do something with them.  There were also rumors last fall that Family Tree DNA would support uploads as well when Ancestry released these files, the same as they do with 23andMe raw data files.  Let’s hope so.

However, today will be the first day these organizations see the raw data too, so expect a bit of lag time before anyone can process or incorporate this information.  Of course, it goes without saying that we have to address issues pertaining to file layout and compatibility.

I’m hopeful that since Ancestry has the raw data files for everyone who has tested there, that they will do what the other two major players have done and create a chromosome browser where you can see who matches you on which segments and download that comparative information as well.  It’s not just the raw data we need, it’s the integrated tools to use it.  Hopefully we’re at the crawl before you walk stage and we’ll be walking soon!

The Autosomal Me – Start, Stop, Go – Identifying Native Chromosome Segments

This is Part 7 of a multi-part series.

Part 1 was “The Autosomal Me – Unraveling Minority Admixture” and Part 2 was “The Autosomal Me – The Ancestors Speak.”  Part 1 discussed the technique we are going to use to unravel minority ancestry, and why it works.  Part two gave an example of the power of fragmented chromosomal mapping and the beauty of the results.

Part 3, “The Autosomal Me – Who Am I?,” reviewed using our pedigree charts to gauge expected results and how autosomal results are put into population buckets.  Part 4, “The Autosomal Me – Testing Company Results,” shows what to expect from all of the major testing companies, past and present, along with Dr. Doug McDonald’s analysis.  In Part 5, “The Autosomal Me – Rooting Around in the Weeds Using Third Party Tools,” we looked at 5 different third party tools and what they can tell us about our minority admixture that is not reported by the major testing companies because the segments are too small and fragmented.

In Part 6, “The Autosomal Me – DNA Analysis – Splitting Up” we began the analysis part of the data we’ve been gathering.   We looked at how to determine whether minority admixture on specific chromosomes came from which parent.

Part 7 – “The Autosomal Me – Start, Stop, Go – Identifying Native Chromosomal Segments”, takes a deeper dive and focusing on the two chromosomes with proven Native heritage, begins by comparing those chromosome segments using the 4 GedMatch admixture tools.  In addition, we’ll be extracting Native segment chromosomal start and stop addresses that we’ll be using in a future segment.

Using Doug McDonald’s tool and the 23andMe results, we can begin with the following two Native segments, one each on chromosome 1 and 2.  These will be our reference points, because according to both sources, these are the largest and most pronounced Native segments, the strongest indicators, so they will be our best yardsticks.

  Chromosome 1 Chromosome 2
23andMe

165,658,091 to 175,711,116

86,316,174 to 103,145,426

McDonald

165,000,000 to 180,000,000

90,000,000 to 105,000,000

On all of these admixture graphs, my results are shown first, then mother’s, then the comparison between the two where the colored regions show common ancestry and the black shows nonmatching segments – in other words those contributed by my father.

Please note that Native contribution in this analysis is being evaluated by a combination of geographies.  In some cases, one individual will show as “Native” meaning in the case of MDLP “North Amerindian” and the parent (or child) will show as something similar, like “Actic,” “South American” or “MesoAmerican.”  In order to normalize this, I have combined all of the geographies that are Native indicators.

MDLP

On the MDLP graph below, the legend indicates that these 4 regions are relevant to Native ancestry.

  • Army green – Mesoamerican
  • Lime Green – Arctic
  • Emerald – South American Indian
  • Grey – North Amerindian

Chromosome 1 – Native Segment

On the graph below, you can see that mother has more grey than I do from about 162-165, but then I have some grey that she does not at about 170.

step 7

A detailed analysis of the segment of chromosome 1 between 158-173 shows the following admixture:

On my results, the putty green, MesoAmerican, is scattered between about 158 and 173, in three segments.  The putty green in my mother’s segments are from 159-160.5 and then 167-170.5.  Therefore, my father, by inference has a segment from about 162-165 and from about 170.5 to 173.

My teal, North Siberian, ranges from 162-163 and from 168-171.  My mother carries no teal in these segments, so this is inferred to be contributed from my father.

My dark grey, North Amerind, ranged from 162-165.5 and then from 168-169.5.  My mother’s range is from 161-165.5.  Therefore my grey segment at 168-169.5 is either recognized as MesoAmerican or Arctic Amerind in my mother.

Chromosome 2 – Native Segment

step 7 - 1

Chromosome 2 is quite interesting.  You can see that on my chromosome, the North Siberian begins at about 80.  Mom has none at that location.  My North Amerind begins at about 95 and extends to 105, where Mom’s begins in the same location but then transitions to a large segment of MesoAmerican which I do not carry.  I do have MesoAmerican, but mine begins about where hers ends and extends to about 105.  Mom’s North Amerind ends about 101, while mine continues to about 105.  She looks to have trace amounts beginning about 105 and extending through 115.

Eurogenes

The next graph shows the same chromosomes using Eurogenes.  Regions relevant to Native ancestry include:

  • Red – South Asian
  • Brown – Southwest Asian
  • Yellow – North Amerindian and Arctic
  • Putty – Siberian
  • Emerald – East Asian

Chromosome 1 – Native Segment

step 7 - 2

The difference between my chromosome 1 and my mother’s in this region is quite pronounced.  My mother’s is drenched in beautiful red South Asian, while I have absolutely none.  Some of the area where I have North Amerindian shows as South Asian on hers, but in other areas, there is no correlation.  It is expected of course, that there are areas where she has some ancestry and I have none, due to the fact that I only inherit half of her DNA, but she has a significant segment of East Asian between 163 and 164, and I look to have received only a very small portion.  The same is true of her Siberian segments at 163-164, but then I have Siberian that she does not at 169-170 and she has some that I don’t at 160-161.5.  Some of this difference can likely be explained, especially between the yellow North Amerindian and the red South Asian by slight differences in the DNA read and how it is categorized, but in other cases, the difference is real.  Looking at mother’s red segments from about 166.5 to about 168 and then looking at my corresponding region, you can see that I have nothing that hints at Native.  In that region, I clearly inherited from my father as well as my mother’s North European.

Chromosome 2 – Native Segment

step 7 - 3

As different as our chromosomes 1 were, one wouldn’t expect chromosome 2 to be so similar.  In the graph, I included my large South Asian segment surrounding 80, where Mom has a trace, although that is beyond the area indicated as Native by 23andMe and Doug McDonald.  In the range of interest, beginning at about 80, we find nothing until about 94 where mother and I both have North Amerindian segments that stretch through about 105.  Mom’s goes slightly further than mine, to about 105.5.  It’s interesting to note that in part of this region, on either side of 101, her Siberian and my North Amerindian are the same shape at the same location, so obviously the same DNA is being read and categorized as two different regions, probably due to my father’s admixture.

Dodecad

On the Dodecad graph of the Native segment, you can see the Native colors are in shades of green.

  • Putty – West Asian
  • Yellow-green – South Asian
  • Emerald – Northeast Asian
  • Light Green – Southeast Asian

To use Dodecad in an equivalent manner as the rest of the tools, it looks like Northeast Asian is the closest we would get to Native American since that is where Native Americans lived just prior to crossing Beringia, so the greens should probably be evaluated as a group.  As can be seen on chromosome 1, they do clump together.  Even though West Asian is also found with this group, it seems to be outside the range, so I am not including it in the evaluation.

Chromosome 1 – Native Segment

You can see another example here of one segment being called South Asian in Mom’s and Northeast Asian in mine at about 170mb.

step 7-4

The Native, or in this case, Northeast Asian/Southeast Asian begins at about 162.5 where Mom’s and mine are very similar.  However, we diverge at about 164.5 where Mom begins with large segments of South Asian.  I have a little bit, but not much.  Beginning about 168, I have a large Northeast Asian segment, but she shows with South Asian there, although the segments are not exact.

Chromosome 2 – Native Segment

step 7 - 5

Chromsome 2 is quite simple using Dodecad.  Only two of the three groups appear.  Southeast Asian is absent, South Asian is present only in trace amounts except for one small area between 79.5 and 80 on my chromosome.  As expected, Northeast Asia is more prominent.  Mother has a few areas that I don’t, which is to be expected.

HarrappaWorld

Last, we have HarrappaWorld.  American and Beringian are the Native American categories here.  Regions relevant to Native American heritage would be:

  • Teal – American
  • Periwinkle – Beringian
  • Lime Green – Siberia
  • Emerald – Northeast Asia

Chromosome 1 – Native Segment

You can see both Beringian and American embedded again at about location 169.  In mine, this entire block reads as American.

step 7 - 6

There is one large chunk of Northeast Asian showing for both results, but part of that region of my chromosome, between 163-164 shows as American instead of Northeast Asian.  The Beringian is scattered through the American, which I would expect.  The American runs either strongly or weakly through this entire segment from 163 to 175 in mine or to 179 in mother’s.  Surprisingly there is no Siberian at all.  I would have expected to see Siberian before Northeast Asian.

Chromosome 2 – Native Segment

step 7 - 7

Where on chromosome 1, we saw no Siberian, on chromosome 2, we find Siberian instead of Northeast Asian.  I have no Beringian, but mother has 4 segments.  Three of her 4 segments are embedded with American segments.  Two may simply be categorized differently in my results, but two, I did not inherit.

Analysis Discussion

What have we learned?

When we are dealing with small amounts of minority admixture, they may or may not be able to be picked up directly by the testing companies.  Of course, part of this has to do with their thresholds for what is “real” and reportable, and what isn’t.  Aside from that, lack of identification of minority admixture probably has to do with which segments were inherited and their size, if they have been isolated and identified as Native by population geneticists, and the robustness of the data base sources the data is being compared against.

We can also see how difficult it is to sort through threshold matches, meaning what is Native, Asian, central Asian, etc.  Many of these differences are probably not actually differences between groups, but similarities with slight categorization differences.  Of course, it’s those differences we seek to identify our ancestral heritage.  Combining similar geographies may help reveal relationships masked my reporting and categorization differences.

Given that multiple sources have indicated Native ancestry, and on the same two chromosomes, I have no doubt that it exists.  Had any doubt remained, the exercises creating the MDLP Chromosome Map Table and reviewing the segments on chromosome 1 between 160 and 180mb would have removed any residual concerns.

The following table shows the results for the Native segments of chromosomes 1 and 2 beginning with the 23andMe and McDonald results, and adding the start and stop segments from each of the 4 admixture tools we used.

  Chromosome 1 Chromosome 2
23andMe

165,658,091 to 175,711,116

86,316,174 to 103,145,426

McDonald

165,000,000 to 180,000,000

90,000,000 to 105,000,000

MDLP

162,000,000 to 173,000,000

80,000,000 to 105,000,000

Eurogenes

162,500,000 to 171,500,000

79,000,000 to 105,000,000

Dodecad?

162,500,000 to 171,000,000

79,500,000 to 105,000,000

Harrappaworld

163,000,000 to 180,000,000

79,000,000 to 104,000,000

In Common

165,658,091 to 171,000,000

90,000,000 to 103,145,426

Although the start and end (or stop) segments vary a bit, all resources above confirm that the region on chromosome 1 between 165,658,091 and 171,000,000 is Native and on chromosome 2, between 90,000,000 and 103,145,426.  Those are the areas “in common” between all resources, which is shown in the last table entry.

The concept of “in common” is important, because while any one resource may report something differently, or not at all, when all or most of the resources report something the same way, it is less likely to be a fluke or reporting issue, and is much more likely to be real.  We’ll be using this methodology throughout the rest of the articles in “The Autosomal Me” series.

In the next segment, Part 8, we’ll be extracting the actual start and stop addresses of the Native only segments, referred to as the “Strong Native” method, and the combined Native indicator segments, referred to as the “Blended Asian” method and looking at how we can use those results.

The Autosomal Me – DNA Analysis – Splitting Up

DNA Analysis purchased 1-24-2013This is Part 6 of a multi-part series.

Part 1 was “The Autosomal Me – Unraveling Minority Admixture” and Part 2 was “The Autosomal Me – The Ancestors Speak.”  Part 1 discussed the technique we are going to use to unravel minority ancestry, and why it works.  Part two gave an example of the power of fragmented chromosomal mapping and the beauty of the results.

Part 3, “The Autosomal Me – Who Am I?,” reviewed using our pedigree charts to gauge expected results and how autosomal results are put into population buckets.  Part 4, “The Autosomal Me – Testing Company Results,” shows what to expect from all of the major testing companies, past and present, along with Dr. Doug McDonald’s analysis.  In Part 5, “The Autosomal Me – Rooting Around in the Weeds Using Third Party Tools,” we looked at 5 different third party tools and what they can tell us about our minority admixture that is not reported by the major testing companies because the segments are too small and fragmented.

In this segment, Part 6, “DNA Analysis – Splitting Up” we’re going to focus on specific aspects of those tools and begin our analysis of our minority ancestry.

Analysis.  Sounds like I’m climbing on the shrink’s couch.  But I’m not, I’m saving all my dollars for DNA kits!  Besides, I don’t want to stop!  This analysis, we’ll do by putting several pieces of data together and sorting the wheat from the chaff.  And yes, we’ll be splitting up…well…splitting our DNA up into pieces contributed by our father and mother.

Let’s start with looking at the DNA segments that mother and I share that are Native.

According to Doug McDonald, we have significant Native matches on chromosomes 1 and 2, with third party tools confirm that finding.  Unfortunately, the only company where Mom’s DNA resides is Family Tree DNA whose test did not reveal the Native ancestry.  23andMe did confirm Native segments in my DNA in those locations.

I’ve used several third party tools at GedMatch to see where Mom and I both have Native heritage, where she has it and I don’t, and equally as important, where I have it and she doesn’t?  What is that so important?  Simple, it means my father had Native heritage too, and tells me on which chromosomes his Native DNA is located  This could, when matching people in the future, on particular segments, help to isolate who our common Native ancestor was, or at least which line.  That is the ultimate goal we are working towards with this entire process.

In this case, to identify my father’s Native lines, if Mom and I neither or both have Native markers at a particular chromosome location, the values are irrelevant, because the Native lineage came from mother.  I did notice in a few cases that I had more than mother, and of course, in that situation, it means that my father contributed some too, or my mother had a misread in that region or a categorization issue exists.  For that reason, I am looking for patterns, not single instances.  We’ll discuss using patterns in a future segment.

Using the MDLP chromosome mapping tool, as MDLP appears to be the most comprehensive, I created a spreadsheet using my results as a base.  I then added mother’s values in the spaces where I had no values, and then I highlighted my results in the locations where mother had no value.  The essence of this is that the red, bold, underscore values mean Mom had a Native result here, but I didn’t receive it.  A yellow highlighted cell means I got the entire amount from my father, because my mother has no percentage showing.  In other cases, of course, it’s possible that both mother and father contributed Native ancestry on some adjacent chromosome segments.  The MDLP mapping tool with my additions is shown below for chromosomes one through eight.  Chromosomes 9-22 are similar, but the chart is too big to display as a whole.  This provides an example of how to do this analysis with your own results.

MDLP Chromosome Map Table

The results were very interesting.

My two primary regions, North-East-Europe and Atlantic-Mediterranean-Neolithic, were represented on every chromosome for both my mother and myself.  No surprises there.  The other regions would be considered minority admixture.

In 2 categories, North-European-Mesolithic and East Siberian, only my father contributed genetic material on some chromosomes and there were no chromosomes where my mother alone contributed.

In 1 category, Melanesia, only my mother contributed genetic material on some chromosomes and there were no chromosomes where my father alone contributed.

In all other categories, both parents contributed on some chromosomes where the other didn’t.  This is important, because it will allow me to associate a match with a particular segment of a chromosome on a particular parent’s side with Native ancestry.

In the minority categories for Native American, Mesoamerican, Arctic-Amerind, South America Amerind and North Amerind, grouped together, both parents contributed on some chromosomes where the other didn’t, and in two categories, on 3 chromosomes, I carry more than my mother, indicating an additional contribution from my father.

This is a repeated occurrence, with Native ancestry for my parents and I combined showing on a total of 42 chromosome locations across 4 geographic/ethnic categories, and in at least three cases, both parents contributed.

In the African categories, South African, Sub-Saharan and Pygmy, I had contributions from both parents on a combined total of 18 chromosome segments.  The African admixture, in total, was less than the Native, and they are assuredly below 5% combined.  If they were present at higher levels, I wouldn’t need to go through these genetic gyrations to prove or disprove the heritage and which parent contributed, because it would be evident in the testing results of all companies.

In our next segment, Step 7, we will be further scrutinizing Chromosomes 1 and 2 for additional information about Native heritage and assigning specific Native segments that I carry on various chromosomes  to either my mother or father’s lineage.

Personal Genetics – Coming out of the Closet – Ostriches, Eagles and Fear

Ostrich

While most of the people subscribing to this blog are here because of genetic genealogy, genetic genealogy is only one piece of the picture of the future of personal genetics.  Ironically, it’s genetic genealogy that gave low cost genetics a push into the mainstream, some 7 or 8 years before 23andMe, the first personal health genetics company, launched in 2006.

This week, the magazine, ieee Spectrum, of all places, has an absolutely wonderful article, The Gene Machine and Me, about the future of personal genetics.  Many of these types of articles are sensationalized and full of what I call “fear-mongering,” but this one is not only excellently written, it’s accurate and interesting – a triple hitter home run as far as I’m concerned.

I’d like to talk for just a minute or two about the high points in this article, about this emerging technology, what it means to us and about fear.  I’ll be sharing my personal journey down this path.

For those who would like to know how next-generation technology works – by the way – that’s the chip technology employed by Family Tree DNA for the Family Finder product, 23andMe for all of their testing and the National Geographic Geno 2.0 project – this article has a very educational description that is understandable by regular air-breathing humans.  The next-next generation sequencing, discussed here and offered shortly by Ion Torrent, will certainly revolutionize personal genetics much as the Illlumina genotyping chip technology has today.

The benefit of full genomic and exome sequencing, the new technology on the horizon for consumers, is in the information it will tell us about ourselves.  And I’m not referring to genealogy here, although that assuredly will be a big beneficiary of this new world of personal genetics.  For genealogists, there is mention of soon-to-be capabilities of sequencing from one single molecule of DNA.  For those of us with hair brushes and toothbrushes that we’ve been jealously guarding for years now, waiting for the technology to improve to the point where we can obtain the DNA of our dearly departed loved ones, this is going to be our ticket.  As excited as I am about that, that’s not the potential I’m talking about.  I’m talking about information about our own bodies and the potential future foretold in those genes.  Notice the word potential.

The information in our genes is seldom a death sentence.  In rare cases, it is, such as Huntington’s Disease.  If this disease runs in your family, you already know it and testing should be done in conjunction with genetic and/or medical counseling.  For these people, DNA testing will either confirm that they carry that gene, or relieve their mind that they do not.

For the vast majority of us, the information held in our genes it much less dire.  In fact, it’s a good news message, as it will provide us ample warning, an opportunity, to do something differently with our lives to prevent what might otherwise occur.  So it’s not a death sentence, more of a life sentence.  For me, it was an epiphany.  Yes, I took positive action and made dramatic life changes as a result of my DNA test results.  In essence, this is my “coming out” story.

I was one of the first people to order the new 23andMe test when it was first offered, mostly for the genealogy aspect, but as you know, it includes health traits and information.  When I received the results of that test a few years ago, in black and white, where I could not possibly ignore them, the reports indicated that I was at elevated risk for certain conditions.  Those conditions were certainly beginning to manifest themselves in my life.  I was on medication for two of them.  My weight, at the time, was certainly a contributing factor to the development of those conditions.  My sister had died near the age I am now as a result of those conditions.  She looked like me, was built like me, was heavy like me, and very probably carried those exact same genetic risk factors.  Our grandfather died of the same thing about the same age.  Our father had it too, but he died in a car accident – caused by a coronary episode, at age 61.  Seeing this, in black and white, and knowing my family history, I decided to do something to prevent that future, or at least to delay or mitigate it as much as possible.

I lost over 100 pounds and yes, for almost 5 years now, I’ve maintained that weight loss, well except for a pesky 10 or 15 pounds that I fight with regularly.  But still, the 100 pound loss is far more important than the 10-15 pounds I battle with.  I am off of all medication related to those and related conditions.  I’ve changed what and how I eat, and a benefit I really didn’t anticipate is how much better I feel.  You have no idea how much I hate these old pictures of me when I was heavy.  This was taken at National Geographic Headquarters in Washington DC, in 2005, at our DNA Conference.

Me Nat Geo 2005

This next photo is me at one of our Lost Colony archaeology digs about five years later wearing one of my favorite t-shirts that says “Well Behaved Women Seldom Make History.”  All of the genealogists should be laughing about now.  No one wants well-behaved women because you can’t find them in the records.  If my clothes look a bit large, that’s because they are, but that t-shirt was too small before the weight loss.  I could never have done the physical work on those digs, or survived the heat, before losing the weight and going from a size 22 to a 12 in the photo below.  These kinds of activities were all unforeseen benefits of the weight loss.  My sister’s untimely death was not wasted on me, but served as a warning bell, well, more of an unrelenting siren actually, when I saw those DNA results.

???????????????????????????????

I also took my 23andMe results, at least some of them, the ones related to the conditions I was dealing with, to my physician.  I really had to think long and hard about this.  So now, let’s talk about the fear part of the equation.

Fear of genetic results falls pretty much into two categories.  We’ll call these the Ostriches and the Eagles.

My brother was an Ostrich.  Yep, he was, head right in the sand.  He had cancer, his wife had cancer, twice, his daughter, in her 30s, had cancer, yet their decision when offered free DNA testing was to decline – because they didn’t want to know.  Fear of the information itself, fear of knowing, perhaps spurred because of a sense of fate – nothing we can do about it so why know about it.  He also refused to discuss it, so I really can’t tell you why, and he died, of cancer, last year, so that opportunity is past.  Personally, I think knowing about a genetic proclivity would equate to more vigilant monitoring.  And knowing the proclivity didn’t exist would set one’s mind at ease.  I would think you would be a winner either way, but my thinking and his were obviously quite different.

The other group are the Eagles.  They are vigilant and acutely aware of the fact that health based discrimination does exist.  It has been worse in the past than it is now.  This is the reason I had to think long and hard about taking any of my results to my physician.  Once in your medical record, it’s permanent.

Today, GINA, the Genetics Information Nondiscrimination Act, goes a long way to protecting people, especially in an employment situation, but it does not cover everything.

Anyone who has ever tried to obtain health care insurance individually or through a small business knows all too painfully about pre-existing condition exclusions.  Well, the good news is that ObamaCare, love it or hate it, levels that playing field for the “rest of us,” those who either were denied or had to make life and employment decisions based on whether or not they had insurance coverage through a group where discrimination related to pre-existing conditions didn’t exist.

The other good news is that you don’t have to take any of your DNA test results to your doctor.  It’s entirely up to you.  You can test anonymously, using an alias, if you’re really paranoid.  Your results through personal genetic testing are yours and for no one else to see unless you disclose them.

Lastly, let’s talk realistically about the types of insurance that still discriminate – which would be life insurance, extended care insurance, etc.  They are in the business of odds-making.  They are betting you will live and you are betting you will die sooner than later.  As you age, the odds shift, cause let’s face it, eventually, you will die – and they will have to pay out.  Now the only way they can make money is if you pay more premiums during your life than they have to pay out in the end, or they make the premiums so expensive you stop paying, letting the policy lapse, before you die, so they never have to pay.  Of course, if they think the odds are stacked too far in your favor, they simply won’t insure you.  So, if you or your family members have Huntington’s Disease, you’re not likely to be able to get life insurance outside of a group policy, with or without a genetic test.  In fact, there is a questionnaire about your family history when you apply for individual life insurance.

I bought individual life insurance about 10 years ago.  They sent a nurse to the house to draw my blood.  They wanted chain of custody, to know the blood sample was mine, which is not the case with personal DNA testing.  I had to provide ID.  If the insurance company wanted to run a DNA test, prohibitively expensive then, but not in the next few years, they certainly could do so.  Let’s just say it plain and simple – everyone has pre-existing genetic proclivities to something – no one is immune.  These results are not generally black or white either, but expressed as a range.  For example, 4.2 European women out of 100 will develop Restless Leg Syndrome in thier lifetime.  My risk is 5.2, so slightly elevated above the average.  I’m only “above average” in 5 areas, and below average in most.  And the insurance companies are still going to be in the odds-making business – they can’t deny everyone or they won’t have any business – and they will use this new tool as soon as it becomes economically viable.  There is no escaping it.

So yes, the Eagles are right to watch vigilantly – but for now – how much you share and with whom is entirely in your control, so you don’t need to be an Ostrich either.  There is a great deal of good that can come from personal genetic testing, in addition to genetic genealogy.  Knowledge is power.

So now, if you haven’t already, read this great article, The Gene Machine and Me!!!