John Y. Estes (1818-1895), Civil War Soldier, Walked to Texas, Twice, 52 Ancestors #64

John Y Estes

John Y. Estes, whose photo we believe is shown above, started out years ago with a question, one that is probably answered now, but every time we think we answer one question about him, another dozen take its place.

Let’s start from the beginning.  When I first saw John’s name, I immediately noticed the Y.  Two things occurred to me…first, that’s someone’s last name and second, that’s shouldn’t be too difficult to find.  Y is not like S that would include something like Smith and takes up 10% of the alphabet.  Famous last words, or first thoughts, because assuredly, that second thought was NOT true.

Now don’t laugh, but one time I was at one of those fortune telling places.  The fortune teller asked me if I had any more questions.  I said yes, and asked her about John Y. Estes’s middle name.  She said something like Yarborough or maybe Yancy.  She wasn’t right about anything else either.

Nope, never let it be said that genealogists are a desperate group!

John Y. Estes was born on December 29, 1818, in Halifax County, Virginia to John R. Estes and his wife, Nancy Ann Moore.  Hmmmm, that middle initial R. might be someone’s last name….never mind….

We know that John R. Estes and his wife, Nancy Ann Moore, along with five if not six children made the long wagon journey from Halifax County, Virginia to Claiborne County, TN. sometime between 1818 and 1826 when John R. Estes had a land survey in Claiborne County.  The 1820 census doesn’t exist for Claiborne County and John appears to be gone from Halifax by then, so we’re out of luck knowing where John R. was in 1820.

In the 1830 census, John R. Estes was living in Claiborne County in the vicinity of Estes Holler, shown below.

Estes Holler 2

How do I know that?  Because these families have all become very familiar to me over my 30+ years of research.  John is living beside William Cunningham, who, in 1871 signed as a character witness for John R. Estes.  And six houses away we find John Campbell, the grandfather of Ruthy Dodson who likely raised her after her mother, Elizabeth Campbell died.  Rutha Dodson was the future wife of John Y. Estes.  And next door to John Campbell lived Mercurious Cook whose son’s widow John R. Estes would marry in another 40 years – but that is a story for a different day.

In the early 1830s, John R. Estes took his family to live in Grainger County for a short time.  Nancy Ann Moore’s two uncles, Rice and Mackness Moore lived there, Rice being a Methodist minister.  John R. Estes’s daughter, Lucy, married in Grainger County in 1833.  By 1835, John was back in Claiborne County when Temperance married Adam Clouse, so they didn’t stay long in Grainger County.

For the most part, John Y. Estes grew up in or near Estes Holler, below, from the cemetery, which, of course, is why it’s called Estes Holler today.

estes holler 5

By 1840, John Y. was probably courting the lovely Ruthy Dodson, likely at her grandfather’s house.  John Campbell had died in 1838, but his widow Jenny Dobkins Campbell didn’t die until between 1850 and 1860, so she would have still been living on the old home place, on Little Sycamore Road, below, when young John Y. Estes came to call.

Campbell house

We don’t find John R. Estes in the 1840 census, but by 1841, John R. Estes had to be living someplace in the vicinity because both his sons Jechonias and John Y. Estes married local gals.

On January 3, 1841, John Y. Estes married Ruthy Dodson, less than a week after his 23rd birthday.

John Y Estes Rutha Dodson marriage

Ruthy Dodson’s mother, Elizabeth Campbell died before Elizabeth’s father, John Campbell, did in 1838.  After John’s death, a guardian was appointed for Elizabeth’s children to function on behalf of their financial interests in his estate.

In the 1830 census, the John Campbell household has small children, so it’s very likely that the grandparents, John and Jenny Dobkins Campbell were raising Elizabeth Campbell’s children she had with her husband, Lazarus Dodson.

On September 5th, 1842, John Y. Estes signed a receipt for receiving part of Ruthy’s inheritance.  This seems to have been paid yearly, at least until the children reached the age of majority.

“John Y. Estes rect. dated 5th Sept. 1842, $54.35. Ditto rents for the year 1841, $1.50. Ditto order for what ballence may be in my hands as guardean, amt. $56.61.”

By 1850, we find John Y. Estes living in Estes Holler along with the rest of the Estes clan.  John is listed as a laborer, age 30, Ruthy as age 25 and Lazarus as age 2.

Given that John and Ruthy were married in 1841 and their oldest child in 1850 is only 2, this suggests that John and Ruthy had already buried several children.  If they had one child per year and the child died at or shortly after birth, they could have buried as many as six children in this time.  The Upper Estes cemetery, as well as the Venable Cemetery at the end of the road have many, many unmarked graves.  The Upper Estes Cemetery was within view of the John Y. Estes home place.

Upper Estes Cemetery

Furthermore, we know that John Y. Estes was living on this land, even though we find very few records of John Y. Estes in official county documents.

This land was originally granted to William Devenport and would eventually, in part, become the property of Rutha Estes, John Y.’s wife – but that wouldn’t happen for another 30 years.

William Devenport, April 17, 1850 – James McNeil trustee to William S. McVey, Districts 6 and 8, 475 acres, Buzzard’s Rock Knob – corner of grant to James M. Patterson, from Devenport’s spring, grant to Drewry Gibson, 50 acres #14072, line of Drewry Gibson, crossing Gibson’s branch, S with John Dobkins grant owned at present by Leander and Greenberry Cloud near N.S. McNeil’s line crossing Gibson’s branch on top of Middle Ridge, Planks fence of old Wier place, John Mason’s corner and line, Cunningham’s line, Devenport-Lanham’s corner, Weatherman’s spring, middle ridge – all of above contained in grant 16628 from the St. of Tennessee to William Devenport.

Second tract – 130 acres of land on the S. Side of Wallen’ ridge, corner of D. Gibson’s 50 acres tract #14072, Houston’s line, NW of Devenport’s line, Harkins corner, large rock on top of knob called Buzzard’s Rock, Harkins corner, Abel Lanham’s corner, Henderson’s line, 100 acre tract of WH Jennings, Bise’s corner, top of Wallen Ridge at Bise’s stake corner of Hardy tract, Henderson’s corner, the above contained in grant 27438 St. of Tn. to Devenport.

Also a 25 acre tract known as the Weatherman place.

1851 – William Devenport tax sale to William McVey – bid July 7, 1851 at courthouse, land in the 8th district, but due to a change in the lines now in the 6th district living near the lines of the 6th and 8th, sold for the taxes of 1845 and 1846, $16.77, 200 acres.

Tract 1 – S side Wallen Ridge near Little Sycamore adjacent lands of William Houston, Mordica Cunningham on the South, Samuel Harkins on the North, on NE Cunningham, William Houston’s, the land commonly known as the Weatherman place where William Devenport and John Estes now live.  Census records show that this is John Y. Estes, not John R. Estes that lives beside William Devenport.

So, in 1851, William Devenport is losing his land and apparently, neither he nor John Estes can do anything about it.  John is not bidding on the land.  William S. McVey purchased this land and in 1852, William McVey also purchased a very large tract of land granted to William Estes, John’s brother, which John Y. Estes witnessed.

By 1876, this same land is being conveyed by Henry Sharp to W.H. Cunningham.  How do we know this is the same land that is where John Y. Estes lived?  Metes and bounds are included, it states that is was William Devenport’s and it says that is where David A. King lived when he died.  The Reverend David A. King, a Methodist minister fought for the Union in the Civil War, died in 1873 and is buried in the Upper Estes Cemetery.  His daughter, Elizabeth married the son of John Y. Estes, George Buchanan Estes, in 1878.  I wonder if the old Reverend rolled over in his grave to have his daughter marry the son of a Confederate.  Yes, the secret is out, John Y. Estes was a Confederate.

David King

1876, Mar 30 – Henry Sharp of Campbell Co., TN to W.H. Cunningham of Claiborne for $400, 2 tracts of land in Claiborne on the waters of Little Sycamore Creek on the South side of Wallen’s Ridge adj the land of William Houston, decd and constitute the farm on which David A. King lived at the time of his death, one part is an entry made by William Devenport and bounded as follows: Beginning at a hickory stump on a red bank in Houston’s line thence north 9 deg west with Hentins? Line 94 poles to the Buzzard Rock on the top of Wallen’s Ridge thence with the top of Wallen’s ridge 240 poles to a chestnut oak and when redused to a strait line is south 60 deg west 234 poles then south 75? Deg east on Houston’s line 34 poles to a stake in the other line of Houston’s then with the same north 70 deg east 93.75 poles to a double chestnut and gum on a spur at Houston’s corner thence with lines of Houston’s land south 390 deg east 43 poles to a maple at the branch then east 62 poles to a hickory stump then with lines of Houston’s land south 30 east 43 poles to a maple at a branch then east 62 poles to a hickory stump then north 62 poles to a large white oak corner then east 9 poles to the beginning containing 90 acres more or less.

This land would eventually be owned by Rutha Estes, the wife of John Y. Estes.

The second parcel bounded by…Houston’s line, Devenport’s grant line, 25 acres.  Witness JW Bois, WW Greer.

This was a very, very indirect “round the mountain” way to track John Y. Estes, but it worked.  However, we’re getting ahead of ourselves, so let’s go back before the Civil War.

On March 8, 1856, in the court records, we show that John Y. Estes had an account in the estate of Thomas Baker – in other words, he owed Thomas money.

In the 1860 census, John and Rutha have four more children, although with a gap of 4 years between Lazarus and Elizabeth, it looks like they lost at least one more child.

John Y Estes 1860

Interestingly, John Y. Estes is a shoemaker.  John is shown as owning no land, but he does have a personal estate of $173, which isn’t exactly trivial.

I think in 1860 that John Y. Estes is not living in Estes Holler.  He is living beside carpenters, stage drivers, a wagon maker, a wagoner and a carriage maker who was quite wealthy.  That sounds suspiciously like he was living in town which would have been Tazewell.

The Civil War

Shortly after 1860, life would change dramatically for the Estes family.  Tensions were escalating towards the Civil War, and in 1861, they erupted when initially 4, then 7, then 11 states seceded from the Union, forming the Confederacy.  Tennessee did secede, but not initially.  Claiborne County was badly torn between the North and South, the blue and grey – and families were torn apart as different brothers and sons joined opposite sides.  Loyalties were divided and family members fought against one another.

In 1862, at the height of the Civil War, Confederate troops occupied Tazewell as part of the greater struggle for the strategic Cumberland Gap. When the Confederates evacuated the town in November of that year, a fire followed, destroying much of Tazewell.  In essence, anyone who could leave, did, because Tazewell was a target of continuous raids for food and supplies.

We know by 1870, positively, from the census, that the John Y. Estes family is back in Estes Holler.  We also know from family stories about the Civil War that they spent the majority of the War in Estes Holler.

But what we didn’t know was something far, far more important.

Aunt Margaret told me that while the war was over, it was really never resolved in Claiborne County.  The Crazy Aunts used to tell stories of the men in Claiborne County wearing their Civil War uniforms once again, on Memorial Day, and head for town to “refight” the war, as long as there were any veterans left to do so.  I suspect that most of the fighting was verbal and in the form of relived memories, but assuredly, not all, especially if region’s notorious moonshine was involved….and you know it was!

The aunts, Margaret and Minnie, lived in Estes Holler as a child, and while I knew none of my direct Estes ancestors had served in the Civil War, obviously some people from that area had.  Just a couple years ago, I decided to look for Estes men in Claiborne County, TN to see if any of them had fought in the Civil War at www.fold3.com.  Was I ever in for the surprise of my life.

My great-great-grandfather, John Y. Estes served in the Civil War – but for which side?

John Y Estes reference slip

Look what that says.  Confederate.

John’s service records are confusing, to say the least.  There are documents in his file from both sides, it seems.  How can that be?  Let’s start with the basics.

The Civil War began in earnest in April, 1861 when confederate forces bombarded the Union controlled Fort Sumter, SC in Charleston Harbor.

Many people who lived in Claiborne County fought for the North and joined the Union troops, but not all.  The Civil War was a source of dissention within and between families in Claiborne County.  Few people there held slaves, so slavery was not a driving force.  By searching for his unit, I confirmed that John Y. Estes had joined the Confederate Army, but I was stunned.  All of my other family members in my various lines fought for the Union – including the families from that area.

The history of Carter’s Tennessee Cavalry Regiment F, formed in Claiborne County shows that it was formed on August 10, 1862 by Captain R. Frank Fulkerson who lived near John Y. Estes in the 1860 census.  There is no existing muster roll, although I recreated one as best I could from the various men’s service records in his unit.  Reading John’s record, along with the other men’s records in his unit, (along with regimental and other histories,)  is also how I reconstructed where that unit was, when, and what they were doing.

We don’t know when John enlisted, although it was likely when the unit was formed, nor do we know if he ever applied for a pension.  John would have been 44 years old in 1862, so no spring chicken.  His daughter, Nancy Jane has been born in November of 1861.  He had a wife and 6 children at home ranging in age from Lazarus born in 1848, so 13, to newborn.  His wife probably wanted to kill him for enlisting and save the Union Forces the trouble.

What we do know is that on March 20, 1865, in Louisville, KY, John Y. Estes signed the following allegiance document.  I later discovered that he had been captured and this was one way men obtained their freedom. This document tells us that he had dark skin, dark hair and dark eyes and was 5 feet 7 inches tall, just slightly taller than me. Information I didn’t have before.  If you look closely at John’s picture at the beginning of this article, he may have been mixed-race.

John Y. Estes allegiance

And look, we also have his signature.

So, how did John Y. Estes get to Louisville, KY in 1865 from Claiborne County?  To answer that question, I tracked the activities of his unit.  That was much easier said than done.

Here’s what we know about the activities of Carter’s Tennessee Cavalry Regiment.

Prior to the organization of the regiment, the battalion had been operating in the neighborhood of Cumberland Gap and Big Creek Gaps, at present day LaFollette, TN, about 33 miles distant from each other, along the line of the railroad.

When the regiment was organized it was assigned to Brigadier General John Pegram’s Cavalry Brigade in Lieutenant General E. Kirby Smith’s Department. This brigade was composed of Howard’s Alabama Regiment, 2nd (Ashby’s), 4th (Starnes’), I. E. Carter’s Tennessee Cavalry Regiment, and Marshall’s Battery.

Prior to the Battle of Murfreesboro, on December 29, 1862, Carter’s Regiment joined Brigadier General Joseph Wheeler’s Brigade, and participated in his raid around the Federal Army from Jefferson Springs to LaVergue, to Nolensville, to Murfreesboro, TN. The unit was engaged on December 31 along the Murfreesboro Pike.

Following this battle, the regiment returned to Pegram’s Brigade, in the Department of East Tennessee under Brigadier General D. S. Donelson.

With Pegram’s Brigade, the regiment took part in operations in Lincoln, Boyle and Garrard Counties of Kentucky, and was engaged March 30, 1863 at the junction of the Stanford and Crab Orchard Roads where it was under the command of Colonel Scott, of the 1st Louisiana Regiment. General Pegram’s comment on this operation is interesting: “For Colonel Scott’s operations, I refer you to the accompanying report. Touching this curious document I have only to say that I cannot but admire the ingenuity with which Colonel Scott has attempted to account for disobedience of orders and dilatoriness of action which it is my sincere belief lost us the fight.” Colonel Carter reported five officers and 32 men as casualties in this operation.

It was not a good day to be a Confederate soldier.  John saw his comrades die. It probably wasn’t the first time, and it certainly wouldn’t be the last.

On April 25, 1863, Colonel J. I. Morrison was reported in command of the brigade, now listed as composed of 1st Georgia, 1st and 2nd Tennessee Regiments, 12th and 16th Cavalry Battalions, and Huwald’s Battery. The brigade was at Albany, Kentucky on May 1; at Travisville, Fentress County, Kentucky on May 2.

On July 23, the Chief of Staff, at Knoxville, ordered Colonel Scott, then commanding the brigade, to send 300 horses of 1st (Carter’s) Regiment to Loudon, Tennessee.

On July 31, Pegram’s Brigade, consisting of 1st and 6th Georgia Regiments, 7th North Carolina Battalion, 1st Tennessee Regiment, Rucker’s Legion, and Huwald’s Battery was reported at Ebenezer.

From December of 1862 to August of 1863, John Y. Estes’s unit covered over 1000 miles and marched from East Tennessee, near the Cumberland Gap to central Tennessee to Kentucky, back to central Tennessee and then back to the Cumberland Gap.

John Y Estes civil war map

On August 15, Carter’s Regiment was reported as operating near Clinton and participated in the fighting around Cumberland Gap.  This fighting occurred on the land previously owned by John Y. Estes’s wife’s father, Lazarus Dodson.  The photo below is on Tipprell Road, on Lazarus’s land, looking North towards Cumberland Gap.

dodson land tipprell road

This is where Lazarus Dodson’s father, Lazarus Dodson’s Revolutionary War marker stands today, in the Cottrell Cemetery, below, now on land owned by Lincoln Memorial University.  This photo is standing in the cemetery, looking North towards the mountains and Cumberland Gap.

Cottrell cem looking north

This map shows LMU complex, the location of the cemetery with the upper red arrow and the location of the Dodson homestead with the lower arrow.  You can see the now abandoned road that used to connect the homestead with the cemetery.

Dodson homestead Cottrell Cem

The map below shows the larger area.  It’s probably a mile between the Dodson homestead and the LMU campus across the back way and maybe two and a half miles to Cumberland Gap, up Tipprell Road from the Dodson home.

Cumberland Gap Dodson homestead

This Civil War map shows where the troops camped, at Camp Cottrell, at Butcher Springs.  Lazarus Dodson had sold this land in 1861 to David Cottrell whose residence is marked on the map.  That was the old Lazarus Dodson homestead.  The main road, now called Tipprell Road, was called Gap Creek Road at the time.  It connects the valley, passes Butcher Springs and continues up to Cumberland Gap along the creek and now the railroad as well.  The road heading to the right above the Cottrell homestead used to go up to the cemetery, but is no longer a road today.

camp cottrell civil war map

This photo shows that area today.  It’s flat, so perfect for camping.  Butcher Springs is to the right in this photo, below, just out of sight.

DSCF9016

This is me standing in the Cottrell Cemetery.

Me in Cottrell Cemetery

Butcher springs would be behind me in the valley to the right.  On the Civil War map, Patterson’s Smith shop would be the cluster of buildings where you can see the church, to the left in the picture, in the distance, across the road.

Cumberland Gap was captured by the Federal troops on September 9, 1863, but the Confederate regiment had escaped up the valley before the surrender, and on September 11, Colonel Carter was reported in command of the brigade near Lee Courthouse.  Lee Courthouse is present day Jonesville, VA, about 35 miles from Cumberland Gap.  I’ve added Estes Holler here for context.

John Y Estes Cumberland Gap Lee Courthouse

On September 18, Carter’s Regiment was driven from the ford above Kingsport, TN after a severe fight.  This fight was only 7 days later and Kingsport was another 45 miles distant over rough, mountainous terrain.

John Y Estes Jonesville Kingsport

Somewhere about this time, the regiment was assigned to Brigadier General John S. Williams’ Cavalry Brigade, composed of the 16th Georgia Battalion, 4th Kentucky Regiment, 10th Kentucky Battalion, May’s Kentucky Regiment, 1st Tennessee and 64th Virginia Regiments, which on October 31, 1863 was reported at Saltville, Virginia, 60 miles northeast of Kingsport, TN.

The unit received orders to proceed to Dalton, GA, but despite these orders, Carter’s Regiment was reported near Rogersville on November 1, in Williams’ Brigade, with Colonel H. L. Gutner commanding.

Rogersville was back, through Kingsport, about 90 miles “down the valley,” so to speak.

John Y Estes Rogersville Saltville

In the meantime, Captain Van Dyke’s Company “C” had returned from Mississippi, and on November 24, 1863 was at Charleston, Tennessee with Colonel John C. Carter’s 38th Tennessee Infantry Regiment. Charleston was 145 miles from Rogersville.

John Y Estes Rogersville Charleston

Colonel Carter highly commended Captain Van Dyke and his 44 men for the part they played in helping his forces to evacuate Charleston without being captured.  On April 16, 1864, the regiment was transferred to Vaughn’s Brigade, of Brigadier General J. C. Vaughn’s Division, and reported 248 men present. It remained in this brigade until the end of the war.

By May of 1864, the majority of the fighting had shifted to Virginia.  Between mid-April and May, John Y. Estes’s unit traveled almost 400 miles, from Charleston, TN to the Lynchburg, VA region.

John Y Estes Charleton Lynchburg

The Civil War was becoming a series of constant battles which were referenced as the Campaign in the Valley of Virginia which lasted from May-July of 1864 as shown on this map by Hal Jespersen.

Shenandoah Valley Campaign 1864

As part of Vaughn’s Brigade, the regiment moved into Virginia in early 1864, fought at the Battle of Piedmont, New Hope Church, and in the subsequent campaign in the Valley of Virginia under General Early.

Germanna Ford

This drawing from Harper’s Weekly shows the troops crossing at Germanna Ford during the Battle of New Hope Church, also called the Mine Run Campaign.

Mine-Run

This drawing shows the “Army of the Potomac at Mine-Run, General Warren’s Troops attacking.”

Battle of Piedmont

This is the location, today, of the Battle of Piedmont.  This battlefield looked very different when John Y. Estes stood here on June 5th, 1864.  There were men, horses and blood all over this battlefield.  After severe fighting, the Confederates lost, badly.

It was this point, nearing the end of this chapter of the war, that John Y. Estes entered the hospital on June 12th.  But, that doesn’t mean he was done…the worst, perhaps, was yet to follow.  What happened next?  There has to be more.

Hmmm, let’s check the 1890 Civil War veterans census.  Nope, nothing there.

Well, let’s look under Eastice.  His folder says that name was used as well.

John Y Estes private

Well, Glory Be, look what we’ve found.  His index packet, indeed, under Eastice.

John Y Estes absent

This regimental return of October 1864 says that he was an absent enlisted man accounted for, “Without Cane Valley of Va. Aug. 28.”  That’s odd phrasing.  Does it mean “without leave?”  But it says he is accounted for?

John Y Estes deserter

Uh-oh, this doesn’t look good.  Now he’s on the list of deserters as of March 18, 1865.  It says he was released north of the Ohio River.  That goes along with the “Oath of Allegiance” document that he signed on March the 20th.

John Y Estes POW

Wikipedia says that during the Civil War, prisoners of War were often released upon taking at “oath of allegiance.”  General Sherman was known to ship people to Louisville and those who signed were freed, north of the Ohio, and those who didn’t remained in prison.

This documents John Y’s oath of allegiance, and the faint writing says that his name also appears as John Y. Estus.  How many ways can you spell Estes?  I checked and there are no additional records under Estus – at least none that are indexed yet.

John Y Estes transfer

This document says that he was a Prisoner of War, but this kind of Prisoner of War was a Rebel Deserter.  He was apparently “caught” on March 6th, 1865, send to Chattanooga, then to Louisville apparently in late March where he was taken across the Ohio River.  I’m thinking John Y. considered this a very bad month.

John Y Estes desertion info

This page gives us a little more info.  Apparently he deserted at Staunton, Va. on June 30 of 1864, just days after his hospitalization and release.  Where was he between June 30, 1864 and March 6 of 1865?  And where was he captured?  The first document says that in October of 1864, he was accounted for which I would interpret to mean that they knew where he was and whatever the situation, was OK.  Nothing confusing about these records….

John Y Estes medical

Well here is at least part of the answer.  On June the 12, 1864 he was hospitalized and had a partial anchyloses of his knee.  On June the 19th he was sent to a convalescent camp.  The 30th of the same month, he was reported as having deserted at Staunton.

What they don’t say here is that Staunton was devastated by the Union in June of 1864 – everything was burned including shops, factories mills and miles of railroad tracks were destroyed.  If that is where he was convalescing, it’s no wonder he deserted, or simply left.

He was accounted for in October, but sometime between then and March 1865, he apparently deserted for real, or he already had in October.  I wonder if he simply went home, or attempted to go home.  Where was he when he was caught, or deserted?  If you are a Confederate deserter, and the Union forces “catch” you, do they still hold you prisoner?  Maybe the Confederates only thought he deserted and he was in fact captured?  But the Union paperwork indicates he was listed as a Rebel deserter.  So many questions.

Ankylosis or anchylosis is a stiffness of a joint due to abnormal adhesion and rigidity of the bones of the joint, which may be the result of injury or disease, sometimes resulting from malnutrition. The rigidity may be complete or partial and may be due to inflammation of the tendons or muscular structures outside the joint or of the tissues of the joint itself.  Sometimes the bones fuse together.  This disease is considered a severe functional limitation.

So here is what we know about John Y. Estes and the Civil War.  He probably joined when the regiment was formed on August 10, 1862, although he may have been participating in the unofficial unit since 1861.  The Fulkerson’s in Tazewell, his near neighbors, were instrumental in raising Confederate volunteers in Claiborne County.  John Y. Estes fought and served until he was either injured or a previous condition became so serious in 1864 that he could not function, although he participated in some of the worst fighting and most brutal battles of the war.  John is reported to have been admitted to the hospital in Charlottesville, VA on June the 12th, transferred to a convalescent camp on June 19th, and deserted at Staunton, Va. on June the  30th.  In October, 1864 records say he was accounted for, but absent.  By March 6th of 1865, he was in prison, captured as a deserter, transferred to Chattanooga, signed the allegiance oath and by the end of March, had been taken to Louisville before being deposited on the north side of the Ohio River, having agreed to stay there for the duration of the war.

He didn’t have long to wait.  General Lee surrendered at the Battle of Appomattox Court House on April 9th, 1865.  But then John probably had to walk home on that injured leg.

That leg apparently didn’t slow him down much.  John Y. Estes eventually walked to Texas, not once, but twice, according to the family, which means he walked back to Tennessee once too.  The family said one leg was shorter than the other and he walked with a cane or walking stick.  It’s about 950 miles from Estes Holler in Claiborne County, Tennessee to Montague County, Texas.  I surely want to know why he walked back from Texas to Tennessee.  After making the initial journey, on foot, taking months, what could be that important in Tennessee?  Was he hoping to convince his wife to relocate with him?  Even then, land and other legal transactions could be handled from afar, so it must have been an intensely personal reason.  Maybe he only decided to return to Texas, forever, after he had returned to Tennessee.

I have to wonder how John’s Civil War allegiance and subsequent desertion, if that is actually what it was, affected John himself and the way that the people in Claiborne County viewed him.  He went back home and lived for several years.  His neighbor in Estes Holler, David King, fought for the North.  So did his sister’s husbands and children.  I’m betting holidays were tough and there was no small talk at the table.  Maybe there were no family gatherings because of these polarized allegiances.  They would have been extremely awkward and difficult.  Maybe John was quietly ostracized.  Maybe that’s part of why he eventually left for Texas.

On October 5, 1865, just six months after being released on the north side of the Ohio River, John Y. Estes did a very unusual thing.  He deeded his property, mostly kitchen items and livestock, to his son Lazarus who was about 17 years old and lived in the family home.

Transcribed from book Y, pages 286 and 287, Claiborne County, Tennessee, by Roberta Estes.

Deed of Gift From John Eastis to Lazarus Eastis :

State of Tennessee, Claiborne County. Personally appeared before me J. I. Hollingsworth, clerk of the county court of the said county, J. R. Eastis and Sallie Bartlett, with whom I am personally aquainted, and after being duly sworn depose and say that they heard John Y. Eastis acknowledge the written deed of conveyance, for the purpose therein contained upon the day it being dated. Given under my hand at office in Taswell this 9th day of October, 1865. J. I. Hollingsworth, clerk. Know all men by these presents that I, John Eastis of the County of Claiborne, State of Tennessee in consideration of the natural love and affection which I feel for, my son, Lazarus and also for divers good cause and consideration, I the said John Eastis, hereunto moving, have given, granted and confirmed by these presents, do give, grant and confirm unto said Lazarus Eastis all and singularly, the six head of sheep, one horse, fourteen head of hogs, one cow and calf, two yearlings, the crop of corn that is on hand, and all the fodder, and all the household and kitchen furniture, to have and to hold and enjoy the same to the only proper use, benefit and behoof of the said Lazarus Eastis, his heirs and assigns, forever and I the said John Eastis for myself and my heirs, executors, and administrators all and singular the said goods unto the said Lazarus Eastis, his heirs and assigns, against myself and against all and every person, or persons, whatever shall and will warrant and forever defend by these presents in witness whereof, I have hereunto set my hand and seal this 5th day of October 1865.  John Y. Eastis.

ATTEST: John R. Eastis, Sallie Bartlett. I certify this deed of gift was filed in my office, October 9, 1865 at 12:00 and registered the 10th day of the same month. E. Goin, register for Claiborne County. [ stamped on page 58 ].

John R. Estes is the father of John Y. Estes who would have been close to 80 years old at that time.

Is this somehow in conjunction with or a result of the Civil War?  Did it take him that long to find his way back to Claiborne County?  Was he angry with his wife?  Lazarus was only a teenager and didn’t live in his own home, and wouldn’t for another 18 months.

The verbiage in this transaction, “hereunto moving” does not mean that John was literally moving, but refers to what motivated him or moved him to make this transaction.  So, in this context, love and affection for his son “moved” John to convey this property.  Of course, this begs the question, “what about your wife?”  Rutha would be the person to use all of that kitchen gear to prepare meals for the entire family.  What about Rutha?

In the 1870 census, John is shown with his wife and family, with another baby, Rutha, named after his wife, born in 1867. John and his wife, Ruthy Dodson, would have one more child, John Ragan (or Reagan or Regan) Estes, born in March of 1871.

We know that in 1879, John Y. Estes was in Claiborne County, but whether he was “back” from Texas or whether he had not yet left, we don’t know.  On June 20, 1879, John Y. Estes signs an agreement granting James Bolton and William Parks permission to make a road across his land in order to enable Bolton and Parks to have access to their own land that they had just purchased from Lazarus Estes, John Y’s son.  This is the last document that John Y. signs in Tennessee.  And actually, it’s the only deed, ever.

Deed records show no evidence of John Y. Estes ever owning land or a conveyance to or from John Y. Estes.  My suspicion is that John was buying this land “on time” and when he failed to pay, the transaction was simply null and void and the deed never filed.  It’s still odd that he would sign to grant access on land he did not officially own.  This is very likely the same land that Rutha would eventually own in her own name.  Sometimes truth is stranger than fiction.

We know that by June of 1880 when the census was taken, John Y. Estes is living in Texas and his wife Rutha, is shown in Claiborne County as divorced, although no divorce papers have been found.  Maybe divorce was less formal then.  Given the distance involved, about 900 miles, and give that John could probably not walk more than 8 or 10 miles a day, the walk to Texas likely took someplace between 95 and 120 days, or 3 to 4 months, if he walked consistently every day and didn’t hitch rides.  So John likely left Claiborne County not long after the signing of the 1879 deed.  In fact, that might have been the last bit of business he took care of before departing.

The family in Texas tells the story that John Y. was wounded in the leg as a young man, although they don’t say how, and that one leg was shorter than the other.  He walked with a stick.  It causes me to wonder if the injury was truly when he was a child or if it was a result of his time in the Civil War, or maybe some of each.  It’s a wonder they would have accepted him as a soldier if he was disabled and his military battle history certainly doesn’t suggest a disability.  Maybe they were desperate or maybe the old injury got much worse during his military service – or maybe the injury occurred during one of the Civil War battles.  John was hospitalized and I find it difficult to believe he would have been hospitalized for an old injury.

During John’s absence, Claiborne County was not immune to the effects of the war.  In fact, they were right in the middle of the war, time and time again, and without a man in the household, Rutha and the family weref even more vulnerable.

During the Civil War, soldiers from both sides came through Estes Holler and took everything they could find: food, animals, anything of value. They didn’t hurt anyone that we know about, but the people hid as best they could. Adults and children both were frightened, as renegade troops were very dangerous.  Elizabeth Estes, born in 1851, was the second oldest (living) child of John Y. Estes and Rutha Dodson.  After the soldiers took all the family had, the 4 smaller children were hungry and crying. The baby had no milk. Elizabeth was angry, not only at what they had done, but the way they had been humiliated. She was a strong and determined young woman, about age 14 or 15, and she knew the soldiers were camping up on the hillside. She snuck into the camp of the soldiers that night, past the sentries, and stole their milk cow back. She took the cow’s bell off and the cow just followed her home. I don’t know if it’s true or not, but another story adds that she went back the second night and took their one horse back too. That one horse was all the family had to plow and earn their living.

Today, not one family member knew that John Y. Estes had served in the Civil War, not even the Crazy Aunts.  Given the way his service ended, it’s probably not something he talked about.  He would have been considered a traitor by both sides.  He didn’t claim his service on the 1890 veterans census either.  It seems a shame to have served for most of the war, in many battles, and survived, only to have had something go wrong in the end that seems to be medically related.  The term “deserter” is so harsh, and while I’m sure it technically applies, I have to wonder at the circumstances.  During the Revolutionary War, men “deserted” regularly to go home and tend the fields for a bit, showing back up a month or two later.  No one seemed to think much of it then.  That’s very likely what happened to John when he supposedly deserted in June of 1864, right after his injury.  He probably just left and went home.

I’m sure there is more to this story, much more, and we’ll never know those missing pieces.  And it’s a chapter, a very important chapter in the life of John Y. Estes and who he was.  It’s very ironic that none of his descendants alive today knew about his Civil War Service.

The Walk to Texas

Initially, I had no idea John Y. Estes ever left Claiborne County.

When I first visited Claiborne County, I did what all genealogists do – I went to the library.  I had called the library and the librarians seemed friendly enough, and they told me they had these wonderful things called “vertical files.”  I didn’t know what that was, so the nice lady sighed and said, “family files.”  Now, that I understood.

The first day I arrived in town, I went straight to the library.  I looked through the books and the family histories that had been contributed.  Most of those were for the “upstanding families” whose members had been judges and public officials.  That would not be my family.  In fact, there was very little for my family.  I was sorely disappointed.  Those promising vertical files either held little or there were none for my surnames.

I had packed up and was leaving, walking past the shelves that held so much disappointment, when one of the files literally fell off the shelf and about three feet onto the floor.  I was no place close to it, so it was prepared to fall with no help from a human, but the librarians looked up at me, and then down at the file on the floor, with great disdain and disgust.  They, obviously, felt I was careless and had knocked the file onto the floor.

I had no problem picking the file up, but I wished they hadn’t been so put out with me.  The file hit sideways and all of the papers fanned across the floor.  Most of them weren’t stapled together, so I was trying to make sure that I put them back in the file in order that they had come out, without mixing things up.  I have no idea the surname on the file.  I had already checked all of mine.  But as I was gathering those papers back into the file, a familiar name crossed my vision, Vannoy, then another, and then Estes.  I stopped and actually looked at the papers in the file.

I was holding a story about John Y. Estes, written by a Vannoy who had moved to Texas.  I put my bag and purse down, and sat down – on the floor – in the aisle way – oblivious to the librarians and their stares, now glares.  I read all three pages of the story and sat in stunned disbelief.  This had to be the wrong man. It was in the wrong family file.  Otherwise, someone would have told me….wouldn’t they?

My family didn’t go to Texas.  Did they?

This story says John Y. Estes walked to Texas, not once, but twice.  This man injured his leg somehow as a child and walked with a limp, one leg being shorter than the other. He walked with a cane or a stick, and still, he walked to Texas, twice, and back to Tennessee once.  This man had tenacity.  Of course, when I was reading this, I didn’t realize he had also fought through the Civil War with this lifelong challenge. I wouldn’t know that piece of the puzzle for another 30 years. I hesitate to call it a disability, because John Y. apparently didn’t treat it as such.  In fact, it just might have saved his life in the Civil War.

Fannie Ann Estes, John’s grand-daughter, said that John Y. brought a skin cancer medicine from Tennessee and sold it in Texas.  He traveled throughout north Texas selling his remedy and established a relationship with William Boren, a merchant that sold goods on both sides of the Red River throughout the Red River Valley.  This was also the location where the Chisolm Trail crossed from Texas into Oklahoma, so comparatively speaking, it received a lot of traffic.

So John Y. Estes was either a snake-oil salesman or a genius on top of being a shoemaker, according to the census, a Civil War veteran and a former Prisoner of War.  This man was certainly full of surprises.  What a great plot for a book!

His grandchildren said that as an old man, they remember him being short and fat.  Hardly a fitting legacy.  Thankfully, one person remembered more and wrote it down.

To the onlooker, it appears that John Y. Estes basically left his family in Claiborne County, TN and absconded to Texas.  But looking at what happened next, his children apparently did not seem to hold a grudge against him for leaving their mother….in fact, John Y. Estes seemed to be more leading the way than abandoning the family.

It’s clear from Rutha’s 1880 census designation as divorced that she viewed the relationship as over.  She never intended to leave Claiborne County, nor did she.  But that didn’t stop her relatives from going to Texas – and they all settled together, including her husband.  Many are buried in the same cemetery.

William Campbell, Ruthy’s uncle, and his family were in Texas by 1870. Barney J. Jennings married Emily Estes, daughter of Jechonias Estes, and they went to Montague Co., TX, as well.

Many of John Y’s children, in fact all of them except Lazarus, eventually moved to Texas, including brave Elizabeth who married William George Vannoy.  She left with William Buchanan Estes and Elizabeth King in 1893, in a wagon train.

Children

The following children were born to John Y. and Ruthy Dodson Estes:

  • Lazarus Estes, born in May 1848 in Claiborne Co., died in July of 1918 in Claiborne Co., married Elizabeth Ann Vannoy.  Both buried in the Pleasant View Cemetery.
  • Elizabeth Ann Estes, born July 11, 1851 in Claiborne Co., died July 7, 1946 at Nocona, Montague Co., Texas.  On September 11, 1870, she married William George Vannoy, brother to Lazarus’s wife and son of Joel Vannoy and Phebe Crumley.  They settled in Belcherville, TX in 1893 and her husband was buried in the Boren Cemetery in Nocona on Sept. 12, 1895, only seven days before her father died and was buried in the same cemetery.  I wonder what killed both men.  This must have been a devastating week for Elizabeth.  She spent most of her life in Texas as a widow – more than 50 years.

Elizabeth Estes Vannoy

Elizabeth Estes Vannoy’s 95th birthday. She liked to sit on an old seat out under a tree.  Elizabeth is buried in the Nocona Cemetery, not with her husband.

Elizabeth Estes Vannoy stone

  • Margaret Melvina Estes, born July 19, 1854 in Claiborne Co., died April 7, 1888 in Claiborne Co., buried in Pleasant View Cemetery.  Never married and no children.
  • George Buchanan Estes, born December 17, 1855 in Claiborne Co., died July 1, 1948 at Nocona, Texas, buried at Temple, Cotton Co., Oklahoma. In 1878 he married Elizabeth King, daughter of David King, in Claiborne Co. She died in 1920 and is buried at Temple, Oklahoma.

George Buchanan Estes and Wanda Hibdon

George Buchanan Estes and granddaughter Wanda Hibdon Russell in 1945.

  • Martha Geneva J. Estes, born October 6, 1859 in Claiborne Co., died April 9, 1888, buried in Cook Cemetery on Estes Road. She married Thomas Daniel Ausban in Claiborne Co. April 17, 1884.  It’s not believed that she had any surviving children.
  • Nancy J. Estes, born November 1861 in Claiborne Co., died at Terral, Jefferson County, Oklahoma in 1951, married a Montgomery.  Buried in the Terral cemetery.  No children.

Nancy Jane Estes Montgomery

  • Rutha Estes, born January 7, 1868 in Claiborne Co., died at Terral, Jefferson Co., Oklahoma in 1957.  She married Thomas Vannoy in 1902 in Claiborne County, or at least she took the license to marry him.  They may have never actually married, as she never used the Vannoy surname, nor is she ever found living with him.  She married William H. Sweatman after 1920 in Texas or Oklahoma and is buried in the Terral Cemetery.  No children.

Ruthie Estes Sweatman

  • John Reagan Estes, born March 25, 1871 in Claiborne Co., died July 8, 1960 in Jefferson Co., Oklahoma. On April 10, 1891 he married Docia Neil Johnson, daughter of William Johnson and Jinsey Nervesta King in Claiborne Co., She was born November 7, 1872 in Claiborne Co. and died August 30, 1957 in Jefferson Co.  John and Docia are both buried at Terral, Oklahoma.

The Texas family provides this information about John Regan Estes.

John Regan Estes grew to manhood in Claiborne Co. Tennessee, he received his schooling on the old split log seats and was taught to the “tune of a hickory stick”. On April 9, 1891 he married Docia Neil Johnson in Tazewell, with Rev. Bill Cook, the old family preacher, reading the vows. John and Docia were wed on horseback. A daughter, Fannie Ann, was born to them on May 4, 1892 at Tazewell.

In 1893, John Regan Estes had the ambition to go west. On the first day of November 1893, he stepped off the train at Belcherville, Texas. He was accompanied by his brother, George Buchanan Estes and family, Clabe Bartlett, and Lewis Taylor Nunn. He worked on the Silverstein ranch until January 1894.

He saved his money and sent it back to Docia and on February 9, 1894, Docia and Fannie, aged 20 months, arrived at the train station in Belcherville. At this time, they went to Oscar, Indian Territory. He located on a farm in the Oscar area and lived there until moving to the Fleetwood community in 1901. John’s farm was located on the Red River across from Red River Crossing where the Chisholm Trail crossed into Oklahoma. He had a shop near his barn and shod horses, sharpened plows, and did other metal work for the community.

Cousin Gib’s grandmother, granddaughter of John Y. Estes through John Reagan Estes told of life in Texas when they first arrived:

Fannie wrote about the Estes family living conditions at the time that Lula was born. She said that they lived in an old log house at the end of Ketchum Bluff, this is the area where the road going south from Oscar, Oklahoma makes a turn along a high rock formation an goes to where, at a later time, there was a toll bridge built going into Texas.

Ketchum Bluff map

Courtesy Butch Bridges

Note that the old trestle of the toll bridge can still be seen on the shore of Ketchum Bluff in the aerial photo, below, about one fourth of the way from the right hand side, directly across from the sand bar.  The bend in the river at the turn is in the lower left hand corner of the photo.  The bluff, of course, lies along the river.

Ketchum Bluff aerial

Courtesy Butch Bridges

Lula was born January 29, 1899 and Fannie said that it was extremely cold and they had snow on the ground for about six weeks. The sun would come out about noon each day for a little while and then it would cloud up again and snow all night. She said that their father would cut wood all day and carry it into the house. He did not have any gloves and his hands would crack open and bleed and hurt so bad that at night he would sit by the fire and cry from the pain.

In 1901, John got the farm a little farther west of here, just east of Fleetwood, and that is where Lula grew up.

The Estes family had moved to Indian Territory in 1894 and Oklahoma did not become a state until 1907. During this time it was pretty much every man for himself and gunfights were common. John Reagan worked as a farmer, blacksmith, farrier and lawman. The family remembers him wearing a gun.

Once, a man named Joe Barnes sent word to John that he was coming to kill him. John only had a black powder shotgun and he told Barnes to stop and to not come any closer. Barnes kept coming and John blew him full of birdshot. John had a bullet hole in his stomach and would tell the grandchildren that he had two navels.

John Reagan Estes circa 1905

John Reagan Estes about 1905.

John Reagan Estes family 1905

John Reagan Estes and family in 1905.

John Reagan Estes

John Reagan Estes in 1943.

Uncle George said that John R. Estes came to visit in the 1940s in Claiborne County Tennessee and that he was extremely tall and had very long eyebrows.

John Reagan Estes stone

The Texas family members, tell another secret too, that John Y. Estes had another family in Texas, but a search of marriage records produced nothing.  However, when I visited, I realized that the location where John lived was on the Choctaw land.  Perhaps he did have a second family without benefit of a legal marriage.  Laws and customs on Indian lands on the Texas/Oklahoma border were quite different than back in “civilized, orderly” Tennessee.  Furthermore, Indian tribes were considered sovereign Nations.  We will probably never know the details unless another family member steps forward.

John Y. Estes died on September 19, 1895 and is buried in the Boren cemetery, northeast of Ringgold, Texas.

Old Time Texas

In 2005, I visited my cousin, Gib, in Texas.  Gib had come back to Claiborne County, TN the year before and had visited Estes Holler.  Now, I was visiting Texas to retrace the steps of my great-grandfather, John Y. Estes.

Gib gave me a great piece of advice before I set out on my great adventure to Texas.

We went to see the movie “Open Range” starring Kevin Costner and Robert Duvall. The setting for the movie is 1882 and they are “free grazing” a herd of cattle on the open range as they are moving toward market. They pass through a little town, cross a river, and are tending their herd.

John Y. Estes was in Montague County Texas in 1880. The Chisholm Cattle trail came right through the little town of Red River Station which was two miles south of the Red River. From the information that I have, the movie town was exactly like what Red River Station was like in 1882. I really got intrigued with the movie by imagining John Y. being in a place just like that. This was where he would have been at that time because Nocona and Belcherville were not founded until 1887 when the MKT railroad came through going from east to west. Ringgold was not founded until 1892 when the Rock Island railroad was built going south to north and crossed the MKT at the site of Ringgold.

Of course no good western movie would be worth the price of admission without a good gun battle. They had one and people were killed. The next thing that grabbed me was the burial scene. They dug graves out on top of a hill and hauled the wooden caskets out in a wagon. This setting was just like what I found at Boren cemetery.

Another thing that caught my attention was the heavy rain storm that they experienced at the little town. Red River Station was pretty much wiped out by a Tornado in the late 1880’s and all the business moved to Belcherville and Nocona.

Anyway, go see the movie and imagine John Y. being one of the residents of the little town and then visualize all of our relatives crossing the Red River on horseback as they did in the movie. The River depth shown is also accurate of Red River. Later, John Reagan Estes owned the land on the Oklahoma side and the Campbells and Vannoys owned ranches on the Texas side.

Go see where John Y. lived in 1882, let your imagination run wild and enjoy it.

 I agree 100% with Gib’s recommendation.

The Chisolm Trail

The Chisolm Trail cut through the Estes land.

Chisholm Trail

Not far from Ryan is one of the cuts in a creek bank  worn by the pounding of thousands of hoofs when the Chisholm Trail was noted for its cattle drives from Texas to Wichita, Kansas.

This map shows Ryan and Terral, OK, and the ghost location of Fleetwood.  All that is left today is a store full of bullet holes and a cemetery.

Fleetwood OK

According to Gib, that cut is still visible on the Estes property. Although highway U.S. 81 mostly follows the route of the old Chisolm Trail, at times Engineers had to diverge from the trail itself in the interest of safety, mileage and economy. The original route crosses a cow lot owned by a man who probably knows more about that trail than anyone in this area. ( Note: the worn cattle trail rut up the hill was just west of the Estes cow lot. ) The location is about three miles east of Fleetwood.

The Chisolm Trail crossed the Red River at Red River Station.  On the Oklahoma side, or Indian Territory at that time, this was at Fleetwood and a marker has been placed today.  On the map below, you can see the balloon of the marker at Fleetwood and below the Red River, Red River Station Road.

Red River Station

Turning on the satellite image, here’s that part of the Red River near Station Road where the cattle would have crossed into Oklahoma.  Apparently, this is the area where the Estes land was located.  I thought sure I’d still be able to see the Chisolm trail today, but I can’t.

Red River Chisolm Crossing

There was a large dugout in the side of the hill where the Estes family lived while their house was being built.

dugout house

You really have to want to visit the Boren Cemetery.  It’s nearly impossible to find, to begin with, and after you to locate it, getting to it through 3 or 4 farm gates is another problem entirely.  And then there’s the issue of wild hogs – and they are not friendly.  In fact, they’re pretty testy – and they aren’t looking to you to feed them, but are looking at you as food.  I fully understand why people here carry guns – plural.

The Boren Cemetery

Boren cemetery crop

The Boren cemetery isn’t far from the Chisolm Trail and not far from where the Estes land was located.  On the map below, you can see the cemetery, marked by the red balloon, and you can also see the Red River Station Road to the right and Fleetwood on the Oklahoma side of the border.

Boren Cem near Red River Station

The Boren Cemetery is located in rolling Texas hill country – and sometimes those rolls are a bit steep.

Gib says to me, “It’s over there somewhere.”

Boren cemetery approach

Ok, Texas is a mighty big place and I don’t SEE anything that looks like a cemetery.

Gib had obtained directions and he and his wife had come out once already and scouted the area.  His wife opted not to come a second time.  That should have been a clue.

Gib had called the local farmer, so he had the lock combinations to the several gates we encountered.

Eventually, we entered a field and started driving across the field, then up the hill, then Gib’s 4 wheel drive vehicle bottomed out.  We were on foot from here on.

Gib forgot to mention about the snakes to me.  Those would be rattlesnakes.  Now, I have snake-boots at home, but those boots at home weren’t helping me one bit here.  I was not to be deterred.  Gib was wearing cowboy boots and walked in front of me.

???????????????????????????????

We found the path that led up to the cemetery,

We had to crawl under the barbed wire fence, or climb over it – because there was no gate.  By now, I could feel the rivulets of sweat running down my back.  Gib, the consummate Texas cowboy, was entirely unphased.  They make ‘em tough down there – I’m telling ya!

Boren cemetery cactus

And if the barbed wire doesn’t get you, the cactus will.  Yes, that’s a bone.  I don’t know is the answer to your next question.  Just don’t ask.

Boren cemetery stones

It’s kind of rough country here, with the stones scattered in no order, graves dug where there were no rocks to interfere with the shovels.  At home on the Indiana farm where I grew up, we would have called this scrub, scratch or hard-scrabble.  Here, it is normal.  But that’s why they need a lot of it to make a living.

???????????????????????????????

This stone in front is the marker for John Y. Estes.  It’s beside a Campbell and Vannoy marker, in fact, John’s son-in-law who was buried just a week before John was.  Did John stand at his son-in-law’s grave just a week before he would be buried beside him?  John’s marker is actually very unique, as gravestones go – and the only one here like it.  In fact, it’s the only one I’ve ever seen like it.

???????????????????????????????

John’s stone was cast in concrete and then the information was drawn in the wet concrete with some kind of object – freestyle.  This tickled Gib a great deal because he had spent many years of his life working in the concrete business – so this somehow seemed fitting.

???????????????????????????????

Tracking John Y. Without GPS

So now we’ve followed John Y. Estes across half of the United States.  While his son, Lazarus likely never ranged further than Knoxville, John Y. Estes not only was very widely traveled, the biggest part was on foot – at least the Tennessee to Texas to Tennessee to Texas part – and probably much of the Civil War part too.

Let’s look at where John Y. Estes was and when.  I can’t keep track.

Location Date
Halifax Co., VA 1818 – birth location
Claiborne Co., TN 1820s, 1840-1870s
Grainger Co., TN 1830s
Tazewell, Claiborne Co., TN 1860
Claiborne County, TN Aug. 10, 1862 – Confederate Unit Formed
Murfeesboro, TN Dec. 29, 1862 – Civil War battle
Murfeesboro Pike, TN Dec 31, 1862 – Civil War battle
Stanford and Crab Orchard Road, KY March 30, 1863 – Civil War battle
Albany, KY May 1, 1863 – Civil War battle
Travisville, Fentress Co., KY May 2, 1863 – Civil War battle
Ebenezer, TN July 31, 1863 – Civil War activity
Clinton, TN August 15, 1863 – Civil War activity
Cumberland Gap, TN August 15, 1863 – Sept. 1863 – Civil War activity
Lee County, VA Courthouse Sept. 18, 1863 – the North took the Gap – Civil War battle
Kingsport, TN Sept. 18, 1863 – Civil War battle
Saltville, VA Oct. 31, 1863 – Civil War battle
Rogersville, TN Nov. 1, 1863 – Civil War battle
Charleston, TN Nov. 24, 1863 – Civil War battle
Battle of New Hope Church, Orange Co., VA Nov 27 – Dec. 2, 1863
Valley of Virginia Campaigns, Shenandoah Valley, VA May-July, 1864
Battle of Piedmont, Augusta Co., VA June 5, 1864
Charlottesville, VA June 12, 1864 – hospital
Stanton, VA June 30, 1864 – deserted
Chattanooga, TN March 6, 1865 – POW
Louisville, KY March 20, 1865 – POW signed oath of allegiance – released north of the Ohio
Claiborne Co., TN 1865-1879
Nocona, TX 1880-1895

I would have loved to sit for a day and talk to this man.  What stories he had to tell.

The John Y. Part of Me

I have to tell you, this man had hootspa.  He was tenacious.  He walked to Texas, twice, using a cane or stick to walk, more than 900 miles each way, when he was 61 years of age.  And it didn’t kill him.  I can’t even begin to imagine this trip, once, let alone once there, walking back to Tennessee and then back to Texas, again.  In essence, just one of those trips took 3-4 months.  Three of them probably took more than year of his life.

The concept of that just baffles me. What could be that alluring about Texas?  And why go back to Tennessee once you had arrived in Texas?

But then again, I’m not so terribly different in some ways.  And sometimes things I do baffle others.

In the 1980s, I decided to retrace the Trail of Tears, in honor of my Native American ancestors and in protest of the atrocities that befell them.  I walked part of the trail, but that’s a lot easier said than done for various reasons – not the least of which is that the trail isn’t (or wasn’t then) marked and segments are lost or missing in many places.  In the 1980s and 1990s, I had completed the segment through Tennessee and Kentucky, into Illinois.  In 2005, I completed the section between southern Illinois and Tahlequah, Oklahoma, the home of the western Cherokee nation today, where the Cherokee settled. Altogether, this trek took me over 20 years because I had to make it in segments.  In 2005, I picked up where I had left off in Illinois and within a couple days, found myself at the location where the Native people crossed the Mississippi..

Trail of Tears State Park

I walked part of that as well, on both sides of the river, but given that I was traveling alone, I had to walk back to my car and then drive to the next segment to walk.  Take my word for it, the state of Missouri goes on forever!

Trail of Tears Crossing

I was a lot younger then that John Y. was when he walked to Texas, and he walked the entire distance, not just a few miles or a day here and there.

One of the most unforgetable stops on that journey was the Trail of Tears State Park in Missouri, just across the border from Illinois where the Cherokee spent a horrific winter, starving and freezing to death, and waiting for the ice to melt so they could cross the Mississippi.  It took eleven weeks to cover 60 miles and the Native people suffered terribly, horrifically – the local people refusing to help them with food.  Within days, there was no wildlife left to hunt.

Trail of Tears at Mississippi

This is on the Missouri side of the River, looking across the river at the land where more than 15,000 Native people camped, and waited, with no food and only light blankets in one of the worst winters recorded.  Weakened from starvation, people froze to death nightly.  The dead couldn’t even be buried, their bodies left in the snow.  There were no reports of cannibalism, but that level of desperation would not have surprised me.

The Trail of Tears as a whole, but in particular, this segment was a unfathomable act of inhumane genocide – torture, hour by hour, day by day, as you watched those you love starve and freeze, as you were doing so yourself.  One can feel their aching spirts as you stand on the land, even yet today.  Some were so devastated that they never spoke again in their lifetimes.  Their torture and grief is unfathomable and the depth of that black hole remains both tangible and palpable today.  There simply are no words.

My final destination in 2005, 125 years after John Y. Estes walked to Texas?  Texas.  Why?  To find John Y. Estes’s grave.  I never, at that time, realized the parallels.  But then, I didn’t really know the rest of the story.  Today, I find the parallels mind-boggling.

What of John Y. Estes do I have in me?  Do I carry his tenacity?  My mother would assuredly have voted in the affirmative, and she would not have meant that as a compliment!  I, on the other hand, am quite proud of that trait.

Sometimes it’s difficult to answer these kinds of questions – meaning how much of one particular ancestor’s DNA you carry.  One reason is that generational DNA is often measure in couples.  By this, I mean that if I compare myself to another individual who descends from John Y. Estes, like cousin Buster for example, the DNA that Buster and I share will not be just the DNA of John Y., but also the DNA of John Y’s wife, Rutha Dodson.

The only way to avoid this “spousal contamination,” and I mean that only in the nicest of ways, is by comparing the DNA of descendants of John Y. to someone who only descends from the Estes side, not the Dodson side.  What this really means is that the comparison has to be against someone who descended from John R. Estes, the father of John Y. Estes (or another Estes whose ancestor is upstream of John Y. Estes and who doesn’t share other family lines.)  Unfortunately, this means that it pushes the relationship back another generation, which means that less DNA will be shared between the cousins.

The cousins I have to work with are as follows, at least at Family Tree DNA.

Estes descent chart

In order for the closest descendants of John Y. Estes to be compared to a descendant of John R. Estes, I utilized the chromosome browser at Family Tree DNA.  Garmon is descended from John R. Estes, so carries none of Rutha’s DNA.  Therefore, any DNA that John Y’s descendants share with Garmon had to come from the Estes side of the house.

The chromosome browser graphic below shows the chromosome of Garmon, with the following individuals with matching DNA displayed as follows:

  • Me – Orange
  • Iona – Blue
  • David – Green
  • Buster – Magenta

On chromosome 1, Buster and Iona match Garmon, but I don’t and neither does David.  This is clearly John Y. Estes’s DNA, but I don’t carry it.

On chromosome 7 there is a small segment shared by everyone except David.

On chromosome 10, there is another small segment shared by me, David and Garmon.

Part of chromosome 13 is shared by Garmon, Iona and David.

To me, the most interesting part of this equation is that chromosome 19 holds a fairly large segment shared by everyone except Buster.

Garmon chromosome

So, let’s answer the question of how much of John Y’s DNA I carry.  I downloaded the segment chart that accompanies the chromosome browser and used that information to triangulate my matches – meaning that I noted when I matched two other cousins.  Not all matches are triangulated, proving a common Estes ancestor, but some are.  I then checked those cousin’s accounts to be sure they did, indeed, match each other on those segments – which is the criteria for triangulation.

This chart shows all of my matches to Garmon, which, precluding a second line or matches by chance, would all be John Y.’s DNA.

Garmon Roberta DNA matches

As we know, the only way to actually prove that these segments descend from John Y. is through triangulation but how can I triangulate more DNA to John Y. Estes?

The answer is the Lazarus tool at GedMatch, a tool built to reassemble or recreate our ancestors from their descendants – to reassemble their scattered DNA.

First, Lazarus allows you to enter up to 10 direct descendants and up to 100 “other relatives,” which means brothers, cousins, descendants of those people, but not someone who descends from the same spouse as John Y. Estes’s wife, Rutha Dodson.  If he had two wives and you were comparing children from both spouses against each other, then the criteria would be a bit different.

In other words, we’re only utilizing direct Estes line descendants, upstream of John Y. Estes.

I selected 4cM and 300 SNPs as my match criteria.

I have a total of 7 descendants and 4 other relatives, not all of whom have tested at Family Tree DNA.

I was pleased to note after running Lazarus at GedMatch that we had a total of 513.9 cM of John Y. Estes’s DNA reconstructed through his descendants and his other relatives.  In essence, that’s approximately 7.6% of John’s DNA that we’ve recovered.  Not bad for someone who was born 197 years ago.

The Lazarus tool matched my DNA with other Estes relatives, but NOT descendants of John Y. Estes.  I inherited the following segments directly from John Y. Estes.  Several of these segments were triangulated with 2 or more relatives.

John Y. Estes reconstruct DNA matches

Of these, only two, on chromosomes 9 and 19, are partial matches to the original list from Family Tree DNA. While, at first glance this looks unusual, it isn’t.  Both of the matches at Family Tree DNA over the threshold selected at GedMatch are included.  The lower segment matches were not “seen” at Gedmatch.  This is one reason why I utilize both tools when possible.  GedMatch allows you to utilize people’s results who tested at a different company, and Family Tree DNA allows you to easily pick up those common small segments.

If all of these segments are from John (and not from a secondary unknown shared line or identical by chance,) then I carry 156.6 cM of John Y. Estes’ DNA that I can map.  Given that John is my great-great-grandfather, I would be expected to carry about 6.25% of his DNA.  Of that amount, I’ve been able to tentatively identify about 2.3%, so if the right people were to test, I should be able to identify about another 3.95%.  So, in rough numbers, I’ve identified around one third of the DNA that I inherited from John Y. Estes utilizing 7 descendants and 4 other relatives.

So, now if I could just figure out which one of these genes is the “walk to Texas” and wanderlust gene, we’d be all set.  If I received that from any ancestor, it’s very likely to be from John Y. Estes, the only man I’ve ever know who walked to Texas, even once.

Red river aerial

Aerial view of the Red River, Texas on the right, Oklahoma on the left.

Acknowledgements:  A special thank you to cousin Gib, who supplied most of the Texas information and a lot of camaraderie over the years.

A Study Utilizing Small Segment Matching

There has been quite a bit of discussion in the last several weeks, both pro and con, about how to use small matching DNA segments in genetic genealogy.  A couple of people are even of the opinion that small segments can’t be used at all, ever.  Others are less certain and many of us are working our way through various scenarios.  Evidence certainly exists that these segments can be utilized.

I’ve been writing foundation articles, in preparation for this article, for several weeks now.  Recently, I wrote about how phasing works and determining IBD versus IBS matches and included guidelines for telling the difference between the different kinds of matches.  If you haven’t read that article, it’s essential to understanding this article, so now would be a good time to read or review that article.

I followed that with a step by step article, Demystifying Autosomal DNA Matching, on how to do phasing and matching in combination with the guidelines about how to determine IBD (identical by descent) versus IBS (identical by chance) and identical by population matches when evaluating your own matches.

Now that we understand IBS, IBD, Phasing and how matching actually works on a case by case basis, let’s look at applying those same matching and IBS vs IBD guidelines to small data segments as well.

A Little History

So those of you who haven’t been following the discussion on various blogs and social media don’t feel like you’ve been dropped into the middle of a conversation with no context, let me catch you up.

On Thanksgiving Day, I published an article about identifying one of my ancestors, after many years of trying, Sarah Hickerson.

That article spurred debate, which is just fine when the debate is about the science, but it subsequently devolved into something less pleasant.  There are some individuals with very strong opinions that utilizing small segments of DNA data can “never be done.”

I do not agree with that position.  In fact, I strongly disagree and there are multiple cases with evidence to support small segments being both accurate and useful in specific types of genealogical situations.  We’ll take a look at several.

I do agree that looking at small segment data out of context is useless.  To the best of my knowledge, no genealogist begins with their smallest segments and tries to assemble them, working from the bottom up.  We all begin with the largest segments, because they are the most useful and the closest connections in our tree, and work our way down.  Generally, we only work with small segments when we have to – and there are times that’s all we have.  So we need to establish guidelines and ways to know if those small segments are reliable or not.  In other words, how can we draw conclusions and how much confidence can we put in those conclusions?

Ultimately, whether you choose to use or work with small segment data will be your own decision, based on your own circumstances.  I simply wanted to understand what is possible and what is reasonable, both for my own genealogy and for my readers.

In my projects, I haven’t been using small segment data out of context, or randomly.  In other words, I don’t just pick any two small segment matches and infer or decide that they are valid matches.  Fortunately, by utilizing the IBD vs IBS guidelines, we have tools to differentiate IBD (Identical by Descent) segments from IBS (Identical by State) by chance segments and IBD/IBS by population for matching segments, both large and small.

Studying small segment data is the key to determining exactly how small segments can reasonably be utilized.  This topic probably isn’t black or white, but shades of gray – and assuming the position that something can’t be done simply assures that it won’t be.

I would strongly encourage those involved and interested in this type of research to retain those small segments, work with them and begin to look for patterns.  The only way we, as a community, are ever going to figure out how to work with small segments successfully and reliably is to, well, work with them.

Discussing the science and scenarios surrounding the usage of small data segments in various different situations is critical to seeing our way through the forest.  If the answers were cast in concrete about how to do this, we wouldn’t be working through this publicly today.

Negative personal comments and inferences have no place in the scientific community.  It discourages others from participating, and serves to stifle research and cooperation, not encourage it.  I hope that civil scientific discussions and comparisons involving small segment data can move forward, with decorum, because they are critically needed in order to enhance our understanding, under varying circumstances, of how to utilize small segment data.  As Judy Russell said, disagreeing doesn’t have to be disagreeable.

Two bloggers, Blaine Bettinger and CeCe Moore wrote articles following my Hickerson article.  Blaine subsequently wrote a second article here.  Felix Immanuel wrote articles here and here.

A few others have weighed in, in writing, as well although most commentary has been on Facebook.  Israel Pickholtz, a professional genealogist and genetic consultant, stated on his blog, All My Foreparents, the following:

It is my nature to distrust rules that put everything into a single category and that’s how I feel about small segments. Sometimes they are meaningful and useful, sometimes not.

When I reconstructed my father’s DNA using Lazerus (described last week in Genes From My Father), I happily accepted all small segments of whatever size because those small segments were in the DNA of at least one of his children and at least one of his brother/sister/first cousin. If I have a particular small segment, I must have received it from my parents. If my father’s brother (or sister) has it as well, then it is eminently clear to me that I got it from my father and that it came to him and his brother from my grandfather. And it is not reasonable to say that a sliver of that small segment might have come from my mother, because my father’s people share it.

After seeing Israel’s commentary about Lazarus, I reconstructed the genome of both Roscoe and John Ferverda, brothers, which includes both large and small segments.  Working with the Ferverda DNA further, I wrote an article, Just One Cousin, about matching between two siblings and a first cousin, which includes lots of small data segments, some of which were proven to triangulate, meaning they are genuine, and some which did not.  There are lots more examples in the demystifying article, as well.

What Not To Do 

Before we begin, I want to make it very clear that am not now, and never have, advocated that people utilize small data segments out of context of larger matching segments and/or at least suspected matching genealogy.  For example, I have never implied or even hinted that anyone should go to GedMatch, do a “one to many” compare at 1 cM and then contact people informing them that they are related.  Anyone who has extrapolated what I’ve written to mean that either simply did not understand or intentionally misinterpreted the articles.

Sarah Hickerson Revisited

If I thought Sarah Hickerson caused me a lot of heartburn in the decades before I found her, little did I know how much heartburn that discovery would cause.

Let’s go back to the Sarah Hickerson article that started the uproar over whether small data segments are useful at all.

In that article, I found I was a member of a new Ancestry DNA Circle for Charles Hickerson and Mary Lytle, the parents of Sarah Hickerson.

Ancestry Hickerson match

Because there are no tools at Ancestry to prove DNA connections, I hurried over to Family Tree DNA looking for any matches to Hickersons for myself and for my Vannoy cousins who also (potentially) descended from this couple.  Much to my delight, I found  several matches to Hickersons, in fact, more than 20 – a total of 614 rows of spreadsheet matches when I included all of my Vannoy cousins who potentially descend from this couple to their Hickerson matches.  There were 64 matching clusters of segments, both small and large.  Some matches were as large as 20cM with 6000 SNPs and more than 20 were over 10cM with from 1500 to 6000 SNPs.  There were also hundreds of small segments that matched (and triangulated) as well.

By the time I added in a few more Vannoy cousins that we’ve since recruited, the spreadsheet is now up to 1093 rows and we have 52 Vannoy-Hickerson TRIANGULATED CLUSTERS utilizing only Family Tree DNA tools.

Triangulated DNA, found in 3 or more people at the same location who share a common ancestor is proven to be from that ancestor (or ancestral couple.)  This is the commonly accepted gold standard of autosomal DNA triangulation within the industry.

Here’s just one example of a cluster of three people.  Charlene and Buster are known (proven, triangulated) cousins and Barbara is a descendant of Charles Hickerson and Mary Lytle.

example triang

What more could you want?

Yes, I called this a match.  As far as I’m concerned, it’s a confirmed ancestor.  How much more confirmed can you get?

Some clusters have as many as 25 confirmed triangulated members.

chr 13 group

Others took issue with this conclusion because it included small segment data.  This seems like the perfect opportunity in which to take a look at how small segments do, or don’t stand up to scrutiny.  So, let’s do just that.  I also did the same type of matching comparison in a situation with 2 siblings and a known cousin, here.

To Trash…or Not To Trash

Some genetic genealogists discard small segments entirely, generally under either 5 or 7cM, which I find unfortunate for several reasons.

  1. If a person doesn’t work with small segments, they really can’t comment on the lack of results, and they’ll never have a success because the small segments will have been discarded.
  2. If a person doesn’t work with small segments, they will never notice any trends or matches that may have implications for their ancestry.
  3. If a person doesn’t work with small segments, they can’t contribute to the body of evidence for how to reasonably utilize these segments.
  4. If a person doesn’t work with small segments, they may well be throwing the baby out with the bathwater, but they’ll never know.
  5. They encourage others to do the same.

The Sarah Hickerson article was not meant as a proof article for anything – it was meant to be an article encouraging people to utilize genetic genealogy for not only finding their ancestor and proving known connections, but breaking down brick walls.  It was pointing the way to how I found Sarah Hickerson.  It was one of my 52 Ancestors Series, documenting my ancestors, not one of the specifically educational articles.  This article is different.

If you are only interested in the low hanging fruit, meaning within the past 5 or 6 generations, and only proving your known pedigree, not finding new ancestors beyond that 5-6 generation level, then you can just stop reading now – and you can throw away your small segments.  But if you want more, then keep reading, because we as a community need to work with small segment data in order to establish guidelines that work relative to utilizing small segments and identifying the small segments that can be useful, versus the ones that aren’t.

I do not believe for one minute that small segments are universally useless.  As Israel said, if his family did not receive those segments from a common family member, then where did they all get those matching segments?

In fact, utilizing triangulated and proven DNA relationships within families is how adoptees piece together their family trees, piggybacking off of the work of people with known pedigrees that they match genetically.  My assumption had been that the adoptee community utilized only large DNA segments, because the larger the matching segments, generally the closer in time the genealogy match – and theoretically the easier to find.

However, I discovered that I was wrong, and the adoptee community does in fact utilize small segments as well.  Here’s one of the comments posted on my Chromosome Browser War blog article.

“Thanks for the well thought out article, Roberta, I have something to add from the folks at DNAadoption. Adoptees are not just interested in the large segments, the small segments also build the proof of the numerous lines involved. In addition, the accumulation of surnames from all the matches provides a way to evaluate new lines that join into the tree.”

Diane Harman-Hoog (on behalf of the 6 million adoptees in this country, many of who are looking for information on medical records and family heritage).

Diane isn’t the only person who is working with small segment data.  Tim Janzen works with small segments, in particular on his Mennonite project, and discusses small segments on the ISOGG WIKI Phasing page.  Here is what Tim has to say:

“One advantage of Family Finder is that FF has a 1 cM threshold for matching segments. If a parent and a child both have a matching segment that is in the 2 to 5 cM range and if the number of matching SNPs is 500 or more then there is a reasonably high likelihood that the matching segment is IBD (identical by descent) and not IBS (identical by state).”

The same rules for utilizing larger segment data need to be applied to small segment data to begin with.

Are more guidelines needed for small segments?  I don’t know, but we’ll never know if we don’t work with many individual situations and find the common methods for success and identify any problematic areas.

Why Do Small Segments Matter?

In some cases, especially as we work beyond the 6 generation level, small segments may be all we have left of a specific ancestor.  If we don’t learn to recognize and utilize the small segments available to us, those ancestors, genetically speaking, will be lost to us forever.

As we move back in time, the DNA from more distant ancestors will be divided into smaller and smaller segments, so if we ever want the ability to identify and track those segments back in time to a specific ancestor, we have to learn how to utilize small segment data – and if we have deleted that data, then we can’t use it.

In my case, I have identified all of my 5th generation ancestors except one, and I have a strong lead on her.  In my 6th generation, however, I have lots of walls that need to be broken through – and DNA may be the only way I’ll ever do that.

Let’s take a look at what I can expect when trying to match people who also descend from an ancestor 5 generations back in time.  If they are my same generation, they would be my fourth cousins.

Based on the autosomal statistics chart at ISOGG, 4th cousins, on the average, would expect to share about 13.28 cM of DNA from their common ancestor.  This would not be over the match threshold at FTDNA of approximately 20 cM total, and if those segments were broken into three pieces, for example, that cousin would not show as a match at either FTDNA or 23andMe, based on the vendors’ respective thresholds.

% Shared DNA Expected Shared cM Relationship
0.781% 53.13 Third cousins, common ancestor is 4 generations back in time
0.391% 26.56 Third cousins once removed
20 cm Family Tree DNA total cM Threshold
0.195% 13.28 Fourth cousins, common ancestor is 5 generations back in time
7 cM 23andMe individual segment cM match threshold
0.0977% 6.64 Fourth cousins once removed
0.0488% 3.32 Fifth cousins, common ancestor is 6 generations back in time
0.0244 1.66 Fifth cousins once removed

If you’re lucky, as I was with Hickerson, you’ll match at least some relative who carries that ancestral DNA line above the threshold, and then they’ll match other cousins above the threshold, and you can build a comparison network, linking people together, in that fashion.  And yes you may well have to utilize GedMatch for people testing at various different vendors and for those smaller segment comparisons.

For clarification, I have never “called” a genealogy match without supporting large segment data.  At the vendors, you can’t even see matches if they don’t have larger segments – so there is no way to even know you would match below the threshold.

I do think that we may be able to make calls based on small segments, at least in some instances, in the future.  In fact, we have to figure out how to do this or we will rarely be able to move past the 5th or 6th generation utilizing genetics.

At the 5th generation, or third cousins, one expects to see approximately 26 cM of matching DNA, still over the threshold (if divided correctly), but from that point further back in time, the expected shared amount of DNA is under the current day threshold.  For those who wonder why the vendors state that autosomal matches are reliable to about the 5th or 6th generation, this is the answer.

I do not discount small segments without cause.  In other words, I don’t discount small segments unless there is a reason.  Unless they are positively IBS by chance, meaning false, and I can prove it, I don’t disregard them.  I do label them and make appropriate notes.  You can’t learn from what’s not there.

Let me give you an example.  I have one area of my spreadsheet where I have a whole lot of segments, large and small, labeled Acadian.  Why?  Because the Acadians are so intermarried that I can’t begin to sort out the actual ancestor that DNA came from, at least not yet…so today, I just label them “Acadian.”

This example row is from my master spreadsheet.  I have my Mom’s results in my spreadsheet, so I can see easily if someone matches me and Mom both. My rows are pink.  The match is on Mom’s side, which I’ve color coded purple.  I don’t know which ancestor is the most recent common ancestor, but based on the surnames involved, I know they are Acadian.  In some cases, on Acadian matches, I can tell the MRCA and if so, that field is completed as well.

Me Mom acadian

As a note of interest, I inherited my mother’s segment intact, so there was no 50% division in this generation.

I also have segments labeled Mennonite and Brethren.  Perhaps in the future I’ll sort through these matches and actually be able to assign DNA segments to specific ancestors.  Those segments aren’t useless, they just aren’t yet fully analyzed.  As more people test, hopefully, patterns will emerge in many of these DNA groupings, both small and large.

In fact, I talked about DNA patterns and endogamous populations in my recent article, Just One Cousin.

For me, today, some small segment matches appear to be central European matches.  I say “appear to be,” because they are not triangulated.  For me this is rather boring and nondescript – but if this were my African American client who is trying to figure out which line her European ancestry came from, this could be very important.  Maybe she can map these segments to at least a specific ancestral line, which she would find very exciting.

Learning to use small segments effectively has the potential to benefit the following groups of people:

  • People with colonial ancestry, because all that may be left today of colonial ancestors is small segments.
  • People looking to break down brick walls, not just confirm currently known ancestors.
  • People looking for minority ancestors more than 5 or 6 generations back in their trees.
  • Adoptees – although very clearly, they want to work with the largest matches first.
  • People working with ethnic identification of ancestors, because you will eventually be able to track ethnicity identifying segments back in time to the originating ancestor(s).

Conversely, people from highly endogamous groups may not be helped much, if at all, by small segments because they are so likely to be widely shared within that population as a group from a common ancestor much further back in time.  In fact, the definition of a “small segment” for people with fully endogamous families might be much larger than for someone with no known endogamy.

However, if we can identify segments to specific populations, that may help the future accuracy of ethnicity testing.

Let’s go back and take a look at the Hickerson data using the same format we have been using for the comparisons so far.

Small Segment Examples

These Hickerson/Vannoy examples do not utilize random small segment matches, but are utilizing the same matching rules used for larger matches in conjunction with known, triangulated cousin groups from a known ancestor.  Many cousins, including 2 brothers and their uncle all carry this same DNA.  Like in Israel’s case, where did they get that same DNA if not from a common ancestor?

In the following examples, I want to stress that all of the people involved DO HAVE LARGER SEGMENT MATCHES on other chromosomes, which is how we knew they matched in the first place, so we aren’t trying to prove they are a match.  We know they are.  Our goal is to determine if small segments are useful in the same situation, proving matches, as with larger segments.  In other words, do the rules hold true?  And how do we work with the data?  Could we utilize these small segment matches if we didn’t have larger matching segments, and if so, how reliable would they be?

There is a difference between a single match and a triangulated group:

  • Matches between two people are suggestive of a common ancestor but could be IBS by chance or population..
  • Multiple matches, such as with the 6 different Hickersons who descend from Charles Hickerson and Mary Lytle, both in the Ancestry DNA Circle and at Family Tree DNA, are extremely suggestive of a specific common ancestor.
  • Only triangulated groups are proof of a common ancestor, unless the people are  closely related known relatives.

In our Hickerson/Vannoy study, all participants match at least to one other (but not to all other) group members at Family Tree DNA which means they match over the FTDNA threshold of approximately 20 cM total and at least one segment over 7.7cM and 500 SNPs or more.

In the example below, from the Hickerson article, the known Vannoy cousins are on the left side and the Hickerson matches to the Vannoy cousins are across the top.  We have several more now, but this gives you an idea of how the matching stacked up initially.  The two green individuals were proven descendants from Charles Hickerson and Mary Lytle.

vannoy hickerson higginson matrix

The goal here is to see how small data segments stack up in a situation where the relationship is distant.  Can small segments be utilized to prove triangulation?  This is slightly different than in the Just One Cousin article, where the relationship between the individuals was close and previously known.  We can contrast the results of that close relationship and small segments with this more distant connection and small segments.

Sarah Hickerson and Daniel Vannoy

The Vannoy project has a group of about a dozen cousins who descend from Elijah Vannoy who have worked together to discover the identify of Elijah’s parents.  Elijah’s father is one of 4 Vannoy men, all sons of the same man, found in Wilkes County, NC. in the late 1700s.  Elijah Vannoy is 5 generations upstream from me.

What kind of evidence do we have?  In the paper genealogy world, I have ruled out one candidate via a Bible record, and probably a second via census and tax records, but we have little information about the third and fourth candidates – in spite of thoroughly perusing all existent records.  So, if we’re ever going to solve the mystery, short of that much-wished-for Vannoy Bible showing up on e-Bay, it’s going to have to be via genetic genealogy.

In addition to the dozen or so Vannoy cousins who have DNA tested, we found 6 individuals who descend from Sarah Hickerson’s parents, Charles Hickerson and Mary Lytle who match various Vannoy cousins.  Additionally, those cousins match another 21 individuals who carry the Hickerson or derivative surnames, but since we have not proven their Hickerson lineage on paper, I have not utilized any of those additional matches in this analysis.  Of those 26 total matches, at Family Tree DNA, one Hickerson individual matches 3 Vannoy cousins, nine Hickerson descendants match 2 Vannoy cousins and sixteen Hickerson descendants match 1 Vannoy cousin.

Our group of Vannoy cousins matching to the 6 Charles Hickerson/Mary Lytle descendants contains over 60 different clusters of matching DNA data across the 22 chromosomes.  Those 6 individuals are included in 43 different triangulated groups, proving the entire triangulation group shares a common ancestor.  And that is BEFORE we add any GedMatch information.

If that sounds like a lot, it’s not.  Another recent article found 31 clusters among siblings and their first cousin, so 60 clusters among a dozen known Vannoy cousins and half a dozen potential Hickerson cousins isn’t unusual at all.

To be very clear, Sarah Hickerson and Daniel Vannoy were not “declared” to be the parents of Elijah Vannoy, born in 1784, based on small segment matches alone.  Larger segment matches were involved, which is how we saw the matches in the first place.  Furthermore, the matches triangulated.  However, small segments certainly are involved and are more prevalent, of course, than large segments.  Some cousins are only connected by small segments.  Are they valid, and how do we tell?  Sometimes it’s all we have.

Let me give you the classic example of when small segments are needed.

We have four people.  Person A and B are known Vannoy cousins and person C and D are potential Hickerson cousins.  Potential means, in this case, potential cousins to the Vannoys.  The Hickersons already know they both descend from Charles Hickerson and Mary Lytle.

  • Person A matches person C on chromosome 1 over the matching threshold.
  • Person B matches person D on chromosome 2 over the matching threshold.

Both Vannoy cousins match Hickerson cousins, but not the same cousin and not on the same segments at the vendor.  If these were same segment matches, there would be no question because they would be triangulated, but they aren’t.

So, what do we do?  We don’t have access to see if person C and D match each other, and even if we did, they don’t match on the same segments where they match persons A and B, because if they did we’d see them as a match too when we view A and B.

If person A and B don’t match each other at the vendor, we’re flat out of luck and have to move this entire operation to GedMatch, assuming all 4 people have or are willing to download their data.

a and b nomatch

If person A and B match each other at the vendor, we can see their small segment data as compared to each other and to persons C and D, respectively which then gives us the ability to see if A matches C on the same small segment as B matches D.

a and b match

If we are lucky, they will all show a common match on a small segment – meaning that A will match B on a small segment of chromosome 3, for example, and A will match C on that same segment.  In a perfect world, B will also match D on that same segment, and you will have 4 way triangulation – but I’m happy with the required 3 way match to triangulate.

This is exactly what happened in the article, Be Still My H(e)art.  As you can see, three people match on chromosomes 1 and 8, below – two of whom are proven cousins and the third was the wife surname candidate line.

Younger Hart 1-8

The example I showed of chromosome 2 in the Hickerson article was where all participants of the 5 individuals shown on the chromosome browser were matching to the Vannoy participant.  I thought it was a good visual example.  It was just one example of the 60+ clusters of cousin matches between the dozen Vannoy cousins and 6 Hickerson descendants.

This example was criticized by some because it was a small segment match.  I should probably have utilized chromosome 15 or searched for a better long segment example, but the point in my article was only to show how people that match stack up together on the chromosome browser – nothing more.   Here’s the entire chromosome, for clarity.

hickerson vannoy chr 2

Certainly, I don’t want to mislead anyone, including myself.  Furthermore, I dislike being publicly characterized as “wrong” and worse yet, labeled “irresponsible,” so I decided to delve into the depths of the data and work through several different examples to see if small segment data matching holds in various situations.  Let’s see what we found.

Chromosome 15

I selected chromosome 15 to work with because it is a region where a lot of Vannoy descendants match – and because it is a relatively large segment.  If the Hickersons do match the Vannoys, there’s a fairly good change they might match on at least part of that segment.  In other words, it appears to be my best bet due to sheer size and the number of Elijah Vannoy’s descendants who carry this segment.  In addition to the 6 individuals above who matched on chromosome 15, here are an additional 4.  As you can see, chromosome 15 has a lot of potential.

Chrom 15 Vannoy

The spreadsheet below shows the sections of chromosome 15 where cousins match.  Green individuals in the Match column are descendants of Charles Hickerson and Mary Lytle, the parents of Sarah Hickerson.  The balance are Vannoys who match on chromosome 15.

chr 15 matches ftdna v4

As you can see, there are several segments that are quite large, shown in yellow, but there are also many that are under the threshold of 7cM, which are all  segments that would be deleted if you are deleting small segments.  Please also note that if you were deleting small segments, all of the Hickerson matches would be gone from chromosome 15.

Those of you with an eagle eye will already notice that we have two separate segments that have triangulated between the Vannoy cousins and the Hickerson descendants, noted in the left column by yellow and beige.  So really, we could stop right here, because we’ve proven the relationship, but there’s a lot more to learn, so let’s go on.

You Can’t Use What You Can’t See

I need to point something out at this point that is extremely important.

The only reason we see any segment data below the match threshold is because once you match someone on a larger segment at Family Tree DNA, over the threshold, you also get to view the small segment data down to 1cM for your match with that person. 

What this means is that if one person or two people match a Hickerson descendant, for example you will see the small segment data for their individual matches, but not for anyone that doesn’t match the participant over the matching threshold.

What that means in the spreadsheet above, is that the only Hickerson that matches more than one Vannoy (on this segment) is Barbara – so we can see her segment data (down to 1cM ) as compared to Polly and Buster, but not to anyone else.

If we could see the smaller segment data of the other participants as compared to the Hickerson participants, even though they don’t match on a larger segment over the matching threshold, there could potentially be a lot of small segment data that would match – and therefore triangulate on this segment.

This is the perfect example of why I’ve suggested to Family Tree DNA that within projects or in individuals situations, that we be allowed to reduce the match threshold – especially when a specific family line match is suspected.

This is also one of the reasons why people turn to GedMatch, and we’ll do that as well.

What this means, relative to the spreadsheet is that it is, unfortunately, woefully incomplete – and it’s not apples to apples because in some cases we have data under the match threshold, and in some, we don’t.  So, matches DO count, but nonmatches where small segment data is not available do NOT count as a non-match, or as disproof.  It’s only negative proof IF you have the data AND it doesn’t match.

The Vannoys match and triangulate on many segments, so those are irrelevant to this discussion other than when they match to Hickerson DNA.  William (H), descends from two sons of Charles Hickerson and Mary Lytle.  Unfortunately, he only matches one Vannoy, so we can only see his small segments for that one Vannoy individual, William (V).  We don’t know what we are missing as compared to the rest of the Vannoy cousins.

To see William (H)’s and William (V)’s DNA as compared to the rest of the Vannoy cousins, we had to move to GedMatch.

Matching Options

Since we are working with segments that are proven to be Vannoy, and we are trying to prove/disprove if Daniel Vannoy and Sarah Hickerson are the parents of Elijah through multiple Hickerson matches, there are only a few matching options, which are:

  1. The Hickerson individuals will not triangulate with any of the Vannoy DNA, on chromosome 15 or on other chromosomes, meaning that Sarah Hickerson is probably not the mother of Elijah Vannoy, or the common ancestor is too far back in time to discern that match at vendor thresholds.
  2. The Hickerson individuals will not triangulate on this segment, but do triangulate on other segments, meaning that this segment came entirely from the Vannoy side of the family and not the Hickerson side of the family. Therefore, if chromosome 15 does not triangulate, we need to look at other chromosomes.
  3. The Hickerson individuals triangulate with the Vannoy individuals, confirming that Sarah Hickerson is the mother of Elijah Vannoy, or that there is a different common unknown ancestor someplace upstream of several Hickersons and Vannoys.

All of the Vannoy cousins descend from Elijah Vannoy and Lois McNiel, except one, William (V), who descends from the proven son of Sarah Hickerson and Daniel Vannoy, so he would be expected to match at least some Hickerson descendants.  The 6 Hickerson cousins descend from Charles Hickerson and Mary Lytle, Sarah’s parents.

hickerson vannoy pedigree

William (H), the Hickerson cousin who descends from David, brother to Sarah Hickerson, is descended through two of David Hickerson’s sons.

I decided to utilize the same segment “mapping comparison” technique with a spreadsheet that I utilized in the phasing article, because it’s easy to see and visualize.

I have created a matching spreadsheet and labeled the locations on the spreadsheet from 25-100 based on the beginning of the start location of the cluster of matches and the end location of the cluster.

Each individual being compared on the spreadsheet below has a column across the top.  On the chart below, all Hickerson individuals are to the right and are shown with their cells highlighted yellow in the top row.

Below, the entire colorized chart of chromosome 15 is shown, beginning with location 25 and ending with 100, in the left hand column, the area of the Vannoy overlap.  Remember, you can double click on the graphics to enlarge.  The columns in this spreadsheet are not fully expanded below, but they are in the individual examples.

entire chr 15 match ss v4

I am going to step through this spreadsheet, and point out several aspects.

First, I selected Buster, the individual in the group to begin the comparison, because he was one of the closest to the common ancestor, Elijah Vannoy, genealogically, at 4 generations.  So he is the person at Family Tree DNA that everyone is initially compared against.

Everyone who matches Buster has their matching segments shown in blue.  Buster is shown furthest left.

When participants match someone other than Buster, who they match on that segment is typed into their column.  You can tell who Buster matches because their columns are blue on matching locations.  Here’s an example.

Me Buster match

You can see that in my column, it’s blue on all segments which means I match Buster on this entire region.  In addition, there are names of Carl, Dean, William Gedmatch and Billie Gedmatch typed into the cell in the first row which means at that location, in addition to Buster, I also match Carl and Dean at Family Tree DNA and William (descended from the son of Daniel Vannoy and Sarah Hickerson) at Gedmatch and Billie (a Hickerson) at Gedmatch.  Their name is typed into my column, and mine into theirs.  Please note that I did not run everyone against everyone at GedMatch.  I only needed enough data to prove the point and running many comparisons is a long, arduous process even when GedMatch isn’t experiencing problems.

On cells that aren’t colorized blue, the person doesn’t match Buster, but may still match other Vannoy cousin segments.  For example, Dean, below, matches Buster on location 25-29, along with some other cousins.  However, he does not match Buster on location 30 where he instead matches Harold and Carl who also don’t match Buster at that location. Harold, Carl and Dean do, however, all descend from the same son of Elijah so they may well be sharing DNA from a Vannoy wife at this location, especially since no one who doesn’t share that specific wife’s line matches those three at this location.

Me Buster Dean match

Remember, we are not working with random small data segments, but with a proven matching segment to a common Vannoy ancestor, with a group of descendants from a possible/probable Hickerson ancestor that we are trying to prove/disprove.  In other words, you would expect either a lot of Hickerson matches on the same segments, if Hickerson is indeed a Vannoy ancestral family, or virtually none of them to match, if not.

The next thing I’d like to point out is that these are small segments of people who also have larger matching segments, many of whom do triangulate on larger segments on other chromosomes.  What we are trying to discern is whether small segment matches can be utilized by employing the same matching criteria as large segment matching.  In other words, is small segment data valid and useful if it meets the criteria for an IBD match?

For example, let’s look at Daniel.  Daniel’s segments on chromosome 15, were it not for the fact that he matches on larger segments on other chromosomes, would not be shown as matches, because they are not individually over the match threshold.

Look at Daniel’s column for Polly and Warren.

Daniel matches 2

The segments in red show a triangulated group where Daniel and Warren, or Daniel, Warren and Polly match.  The segments where all 3 match are triangulated.

This proves, unquestionably, that small segments DO match utilizing the normal prescribed IBD matching criteria.  This spreadsheet, just for chromosome 15, is full of these examples.

Is there any reason to think that these triangulated matches are not identical by descent?  If they are not IBD, how do all of these people match the same DNA? Chance alone?  How would that be possible?  Two people, yes, maybe, but 3 or more?  In some cases, 5 or 6 on the same segment?  That is simply not possible, or we have disproven the entire foundation that autosomal DNA matching is based upon.

The question will soon be asked if small segments that triangulate can be useful when there are no larger matching segments to put the match over the initial vendor threshold.

Triangulated Groups

As you can see, most of the people and segments on the spreadsheet, certainly the Elijah descendants, are heavily triangulated, meaning that three or more people match each other on the same locations.  Most of this matching is over the vendor threshold at Family Tree DNA.

You can see that Buster, Me, Dean, Carl and Harold all match each other on the same segments, on the left half of the spreadsheet where our names are in each other’s columns.

triangulated groups

Remember when I said that the spreadsheet was incomplete?  This is an example.  David and Warren don’t match each other at a high enough total of segments to get them over the matching threshold when compared to each other, so we can’t see their small segment data as compared to each other.  David matches Buster, but Warren doesn’t, so I can’t even see them both in relationship to a common match.  There are several people who fall into this category.

Let’s select one individual to use as an example.

I’ve chosen the Vannoy cousin, William(V), because his kit has been uploaded to Gedmatch, he has Vannoy matches and because William is proven to descend from Sarah Hickerson and Daniel Vannoy through their son Joel – so we expect some Hickerson DNA to match William(V).

If William (V) matches the Hickersons on the same DNA locations as he matches to Elijah’s descendants, then that proves that Elijah’s descendant’s DNA in that location is Hickerson DNA.

At GedMatch, I compared William(V) with me and then with Dean using a “one to one” comparison at a low threshold, simply because I wanted as much data as I could get.  Family Tree DNA allows for 1 cM and I did the same, allowing 100 SNPs at GedMatch.  Family Tree DNA’s lowest SNP threshold is 500.

In case you were wondering, even though I did lower the GedMatch threshold below the FTDNA minimum, there were 45 segments that were above 1cM and above 500 SNPs when matching me to William(V), which would have been above the lowest match threshold at FTDNA (assuming we were over the initial match threshold.)  In other words, had we not been below the original match threshold (20cM total, one segment over 7.7cM), these segments would have been included at FTDNA as small segments.  As you can see in the chart below, many triangulated.

I colorized the GedMatch matches, where there were no FTDNA matches, in dark red text.  This illustrates graphically just how much is missed when the small segments are ignored in cases with known or probable cousins.  In the green area, the entry that says “Me GedMatch” could not be colorized red (because you can’t colorize only part of the text of a cell) so I added the Gedmatch designation to differentiate between a match through FTDNA and one from GedMatch.  I did the same with all Gedmatch matches, whether colorized or not.

Let’s take a look and see how small segments from GedMatch affect our Hickerson matching.  Note that in the green area, William (V) matches William (H), the Hickerson descendant, and William (V) matches to me and Dean as well.  This triangulates William (V)’s Hickerson DNA and proves that Elijah’s descendants DNA includes proven Hickerson segments.

William (V) gedmatch matches v2

In this next example, I matched William (H), the Hickerson cousin (with no Vannoy heritage) against both Buster and me.

William (H) gedmatch me buster

Without Gedmatch data, only two segments of chromosome 15 are triangulated between Vannoy and Hickerson cousins, because we can’t see the small data segments of the rest of the cousins who don’t match over the threshold.

You can see here that nearly the entire chromosome is triangulated using small segments.  In the chart below, you can see both William(V) and William (H) as they match various Vannoy cousins.  Both triangulate with me.

William V and William H

I did the same thing with the Hickerson descendant, Billie, as compared to both me and Dean, with the same type of results.

The next question would be if chromosome 15 is a pileup area where I have a lot of IBS matches that are really population based matches.  It does not appear to be.  I have identified an area of my chromosomes that may be a pileup area, but chromosome 15 does not carry any of those characteristics.

So by utilizing the small segments at GedMatch for chromosome 15 that we can’t otherwise see, we can triangulate at least some of the Hickerson matches.  I can’t complete this chart, because several individuals have not uploaded to GedMatch.

Why would the Hickerson descendant match so many of the Vannoy segments on chromosome 15?  Because this is not a random sample.  This is a proven Vannoy segment and we are trying to see which parts of this segment are from a potential Hickerson mother or the Vannoy father.  If from the Hickerson mother, then this level of matching is not unexpected.  In fact, it would be expected.  Since we cheated and saw that chromosome 15 was already triangulated at Family Tree DNA, we already knew what to expect.

In the spreadsheet below, I’ve added the 2 GedMatch comparisons, William (V) to me and Dean, and William (H) to me and Buster.  You can see the segments that triangulate, on the left.  We could also build “triangulated groups,” like GedMatch does.  I started to do this, but then stopped because I realized most cells would be colored and you’d have a hard time seeing the individual triangulated segments.  I shifted to triangulating only the individuals who triangulate directly with the Hickerson descendant, William(H), shown in green.  GedMatch data is shown in red.

chr 15 with gedmatch

I would like to make three points.

1.  This still is not a complete spreadsheet where everyone is compared to everyone.  This was selectively compared for two known Hickerson cousins, William (V) who descends from both Vannoys and Hickersos and William (H) who descends only from Hickersons.

2. There are 25 individually triangulated segments to the Hickerson descendant on just this chromosome to the various Vannoy cousins.  That’s proof times 25 to just one Hickerson cousin.

3.  I would NEVER suggest that you select one set of small segments and base a decision on that alone.  This entire exercise has assembled cumulative evidence.  By the same token, if the rules for segment matching hold up under the worst circumstances, where we have an unknown but suspected relationship and the small segments appear to continue to follow the triangulation rules, they could be expected to remain true in much more favorable circumstances.

Might any of these people have random DNA matches that are truly IBS by chance on chromosome 15?  Of course, but the matching rules, just like for larger segments, eliminates them.  According to triangulation rules, if they are IBS by chance, they won’t triangulate.  If they do triangulate, that would confirm that they received the same DNA from a common ancestor.

If this is not true, and they did not receive their common DNA from a common ancestor, then it disproves the fundamental matching rule upon which all autosomal DNA genetic genealogy is based and we all need to throw in the towel and just go and do something else.

Is there some grey area someplace?  I would presume so,  but at this point, I don’t know how to discern or define it, if there is.  I’ve done three in-depth studies on three different families over the past 6 weeks or so, and I’ve yet to find an area (except for endogamous populations that have matches by population) where the guidelines are problematic.  Other researchers may certainly make different discoveries as they do the same kind of studies.  There is always more to be discovered, so we need to keep an open mind.

In this situation, it helps a lot that the Hickerson/Vannoy descendants match and triangulate on larger segments on other chromosomes.  This study was specifically to see if smaller segments would triangulate and obey the rules. We were fortunate to have such a large, apparently “sticky” segment of Vannoy DNA on chromosome 15 to work with.

Does small segment matching matter in most cases, especially when you have larger segments to utilize?  Probably not. Use the largest segments first.  But in some cases, like where you are trying to prove an ancestor who was born in the 1700s, you may desperately need that small segment data in order to triangulate between three people.

Why is this important – critically important?  Because if small segments obey all of the triangulation rules when larger segments are available to “prove” the match, then there is no reason that they couldn’t be utilized, using the same rules of IBD/IBS, when larger segments are not available.  We saw this in Just One Cousin as well.

However, in terms of proof of concept, I don’t know what better proof could possibly be offered, within the standard genetic genealogy proofs where IBD/IBS guidelines are utilized as described in the Phasing article.  Additional examples of small segment proof by triangulation are offered in Just One Cousin, Lazarus – Putting Humpty Dumpty Together Again, and in Demystifying Autosomal DNA Matching.

Raising Elijah Vannoy and Sarah Hickerson from the Dead

As I thought more about this situation, I realized that I was doing an awful lot of spreadsheet heavy lifting when a tool might already be available.  In fact, Israel’s mention of Lazarus made me wonder if there was a way to apply this tool to the situation at hand.

I decided to take a look at the Lazarus tool and here is what the intro said:

Generate ‘pseudo-DNA kits’ based on segments in common with your matches. These ‘pseudo-DNA kits’ can then be used as a surrogate for a common ancestor in other tests on this site. Segments are included for every combination where a match occurs between a kit in group1 and group2.

It’s obvious from further instructions that this is really meant for a parent or grandparent, but the technique should work just the same for more distant relatives.

I decided to try it first just with the descendants of Elijah Vannoy.  At first, I thought that recreated Elijah would include the following DNA:

  • DNA segments from Elijah Vannoy
  • DNA segments from Elijah Vannoy’s wife, Lois McNiel
  • DNA segments that match from Elijah’s descendants spouse’s lines when individuals come from the same descendant line. This means that if three people descend from Joel Vannoy and Phoebe Crumley, Elijah’s son and his wife, that they would match on some DNA from Phoebe, and that there was no way to subtract Phoebe’s DNA.

After working with the Lazarus tool, I realized this is not the case because Lazarus is designed to utilize a group of direct descendants and then compare the DNA of that group to a second group of know relatives, but not descendants.

In other words, if you have a grandson of a man, and his brother.  The DNA shared by the brother and the grandson HAS to be the DNA contributed to that grandson by his grandfather, from their common ancestor, the great grandfather.  So, in our situation above, Phoebe’s DNA is excluded.

The chart below shows the inheritance path for Lazarus matching.

Lazarus inheritance

Because Lazarus is comparing the DNA of Son Doe with Brother Doe – that eliminates any DNA from the brother’s wives, Sarah Spoon or Mary – because those lines are not shared between Brother Doe and Son Doe.  The only shared ancestors that can contribute DNA to both are Father Doe and Methusaleh Fisher.

The Lazarus instructions allow you to enter the direct descendants of the person/couple that you are reconstructing, then a second set of instructions asks for remaining relatives not directly descended, like siblings, parents, cousins, etc. In other words, those that should share DNA through the common ancestor of the person you are recreating.

To recreate Elijah, I entered all of the Vannoy cousins and then entered William (V) as a sibling since he is the proven son of Daniel Vannoy and Sarah Hickerson.

Here is what Lazarus produced.

lazarus elijah 1

Lazarus includes segments of 4cM and 500 SNPs.

The first thing I thought was, “Holy Moly, what happened to chromosome 15?”  I went back and looked, and sure enough, while almost all of the Elijah descendants do match on chromosome 15, William (V), kit 156020, does not match above the Lazarus threshold I selected.  So chromosome 15 is not included.  Finding additional people who are known to be from this Vannoy line and adding them to the “nondescendant” group would probably result in a more complete Elijah.

lazarus elijah 2

Next, to recreate Sarah Hickerson, I added all of the Vannoy cousins plus William (V) as descendants of Sarah Hickerson and then I added just the one Hickerson descendant, William, as a sibling.  William’s ancestor is proven to be the sibling of Sarah.

I didn’t know quite what to expect.

Clearly if the DNA from the Hickerson descendant didn’t match or triangulate with DNA from any of the Vannoy cousins at this higher level, then Sarah Hickerson wasn’t likely Elijah’s mother.  I wanted to see matching, but more, I wanted to see triangulation.

lazarus elijah 3

I was stunned.  Every kit except two had matches, some of significant size.

lazarus elijah 4

lazarus elijah 5 v2

Please note that locations on chromosomes 3, 4 and 13, above, are triangulated in addition to matching between two individuals, which constitutes proof of a common ancestor.  Please also note that if you were throwing away segments below 7cM, you would lose all of the triangulated matches and all but two matches altogether.

Clearly, comparing the Vannoy DNA with the Hickerson DNA produced a significant number of matches including three triangulated segments.

lazarus elijah 6

Where Are We?

I never have, and I never would recommend attempting to utilize random small match segments out of context.  By out of context, I mean simply looking at all of your 1cM segments and suggesting that they are all relevant to your genealogy.  Nope, never have.  Never would.

There is no question that many small segments are IBS by chance or identical by population.  Furthermore, working with small segments in endogamous populations may not be fruitful.

Those are the caveats.  Small segments in the right circumstances are useful.  And we’ve seen several examples of the right circumstances.

Over the past few weeks, we have identified guidelines and tools to work with small segments, and they are the same tools and guidelines we utilize to work with larger segments as well.  The difference is size.  When working with large segments, the fact that they are large serves an a filter for us and we don’t question their authenticity.  With all small segments, we must do the matching and analysis work to prove validity.  Probably not worthwhile if you have larger segments for the same group of people.

Working with the Vannoy data on chromosome 15 is not random, nor is the family from an endogamous population.  That segment was proven to be Vannoy prior to attempts to confirm or disprove the Hickerson connection.  And we’ve gone beyond just matching, we’ve proven the ancestral link by triangulation, including small segments.  We’ve now proven the Hickerson connection about 7 ways to Sunday.  Ok, maybe 7 is an exaggeration, but here is the evidence summed up for the Vannoy/Hickerson study from multiple vendors and tools:

  • Ancestry DNA Circle indicating that multiple Hickerson descendants match me and some that don’t match me, match each other. Not proof, but certainly suggestive of a common ancestor.
  • A total of 26 Hickerson or derivative family name matches to Vannoy cousins at Family Tree DNA. Not proof, but again, very suggestive.
  • 6 Charles Hickerson/Mary Lytle descendants match to Vannoy cousins at Family Tree DNA. Extremely suggestive, needs triangulation.
  • Triangulation of segments between Vannoy and Hickerson cousins at Family Tree DNA. Proof, but in this study we were only looking to determine whether small segment matches constituted proof.
  • Triangulation of multiple Hickerson/Vannoy cousins on chromosome 15 at GedMatch utilizing small segments and one to one matching. More proof.
  • Lazarus, at higher thresholds than the triangulation matching, when creating Sarah Hickerson, still matched 19 segments and triangulated three for a total of 73.2cM when comparing the Hickerson descendant against the Vannoy cousins. Further proof.

So, can small segment matching data be useful? Is there any reason NOT to accept this evidence as valid?

With proper usage, small segment data certainly looks to provide value by judiciously applying exactly the same rules that apply to all DNA matching.  The difference of course being that you don’t really have to think about utilizing those tools with large segment matches.  It’s pretty well a given that a 20cM match is valid, but you can never assume anything about those small segment matches without supporting evidence. So are larger segments easier to use?  Absolutely.

Does that automatically make small segments invalid?  Absolutely not.

In some cases, especially when attempting to break down brick walls more than 5 or 6 generations in the past, small segment data may be all we have available.  We must use it effectively.  How small is too small?  I don’t know.  It appears that size is really not a factor if you strictly adhere to the IBD/IBS guidelines, but at some point, I would think the segments would be so small that just about everyone would match everyone because we are all humans – so the ultimate identical by population scenario.

Segments that don’t match an individual and either or both parents, assuming you have both parents to test, can safely be disregarded unless they are large and then a look at the raw data is in order to see if there is a problem in that area.  These are IBS by chance.  IBS segments by chance also won’t triangulate further up the tree.  They can’t, because they don’t match your parents so they cannot come from an ancestor.  If they don’t come from an ancestor, they can’t possibly match two other people whose DNA comes from that ancestor on that segment.

If both parents aren’t available, or your small segments do match with your parents, I would suggest that you retain your small segments and map them.

You can’t recognize patterns if the data isn’t present and you won’t be able to find that proverbial needle in the haystack that we are all looking for.

Based on what we’ve seen in multiple case studies, I would conclude that small segment data is certainly valid and can play a valid role in a situation where there is a known or suspected relationship.

I would agree that attempting to utilize small segment data outside the context of a larger data match is not optimal, at least not today, although I wish the vendors would provide a way for us to selectively lower our thresholds.  A larger segment match can point the way to smaller segment matches between multiple people that can be triangulated.  In some situations, like the person A, B, C, D Hickerson-Vannoy situation I described earlier in this article, I would like to be able to drop the match threshold to reveal the small segment data when other matches are suggestive of a family relationship.

In the Hickerson situation, having the ability to drop the matching thresholds would have been the key to positively confirming this relationship within the vendor’s data base and not having to utilize third party tools like GedMatch – which require the cooperation of all parties involved to download their raw data files.  Not everyone transferred their data to Gedmatch in my Vannoy group, but enough did that we were able to do what we needed to do.  That isn’t always the case.  In fact, I have an nearly identical situation in another line but my two matches at Ancestry have declined to download their data to Gedmatch.

This not the first time that small segment data has played a successful role in finding genealogy solutions, or confirming what we thought we knew – although in all cases to date, larger segments matched as well – and those larger segment matches were key and what pointed me to the potential match that ultimately involved the usage of the small segments for triangulation.

Using larger data segments as pointers probably won’t be the case forever, especially if we can gain confidence that we can reliably utilize small segments, at least in certain situations.  Specifically, a small segment match may be nothing, but a small segment triangulated match in the context of a genealogical situation seems to abide by all of the genetic genealogy DNA rules.

In fact, a situation just arose in the past couple weeks that does not include larger segments matching at a vendor.

Let’s close this article by discussing this recent scenario.

The Adoptee

An adoptee approached me with matching data from GedMatch which included matches to me, Dean, Carl and Harold on chromosome 15, on segments that overlap, as follows.

adoptee chr 15

On the spreadsheet above, sent to me by the adoptee, we can see some matches but not all matches. I ran the balance of these 4 people at GedMatch and below is the matching chart for the segment of chromosome 15 where the adoptee matches the 4 Vannoy cousins plus William(H), the Hickerson cousin.

  Me Carl Dean Harold Adoptee
Me NA FTDNA FTDNA GedMatch GedMatch
Carl FTDNA NA FTDNA FTDNA GedMatch
Dean FTDNA FTDNA NA FTDNA GedMatch
Harold GedMatch FTDNA FTDNA NA GedMatch
Adoptee GedMatch GedMatch GedMatch GedMatch NA
William (H) GedMatch GedMatch GedMatch GedMatch GedMatch

I decided to take the easy route and just utilize Lazarus again, so I added all of the known Vannoy and Hickerson cousins I utilized in earlier Lazarus calculations at Gedmatch as siblings to our adoptee.  This means that each kit will be compared to the adoptees DNA and matching segments will be reported.  At a threshold of 300 SNPs and 4cM, our adoptee matches at 140cM of common DNA between the various cousins.

adoptee vannoy match

Please note that in addition to matching several of the cousins, our adoptee also triangulates on chromosomes 1, 11, 15, 18, 19 and 21.  The triangulation on chromosome 21 is to two proven Hickerson descendants, so he matches on this line as well.

I reduced the threshold to 4cM and 200 SNPs to see what kind of difference that would make.

adoptee vannoy match low threshold

Our adoptee picked up another triangulation on chromosome 1 and added additional cousins in the chromosome 15 “sticky Vannoy” cluster and the chromosome 18 cluster.

Given what we just showed about chromosome 15, and the discussions about IBD and IBS guidelines and small matching segments, what conclusions would you draw and what would you do?

  1. Tell the adoptee this is invalid because there are no qualifying large match segments that match at the vendors.
  2. Tell the adoptee to throw all of those small segments away, or at least all of the ones below 7cM because they are only small matching segments and utilizing small matching segments is only a folly and the adoptee is only seeing what he wants to see – even though the Vannoy cousins with whom he triangulates are proven, triangulated cousins.
  3. Check to see if the adoptee also matches the other cousins involved, although he does clearly already exceeds the triangulation criteria to declare a common ancestor of 3 proven cousins on a matching segment. This is actually what I did utilizing Lazarus and you just saw the outcome.

If this is a valid match, based on who he does and doesn’t match in terms of the rest of the family, you could very well narrow his line substantially – perhaps by utilizing the various Vannoy wives’ DNA, to an ancestral couple.  Given that our adoptee matches both the Vannoys and the Hickersons, I suspect he is somehow descended from Daniel Vannoy and Sarah Hickerson.

In Conclusion

What is the acceptable level to utilize small segments in a known or suspected match situation?

Rather than look for a magic threshold number, we are much better served to look at reliable methods to determine the difference between DNA passed from our ancestors to us, IBD, and matches by chance.  This helps us to establish the reliability of DNA segments in individual situations we are likely to encounter in our genealogy.  In other words, rather that throw the entire pile of wheat away because there is some percentage of chaff in the wheat, let’s figure out how to sort the wheat from the chaff.

Fortunately, both parental phasing and triangulation eliminate the identical by chance segments.

Clearly, the smaller the segments, even in a known match situation, the more likely they are identical by population, given that they triangulate.  In fact, this is exactly how the Neanderthal and Denisovan genomes have been reconstructed.

Furthermore, given that the Anzick DNA sample is over 12,000 years old, Identical by population must be how Anzick is matching to contemporary humans, because at least some of these people do clearly share a common ancestor with Anzick at some point, long ago – more than 12,000 years ago.  In my case, at least some of the Anzick segments triangulate with my mother’s DNA, so they are not IBS by chance.  That only leaves identical by population or identical by descent, meaning within a genealogical timeframe, and we know that isn’t possible.

There are yet other situations where small segment matches are not IBS by chance nor identical by population.  For example, I have a very hard time believing that the adoptee situation is nothing but chance.  It’s not a folly.  It’s identical by descent as proven by triangulation with 10 different cousins – all on segments below the vendor matching thresholds.

In fact, it’s impossible to match the Vannoy cousins, who are already triangulated individually, by chance.  While the adoptee match is not over the vendor threshold, the segments are not terribly small and they do all triangulate with multiple individuals who also triangulate with larger segments, at the vendors and on different chromosomes.

This adoptee triangulated match, even without the Hickerson-Vannoy study disproves the blanket statement that small segments below 5cM cannot be used for genealogy.  All of these segments are 7.1cM or below and most are below 5.

This small segment match between my mother and her first cousins also disproves that segments under 5cM can never be used for genealogy.

Two cousins combined

This small segment passed from my mother to me disproves that statement too – clearly matching with our cousin, Cheryl.  If I did not receive this from my mother, and she from her parent, then how do we match a common cousin???

me mother small seg

More small segment proof, below, between my mother and her second cousin when Lazarus was reconstructing my mother’s father.

2nd cousin lazarus match

And this Vannoy Hickerson 4 cousin triangulated segment also disproves that 5cM and below cannot be used for genealogy.

vannoy hickerson triang

Where did these small segments come from if not a common ancestor, either one or several generations ago?  If you look at the small segment I inherited from my mother and say, “well, of course that’s valid, you got it from your mother” then the same logic has to apply that she inherited it from her parent.  The same logic then applies that the same small segment, when shared by my mother’s cousin, also came from the their common grandparents.  One cannot be true without the others being true.  It’s the same DNA. I got it from my mother.  And it’s only a 1.46cM segment, shown in the examples above.

Here are my observations and conclusions:

  • As proven with hundreds of examples in this and other articles cited, small segments can be and are inherited from our ancestors and can be utilized for genetic genealogy.
  • There is no line in the sand at 7cM or 5cM at which a segment is viable and useful at 5.1cM and not at 4.9cM.
  • All small segment matches need to be evaluated utilizing the guidelines set forth for IBD versus IBS by chance versus identical by population set forth in the articles titled How Phasing Works and Determining IBD Versus IBS Matches and Demystifying Autosomal DNA Matching.
  • When given a choice, large segment matches are always easier to use because they are seldom IBS by chance and most often IBD.
  • Small segment matches are more likely to be IBS by chance than larger matches, which is why we need to judiciously apply the IBD/IBS Guidelines when attempting to utilize small segment matches.
  • All DNA matches, not just small segments, must be triangulated to prove a common ancestor, unless they are known close relatives, like siblings, first cousins, etc.
  • When working in genetic genealogy, always glean the information from larger matches and assemble that information.  However, when the time comes that you need those small segments because you are working 5, 6 or 7 generations back in time, remember that tools and guidelines exist to use small segments reliably.
  • Do not attempt to use small segments out of context.  This means that if you were to look only at your 1cM matches to unknown people, and you have the ability to triangulate against your parents, most would prove to be IBS by chance.  This is the basis of the argument for why some people delete their small segments.  However, by utilizing parental phasing, phasing against known family members (like uncles, aunts and first cousins) and triangulation, you can identify and salvage the useable small segments – and these segments may be the only remnants of your ancestors more than 5 or 6 generations back that you’ll ever have to work with.  You do not have to throw all of them away simply because some or many small segments, out of context, are IBS by chance.  It doesn’t hurt anything to leave them just sit in your spreadsheet untouched until the day that you need them.

Ultimately, the decision is yours whether you will use small segments or not – and either decision is fine.  However, don’t make the decision based on the belief that small segments under some magic number, like 5cM or 7cM are universally useless.  They aren’t.

Whether small segments are too much work and effort in your individual situation depends on your personal goals for genetic genealogy and on factors like whether or not you descend from an endogamous population.  People’s individual goals and circumstances vary widely.  Some people test at Ancestry and are happy with inferential matching circles and nothing more.  Some people want to wring every tidbit possible out of genealogy, genetic or otherwise.

I hope everyone will begin to look at how they can use small segment data reliably instead of simply discarding all the small segments on the premise that all small segment data is useless because some small segments are not useful.  All unstudied and discarded data is indeed useless, so discarding becomes a self-fulfilling prophecy.

But by far, the worst outcome of throwing perfectly good data away is that you’ll never know what genetic secrets it held for you about your ancestors.  Maybe the DNA of your own Sarah Hickerson is lurking there, just waiting for the right circumstances to be found.

Demystifying Autosomal DNA Matching

dna word cluster4

What, exactly, is an autosomal DNA match?

Answer:  It’s Relative

I’m sorry, I just had to say that.

But truthfully, it is.

I know this sounds like a very basic question, and it is, but the answer sometimes isn’t as straightforward as we would like for it to be.

Plus, there are differences in quality of matches and types of matches.  If you want to sigh right about now, it’s OK.

We’ve talked a lot about matching in various recent articles.  I have several people who follow this blog religiously, and who would rather read this than, say, do dishes (who wouldn’t).  One of our regulars recently asked me the question, “what, exactly, is a match and how do I tell?”

Darned good question and I wish someone had explained this to me so I wouldn’t have had to figure it out.

In the computer industry, where I spent many years, we have what we call flow charts or wernier diagrams which in essence are logic paths that lead to specific results or outcomes depending on the answers at different junctions.

flow chart

I had a really hard time deciding whether to use the beer decision-making flow chart or the procrastinator flow chart, but the procrastinator flow chart was just one big endless loop, so I decided on the beer.

What I’m going to do is to step you through the logic path of finding and evaluating a match, determining whether it’s valid, identical by descent or chance, when possible, and how to work with your matches and what they mean.

Let me also say that while I use and prefer Family Tree DNA, these matching techniques are universal and apply to results from 23andMe as well, but not for Ancestry who gives you no browser or tools to compare your DNA to anyone else.  So, you can’t compare your results at Ancestry.

Comparing DNA results is the lynchpin of genetic genealogy.  You’re dead in the water without it.  If you have tested at Ancestry, you can always transfer your results to Family Tree DNA, where you do have tools, and to GedMatch as well.  You’re always better, in terms of genealogy, to fish in as many ponds as possible.

Before we talk about how to work with matches, for those who need to figure out how to find matches at Family Tree DNA and 23andMe, I wrote about that in the Chromosome Browser War article.  This article focuses on working with matching DNA after you have found that you are a match to someone – and what those matches might mean.

Matching Thresholds

All autosomal DNA vendors have matching thresholds.  People who meet or exceed those thresholds will be shown on your match list.  People who do not meet the initial threshold will not be considered as a match to you, and therefore will not be on your match list.

Currently, at Family Tree DNA, their match threshold to be shown as a match is about 20cM of total matching DNA and a single segment of about 7.7cM with 500 SNPs or over. The words “about” are in there because there is some fuzziness in the rules based on certain situations.

After you meet that criteria and you are shown as a match to an individual, when you download your matching data, your matches to them on each chromosome will be shown to the 1cM and 500 SNP level

At 23andMe, the threshold is 7cMs/700 SNPs for the first segment.  However, 23andMe has an upper limit of people who can match you at about 1000 matches.  This can be increased by the number of people you are communicating or sharing with.  However, your smallest matches will be dropped from your list when you hit your threshold.  This means that it’s very likely that at least some of your matches are not showing if you have in excess of 1000 matches total.  This means that your personal effective cM/SNP match threshold at 23andMe may be much higher.

Step 1 – Downloading Your Matching Segments

For this comparison, I’m starting with two fresh files from Family Tree DNA, one file of my own matches and one of my mother’s matches.  My mother died before autosomal DNA testing was available, so her results are only at Family Tree DNA (and now downloaded to GedMatch,) because her DNA was archived there.  Thank you Family Tree DNA, 100,000 times thank you!!!

At Family Tree DNA, the option to download all matches with segment information is on the chromosome browser tab, at the top, at the right, shown below.

ftdna download button

If you have your parents DNA available to test and it hasn’t been tested, order a kit for them today.  If either or both parents have been tested, download their results into the same spreadsheet with yours and color code them in a way you will understand.

In my case, I only have my mother’s results, and I color coded my matches pink, because I’m the daughter.  However, if I had both parents, I might have colored coded Mother pink and Dad blue.

Whatever color coding you do, it’s forever in your master spreadsheet, so make a note of what it is.  In my case, it’s part of the match column header.  Why is it in my column header?  Because I screwed up once and reversed them in a download.

Step 2 – Preparing and Sorting Your Spreadsheet

In my master DNA spreadsheet, I have the following columns,

dna master header

The green cell matches are matches to me from 23andMe.  My cousin, Cheryl also tested at 23andMe before autosomal testing was offered at Family Tree DNA.

The Source column, in my spreadsheet, means any source other than FTDNA.  The Ignore column is an extraneous number generated at one time by downloads.  I could delete that column now.

The “Side” column is which side the match is from, Mom or Dad.  Mom’s I can identify easily, because I have her DNA to compare to.  I don’t identify a match as Dad’s without having identified an ancestral line, because I don’t have his DNA to compare to.

And no, you can’t just assume that if it doesn’t match Mom, it’s an automatic match to Dad because you may have some IBS, identical by chance, matches.

The Common Ancestors/Comments column is just that.  I include things like when I e-mailed someone, if the match is triangulated and if so, with whom, etc.

In my master spreadsheet, the first “name” column (of who tested) is deleted, but I’ve left it in the working spreadsheet (below) with my mother for illustration purposes.  That way, neither of us has to remember who is pink!

Step 3 – Reviewing IBD and IBS Guidelines

If you need a refresher on, phasing, IBD, identical by descent, IBS which can mean either identical by chance or identical by population, it would be a good time to read or reread the article titled How Phasing Works and Determining IBD Versus IBS Matches.

Let’s briefly review the IBD vs IBS guidelines, because we’ll be applying them in this article.

Identical by Chance – Can be determined if an individual you match does not match to one of your parents, if parents are available.  If parents are not available for matching, IBS by chance segments won’t triangulate with other known genealogical matches on a common segment.

Identical by Descent – Can be suggested if a common ancestor (or ancestral line) can be determined between any two people who are not known relatives. If the two people are known close relatives, and their DNA matches, identical by descent is proven.  IBD can be proven with previously unknown family or genealogical matches when any three people descending from that same ancestor or ancestral line all match each other on the same segment of DNA.  Three way matching is called triangulation.

Identical by Population – Can be determined when multiple people triangulate with you on a specific segment of DNA, but the triangulated groups are from proven different lineages and are not otherwise related.  This is generally found in smaller segments from similar regions of the world.  Identical by population is identical by descent, but the ancestors are so far back in time that they cannot be determined and may contribute the same DNA to multiple lineages.  This is particularly evident in Jewish genealogy and other endogamous groups.

Step 4 – Determining Parental Side and IBS by Chance

The first thing to do, if you have either or both parents, is to determine whether your matches phase to your parents or are IBS by chance.

In this context, phasing means determining whether a particular match is to your father’s side of the family or to your mother’s side of the family.

Remember, at every address in your DNA, you will have two valid matches to different lines, one from your mother and one from your father.  The address on your DNA consists of the chromosome number which equates to the street name, and then the start and end locations, which consists of a range of addresses on that street.  Think of it as the length of your property on the street.

First, let’s look at my situation with only my mother’s DNA for comparison.

It’s easy to tell one of three things.

  1. Do mother and I both match the person? If so, that means that DNA match is from mother’s side of the family. Mark it as such. They are green, below.
  2. If the individual does not match me and mother, both, and only matches me, then the match is either on my father’s side or it’s IBS by chance. Those matches are blue below. Because I don’t have my father’s DNA, I can’t tell any more at this step.
  3. Notice the matches that are Mom’s but not to me. That means that I did not receive that DNA from Mom, or I received a small part, but it’s not over the lowest matching threshold at Family Tree DNA of 1cM and 500 SNPs.

match mom

In this next scenario, you can see that mother and I both match the same individual, but not on all segments.  I selected this particular match between me, my mother and Alfred because it has some “problems” to work through.

match mom2

The segments shown in green above are segments that Mom carries that I don’t.  This means that I didn’t receive them from mother.  This also means they could be  matching to Alfred legitimately, or are IBS by chance.  I can’t tell anything more about them at this point, so I’ve just noted what they are.  I usually mark these as “mother only” in my master spreadsheet.

match mom3

The first of the two green rows above show a match but it’s a little unusual.  My segment is larger than my mothers.  This means that one of five things has happened.

  1. Part of this segment is a valid match.  At the end, where we don’t match, the match extends IBS by chance a bit at the end, in my case, when matching Alfred. The valid match portion would end where my mother’s segment ends, at 16,100,293
  2. There is a read error in one of the files.
  3. The boundary locations are fuzzy, meaning vendor calculations like ‘healing’ for no calls, etc..
  4. I also match to my father’s line.
  5. Recombination has occurred, especially possible in an endogamous population, reconnecting identical by population segments between me and Alfred at the end of the segment where I don’t match my mother’s segment, so from 16,100,293 to 16,250,884.

Given that this is a small segment, the most likely scenario would be the first, that this is partly valid and partly IBS by chance.  I just make the note by that row.

The second green segment above isn’t an exact match, but if my segment “fits within” the boundaries of my mother’s segments, then we know I inherited the entire segment from her.  Once again, my boundaries are off a bit from hers, but this time it’s the beginning.  The same criteria applies as in 1-5, above.

match mom4

The green segments above are where I match Alfred, but my mother does not.  This means that these segments are either IBS by chance or that they will match my father.  I don’t know which, so I simply label them.  Given that they are all small segments, they are likely IBS by chance, but we don’t know that.  If we had my father’s DNA, we would be able to phase against him, too, but we don’t.

Now, if I was to leave this discussion here, you might have the impression that all small segment matches have problems, but they don’t.  In fact, here’s a much more normal “rea life” situation where mother and I are both matching to our cousin, Cheryl, Mom’s first cousin.  These matches include both large and small segments.  Let’s take a look and see what we can tell about our matches.

match mom complete

Roberta and Barbara have a total of 83 DNA matches to Cheryl.

Some matches will be where Barbara matches Cheryl and Roberta doesn’t.  That’s normal, Barbara is Roberta’s mother and Roberta only inherits half of Barbara’s DNA.  These rows where only Barbara, the mother, matches Cheryl are not colorized in the Start, End, cM and SNP columns, so they show as white.

Some matches will be exact matches.  That too is normal.  In some cases, Barbara passes all of a particular segment of DNA to Roberta.  These matches are colored purple.

Some of these matches are partial matches where Roberta inherited part of the segment of DNA from Barbara.  These are colored green. There are two additional columns at right where the percentage of DNA that Roberta inherited from Barbara on these segments is calculated, both for cM and SNPs.

Some of the matches are where Roberta matches Cheryl and Barbara doesn’t.  Cheryl is not known to be related to Roberta on her father’s side, so assuming that statement is correct, these matches would be IBS, identical by state, meaning identical by chance and can be disregarded at legitimate matches.  These are colored rust.  Note that most of these are small segments, but one segment is 8.8cM and 2197 SNPs.  In this case, if this segment becomes important for any reason, I would be inclined to look at the raw data file of Barbara to see if there were no calls or a problem with reads in this region that would prevent an otherwise legitimate match.

Let’s look at how these matches stack up.

Number Percent (rounded) Comment
Exact Matches 26 31 100% of the DNA
Barbara Only 20 24 0% of the DNA
Partial Matches 29 35 11-98% of the actual DNA matches
Roberta Only (IBS by chance) 7 8 Not a valid match

I think it’s interesting to note that while, on the average, 50% of the DNA of any segment is passed to the child, in actuality, in this example of partial inheritance, meaning the green rows, inheritance was never actually 50%.  In fact, the SNP and cM percentages inherited for the same segment varied, and the actual amounts ranged from 11-98% of the DNA of the parent being inherited by the child.  The average of these events was 54.57143 (cM) and 54.21429 (SNPs) however.

On top of that, in 13 (26 rows) instances, Roberta inherited all of Barbara’s DNA in that sequence, and in 20 cases, Roberta inherited none of Barbara’s DNA in that sequence.

This illustrates that while the average of something may be 50%, none of the actual individual values may be 50% and the values themselves may include the entire range of possibilities.  In this case, 11-98% were the actual percentage ranges for partial matches.

Matching Both Parents

I don’t have my father’s DNA, but I’m creating this next example as if I did.

match both parents

Matches to mother are marked in green.

I have two matches where I match my father, so we can attribute those to his side, which I’ve done and marked in orange.

The third group of matches to me, at the bottom, to Julio, Anna, Cindy and George don’t match either parent, so they must be IBS by chance.

I label IBS by chance segments, but I don’t delete them because if I download again, I’ll have to go through this same analysis process if I don’t leave them in my spreadsheet

Step 5 – How Much of the DNA is a Match?

One person asked, “exactly how do I tell how much DNA is matching, especially between three people.”  That’s a very valid question, especially since triangulation requires matching of three people, on the same segment, proven to a common ancestral line.

Let’s look at the match of both me and my mother to Don, Cheryl and Robin.

match mom part

In this example, we know that Don, Cheryl and Robin all match me on my mother’s side, because they all three match me and my mother, both on the same segment.

How do we determine that we match on the same segment?

I have sorted this spreadsheet in order of end location, then start location, then chromosome number so that the entire spreadsheet is in chromosome order, then start location, then end location.

We can see that both mother and I match Cheryl partially on this segment of chromosome 1, but not exactly.  The start location is slightly different, but the end location matches exactly.

The area where we all three match, meaning me, Mom and Cheryl, begins at 176,231,846 and ends at the common endpoint of 178,453,336

On the chart below, you can see that mother and I also both match Don, Cheryl’s brother, on part of this same segment, but not all of the same segment.

match mom part2

The common matching areas between me, Mom and Don begins at 176,231,846 and ends at 178,453,336.

Next, let’s look at the third person, Robin.

Mom and I both match Robin on part of this same overlapping segment as well.  Note that my segment extends beyond Mom’s, but that does not invalidate the portion that does match between Robin, Mom and I.

match mom part3

Our common match area begins at the same location, but ends at 178,453,336, the same location as the common end area with Don and Cheryl

Step 6 – What Do Matches Mean? IBD vs IBS in Action

So, let’s look at various types of matches and what they tell us.

match mom example

Looking at our matching situation above, let’s apply the various IBD/IBS rules and guidelines and see what we have

1. Are these matches identical by chance?  No.  How do we know?

a. Because they all match both me and a parent.

2. Are these matches identical by descent? Yes. How do we know?

a. Because we all match each other on this segment, and we know the common ancestor of Cheryl, Don, Barbara and me is Hiram Ferverda and Evaline Miller.  We know that Robin descends from the same ancestral Miller line.

3. Are these matches identical by population.  We don’t know, but there is no reason at this point to think so. Why?

a. Because looking at my master spreadsheet, I see no evidence that these segments are also assigned to other lineages. These individuals are also triangulated on a large number of other, much larger, segments as well.

4. Are these matches triangulated, meaning they are proven to a common ancestor? Yes. How do we know?

a. Documented genealogy of Hiram Ferverda and Evaline Miller. Don, Barbara, Cheryl and me are known family since birth.
b. Documented genealogy of Robin to the same ancestral family, even though Robin was previously unknown before DNA matching.
c. Even without the documented genealogy, Robin matches a set of two triangulation groups of people documented to the same ancestral line, which means she has to descend from that same line as well.

In our case, clearly these individuals share a common ancestor and a common ancestral line.  Even though these are small segments on chromosome 1, there are much larger matching segments on other chromosomes, and the same rules still apply.  The difference might be at some point smaller segments are more likely to be identical by population than larger segments.  Larger segments, when available, are always safer to use to draw conclusions.  Larger groups of matching individuals with known common genealogy on the same segments are also the safest way to draw conclusions.

Step 7 – Matching With No Parents

Sometimes you’re just not that lucky.  Let’s say both of your parents have passed and you have no DNA from them.

That immediately eliminates phasing and the identical by chance test by comparing to your parents, so you’ll have to work with your matches, including your identical by chance segments.

A second way to “phase” part of your DNA to a side of your family is by matching with known cousins or any known family member.

In the situation above, matching to Cheryl, Don and Robin, let’s remove my mother and see what we have.

match no mom

In this case, I still match to both of my first cousins, once removed, Cheryl and Don.  Given that Cheryl and Don are both known cousins, since forever, I don’t feel the need for triangulation proof in this case – although the three of us are triangulated to our common ancestor.  In other words, the fact that my mother does match them at the expected 1st cousin level is proof enough in and of itself if we only had one cousin to test.  We know our common ancestor is Cheryl and Don’s grandparents, who are my great-grandparents, Hiram Ferverda and Evaline Miller.

When I looked at Robin’s pedigree chart and saw that Robin descended from Philip Jacob Miller and wife Magdalena, I knew that this segment was a Miller side match, not a Ferverda match.

Therefore, matching with someone whose genealogy goes beyond the common ancestor of Cheryl, Don and me proves this line through 4 more generations.  In other words, this DNA segment came through the following direct line to reach Me, Mother, Cheryl and Don.

  • Philip Jacob Miller and Magdalena
  • Daniel Miller
  • David Miller
  • John David Miller
  • Evaline Louise Miller who married Hiram Ferverda

Clearly, we know from the earlier chart that my mother carried this DNA too, but even if we didn’t know that, she obviously had to have carried this segment or I would not carry it today.

So, even though in this example, our parents aren’t directly available for IBS testing and elimination, we can determine that anyone who matches both me and Cheryl or me and Don will have also matched mother on that segment, so we have, in essence, phased those people by triangulation, not by direct parental matching.

Step 8 – Triangulation Groups

What else does this match group tell us?

It tells us that anyone else who matches me and any one of our triangulation group on that segment also descends from the Miller descendant clan, one way or another.

Why do they have to match me AND one of the triangulation group members on that segment?  Because I have two sides to my DNA, my Mom’s side and my Dad’s side.  Matching me plus another person from the triangulation group proves which side the match is on – Mom’s or Dad’s.

We were able to phase to eliminate any identical by chance segments people on Mom’s side, so we know matches to both of us are valid.

On Dad’s side, there are some IBS by chance people (or segments) thrown in for good measure because I don’t have my Dad’s DNA to eliminate them out of the starting gate.  Those IBS segments will have to be removed in time by not triangulating with proven triangulated groups they should triangulate with, if they were valid matches.

When you map matches on your chromosome spreadsheet, this is what you’re doing.  Over time, you will be able to tell when you receive a new match by who they match and where they fall on your spreadsheet which ancestral line they descend from.

GedMatch also includes a triangulation utility.  It’s a great tool, because it produces trios of people for your top 400 matches.  The results are two kits that triangulate to the third person whose kit number you are matching against.

The output, below, shows you the chromosome number followed by the two kit numbers (obscured) that triangulate at this location, and then the start and end location followed by the matching cMs.  The result is triangulation groups that “slide to the right.”

gedmatch triang group3

In the example above, all of the triangulation matches to me above the red arrow include either Mother, my Ferverda cousins or the Miller group that we discussed in the Just One Cousin article.  In other words they are all related via a common ancestor.

You can tell a great deal about triangulation groups by who is, and isn’t in them using deductive reasoning.  And once you’ve figured out the key to the group, you have the key to the entire group.

In this case, Mom is a member of the first triangulation group, so I know this group is from her side and not Dad’s side.  Both Ferverda cousins are there, so I know it’s Mom’s Dad’s side of the family.  The Miller cousins are there, so I know it’s the Miller side of Mom’s Dad’s side of the family.

Please also note that while this entire group triangulates within itself, that the group manages to slide right and the first triangulated group of 3 in the list may not overlap the DNA of the last triangulated group of 3.  In fact, because you can see the start and end points, you can tell that these two triangulated groups don’t overlap.  The multiple triangulation groups all do match some portion of the group above and below them (in this case,) and as a composite group, they slide to the right. Because each group overlaps with the group above and below them, they all connect together in a genetic chain.  Because there is an entire group that are triangulated together, in multiple ways, we know that it is one entire group.

This allows me to map that entire segment on my Mom’s side of my DNA, from 10,369,154 to 41,685,667 to this group because it is contiguously connected to me, triangulated and unbroken.  The most distant ancestor listed will vary based upon the known genealogy of the three people being triangulated  For example, part of this segment, may come from Philip Jacob Miller himself, the line’s founder,, but another part could come from his son’s wife, who is also my ancestor.  Therefore, the various pieces of this group segment may eventually be attributed to different ancestors from this particular line based upon the oldest common ancestor of the three people who have triangulated.

In our example above, the second group starts where the red arrow is pointing.  I have absolutely no idea which ancestor this second group comes from – except – I know it does not come from my mother’s side because her kit number isn’t there.

Neither are any of my direct line Estes or Vannoy relatives, so it’s probably not through that line either.  My Bolton cousins are also missing, so we’ve probably eliminated several possible lines, 3 of 4 great grandparents, based on who is NOT in the match group.  See the value of testing both close and distant cousins?  In this case, the family members not only have to test, they also have to upload their results to GedMatch.

Conversely, we could quickly identify at least a base group by the presence in the triangulation groups of at least one my known cousins or people with whom I’ve identified my common ancestor.  Two from the same line would be even better!!!

Endogamy

The last thing I want to show you is an example of what an endogamous group looks like when triangulated.

gedmatch endogamy

This segment of chromosome 9 is an Acadian matching group to my Mom – and the list doesn’t stop here – this is just the size of the screen shot.  These matches continue for pages.

How do I know this group is Acadian?  In part, because this group also triangulates with my known Lore cousin who also descends from the same Acadian ancestor, Antoine Lore, son of Honore Lore and Marie Lafaille.  Additionally, I’ve worked with some of these people and we have confirmed Honore Lore and Marie Lafaille as our common ancestor as well.  In other cases, we’ve confirmed upstream ancestors.

Unfortunately, the Acadians are so intermarried that it’s very difficult to sort through the most distant genetic ancestor because there tend to be multiple most distant ancestors in everyone’s trees.  There is a saying that if you’re related to one Acadian, you’re related to all Acadians and it’s the truth.  Just ask my cousin Paul who I’m related to 137 different ways.

Matches to endogamous groups tend to have very, very long lists of matches, even triangulated, which means proven, matches.

Oh, and by the way, just for the record, this lengthy group includes some of my proven Acadian matches that were trimmed, meaning removed, from my match list when Ancestry did their big purge due to their new and improved phasing.  So if there was ever any doubt that we did in fact lose at least some valid matches, the proof lies right here, in the triangulation of those exact same people at GedMatch

Summary

I hope this step by step article has helped take the Greek, or maybe the geek, out of matching.  Once you think of it in a step by step logical basis, it makes a lot of sense and allows you to reasonably judge the quality of your matches.

The rule of thumb has been that larger matches tend to be “legitimate” and smaller matches are often discarded en masse because they might be problematic.  However, we’ve seen situations where some larger matches may not be legitimate and some smaller matches clearly are.  In essence, the 50% average seldom applies exactly and rules of thumb don’t apply in individuals situations either.  Your situation is unique with every match and now you have tools and guidelines to help you through the matching maze.

And hey, since we made it to the end, I think we should celebrate with that beer!!!

beer

Lazarus – Putting Humpty Dumpty Back Together Again

Recently, GedMatch introduced a tool, Lazarus, to figuratively raise the dead by combining the DNA of descendants, siblings and other relatives of long-dead ancestors to recreate their genome.  Kind of like piecing Humpty Dumpty back together again.

Humpty Dumpty

Blaine Bettinger wrote about using Lazarus here and here where he recreated the genome of his grandmother.  I’d like to use Lazarus to see how it works with one pair of siblings and a first cousin.  Blaine was fortunate to have 4 siblings.  I have a much smaller group of people to work with, so let’s see what we can do and how successful we are, or aren’t.  But first, lets talk about the basics and how we can reconstruct an ancestor.

The Basics

An individual has 6766.2 cM of DNA.  Both parents give half of their DNA to each child, but not exactly the same parental DNA is contributed to each child.  A random process selects which half of the parents’ DNA is given to each child.  Different children will have some of the same DNA from their parents, and some different DNA from each parent.

Obviously, the DNA contributed to each child from a parent is a combination of the DNA given to the parent by the grandparents.  Approximately half of the grandparent’s DNA is given to each child.  In many cases, the DNA contributed to the child from the grandparents is not actually divided evenly, and we receive all or nothing of individual segments, not half.  Half is an average that works pretty well most of the time.  It’s a statistic, and we all know about statistics…right???

Therefore, children carry 3383cM of each parent’s DNA.  Each sibling carries half of the same DNA from their parents.  From the ISOGG autosomal DNA statistics chart, each sibling actually carries 25% of exactly the same DNA from both parents, 50% where they inherited half of the same DNA from one parent and different DNA from the other parent, and 25% where the siblings don’t share any of the identical DNA from their parents. This averages 50%.

This chart, also from ISOGG, sums up what percentage of the same DNA different relatives can expect to carry.

cousin percents

Recreating Ferverda Brothers

I have a situation where I have a person, Barbara, and two of her first cousins, Cheryl and Don, who are siblings.  This is the same family we discussed in the Just One Cousin article.

Miller Ferverda chart

In this case, Cheryl and Don share 50% of Roscoe’s DNA.

Barbara shares 12.5% of Hiram and Evaline’s DNA with Cheryl and 12.5% with Don, but not the same 12.5%.  Since siblings share 50% of their DNA, Barbara should share about 12.5% of Cheryl’s DNA and an additional 6.25% that the Cheryl didn’t receive from Roscoe, but that Don did.

Translating that into cMs, Barbara should share about 850 cM with Cheryl and an additional 425 cM with Don, for an approximate total of 1275 cM.

At http://www.gedmatch.com, I selected the Tier 1 (subscription or donation) option of Lazarus and was presented with this menu.

lazarus menu

My first attempt was to recreate Barbara’s father, John W. Ferverda.  I allowed 100 SNPs and 4cM because I was hoping to be able to accumulate more than the required 1500cM of matching DNA for the kit to be utilized as a “real kit,” available for one-to-many matching.

100SNP 4cM 200SNP 4cM 300SNP 4cM 400SNP 4cM 500SNP 4cM 600SNP 4cM 700SNP 4cM
John W. Ferverda 1330.7 cM 1370.2 cM 1360.0 cM 1353.5 cM 1338.7 cM 1336.2 cM 1322.9 cM

I then experimented with the various SNP levels, leaving the cM at 4.

The resulting number of cM of just over 1300, no matter how you slice and dice it, is very near the expected approximation of 1275.

Using the Lazarus tool, I created “John Ferverda” by listing Barbara as his descendant and both Cheryl and Don as cousins.

To create “Roscoe Ferverda,” I reversed the positions of the individuals, listing Don and Cheryl as descendants and Barbara as the cousin.

Lazarus options

These two created individuals, “John” and “Roscoe” should be exactly the same, and, thankfully, they were.

Both recreated “John” and “Roscoe” represent a common set of DNA from the parents of both of these men, Hiram Ferverda and Evaline Miller based on the matching DNA of their descendants, Barbara, Cheryl and Don.

The way Lazarus works is that all kits in Group 1, the descendants, are compared with Group 2, other relatives but not descendants.  The descendants will carry some of Roscoe’s DNA, but also the DNA of Roscoe’s wife, the mother of Don and Cheryl.  By comparing against known relatives but not direct descendants, Lazarus effectively narrows the DNA to that contributed only by the common ancestor of group 1 and group 2.  In this case, that common ancestor would be John and Roscoe’s parents, Hiram Ferverda and Evaline Miller.  By comparing the descendant and non-descendant-but-otherwise-related groups, you effectively subtract out the mother’s DNA from the descendants – in this case meaning the DNA of John Ferverda’s wife and Roscoe Ferverda’s wife.

In other words, the descendants, above, are NOT compared to each other, but instead, to each one of the not-descendant-but-otherwise-related group.

Unfortunately, none of the kits generated was over the 1500 cM threshold.  I remembered that there is also a second cousin, Rex, whose DNA we can add because he descends from the parents of Evaline Miller.

Adding Rex to the mix brought the resulting “Roscoe” kit to 1589.7 cM and the resulting “John” kit to 1555.7 cM, both now barely over the 1500 threshold – but over just the same and that’s all that matters.  Soon, we’ll be able to utilize both of these kits for direct matching as a “person” at GedMatch.  Now how cool is that???

You receive four pieces of output information when you create a Lazarus kit.

First, a comparison between the descendants (Group 1 above, Kit 2 below) and each of the cousins and related-but-not-descendants individuals (Group 2 above, Kit 1 below), by chromosome.

John W. Ferverda

Processed: 2015/01/09 17:32:41
Name: John W. Ferverda
SNP threshold = 100 cM
Threshold = 4.0 cM
Batch processing will be performed if resulting kit achieves required threshold of 1500 cM.

Contributions:

Kit 1

Kit 2

Chr

Start

End

cM

F9141

M133930

1

72017

5703284

14.8

F9141

M133930

1

17271101

18589169

4.1

F9141

M133930

1

32804999

65722466

37.8

F9141

M133930

1

242601404

247174776

8.5

Obviously, these are only snippets of the output for chromosome 1.  You receive a chart of this same information for all of the chromosomes of the people being compared.

Second, a chart that shows the resulting matching segments.

Resulting Segments:

Chr

Start

End

cM

1

742429

5694404

14.8

1

17285357

18588145

4.1

1

38226163

43823334

7.2

1

43975578

54990495

8.0

1

55040097

62847030

12.1

1

76341094

85237614

8.7

1

242606491

247179501

8.5

At the bottom of this second set of numbers is the all-important total cM.  This is the only place you will find this number

Total cM: 1555.7

Third, a list of the original kits that have match results between the two groups.

Original Kits match with result:

Kit

Chr

Start

End

cM

F9141

1

742429

5700507

14.8

F9141

1

10899689

12530765

4.5

F9141

1

35075204

65714854

35.3

F9141

1

76334120

85252045

8.7

F9141

1

242606379

247169190

8.5

M133930

1

742429

5705356

14.8

M133930

1

35075956

65714854

35.3

M133930

1

242606491

247165725

8.5

F50000

1

10899689

12530765

4.5

F153785

1

742584

5700507

14.8

F153785

1

76337055

85252045

8.7

F153785

1

242606379

247169190

8.5

And finally, a summary.

196074 single allele SNPs were derived for the resulting kit.
37068 bi-allelic SNPs were derived for the resulting kit.
233142 total SNPs were derived for the resulting kit.
Kit number of Result: LX056148
Kit Name: John Ferverda 8
Your Lazarus file has been generated.

Is this as good as the real McCoy, meaning swabbing John and Roscoe?  Of course not, but John and Roscoe aren’t available for swabbing.  In fact, John and Roscoe are both probably finding this pretty amusing from someplace on the other side, watching their children “recreate” them!

I can hear them now, shaking their heads, “Well I never….”

They should have known if they left Cheryl and me here, together, unsupervised that we would do something like this!!!

2014 Top Genetic Genealogy Happenings – A Baker’s Dozen +1

It’s that time again, to look over the year that has just passed and take stock of what has happened in the genetic genealogy world.  I wrote a review in both 2012 and 2013 as well.  Looking back, these momentous happenings seem quite “old hat” now.  For example, both www.GedMatch.com and www.DNAGedcom.com, once new, have become indispensable tools that we take for granted.  Please keep in mind that both of these tools (as well as others in the Tools section, below) depend on contributions, although GedMatch now has a tier 1 subscription offering for $10 per month as well.

So what was the big news in 2014?

Beyond the Tipping Point

Genetic genealogy has gone over the tipping point.  Genetic genealogy is now, unquestionably, mainstream and lots of people are taking part.  From the best I can figure, there are now approaching or have surpassed three million tests or test records, although certainly some of those are duplicates.

  • 500,000+ at 23andMe
  • 700,000+ at Ancestry
  • 700,000+ at Genographic

The organizations above represent “one-test” companies.  Family Tree DNA provides various kinds of genetic genealogy tests to the community and they have over 380,000 individuals with more than 700,000 test records.

In addition to the above mentioned mainstream firms, there are other companies that provide niche testing, often in addition to Family Tree DNA Y results.

In addition, there is what I would refer to as a secondary market for testing as well which certainly attracts people who are not necessarily genetic genealogists but who happen across their corporate information and decide the test looks interesting.  There is no way of knowing how many of those tests exist.

Additionally, there is still the Sorenson data base with Y and mtDNA tests which reportedly exceeded their 100,000 goal.

Spencer Wells spoke about the “viral spread threshold” in his talk in Houston at the International Genetic Genealogy Conference in October and terms 2013 as the year of infection.  I would certainly agree.

spencer near term

Autosomal Now the New Normal

Another change in the landscape is that now, autosomal DNA has become the “normal” test.  The big attraction to autosomal testing is that anyone can play and you get lots of matches.  Earlier in the year, one of my cousins was very disappointed in her brother’s Y DNA test because he only had a few matches, and couldn’t understand why anyone would test the Y instead of autosomal where you get lots and lots of matches.  Of course, she didn’t understand the difference in the tests or the goals of the tests – but I think as more and more people enter the playground – percentagewise – fewer and fewer do understand the differences.

Case in point is that someone contacted me about DNA and genealogy.  I asked them which tests they had taken and where and their answer was “the regular one.”  With a little more probing, I discovered that they took Ancestry’s autosomal test and had no clue there were any other types of tests available, what they could tell him about his ancestors or genetic history or that there were other vendors and pools to swim in as well.

A few years ago, we not only had to explain about DNA tests, but why the Y and mtDNA is important.  Today, we’ve come full circle in a sense – because now we don’t have to explain about DNA testing for genealogy in general but we still have to explain about those “unknown” tests, the Y and mtDNA.  One person recently asked me, “oh, are those new?”

Ancient DNA

This year has seen many ancient DNA specimens analyzed and sequenced at the full genomic level.

The year began with a paper titled, “When Populations Collide” which revealed that contemporary Europeans carry between 1-4% of Neanderthal DNA most often associated with hair and skin color, or keratin.  Africans, on the other hand, carry none or very little Neanderthal DNA.

http://dna-explained.com/2014/01/30/neanderthal-genome-further-defined-in-contemporary-eurasians/

A month later, a monumental paper was published that detailed the results of sequencing a 12,500 Clovis child, subsequently named Anzick or referred to as the Anzick Clovis child, in Montana.  That child is closely related to Native American people of today.

http://dna-explained.com/2014/02/13/clovis-people-are-native-americans-and-from-asia-not-europe/

In June, another paper emerged where the authors had analyzed 8000 year old bones from the Fertile Crescent that shed light on the Neolithic area before the expansion from the Fertile Crescent into Europe.  These would be the farmers that assimilated with or replaced the hunter-gatherers already living in Europe.

http://dna-explained.com/2014/06/09/dna-analysis-of-8000-year-old-bones-allows-peek-into-the-neolithic/

Svante Paabo is the scientist who first sequenced the Neanderthal genome.  Here is a neanderthal mangreat interview and speech.  This man is so interesting.  If you have not read his book, “Neanderthal Man, In Search of Lost Genomes,” I strongly recommend it.

http://dna-explained.com/2014/07/22/finding-your-inner-neanderthal-with-evolutionary-geneticist-svante-paabo/

In the fall, yet another paper was released that contained extremely interesting information about the peopling and migration of humans across Europe and Asia.  This was just before Michael Hammer’s presentation at the Family Tree DNA conference, so I covered the paper along with Michael’s information about European ancestral populations in one article.  The take away messages from this are two-fold.  First, there was a previously undefined “ghost population” called Ancient North Eurasian (ANE) that is found in the northern portion of Asia that contributed to both Asian populations, including those that would become the Native Americans and European populations as well.  Secondarily, the people we thought were in Europe early may not have been, based on the ancient DNA remains we have to date.  Of course, that may change when more ancient DNA is fully sequenced which seems to be happening at an ever-increasing rate.

http://dna-explained.com/2014/10/21/peopling-of-europe-2014-identifying-the-ghost-population/

Lazaridis tree

Ancient DNA Available for Citizen Scientists

If I were to give a Citizen Scientist of the Year award, this year’s award would go unquestionably to Felix Chandrakumar for his work with the ancient genome files and making them accessible to the genetic genealogy world.  Felix obtained the full genome files from the scientists involved in full genome analysis of ancient remains, reduced the files to the SNPs utilized by the autosomal testing companies in the genetic genealogy community, and has made them available at GedMatch.

http://dna-explained.com/2014/09/22/utilizing-ancient-dna-at-gedmatch/

If this topic is of interest to you, I encourage you to visit his blog and read his many posts over the past several months.

https://plus.google.com/+FelixChandrakumar/posts

The availability of these ancient results set off a sea of comparisons.  Many people with Native heritage matched Anzick’s file at some level, and many who are heavily Native American, particularly from Central and South America where there is less admixture match Anzick at what would statistically be considered within a genealogical timeframe.  Clearly, this isn’t possible, but it does speak to how endogamous populations affect DNA, even across thousands of years.

http://dna-explained.com/2014/09/23/analyzing-the-native-american-clovis-anzick-ancient-results/

Because Anzick is matching so heavily with the Mexican, Central and South American populations, it gives us the opportunity to extract mitochondrial DNA haplogroups from the matches that either are or may be Native, if they have not been recorded before.

http://dna-explained.com/2014/09/23/analyzing-the-native-american-clovis-anzick-ancient-results/

Needless to say, the matches of these ancient kits with contemporary people has left many people questioning how to interpret the results.  The answer is that we don’t really know yet, but there is a lot of study as well as speculation occurring.  In the citizen science community, this is how forward progress is made…eventually.

http://dna-explained.com/2014/09/25/ancient-dna-matches-what-do-they-mean/

http://dna-explained.com/2014/09/30/ancient-dna-matching-a-cautionary-tale/

More ancient DNA samples for comparison:

http://dna-explained.com/2014/10/04/more-ancient-dna-samples-for-comparison/

A Siberian sample that also matches the Malta Child whose remains were analyzed in late 2013.

http://dna-explained.com/2014/11/12/kostenki14-a-new-ancient-siberian-dna-sample/

Felix has prepared a list of kits that he has processed, along with their GedMatch numbers and other relevant information, like gender, haplogroup(s), age and location of sample.

http://www.y-str.org/p/ancient-dna.html

Furthermore, in a collaborative effort with Family Tree DNA, Felix formed an Ancient DNA project and uploaded the ancient autosomal files.  This is the first time that consumers can match with Ancient kits within the vendor’s data bases.

https://www.familytreedna.com/public/Ancient_DNA

Recently, GedMatch added a composite Archaic DNA Match comparison tool where your kit number is compared against all of the ancient DNA kits available.  The output is a heat map showing which samples you match most closely.

gedmatch ancient heat map

Indeed, it has been a banner year for ancient DNA and making additional discoveries about DNA and our ancestors.  Thank you Felix.

Haplogroup Definition

That SNP tsunami that we discussed last year…well, it made landfall this year and it has been storming all year long…in a good way.  At least, ultimately, it will be a good thing.  If you asked the haplogroup administrators today about that, they would probably be too tired to answer – as they’ve been quite overwhelmed with results.

The Big Y testing has been fantastically successful.  This is not from a Family Tree DNA perspective, but from a genetic genealogy perspective.  Branches have been being added to and sawed off of the haplotree on a daily basis.  This forced the renaming of the haplogroups from the old traditional R1b1a2 to R-M269 in 2012.  While there was some whimpering then, it would be nothing like the outright wailing now that would be occurring as haplogroup named reached 20 or so digits.

Alice Fairhurst discussed the SNP tsunami at the DNA Conference in Houston in October and I’m sure that the pace hasn’t slowed any between now and then.  According to Alice, in early 2014, there were 4115 individual SNPs on the ISOGG Tree, and as of the conference, there were 14,238 SNPs, with the 2014 addition total at that time standing at 10,213.  That is over 1000 per month or about 35 per day, every day.

Yes, indeed, that is the definition of a tsunami.  Every one of those additions requires one of a number of volunteers, generally haplogroup project administrators to evaluate the various Big Y results, the SNPs and novel variants included, where they need to be inserted in the tree and if branches need to be rearranged.  In some cases, naming request for previously unknown SNPs also need to be submitted.  This is all done behind the scenes and it’s not trivial.

The project I’m closest to is the R1b L-21 project because my Estes males fall into that group.  We’ve tested several, and I’ll be writing an article as soon as the final test is back.

The tree has grown unbelievably in this past year just within the L21 group.  This project includes over 700 individuals who have taken the Big Y test and shared their results which has defined about 440 branches of the L21 tree.  Currently there are almost 800 kits available if you count the ones on order and the 20 or so from another vendor.

Here is the L21 tree in January of 2014

L21 Jan 2014 crop

Compare this with today’s tree, below.

L21 dec 2014

Michael Walsh, Richard Stevens, David Stedman need to be commended for their incredible work in the R-L21 project.  Other administrators are doing equivalent work in other haplogroup projects as well.  I big thank you to everyone.  We’d be lost without you!

One of the results of this onslaught of information is that there have been fewer and fewer academic papers about haplogroups in the past few years.  In essence, by the time a paper can make it through the peer review cycle and into publication, the data in the paper is often already outdated relative to the Y chromosome.  Recently a new paper was released about haplogroup C3*.  While the data is quite valid, the authors didn’t utilize the new SNP naming nomenclature.  Before writing about the topic, I had to translate into SNPese.  Fortunately, C3* has been relatively stable.

http://dna-explained.com/2014/12/23/haplogroup-c3-previously-believed-east-asian-haplogroup-is-proven-native-american/

10th Annual International Conference on Genetic Genealogy

The Family Tree DNA International Conference on Genetic Genealogy for project administrators is always wonderful, but this year was special because it was the 10th annual.  And yes, it was my 10th year attending as well.  In all these years, I had never had a photo with both Max and Bennett.  Everyone is always so busy at the conferences.  Getting any 3 people, especially those two, in the same place at the same time takes something just short of a miracle.

roberta, max and bennett

Ten years ago, it was the first genetic genealogy conference ever held, and was the only place to obtain genetic genealogy education outside of the rootsweb genealogy DNA list, which is still in existence today.  Family Tree DNA always has a nice blend of sessions.  I always particularly appreciate the scientific sessions because those topics generally aren’t covered elsewhere.

http://dna-explained.com/2014/10/11/tenth-annual-family-tree-dna-conference-opening-reception/

http://dna-explained.com/2014/10/12/tenth-annual-family-tree-dna-conference-day-2/

http://dna-explained.com/2014/10/13/tenth-annual-family-tree-dna-conference-day-3/

http://dna-explained.com/2014/10/15/tenth-annual-family-tree-dna-conference-wrapup/

Jennifer Zinck wrote great recaps of each session and the ISOGG meeting.

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy/

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-isogg-meeting/

http://www.ancestorcentral.com/decennial-conference-on-genetic-genealogy-sunday/

I thank Family Tree DNA for sponsoring all 10 conferences and continuing the tradition.  It’s really an amazing feat when you consider that 15 years ago, this industry didn’t exist at all and wouldn’t exist today if not for Max and Bennett.

Education

Two educational venues offered classes for genetic genealogists and have made their presentations available either for free or very reasonably.  One of the problems with genetic genealogy is that the field is so fast moving that last year’s session, unless it’s the very basics, is probably out of date today.  That’s the good news and the bad news.

http://dna-explained.com/2014/11/12/genetic-genealogy-ireland-2014-presentations 

http://dna-explained.com/2014/09/26/educational-videos-from-international-genetic-genealogy-conference-now-available/

In addition, three books have been released in 2014.emily book

In January, Emily Aulicino released Genetic Genealogy, The Basics and Beyond.

richard hill book

In October, Richard Hill released “Guide to DNA Testing: How to Identify Ancestors, Confirm Relationships and Measure Ethnicity through DNA Testing.”

david dowell book

Most recently, David Dowell’s new book, NextGen Genealogy: The DNA Connection was released right after Thanksgiving.

 

Ancestor Reconstruction – Raising the Dead

This seems to be the year that genetic genealogists are beginning to reconstruct their ancestors (on paper, not in the flesh) based on the DNA that the ancestors passed on to various descendants.  Those segments are “gathered up” and reassembled in a virtual ancestor.

I utilized Kitty Cooper’s tool to do just that.

http://dna-explained.com/2014/10/03/ancestor-reconstruction/

henry bolton probablyI know it doesn’t look like much yet but this is what I’ve been able to gather of Henry Bolton, my great-great-great-grandfather.

Kitty did it herself too.

http://blog.kittycooper.com/2014/08/mapping-an-ancestral-couple-a-backwards-use-of-my-segment-mapper/

http://blog.kittycooper.com/2014/09/segment-mapper-tool-improvements-another-wold-dna-map/

Ancestry.com wrote a paper about the fact that they have figured out how to do this as well in a research environment.

http://corporate.ancestry.com/press/press-releases/2014/12/ancestrydna-reconstructs-partial-genome-of-person-living-200-years-ago/

http://www.thegeneticgenealogist.com/2014/12/16/ancestrydna-recreates-portions-genome-david-speegle-two-wives/

GedMatch has created a tool called, appropriately, Lazarus that does the same thing, gathers up the DNA of your ancestor from their descendants and reassembles it into a DNA kit.

Blaine Bettinger has been working with and writing about his experiences with Lazarus.

http://www.thegeneticgenealogist.com/2014/10/20/finally-gedmatch-announces-monetization-strategy-way-raise-dead/

http://www.thegeneticgenealogist.com/2014/12/09/recreating-grandmothers-genome-part-1/

http://www.thegeneticgenealogist.com/2014/12/14/recreating-grandmothers-genome-part-2/

Tools

Speaking of tools, we have some new tools that have been introduced this year as well.

Genome Mate is a desktop tool used to organize data collected by researching DNA comparsions and aids in identifying common ancestors.  I have not used this tool, but there are others who are quite satisfied.  It does require Microsoft Silverlight be installed on your desktop.

The Autosomal DNA Segment Analyzer is available through www.dnagedcom.com and is a tool that I have used and found very helpful.  It assists you by visually grouping your matches, by chromosome, and who you match in common with.

adsa cluster 1

Charting Companion from Progeny Software, another tool I use, allows you to colorize and print or create pdf files that includes X chromosome groupings.  This greatly facilitates seeing how the X is passed through your ancestors to you and your parents.

x fan

WikiTree is a free resource for genealogists to be able to sort through relationships involving pedigree charts.  In November, they announced Relationship Finder.

Probably the best example I can show of how WikiTree has utilized DNA is using the results of King Richard III.

wiki richard

By clicking on the DNA icon, you see the following:

wiki richard 2

And then Richard’s Y, mitochondrial and X chromosome paths.

wiki richard 3

Since Richard had no descendants, to see how descendants work, click on his mother, Cecily of York’s DNA descendants and you’re shown up to 10 generations.

wiki richard 4

While this isn’t terribly useful for Cecily of York who lived and died in the 1400s, it would be incredibly useful for finding mitochondrial descendants of my ancestor born in 1802 in Virginia.  I’d love to prove she is the daughter of a specific set of parents by comparing her DNA with that of a proven daughter of those parents!  Maybe I’ll see if I can find her parents at WikiTree.

Kitty Cooper’s blog talks about additional tools.  I have used Kitty’s Chromosome mapping tools as discussed in ancestor reconstruction.

Felix Chandrakumar has created a number of fun tools as well.  Take a look.  I have not used most of these tools, but there are several I’ll be playing with shortly.

Exits and Entrances

With very little fanfare, deCODEme discontinued their consumer testing and reminded people to download their date before year end.

http://dna-explained.com/2014/09/30/decodeme-consumer-tests-discontinued/

I find this unfortunate because at one time, deCODEme seemed like a company full of promise for genetic genealogy.  They failed to take the rope and run.

On a sad note, Lucas Martin who founded DNA Tribes unexpectedly passed away in the fall.  DNA Tribes has been a long-time player in the ethnicity field of genetic genealogy.  I have often wondered if Lucas Martin was a pseudonym, as very little information about Lucas was available, even from Lucas himself.  Neither did I find an obituary.  Regardless, it’s sad to see someone with whom the community has worked for years pass away.  The website says that they expect to resume offering services in January 2015. I would be cautious about ordering until the structure of the new company is understood.

http://www.dnatribes.com/

In the last month, a new offering has become available that may be trying to piggyback on the name and feel of DNA Tribes, but I’m very hesitant to provide a link until it can be determined if this is legitimate or bogus.  If it’s legitimate, I’ll be writing about it in the future.

However, the big news exit was Ancestry’s exit from the Y and mtDNA testing arena.  We suspected this would happen when they stopped selling kits, but we NEVER expected that they would destroy the existing data bases, especially since they maintain the Sorenson data base as part of their agreement when they obtained the Sorenson data.

http://dna-explained.com/2014/10/02/ancestry-destroys-irreplaceable-dna-database/

The community is still hopeful that Ancestry may reverse that decision.

Ancestry – The Chromosome Browser War and DNA Circles

There has been an ongoing battle between Ancestry and the more seasoned or “hard-core” genetic genealogists for some time – actually for a long time.

The current and most long-standing issue is the lack of a chromosome browser, or any similar tools, that will allow genealogists to actually compare and confirm that their DNA match is genuine.  Ancestry maintains that we don’t need it, wouldn’t know how to use it, and that they have privacy concerns.

Other than their sessions and presentations, they had remained very quiet about this and not addressed it to the community as a whole, simply saying that they were building something better, a better mousetrap.

In the fall, Ancestry invited a small group of bloggers and educators to visit with them in an all-day meeting, which came to be called DNA Day.

http://dna-explained.com/2014/10/08/dna-day-with-ancestry/

In retrospect, I think that Ancestry perceived that they were going to have a huge public relations issue on their hands when they introduced their new feature called DNA Circles and in the process, people would lose approximately 80% of their current matches.  I think they were hopeful that if they could educate, or convince us, of the utility of their new phasing techniques and resulting DNA Circles feature that it would ease the pain of people’s loss in matches.

I am grateful that they reached out to the community.  Some very useful dialogue did occur between all participants.  However, to date, nothing more has happened nor have we received any additional updates after the release of Circles.

Time will tell.

http://dna-explained.com/2014/11/18/in-anticipation-of-ancestrys-better-mousetrap/

http://dna-explained.com/2014/11/19/ancestrys-better-mousetrap-dna-circles/

DNA Circles 12-29-2014

DNA Circles, while interesting and somewhat useful, is certainly NOT a replacement for a chromosome browser, nor is it a better mousetrap.

http://dna-explained.com/2014/11/30/chromosome-browser-war/

In fact, the first thing you have to do when you find a DNA Circle that you have not verified utilizing raw data and/or chromosome browser tools from either 23andMe, Family Tree DNA or Gedmatch, is to talk your matches into transferring their DNA to Family Tree DNA or download to Gedmatch, or both.

http://dna-explained.com/2014/11/27/sarah-hickerson-c1752-lost-ancestor-found-52-ancestors-48/

I might add that the great irony of finding the Hickerson DNA Circle that led me to confirm that ancestry utilizing both Family Tree DNA and GedMatch is that today, when I checked at Ancestry, the Hickerson DNA Circle is no longer listed.  So, I guess I’ve been somehow pruned from the circle.  I wonder if that is the same as being voted off of the island.  So, word to the wise…check your circles often…they change and not always in the upwards direction.

The Seamy Side – Lies, Snake Oil Salesmen and Bullys

Unfortunately a seamy side, an underbelly that’s rather ugly has developed in and around the genetic genealogy industry.  I guess this was to be expected with the rapid acceptance and increasing popularity of DNA testing, but it’s still very unfortunate.

Some of this I expected, but I didn’t expect it to be so…well…blatant.

I don’t watch late night TV, but I’m sure there are now DNA diets and DNA dating and just about anything else that could be sold with the allure of DNA attached to the title.

I googled to see if this was true, and it is, although I’m not about to click on any of those links.

google dna dating

google dna diet

Unfortunately, within the ever-growing genetic genealogy community a rather large rift has developed over the past couple of years.  Obviously everyone can’t get along, but this goes beyond that.  When someone disagrees, a group actively “stalks” the person, trying to cost them their employment, saying hate filled and untrue things and even going so far as to create a Facebook page titled “Against<personname>.”  That page has now been removed, but the fact that a group in the community found it acceptable to create something like that, and their friends joined, is remarkable, to say the least.  That was accompanied by death threats.

Bullying behavior like this does not make others feel particularly safe in expressing their opinions either and is not conducive to free and open discussion. As one of the law enforcement officers said, relative to the events, “This is not about genealogy.  I don’t know what it is about, yet, probably money, but it’s not about genealogy.”

Another phenomenon is that DNA is now a hot topic and is obviously “selling.”  Just this week, this report was published, and it is, as best we can tell, entirely untrue.

http://worldnewsdailyreport.com/usa-archaeologists-discover-remains-of-first-british-settlers-in-north-america/

There were several tip offs, like the city (Lanford) and county (Laurens County) is not in the state where it is attributed (it’s in SC not NC), and the name of the institution is incorrect (Johns Hopkins, not John Hopkins).  Additionally, if you google the name of the magazine, you’ll see that they specialize in tabloid “faux reporting.”  It also reads a lot like the King Richard genuine press release.

http://urbanlegends.about.com/od/Fake-News/tp/A-Guide-to-Fake-News-Websites.01.htm

Earlier this year, there was a bogus institutional site created as well.

On one of the DNA forums that I frequent, people often post links to articles they find that are relevant to DNA.  There was an interesting article, which has now been removed, correlating DNA results with latitude and altitude.  I thought to myself, I’ve never heard of that…how interesting.   Here’s part of what the article said:

Researchers at Aberdeen College’s Havering Centre for Genetic Research have discovered an important connection between our DNA and where our ancestors used to live.

Tiny sequence variations in the human genome sometimes called Single Nucleotide Polymorphisms (SNPs) occur with varying frequency in our DNA.  These have been studied for decades to understand the major migrations of large human populations.  Now Aberdeen College’s Dr. Miko Laerton and a team of scientists have developed pioneering research that shows that these differences in our DNA also reveal a detailed map of where our own ancestors lived going back thousands of years.

Dr. Laerton explains:  “Certain DNA sequence variations have always been important signposts in our understanding of human evolution because their ages can be estimated.  We’ve known for years that they occur most frequently in certain regions [of DNA], and that some alleles are more common to certain geographic or ethnic groups, but we have never fully understood the underlying reasons.  What our team found is that the variations in an individual’s DNA correlate with the latitudes and altitudes where their ancestors were living at the time that those genetic variations occurred.  We’re still working towards a complete understanding, but the knowledge that sequence variations are connected to latitude and altitude is a huge breakthrough by itself because those are enough to pinpoint where our ancestors lived at critical moments in history.”

The story goes on, but at the bottom, the traditional link to the publication journal is found.

The full study by Dr. Laerton and her team was published in the September issue of the Journal of Genetic Science.

I thought to myself, that’s odd, I’ve never heard of any of these people or this journal, and then I clicked to find this.

Aberdeen College bogus site

About that time, Debbie Kennett, DNA watchdog of the UK, posted this:

April Fools Day appears to have arrived early! There is no such institution as Aberdeen College founded in 1394. The University of Aberdeen in Scotland was founded in 1495 and is divided into three colleges: http://www.abdn.ac.uk/about/colleges-schools-institutes/colleges-53.php

The picture on the masthead of the “Aberdeen College” website looks very much like a photo of Aberdeen University. This fake news item seems to be the only live page on the Aberdeen College website. If you click on any other links, including the link to the so-called “Journal of Genetic Science”, you get a message that the website is experienced “unusually high traffic”. There appears to be no such journal anyway.

We also realized that Dr. Laerton, reversed, is “not real.”

I still have no idea why someone would invest the time and effort into the fake website emulating the University of Aberdeen, but I’m absolutely positive that their motives were not beneficial to any of us.

What is the take-away of all of this?  Be aware, very aware, skeptical and vigilant.  Stick with the mainstream vendors unless you realize you’re experimenting.

King Richard

King Richard III

The much anticipated and long-awaited DNA results on the remains of King Richard III became available with a very unexpected twist.  While the science team feels that they have positively identified the remains as those of Richard, the Y DNA of Richard and another group of men supposed to have been descended from a common ancestor with Richard carry DNA that does not match.

http://dna-explained.com/2014/12/09/henry-iii-king-of-england-fox-in-the-henhouse-52-ancestors-49/

http://dna-explained.com/2014/12/05/mitochondrial-dna-mutation-rates-and-common-ancestors/

Debbie Kennett wrote a great summary article.

http://cruwys.blogspot.com/2014/12/richard-iii-and-use-of-dna-as-evidence.html

More Alike than Different

One of the life lessons that genetic genealogy has held for me is that we are more closely related that we ever knew, to more people than we ever expected, and we are far more alike than different.  A recent paper recently published by 23andMe scientists documents that people’s ethnicity reflect the historic events that took place in the part of the country where their ancestors lived, such as slavery, the Trail of Tears and immigration from various worldwide locations.

23andMe European African map

From the 23andMe blog:

The study leverages samples of unprecedented size and precise estimates of ancestry to reveal the rate of ancestry mixing among American populations, and where it has occurred geographically:

  • All three groups – African Americans, European Americans and Latinos – have ancestry from Africa, Europe and the Americas.
  • Approximately 3.5 percent of European Americans have 1 percent or more African ancestry. Many of these European Americans who describe themselves as “white” may be unaware of their African ancestry since the African ancestor may be 5-10 generations in the past.
  • European Americans with African ancestry are found at much higher frequencies in southern states than in other parts of the US.

The ancestry proportions point to the different regional impacts of slavery, immigration, migration and colonization within the United States:

  • The highest levels of African ancestry among self-reported African Americans are found in southern states, especially South Carolina and Georgia.
  • One in every 20 African Americans carries Native American ancestry.
  • More than 14 percent of African Americans from Oklahoma carry at least 2 percent Native American ancestry, likely reflecting the Trail of Tears migration following the Indian Removal Act of 1830.
  • Among self-reported Latinos in the US, those from states in the southwest, especially from states bordering Mexico, have the highest levels of Native American ancestry.

http://news.sciencemag.org/biology/2014/12/genetic-study-reveals-surprising-ancestry-many-americans?utm_campaign=email-news-weekly&utm_source=eloqua

23andMe provides a very nice summary of the graphics in the article at this link:

http://blog.23andme.com/wp-content/uploads/2014/10/Bryc_ASHG2014_textboxes.pdf

The academic article can be found here:

http://www.cell.com/ajhg/home

2015

So what does 2015 hold? I don’t know, but I can’t wait to find out. Hopefully, it holds more ancestors, whether discovered through plain old paper research, cousin DNA testing or virtually raised from the dead!

What would my wish list look like?

  • More ancient genomes sequenced, including ones from North and South America.
  • Ancestor reconstruction on a large scale.
  • The haplotree becoming fleshed out and stable.
  • Big Y sequencing combined with STR panels for enhanced genealogical research.
  • Improved ethnicity reporting.
  • Mitochondrial DNA search by ancestor for descendants who have tested.
  • More tools, always more tools….
  • More time to use the tools!

Here’s wishing you an ancestor filled 2015!

 

Chromosome Browser War

There has been a lot of discussion lately, and I mean REALLY a lot, about chromosome browsers, the need or lack thereof, why, and what the information really means.

For the old timers in the field, we know the story, the reasons, and the backstory, but a lot of people don’t.  Not only are they only getting pieces of the puzzle, they’re confused about why there even is a puzzle.  I’ve been receiving very basic questions about this topic, so I thought I’d write an article about chromosome browsers, what they do for us, why we need them, how we use them and the three vendors, 23andMe, Ancestry and Family Tree DNA, who offer autosomal DNA products that provide a participant matching data base.

The Autosomal Goal

Autosomal DNA, which tests the part of your DNA that recombines between parents every generation, is utilized in genetic genealogy to do a couple of things.

  1. To confirm your connection to a specific ancestor through matches to other descendants.
  2. To break down genealogy brick walls.
  3. Determine ethnicity percentages which is not the topic of this article.

The same methodology is used for items 1 and 2.

In essence, to confirm that you share a common ancestor with someone, you need to either:

  1. Be a close relative – meaning you tested your mother and/or father and you match as expected. Or, you tested another known relative, like a first cousin, for example, and you also match as expected. These known relationships and matches become important in confirming or eliminating other matches and in mapping your own chromosomes to specific ancestors.
  2. A triangulated match to at least two others who share the same distant ancestor. This happens when you match other people whose tree indicates that you share a common ancestor, but they are not previously known to you as family.

Triangulation is the only way you can prove that you do indeed share a common ancestor with someone not previously identified as family.

In essence, triangulation is the process by which you match people who match you genetically with common ancestors through their pedigree charts.  I wrote about the process in this article “Triangulation for Autosomal DNA.”

To prove that you share a common ancestor with another individual, the DNA of  three proven descendants of that common ancestor must match at the same location.  I should add a little * to this and the small print would say, “ on relatively large segments.”  That little * is rather controversial, and we’ll talk about that in a little bit.  This leads us to the next step, which is if you’re a fourth person, and you match all three of those other people on that same segment, then you too share that common ancestor.  This is the process by which adoptees and those who are searching for the identity of a parent work through their matches to work forward in time from common ancestors to, hopefully, identify candidates for individuals who could be their parents.

Why do we need to do this?  Isn’t just matching our DNA and seeing a common ancestor in a pedigree chart with one person enough?  No, it isn’t.  I recently wrote about a situation where I had a match with someone and discovered that even though we didn’t know it, and still don’t know exactly how, we unquestionably share two different ancestral lines.

When you look at someone’s pedigree chart, you may see immediately that you share more than one ancestral line.  Your shared DNA could come from either line, both lines, or neither line – meaning from an unidentified common ancestor.  In genealogy parlance, those are known as brick walls!

Blaine Bettinger wrote about this scenario in his now classic article, “Everyone Has Two Family Trees – A Genealogical Tree and a Genetic Tree.”

Proving a Match

The only way to prove that you actually do share a genealogy relative with someone that is not a known family member is to triangulate.  This means searching other matches with the same ancestral surname, preferably finding someone with the same proven ancestral tree, and confirming that the three of you not only share matching DNA, but all three share the same matching DNA segments.  This means that you share the same ancestor.

Triangulation itself is a two-step process followed by a third step of mapping your own DNA so that you know where various segments came from.  The first two triangulation steps are discovering that you match other people on a common segment(s) and then determining if the matches also match each other on those same segments.

Both Family Tree DNA and 23andMe, as vendors have provided ways to do most of this.  www.gedmatch.com and www.dnagedcom.com both augment the vendor offerings.  Ancestry provides no tools of this type – which is, of course, what has precipitated the chromosome browser war.

Let’s look at how the vendors products work in actual practice.

Family Tree DNA

1. Chromosome browser – do they match you?

Family Tree DNA makes it easy to see who you match in common with someone else in their matching tool, by utilizing the ICW crossed X icon.

chromosome browser war13

In the above example, I am seeing who I match in common with my mother.  Sure enough, our three known cousins are the closest matches, shown below.

chromosome browser war14

You can then push up to 5 individuals through to the chromosome browser to see where they match the participant.

The following chromosome browser is an example of a 4 person match showing up on the Family Tree DNA chromosome browser.

This example shows known cousins matching.  But this is exactly the same scenario you’re looking for when you are matching previously unknown cousins – the exact same technique.

In this example, I am the participant, so these matches are matches to me and my chromosome is the background chromosome displayed.  I have switched from my mother’s side to known cousins on my father’s side.

chromosome browser war1

The chromosome browser shows that these three cousins all match the person whose chromosomes are being shown (me, in this case), but it doesn’t tell you if they also match each other.  With known cousins, it’s very unlikely (in my case) that someone would match me from my mother’s side, and someone from my father’s side, but when you’re working with unknown cousins, it’s certainly possible.  If your parents are from the same core population, like Germans or an endogamous population, you may well have people who match you on both sides of your family.  Simply put, you can’t assume they don’t.

It’s also possible that the match is a genuine genealogical match, but you don’t happen to match on the exact same segments, so the ancestor can’t yet be confirmed until more cousins sharing that same ancestral line are found who do match, and it’s possible that some segments could be IBS, identical by state, meaning matches by chance, especially small segments, below the match threshold.

2. Matrix – do they match each other?

Family Tree DNA also provides a tool called the Matrix where you can see if all of the people who match on the same segment, also match each other at some place on their DNA.

chromosome browser war2

The Matrix tool measures the same level of DNA as the default chromosome browser, so in the situation I’m using for an example, there is no issue.  However, if you drop the threshold of the match level, you may well, and in this case, you will, find matches well below the match threshold.  They are shown as matches because they have at least one segment above the match threshold.  If you don’t have at least one segment above the threshold, you’ll never see these smaller matches.  Just to show you what I mean, this is the same four people, above, with the threshold lowered to 1cM.  All those little confetti pieces of color are smaller matches.

chromosome browser war3

At Family Tree DNA, the match threshold is about 7cM.  Each of the vendors has a different threshold and a different way of calculating that threshold.

The only reason I mention this is because if you DON’T match with someone on the matrix, but you also show matches at smaller segments, understand that matrix is not reporting on those, so matrix matches are not negative proof, only positive indications – when you do match, both on the chromosome browser and utilizing the matrix tool.

What you do know at this point is that these individuals all match you on the same segments, and that they match each other someplace on their chromosomes, but what you don’t know is if they match each other on the same locations where they match you.

If you are lucky and your matches are cousins or experienced genetic genealogists and are willing to take a look at their accounts, they can tell you if they match the other people on the same segments where they match you – but that’s the only way to know unless they are willing to download their raw data file to GedMatch.  At GedMatch, you can adjust the match thresholds to any level you wish and you can compare one-to-one kits to see where any two kits who have provided you with their kit number match each other.

3. Downloading data – mapping your chromosome.

The “download to Excel” function at Family Tree DNA, located just above the chromosome browser graphic, on the left, provides you with the matching data of the individuals shown on the chromosome browser with their actual segment data shown. (The download button on the right downloads all of your matches, not just the ones shown in the browser comparison.)

The spreadsheet below shows the downloaded data for these four individuals.  You can see on chromosome 15 (yellow) there are three distinct segments that match (pink, yellow and blue,) which is exactly what is reflected on the graphic browser as well.

chromosome browser war4

On the spreadsheet below, I’ve highlighted, in red, the segments which appeared on the original chromosome browser – so these are only the matches at or over the match threshold.

chromosome browser war5

As you can see, there are 13 in total.

Smaller Segments

Up to this point, the process I’ve shared is widely accepted as the gold standard.

In the genetic genealogy community, there are very divergent opinions on how to treat segments below the match threshold, or below even 10cM.  Some people “throw them away,” in essence, disregard them entirely.  Before we look at a real life example, let’s talk about the challenges with small segments.

When smaller segments match, along with larger segments, I don’t delete them, throw them away, or disregard them.  I believe that they are tools and each one carries a message for us.  Those messages can be one of four things.

  1. This is a valid IBD, meaning identical by descent, match where the segment has been passed from one specific ancestor to all of the people who match and can be utilized as such.
  2. This is an IBS match, meaning identical by state, and is called that because we can’t yet identify the common ancestor, but there is one. So this is actually IBD but we can’t yet identify it as such. With more matches, we may well be able to identify it as IBD, but if we throw it away, we never get that chance. As larger data bases and more sophisticated software become available, these matches will fall into place.
  3. This is an IBS match that is a false match, meaning the DNA segments that we receive from our father and mother just happen to align in a way that matches another person. Generally these are relatively easy to determine because the people you match won’t match each other. You also won’t tend to match other people with the same ancestral line, so they will tend to look like lone outliers on your match spreadsheets, but not always.
  4. This is an IBS match that is population based. These are much more difficult to determine, because this is a segment that is found widely in a population. The key to determining these pileup areas, as discussed in the Ancestry article about their new phasing technique, if that you will find this same segment matching different proven lineages. This is the reason that Ancestry has implemented phasing – to identify and remove these match regions from your matches. Ancestry provided a graphic of my pileup areas, although they did not identify for me where on my chromosomes these pileup regions occurred. I do have some idea however, because I’ve found a couple of areas where I have matches from my mother’s side of the family from different ancestors – so these areas must be IBS on a population level. That does not, however, make them completely irrelevant.

genome pileups

The challenge, and problem, is where to make the cutoff when you’re eliminating match areas based on phased data.  For example, I lost all of my Acadian matches at Ancestry.  Of course, you would expect an endogamous population to share lots of the same DNA – and there are a huge number of Acadian descendants today – they are in fact a “population,” but those matches are (were) still useful to me.

I utilize Acadian matches from Family Tree DNA and 23andMe to label that part of my chromosome “Acadian” even if I can’t track it to a specific Acadian ancestor, yet.  I do know from which of my mother’s ancestors it originated, her great-grandfather, who is her Acadian ancestor.  Knowing that much is useful as well.

The same challenge exists for other endogamous groups – people with Jewish, Mennonite/Brethren/Amish, Native American and African American heritage searching for their mixed race roots arising from slavery.  In fact, I’d go so far as to say that this problem exists for anyone looking for ancestors beyond the 5th or 6th generation, because segments inherited from those ancestors, if there are any, will probably be small and fall below the generally accepted match thresholds.  The only way you will be able to find them, today, is the unlikely event that there is one larger segments, and it leads you on a search, like the case with Sarah Hickerson.

I want to be very clear – if you’re looking for only “sure thing” segments – then the larger the matching segment, the better the odds that it’s a sure thing, a positive, indisputable, noncontroversial match.  However, if you’re looking for ancestors in the distant past, in the 5th or 6th generation or further, you’re not likely to find sure thing matches and you’ll have to work with smaller segments. It’s certainly preferable and easier to work with large matches, but it’s not always possible.

In the Ralph and Coop paper, The Geography of Recent Genetic Ancestry Across Europe, they indicated that people who matched on segments of 10cM or larger were more likely to have a common ancestor with in the past 500 years.  Blocks of 4cM or larger were estimated to be from populations from 500-1500 years ago.  However, we also know that there are indeed sticky segments that get passed intact from generation to generation, and also that some segments don’t get divided in a generation, they simply disappear and aren’t passed on at all.  I wrote about this in my article titled, Generational Inheritance.

Another paper by Durand et al, Reducing pervasive false positive identical-by-descent segments detected by large-scale pedigree analysis, showed that 67% of the 2-4cM segments were false positives.  Conversely, that also means that 33% of the 2-4cM segments were legitimate IBD segments.

Part of the disagreement within the genetic genealogy community is based on a difference in goals.  People who are looking for the parents of adoptees are looking first and primarily as “sure thing” matches and the bigger the match segment, of course, the better because that means the people are related more closely in time.  For them, smaller segments really are useless.  However, for people who know their recent genealogy and are looking for those brick wall ancestors, several generations back in time, their only hope is utilizing those smaller segments.  This not black and white but shades of grey.  One size does not fit all.  Nor is what we know today the end of the line.  We learn every single day and many of our learning experiences are by working through our own unique genealogical situations – and sharing our discoveries.

On this next spreadsheet, you can see the smaller segments surrounding the larger segments – in other words, in the same match cluster – highlighted in green.  These are the segments that would be discarded as invalid if you were drawing the line at the match threshold.  Some people draw it even higher, at 10 cM.  I’m not being critical of their methodology or saying they are wrong.  It may well work best for them, but discarding small segments is not the only approach and other approaches do work, depending on the goals of the researcher.  I want my 33% IBD segments, thank you very much.

All of the segments highlighted in purple match between at least three cousins.  By checking the other cousins accounts, I can validate that they do all match each other as well, even though I can’t tell this through the Family Tree DNA matrix below the matching threshold.  So, I’ve proven these are valid.  We all received them from our common ancestor.

What about the white rows?  Are those valid matches, from a common ancestor?  We don’t have enough information to make that determination today.

chromosome browser war6

Downloading my data, and confirming segments to this common ancestor allows me to map my own chromosomes.  Now, I know that if someone matches me and any of these three cousins on chromosome 15, for example, between 33,335,760 and 58,455,135 – they are, whether they know it or not, descended from our common ancestral line.

In my opinion, I would think it a shame to discount or throw away all of these matches below 7cM, because you would be discounting 39 of your 52 total matches, or 75% of them.  I would be more conservative in assigning my segments with only one cousin match to any ancestor, but I would certainly note the match and hope that if I added other cousins, that segment would be eventually proven as IBD.

I used positively known cousins in this example because there is no disputing the validity of these matches.  They were known as cousins long before DNA testing.

Breaking Down Brick Walls

This is the same technique utilized to break down brick walls – and the more cousins you have tested, so that you can identify the maximum number of chromosome pieces of a particular ancestor – the better.

I used this same technique to identify Sarah Hickerson in my Thanksgiving Day article, utilizing these same cousins, plus several more.

Hey, just for fun, want to see what chromosome 15 looks like in this much larger sample???

In this case, we were trying to break down a brick wall.  We needed to determine if Sarah Hickerson was the mother of Elijah Vannoy.  All of the individuals in the left “Name” column are proven Vannoy cousins from Elijah, or in one case, William, from another child of Sarah Hickerson.  The individuals in the right “Match” column are everyone in the cousin match group plus the people in green who are Hickerson/Higginson descendants.  William, in green, is proven to descend from Sarah Hickerson and her husband, Daniel Vannoy.

chromosome browser war7

The first part of chromosome 15 doesn’t overlap with the rest.  Buster, David and I share another ancestral line as well, so the match in the non-red section of chromosome 15 may well be from that ancestral line.  It becomes an obvious possibility, because none of the people who share the Vannoy/Hickerson/Higginson DNA are in that small match group.

All of the red colored cells do overlap with at least one other individual in that group and together they form a cluster.  The yellow highlighted cells are the ones over the match threshold.  The 6 Hickerson/Higginson descendants are scattered throughout this match group.

And yes, for those who are going to ask, there are many more Vannoy/Hickerson triangulated groups.  This is just one of over 60 matching groups in total, some with matches well above the match threshold. But back to the chromosome browser wars!

23andMe

This example from 23andMe shows why it’s so very important to verify that your matches also match each other.

chromosome browser war8

Blue and purple match segments are to two of the same cousins that I used in the comparison at Family Tree DNA, who are from my father’s side.  Green is my first cousin from my mother’s side.   Note that on chromosome 11, they both match me on a common segment.  I know by working with them that they don’t match each other on that segment, so while they are both related to me, on chromosome 11, it’s not through the same ancestor.  One is from my father’s side and one is from my mother’s side.  If I hadn’t already known that, determining if they matched each other would be the acid test and would separate them into 2 groups.

23andMe provides you with a tool to see who your matches match that you match too.  That’s a tongue twister.

In essence, you can select any individual, meaning you or anyone that you match, on the left hand side of this tool, and compare them to any 5 other people that you match.  In my case above, I compared myself to my cousins, but if I want to know if my cousin on my mother’s side matches my two cousins on my father’s side, I simply select her name on the left and theirs on the right by using the drop down arrows.

chromosome browser war9

I would show you the results, but it’s in essence a blank chromosome browser screen, because she doesn’t match either of them, anyplace, which tells me, if I didn’t already know, that these two matches are from different sides of my family.

However, in other situations, where I match my cousin Daryl, for example, as well as several other people on the same segment, I want to know how many of these people Daryl matches as well.  I can enter Daryl’s name, with my name and their names in the group of 5, and compare.  23andMe facilitates the viewing or download of the results in a matrix as well, along with the segment data.  You can also download your entire list of matches by requesting aggregated data through the link at the bottom of the screen above or the bottom of the chromosome display.

I find it cumbersome to enter each matches name in the search tool and then enter all of the other matches names as well.  By utilizing the tools at www.dnagedcom.com, you can determine who your matches match as well, in common with you, in one spreadsheet.  Here’s an example.  Daryl in the chart below is my match, and this tool shows you who else she matches that I match as well, and the matching segments.  This allows me to correlate my match with Gwen for example, to Daryl’s match to Gwen to see if they are on the same segments.

chromosome browser war10

As you can see, Daryl and I both match Gwen on a common segment.  On my own chromosome mapping spreadsheet, I match several other people as well at that location, at other vendors, but so far, we haven’t been able to find any common genealogy.

Ancestry.com

At Ancestry.com, I have exactly the opposite problem.  I have lots of people I DNA match, and some with common genealogy, but no tools to prove the DNA match is to the common ancestor.

Hence, this is the crux of the chromosome browser wars.  I’ve just showed you how and why we use chromosome browsers and tools to show if our matches match each other in addition to us and on which segments.  I’ve also illustrated why.  Neither 23andMe nor Family Tree DNA provides perfect tools, which is why we utilize both GedMatch and DNAGedcom, but they do provide tools.  Ancestry provides no tools of this type.

At Ancestry, you have two kinds of genetic matches – ones without tree matches and ones with tree matches.  Pedigree matching is a service that Ancestry provides that the other vendors don’t.  Unfortunately, it also leads people to believe that because they match these people genetically and share a tree, that the tree shown is THE genetic match and it’s to the ancestor shown in the tree.  In fact, if the tree is wrong, either your tree or their tree, and you match them genetically, you will show up as a pedigree match as well.  Even if both pedigrees are right, that still doesn’t mean that your genetic match is through that ancestor.

How many bad trees are at Ancestry percentagewise?  I don’t know, but it’s a constant complaint and there is absolutely nothing Ancestry can do about it.  All they can do is utilize what they have, which is what their customers provide.  And I’m glad they do.  It does make the process of working through your matches much easier. It’s a starting point.  DNA matches with trees that also match your pedigree are shown with Ancestry’s infamous shakey leaf.

In fact, in my Sarah Hickerson article, it was a shakey leaf match that initially clued me that there was something afoot – maybe. I had to shift to another platform (Family Tree DNA) to prove the match however, where I had tools and lots of known cousins.

At Ancestry, I now have about 3000 matches in total, and of those, I have 33 shakey leaves – or people with whom I also share an ancestor in our pedigree charts.  A few of those are the same old known cousins, just as genealogy crazy as me, and they’ve tested at all 3 companies.

The fly in the ointment, right off the bat, is that I noticed in several of these matches that I ALSO share another ancestral line.

Now, the great news is that Ancestry shows you your surnames in common, and you can click on the surname and see the common individuals in both trees.

The bad news is that you have to notice and click to see that information, found in the lower left hand corner of this screen.

chromosome browser war11

In this case, Cook is an entirely different line, not connected to the McKee line shown.

However, in this next case, we have the same individual entered in our software, but differently.  It wasn’t close enough to connect as an ancestor, but close enough to note.  It turns out that Sarah Cook is the mother of Fairwick Claxton, but her middle name was not Helloms, nor was her maiden name, although that is a long-standing misconception that was proven incorrect with her husband’s War of 1812 documents many years ago. Unfortunately, this misinformation is very widespread in trees on the internet.

chromosome browser war12

Out of curiosity, and now I’m sorry I did this because it’s very disheartening – I looked to see what James Lee Claxton/Clarkson’s wife’s name was shown to be on the first page of Ancestry’s advanced search matches.

Despite extensive genealogical and DNA research, we don’t know who James Lee Claxton/Clarkson’s parents are, although we’ve disproven several possibilities, including the most popular candidate pre-DNA testing.  However, James’ wife was positively Sarah Cook, as given by her, along with her father’s name, and by witnesses to their marriage provided when she applied for a War of 1812 pension and bounty land.  I have the papers from the National Archives.

James Lee Claxton’s wife, Sara Cook is identified as follows in the first 50 Ancestry search entries.

Sarah Cook – 4

Incorrect entries:

  • Sarah Cook but with James’ parents listed – 3
  • Sarah Helloms Cook – 2, one with James’ parents
  • Sarah Hillhorns – 15
  • Sarah Cook Hitson – 13, some with various parents for James
  • No wife, but various parents listed for James – 12
  • No wife, no parents – 1

I’d much rather see no wife and no parents than incorrect information.

Judy Russell has expressed her concern about the effects of incorrect trees and DNA as well and we shared this concern with Ancestry during our meeting.

Ancestry themselves in their paper titled “Identifying groups of descendants using pedigrees and genetically inferred relationships in a large database” says, “”As with all analyses relating to DNA Circles™, tree quality is also an important caveat and limitation.”  So Ancestry is aware, but they are trying to leverage and utilize one of their biggest assets, their trees.

This brings us to DNA Circles.  I reviewed Ancestry’s new product release extensively in my Ancestry’s Better Mousetrap article.  To recap briefly, Ancestry gathers your DNA matches together, and then looks for common ancestors in trees that are public using an intelligent ranking algorithm that takes into account:

  1. The confidence that the match is due to recent genealogical history (versus a match due to older genealogical history or a false match entirely).
  2. The confidence that the identified common recent ancestor represents the same person in both online pedigrees.
  3. The confidence that the individuals have a match due to the shared ancestor in question as opposed to from another ancestor or from more distant genealogical history.

The key here is that Ancestry is looking for what they term “recent genealogical history.”  In their paper they define this as 10 generations, but the beta version of DNA Circles only looks back 7 generations today.  This was also reflected in their phasing paper, “Discovering IBD matches across a large, growing database.”

However, the unfortunate effect has been in many cases to eliminate matches, especially from endogamous groups.  By way of example, I lost my Acadian matches in the Ancestry new product release.  They would have been more than 7 generations back, and because they were endogamous, they may have “looked like” IBS segments, if IBS is defined at Ancestry as more than 7 or 10 generations back.  Hopefully Ancestry will tweek this algorithm in future releases.

Ancestry, according to their paper, “Identifying groups of descendants using pedigrees and genetically inferred relationships in a large database,” then clusters these remaining matching individuals together in Circles based on their pedigree charts.  You will match some of these people genetically, and some of them will not match you but will match each other.  Again, according to the paper, “these confidence levels are calculated by the direct-line pedigree size, the number of shared ancestral couples and the generational depth of the shared MRCA couple.”

Ancestry notes that, “using the concordance of two independent pieces of information, meaning pedigree relationships and patterns of match sharing among a set of individuals, DNA Circles can serve as supporting evidence for documented pedigree lines.”  Notice, Ancestry did NOT SAY proof.  Nothing that Ancestry provides in their DNA product constitutes proof.

Ancestry continues by saying that Circles “opens the possibility for people to identify distant relatives with whom they do not share DNA directly but with whom they still have genetic evidence supporting the relationship.”

In other words, Ancestry is being very clear in this paper, which is provided on the DNA Circles page for anyone with Circles, that they are giving you a tool, not “the answer,” but one more piece of information that you can consider as evidence.

joel vannoy circleJoel Vannoy circle2

You can see in my Joel Vannoy circle that I match both of these people both genetically and on their tree.

We, in the genetic genealogy community, need proof.  It certainly could be available, technically – because it is with other vendors and third party sites.

We need to be able to prove that our matches also match each other, and utilizing Ancestry’s tools, we can’t.  We also can’t do this at Ancestry by utilizing third party tools, so we’re in essence, stuck.

We can either choose to believe, without substantiation, that we indeed share a common ancestor because we share DNA segments with them plus a pedigree chart from that common ancestor, or we can initiate a conversation with our match that leads to either or both of the following questions:

  1. Have you or would you upload your raw data to GedMatch?
  2. Have you or would you upload your raw data file to Family Tree DNA?

Let the begging begin!!!

The Problem

In a nutshell, the problem is that even if your Ancestry matches do reply and do upload their file to either Family Tree DNA or GedMatch or both, you are losing most of the potential information available, or that would be available, if Ancestry provided a chromosome browser and matrix type tool.

In other words, you’d have to convince all of your matches and then they would have to convince all of the matches in the circle that they match and you don’t to upload their files.

Given that, of the 44 private tree shakey leaf matches that I sent messages to about 2 weeks ago, asking only for them to tell me the identity of our common pedigree ancestor, so far 2 only of them have replied, the odds of getting an entire group of people to upload files is infinitesimal.  You’d stand a better chance of winning the lottery.

One of the things Ancestry excels at is marketing.

ancestry ad1

If you’ve seen any of their ads, and they are everyplace, they focus on the “feel good” and they are certainly maximizing the warm fuzzy feelings at the holidays and missing those generations that have gone before us.

ancestry ad2

This is by no means a criticism, but it is why so many people do take the Ancestry DNA test. It’s advertised as easy and you’ll learn more about your family.  And you do, no question – you learn about your ethnicity and you get a list of DNA matches, pedigree matches when possible and DNA Circles.

The list of what you don’t get is every bit as important, a chromosome browser and tools to see whether your matches also match each other.  However, most of their customers will never know that.

Judging by the high percentage of inaccurate trees I found at Ancestry in my little experiment relative to the known and documented wife’s name of James Lee Claxton, which was 96%, based on just the first page of 50 search matches, it would appear that about 96% of Ancestry’s clientele are willing to believe something that someone else tells them without verification.  I doubt that it matters whether that information is a tree or a DNA test where they are shown  matches with common pedigree charts and circles.  I don’t mean this to be critical of those people.  We all began as novices and we need new people to become interested in both genealogy and DNA testing.

I suspect that most of Ancestry’s clients, especially new ones, simply don’t have a clue that there is a problem, let alone the magnitude and scope.  How would they?  They are just happy to find information about their ancestor.  And as someone said to me once – “but there are so many of those trees (with a wrong wife’s name), how can they all be wrong?”  Plus, the ads, at least some of them, certainly suggest that the DNA test grows your family tree for you.

ancestry ad3 signoff

The good news in all of this is that Ancestry’s widespread advertising has made DNA testing just part of the normal things that genealogists do.  Their marketing expertise along with recent television programs have served to bring DNA testing into the limelight. The bad news is that if people test at Ancestry instead of at a vendor who provides tools, we, and they, lose the opportunity to utilize those results to their fullest potential.  We, and they, lose any hope of proving an ancestor utilizing DNA.  And let’s face it, DNA testing and genealogy is about collaboration.  Having a DNA test that you don’t compare against others is pointless for genealogy purposes.

When a small group of bloggers and educators visited Ancestry in October, 2014, for what came to be called DNA Day, we discussed the chromosome browser and Ancestry’s plans for their new DNA Circles product, although it had not yet been named at that time.  I wrote about that meeting, including the fact that we discussed the need for a chromosome browser ad nauseum.  Needless to say, there was no agreement between the genetic genealogy community and the Ancestry folks.

When we discussed the situation with Ancestry they talked about privacy and those types of issues, which you can read about in detail in that article, but I suspect, strongly, that the real reason they aren’t keen on developing a chromosome browser lies in different areas.

  1. Ancestry truly believes that people cannot understand and utilize a chromosome browser and the information it provides. They believe that people who do have access to chromosome browsers are interpreting the results incorrectly today.
  2. They do not want to implement a complex feature for a small percentage of their users…the number bantered around informally was 5%…and I don’t know if that was an off-the-cuff number or based on market research. However, if you compare that number with the number of accurate versus inaccurate pedigree charts in my “James Claxton’s wife’s name” experiment, it’s very close…so I would say that the 5% number is probably close to accurate.
  3. They do not want to increase their support burden trying to explain the results of a chromosome browser to the other 95%. Keep in mind the number of users you’re discussing. They said in their paper they had 500,000 DNA participants. I think it’s well over 700,000 today, and they clearly expect to hit 1 million in 2015. So if you utilize a range – 5% of their users are 25,000-50,000 and 95% of their users are 475,000-950,000.
  4. Their clients have already paid their money for the test, as it is, and there is no financial incentive for Ancestry to invest in an add-on tool from which they generate no incremental revenue and do generate increased development and support costs. The only benefit to them is that we shut up!

So, the bottom line is that most of Ancestry’s clients don’t know or care about a chromosome browser.  There are, however, a very noisy group of us who do.

Many of Ancestry’s clients who purchase the DNA test do so as an impulse purchase with very little, if any, understanding of what they are purchasing, what it can or will do for them, at Ancestry or anyplace else, for that matter.

Any serious genealogist who researched the autosomal testing products would not make Ancestry their only purchase, especially if they could only purchase one test.  Many, if not most, serious genealogists have tested at all three companies in order to fish in different ponds and maximize their reach.  I suspect that most of Ancestry’s customers are looking for simple and immediate answers, not tools and additional work.

The flip side of that, however, if that we are very aware of what we, the genetic genealogy industry needs, and why, and how frustratingly lacking Ancestry’s product is.

Company Focus

It’s easy for us as extremely passionate and focused consumers to forget that all three companies are for-profit corporations.  Let’s take a brief look at their corporate focus, history and goals, because that tells a very big portion of the story.  Every company is responsible first and foremost to their shareholders and owners to be profitable, as profitable as possible which means striking the perfect balance of investment and expenditure with frugality.  In corporate America, everything has to be justified by ROI, or return on investment.

Family Tree DNA

Family Tree DNA was the first one of the companies to offer DNA testing and was formed in 1999 by Bennett Greenspan and Max Blankfeld, both still principles who run Family Tree DNA, now part of Gene by Gene, on a daily basis.  Family Tree DNA’s focus is only on genetic genealogy and they have a wide variety of products that produce a spectrum of information including various Y DNA tests, mitochondrial, autosomal, and genetic traits.  They are now the only commercial company to offer the Y STR and mitochondrial DNA tests, both very important tools for genetic genealogists, with a great deal of information to offer about our ancestors.

In April 2005, National Geographic’s Genographic project was announced in partnership with Family Tree DNA and IBM.  The Genographic project, was scheduled to last for 5 years, but is now in its 9th year.  Family Tree DNA and National Geographic announced Geno 2.0 in July of 2012 with a newly designed chip that would test more than 12,000 locations on the Y chromosome, in addition to providing other information to participants.

The Genographic project provided a huge boost to genetic genealogy because it provided assurance of legitimacy and brought DNA testing into the living room of every family who subscribed to National Geographic magazine.  Family Tree DNA’s partnership with National Geographic led to the tipping point where consumer DNA testing became mainstream.

In 2011 the founders expanded the company to include clinical genetics and a research arm by forming Gene by Gene.  This allowed them, among other things, to bring their testing in house by expanding their laboratory facilities.  They have continued to increase their product offerings to include sophisticated high end tests like the Big Y, introduced in 2013.

23andMe

23andMe is also privately held and began offering testing for medical and health information in November 2007, initially offering “estimates of predisposition for more than 90 traits ranging from baldness to blindness.”  Their corporate focus has always been in the medical field, with aggregated customer data being studied by 23andMe and other researchers for various purposes.

In 2009, 23andMe began to offer the autosomal test for genealogists, the first company to provide this service.  Even though, by today’s standards, it was very expensive, genetic genealogists flocked to take this test.

In 2013, after several years of back and forth with 23andMe ultimately failing to reply to the FDA, the FDA forced 23andMe to stop providing the medical results.  Clients purchasing the 23andMe autosomal product since November of 2013 receive only ethnicity results and the genealogical matching services.

In 2014, 23andMe has been plagued by public relations issues and has not upgraded significantly nor provided additional tools for the genetic genealogy community, although they recently formed a liaison with My Heritage.

23andMe is clearly focused on genetics, but not primarily genetic genealogy, and their corporate focus during this last year in particular has been, I suspect, on how to survive, given the FDA action.  If they steer clear of that landmine, I expect that we may see great things in the realm of personalized medicine from them in the future.

Genetic genealogy remains a way for them to attract people to increase their data base size for research purposes.  Right now, until they can again begin providing health information, genetic genealogists are the only people purchasing the test, although 23andMe may have other revenue sources from the research end of the business

Ancestry.com

Ancestry.com is a privately held company.  They were founded in the 1990s and have been through several ownership and organizational iterations, which you can read about in the wiki article about Ancestry.

During the last several years, Ancestry has purchased several other genealogy companies and is now the largest for-profit genealogy company in the world.  That’s either wonderful or terrible, depending on your experiences and perspective.

Ancestry has had an on-again-off-again relationship with DNA testing since 2002, with more than one foray into DNA testing and subsequent withdrawal from DNA testing.  If you are interested in the specifics, you can read about them in this article.

Ancestry’s goal, as it is with all companies, is profitability.  However, they have given themselves a very large black eye in the genetic genealogy community by doing things that we consider to be civically irresponsible, like destroying the Y and mitochondrial DNA data bases.  This still makes no sense, because while Ancestry spends money on one hand to acquire data bases and digitize existing records, on the other hand, they wiped out a data base containing tens of thousands of irreplaceable DNA records, which are genealogy records of a different type.  This was discussed at DNA Day and the genetic genealogy community retains hope that Ancestry is reconsidering their decision.

Ancestry has been plagued by a history of missteps and mediocrity in their DNA products, beginning with their Y and mitochondrial DNA products and continuing with their autosomal product.  Their first autosomal release included ethnicity results that gave many people very high percentages of Scandinavian heritage.  Ancestry never acknowledged a problem and defended their product to the end…until the day when they announced an update titled….a whole new you.  They are marketing geniuses.  While many people found their updated product much more realistic, not everyone was happy.  Judy Russell wrote a great summary of the situation.

It’s difficult, once a company has lost their credibility, for them to regain it.

I think Ancestry does a bang up job of what their primary corporate goal is….genealogy records and subscriptions for people to access those records. I’m a daily user.  Today, with their acquisitions, it would be very difficult to be a serious genealogist without an Ancestry subscription….which is of course what their corporate goal has been.

Ancestry does an outstanding job of making everything look and appear easy.  Their customer interface is intuitive and straightforward, for the most part. In fact, maybe they have made both genealogy and genetic genealogy look a little too easy.  I say this tongue in cheek, full well knowing that the ease of use is how they attract so many people, and those are the same people who ultimately purchase the DNA tests – but the expectation of swabbing and the answer appearing is becoming a problem.  I’m glad that Ancestry has brought DNA testing to so many people but this success makes tools like the chromosome browser/matrix that much more important – because there is so much genealogy information there just waiting to be revealed.  I also feel that their level of success and visibility also visits upon them the responsibility for transparency and accuracy in setting expectations properly – from the beginning – with the ads. DNA testing does not “grow your tree” while you’re away.

I’m guessing Ancestry entered the DNA market again because they saw a way to sell an additional product, autosomal DNA testing, that would tie people’s trees together and provide customers with an additional tool, at an additional price, and give them yet another reason to remain subscribed every year.  Nothing wrong with that either.  For the owners, a very reasonable tactic to harness a captive data base whose ear you already have.

But Ancestry’s focus or priority is not now, and never has been, quality, nor genetic genealogy.  Autosomal DNA testing is a tool for their clients, a revenue generation source for them, and that’s it.  Again, not a criticism.  Just the way it is.

In Summary

As I look at the corporate focus of the three players in this space, I see three companies who are indeed following their corporate focus and vision.  That’s not a bad thing, unless the genetic genealogy community focus finds itself in conflict with the results of their corporate focus.

It’s no wonder that Family Tree DNA sponsors events like the International DNA Conference and works hand in hand with genealogists and project administrators.  Their focus is and always has been genetic genealogy.

People do become very frustrated with Family Tree DNA from time to time, but just try to voice those frustrations to upper management at either 23andMe or Ancestry and see how far you get.  My last helpdesk query to 23andMe submitted on October 24th has yet to receive any reply.  At Family Tree DNA, I e-mailed the project administrator liaison today, the Saturday after Thanksgiving, hoping for a response on Monday – but I received one just a couple hours later – on a holiday weekend.

In terms of the chromosome browser war – and that war is between the genetic genealogy community and Ancestry.com, I completely understand both positions.

The genetic genealogy community has been persistent, noisy, and united.  Petitions have been created and signed and sent to Ancestry upper management.  To my knowledge, confirmation of any communications surrounding this topic with the exception of Ancestry reaching out to the blogging and education community, has never been received.

This lack of acknowledgement and/or action on the issues at hand frustrates the community terribly and causes reams of rather pointed and very direct replies to Anna Swayne and other Ancestry employees who are charged with interfacing with the public.  I actually feel sorry for Anna.  She is a very nice person.  If I were in her position, I’d certainly be looking for another job and letting someone else take the brunt of the dissatisfaction.  You can read her articles here.

I also understand why Ancestry is doing what they are doing – meaning their decision to not create a chromosome browser/match matrix tool.  It makes sense if you sit in their seat and now have to look at dealing with almost a million people who will wonder why they have to use a chromosome browser and or other tools when they expected their tree to grow while they were away.

I don’t like Ancestry’s position, even though I understand it, and I hope that we, as a community, can help justify the investment to Ancestry in some manner, because I fully believe that’s the only way we’ll ever get a chromosome browser/match matrix type tool.  There has to be a financial benefit to Ancestry to invest the dollars and time into that development, as opposed to something else.  It’s not like Ancestry has additional DNA products to sell to these people.  The consumers have already spent their money on the only DNA product Ancestry offers, so there is no incentive there.

As long as Ancestry’s typical customer doesn’t know or care, I doubt that development of a chromosome browser will happen unless we, as a community, can, respectfully, be loud enough, long enough, like an irritating burr in their underwear that just won’t go away.

burr

The Future

What we “know” and can do today with our genomes far surpasses what we could do or even dreamed we could do 10 years ago or even 5 or 2 years ago.  We learn everyday.

Yes, there are a few warts and issues to iron out.  I always hesitate to use words like “can’t,” “never” and “always” or to use other very strongly opinionated or inflexible words, because those words may well need to be eaten shortly.

There is so much more yet to be done, discovered and learned.  We need to keep open minds and be willing to “unlearn” what we think we knew when new and better information comes along.  That’s how scientific discovery works.  We are on the frontier, the leading edge and yes, sometimes the bleeding edge.  But what a wonderful place to be, to be able to contribute to discovery on a new frontier, our own genes and the keys to our ancestors held in our DNA.

Anzick (12,707-12,556), Ancient One, 52 Ancestors #42

anzick burial location

His name is Anzick, named for the family land, above, where his remains were found, and he is 12,500 years old, or more precisely, born between 12,707 and 12,556 years before the present.  Unfortunately, my genealogy software is not prepared for a birth year with that many digits.  That’s because, until just recently, we had no way to know that we were related to anyone of that age….but now….everything has changed ….thanks to DNA.

Actually, Anzick himself is not my direct ancestor.  We know that definitively, because Anzick was a child when he died, in present day Montana.

anzick on us map

Anzick was loved and cherished, because he was smeared with red ochre before he was buried in a cave, where he would be found more than 12,000 years later, in 1968, just beneath a layer of approximately 100 Clovis stone tools, shown below.  I’m sure his parents then, just as parents today, stood and cried as the laid their son to rest….never suspecting just how important their son would be some 12,500 years later.

anzick clovis tools

From 1968 until 2013, the Anzick family looked after Anzick’s bones, and in 2013, Anzick’s DNA was analyzed.

DNA analysis of Anzick provided us with his mitochondrial haplogroup,  D4h3a, a known Native American grouping, and his Y haplogroup was Q-L54, another known Native American haplogroup.  Haplogroup Q-L54 itself is estimated to be about 16,900 years old, so this finding is certainly within the expected range.  I’m not related to Anzick through Y or mitochondrial DNA.

Utilizing the admixture tools at GedMatch, we can see that Anzick shows most closely with Native American and Arctic with a bit of east Siberian.  This all makes sense.

Anzick MDLP K23b

Full genome sequencing was performed on Anzick, and from that data, it was discovered that Anzick was related to Native Americans, closely related to Mexican, Central and South Americans, and not closely related to Europeans or Africans.  This was an important discovery, because it in essence disproves the Solutrean hypothesis that Clovis predecessors emigrated from Southwest Europe during the last glacial maximum, about 20,000 years ago.

anzick matches

The distribution of these matches was a bit surprising, in that I would have expected the closest matches to be from North America, in particular, near to where Anzick was found, but his closest matches are south of the US border.  Although, in all fairness, few people in Native tribes in the US have DNA tested and many are admixed.

This match distribution tells us a lot about population migration and distribution of the Native people after they left Asia, crossed Beringia on the land bridge, now submerged, into present day Alaska.

This map of Beriginia, from the 2008 paper by Tamm et all, shows the migration of Native people into (and back from) the new world.

beringia map

Anzick’s ancestors crossed Beringia during this time, and over the next several thousand years, found their way to Montana.  Some of Anzick’s relatives found their way to Mexico, Central and South America.  The two groups may have split when Anzick’s family group headed east instead of south, possibly following the edges of glaciers, while the south-moving group followed the coastline.

Recently, from Anzick’s full genome data, another citizen scientist extracted the DNA locations that the testing companies use for autosomal DNA results, created an Anzick file, and uploaded the file to the public autosomal matching site, GedMatch.  This allowed everyone to see if they matched Anzick.  We expected no, or few, matches, because after all, Anzick was more than 12,000 years old and all of his DNA would have washed out long ago due to the 50% replacement in every generation….right?  Wrong!!!

What a surprise to discover fairly large segments of DNA matching Anzick in living people, and we’ve spent the past couple of weeks analyzing and discussing just how this has happened and why.  In spite of some technical glitches in terms of just how much individual people carry of the same DNA Anzick carried, one thing is for sure, the GedMatch matches confirm, in spades, the findings of the scientists who wrote the recent paper that describes the Anzick burial and excavation, the subsequent DNA processing and results.

For people who carry known Native heritage, matches, especially relatively large matches to Anzick, confirm not only their Native heritage, but his too.

For people who suspect Native heritage, but can’t yet prove it, an Anzick match provides what amounts to a clue – and it may be a very important clue.

In my case, I have proven Native heritage through the Micmac who intermarried with the Acadians in the 1600s in Nova Scotia.  Given that Anzick’s people were clearly on a west to east movement, from Beringia to wherever they eventually wound up, one might wonder if the Micmac were descended from or otherwise related to Anzick’s people.  Clearly, based on the genetic affinity map, the answer is yes, but not as closely related to Anzick as Mexican, Central and South Americans.

After several attempts utilizing various files, thresholds and factors that produced varying levels of matching to Anzick, one thing is clear – there is a match on several chromosomes.  Someplace, sometime in the past, Anzick and I shared a common ancestor – and it was likely on this continent, or Beringia, since the current school of thought is that all Native people entered the New World through this avenue.  The school of thought is not united in an opinion about whether there was a single migration event, or multiple migrations to the new word.  Regardless, the people came from the same base population in far northeast Asia and intermingled after arriving here if they were in the same location with other immigrants.

In other words, there probably wasn’t much DNA to pass around.  In addition, it’s unlikely that the founding population was a large group – probably just a few people – so in very short order their DNA would be all the same, being passed around and around until they met a new population, which wouldn’t happen until the Europeans arrived on the east side of the continent in the 1400s.  The tribes least admixed today are found south of the US border, not in the US.  So it makes sense that today the least admixed people would match Anzick the most closely – because they carry the most common DNA, which is still the same DNA that was being passed around and around back then.

Many of us with Native ancestors do carry bits and pieces of the same DNA as Anzick.  Anzick can’t be our ancestor, but he is certainly our cousin, about 500 generations ago, using a 25 year generation, so roughly our 500th cousin.  I had to laugh at someone this week, an adoptee who said, “Great, I can’t find my parents but now I have a 12,500 year old cousin.”  Yep, you do!  The ironies of life, and of genealogy, never fail to amaze me.

Utilizing the most conservative matching routine possible, on a phased kit, meaning one that combines the DNA shared by my mother and myself, and only that DNA, we show the following segment matches with Anzick.

Chr Start Location End Location Centimorgans (cM) SNPs
2 218855489 220351363 2.4 253
4 1957991 3571907 2.5 209
17 53111755 56643678 3.4 293
19 46226843 48568731 2.2 250
21 35367409 36761280 3.7 215

Being less conservative produces many more matches, some of which are questionable as to whether they are simply convergence, so I haven’t utilized the less restrictive match thresholds.

Of those matches above, the one on chromosomes 17 matches to a known Micmac segment from my Acadian lines and the match on chromosome 2 also matches an Acadian line, but I share so many common ancestors with this person that I can’t tell which family line the DNA comes from.

There are also Anzick autosomal matches on my father’s side.  My Native ancestry on his side reaches back to colonial America, in either Virginia or North Carolina, or both, and is unproven as to the precise ancestor and/or tribe, so I can’t correlate the Anzick DNA with proven Native DNA on that side.  Neither can I associate it with a particular family, as most of the Anzick matches aren’t to areas on my chromosome that I’ve mapped positively to a specific ancestor.

Running a special utility at GedMatch that compared Anzick’s X chromosome to mine, I find that we share a startlingly large X segment.  Sometimes, the X chromosome is passed for generations intact.

Interestingly enough, the segment 100,479,869-103,154,989 matches a segment from my mother exactly, but the large 6cM segment does not match my mother, so I’ve inherited that piece of my X from my father’s line.

Chr Start Location End Location Centimorgans (cM) SNPs
X 100479869 103154989 1.4 114
X 109322285 113215103 6.0 123

This tells me immediately that this segment comes from one of the pink or blue lines on the fan chart below that my father inherited from his mother, Ollie Bolton, since men don’t inherit an X chromosome from their father.  Utilizing the X pedigree chart reduces the possible lines of inheritance quite a bit, and is very suggestive of some of those unknown wives.

olliex

It’s rather amazing, if you think about it, that anyone today matches Anzick, or that we can map any of our ancestral DNA that both we and Anzick carry to a specific ancestor.

Indeed, we do live in exciting times.

Honoring Anzick

On a rainy Saturday in June, 2014, on a sagebrush hillside in Montana, in Native parlance, our “grandfather,” Anzick was reburied, bringing his journey full circle.  Sarah Anzick, a molecular biologist, the daughter of the family that owns the land where the bones were found, and who did part of the genetic discovery work on Anzick, returns the box with his bones for reburial.

anzick bones

More than 50 people, including scientists, members of the Anzick family and representatives of six Native American tribes, gathered for the nearly two-hour reburial ceremony. Tribe members said prayers, sang songs, played drums and rang bells to honor the ancient child. The bones were placed in the grave and sprinkled with red ocher, just like when his parents buried him some 12,500 years before.

Participants at the reburial ceremony filled in the grave with handfuls, then shovelfuls of dirt and covered it with stones. A stick tied with feathers marks Anzick’s final resting place.

Sarah Anzick tells us that, “At that point, it stopped raining. The clouds opened up and the sun came out. It was an amazing day.”

I wish I could have been there.  I would have, had I known.  After all, he is part of me, and I of him.

anzick grave'

Welcome to the family, Anzick, and thank you, thank you oh so much, for your priceless, unparalleled gift!!!

tobacco

If you want to read about the Anzick matching journey of DNA discovery, here are the articles I’ve written in the past two weeks.  It has been quite a roller coaster ride, but I’m honored and privileged to be doing this research.  And it’s all thanks to an ancient child named Anzick.

Utilizing Ancient DNA at Gedmatch

Analyzing the Native American Anzick Clovis Native American Results

New Native American Mitochondrial DNA Haplogroups Extrapolated from Anzick Match Results

Ancient DNA Matching, A Cautionary Tale

More Ancient DNA Samples for Comparison