Recovering the Past – WayBackMachine

Nothing is forever, especially not on the internet.

Have you ever utilized a site, only to discover that precious information was gone the next time you wanted to reference the site?  And I don’t mean that piece of data was missing, but the entire site was AWOL.

We think in today’s digital world that increasingly more and more information is becoming available, and while that’s true, some also disappears.  People die, sites and providers become obsolete.  Whatever the reason, you may have some recourse finding that missing site.

The site WayBackMachine, provided by Internet Archives “crawls” sites and archives their contents, or at least part of their contents, periodically. They have saved over 308 billion, yes billion, web pages since 1996 – 21 years.

And by the way, Internet Archives is contribution funded, so if you use the site and find it valuable, please contribute what you can.

Find the Name and URL of the Site You Seek

The first piece of information you need is the actual website address of the site you are seeking. You can obtain that in a number of ways:

  • Check your saved links
  • Look in any document where you may have saved or embedded a link
  • Check old Genforum or Rootsweb lists that might pertain
  • Google for the site name or any other information that might produce a result

Note that each page of a site has it’s own URL so you may need a page URL, not just the main site’s URL.  The main site’s URL will contain the cover or landing page which may or may not lead to the page you actually want.

Let’s say all I can find are Iinks where I can’t actually see the website address.  What then? Let’s step through this process.

Finding the Address of an Embedded LInk

Next, go to WayBackMachine at this link:  https://web.archive.org/

I provided the actual link above to illustrate the difference between an embedded link, under the word WayBackMachine, and a link that is spelled out with its actual url.  Sometimes you can “mouse over” or “fly over” the embedded link with your cursor to display the real address.  Sometimes not.

To find the actual address of the embedded link, behind the word WayBackMachine, above, click or double click on the link. You may have to control+click. The link will then take you to the address or url.  If the site is there, you’re in luck.  If not, you will receive an error message, but you will then be able to see in the url line the address to which the embedded link tried to resolved.  That’s the address you want, which is the same as the link that is spelled out. Copy that link, because you’ll need it for finding an archived copy in WayBackMachine.

Using WayBackMachine

By now, you should be at WayBackMachine.  Let’s use my own blog address as our guinea pig.  Let’s pretend that for some reason, my blog was suddenly gone.  Yes, in a pique of outrage or a horrible mistake, I could delete all 900+ articles in the blink of an eye by deleting the site itself.  Of course, I’m not planning for that to happen. But life doesn’t always go according to plan.

However, and this is a really big however, should I die unexpectedly, you know, like from that blood clot when chocolate and my ancestors tried to kill me earlier this year, and no one paid the annual fee to WordPress, my blog seriously would be gone. So would anyone else’s in the same situation.  WordPress is free “forever” for unpaid sites, but paid sites are another matter.  And who knows what forever means in reality.

At WayBackMachine, enter the url of the site you want to find.  I’m calling this the target site – the one you are searching for.

If you enter a partial url, WayBackMachine finds candidates from as much as you entered.

If you have used this tool before, the format has changed and isn’t terribly intuitive, or wasn’t to me. Let’s step through the results.

What You See

For www.dna-explained.com, you can see that they began crawling, which is a technical term for scanning, my blog in mid 2012.  That’s exactly when I started this blog.

The have scanned the blog often ever since, which makes since, given that I publish at least twice weekly.

On the top row, you are positioned in the current year whose calendar is displayed below the year band. To view other years, side back and forth on the year bar. The yellow year is the calendar you are viewing, below the year band.

On the calendar portion, you will see blue or green dots.

Now, you’re going to laugh, but I could not for the life of me figure out how to actually display the website I was searching for.  In all fairness, the site I was hunting was older and the little colored dots were not visible on my screen, meaning I would have had to scroll down to see them.  This is where you need another set of eyes.  I want to say a very big thank you to my long time friend (and DNA project co-administrator) Janet Crain for figuring out what to do next.

On the calendar, click on the blue and green dots to view actual archives pages from the site you are seeking. If you’re saying “duh,” I know, so was I.  It’s intuitive AFTER you know how it works and you actually see the dots.  In my defense, Janet said it took her awhile to figure this out too. Maybe she was just being nice😊

Once WayBackMachine brings up the target site for you to view, you can then click on links on that original site, and those links will (sometimes) go to other pages on the site that WayBackMachine has also saved.

Not all target site links are saved, and links that involve applications (like searching for a surname) don’t work, because the application isn’t saved, just the viewing page.  Sometimes search features are just ways to view additional pages, and if that is the case, you may be able to find what you are seeking by poking around. For example, if the search is only making it easier to find your ancestor on a page that is fully displayed on the site, that page may well still be available, even if the search function no longer works. However, if the search only shows you a piece of data from a data base behind the scenes, the search will no longer work.

Having said that, WayBackMachine has been my salvation more than once.

By this time, you’ll either have what you were seeking, or many more questions.  For answers to those questions, refer to the WayBackMachine FAQ.

How Does This Affect Genetic Genealogy?

You may be asking yourself how this affects genetic genealogy and why I’m writing about it.

The genetic part of genetic genealogy is only half the equation.  Genetic plus genealogy.  Genealogy is the other half.

If you’ve been doing genealogy more than a few minutes, you’ll surely have needed to retrace your steps to find something you just know you found previously.  And if you’re like me, you’ll be very VERY regretful that you didn’t record more of some resource when you had the chance.  And of course, you’ll discover that too late.

With the recent outage of the Rootsweb archives, trees and homepages, we’re reminded once again how much we depend on resources that we think are permanent, but that really aren’t. Let’s hope that eventually, most of the Rootsweb functionality will be restored.  If not, it wouldn’t be the first time that a free resource we utilize has been discontinued for any variety of reasons.

As it turns out, Judy Russell and I were composing similar articles at the same time, and she specifically addresses finding Rootsweb archived pages utilizing the WayBackMachine, here.

Thank goodness for WayBackMachine.

At least it gives you a prayer.

9 thoughts on “Recovering the Past – WayBackMachine

  1. MY QUESTION IS SIMPLE WHAT HAS HAPPENED TO THE TRIANGULATION TOOL THAT WAS AT FTDNA FOR A SHORT PERIOD??? I just got started using it went on vacation come back then its no longer accessible ?? Talk to us about what’s going on here!

  2. Do you have any suggestions a .for those of us who had posted sites on Freepagds? I had close to 500 pages there. I am distraught.

Leave a Reply