1. Parenting
Send to a Friend via Email

Discuss in my forum

Kimberly Powell

Cache 22 - Has Ancestry.com Gone too Far?

By August 28, 2007

Follow me on:

Update: On Wednesday, August 29, Ancestry.com removed the new Internet Biographical Collection from its site. According to Ancestry.com officials, the idea is being reevaluated and the database will not return anytime soon - at least not until many changes have been made.

The new Internet Biographical Collection at Ancestry.com has raised a storm of furor this week. This database, available to subscribers only, is basically a collection of cached content - copies of Web pages taken from a variety of sites across the Internet that contain biographical information of interest.

I personally have no problems with Ancestry.com offering a search feature that indexes these Web sites. It offers a useful search service, although I would feel better about it if they didn't include this among their subscription offerings as the content isn't their own, and search engines traditionally do not charge for their service.

The biggest problem I see here, however, is that these pages are being cached. A cache is basically a copy of a Web site, taken at a particular point in time.

Google has been taken to court over their practice of caching Web sites, but the ruling was in Google's favor. Judge Robert C. Jones of the Nevada District Court said, "“When a user requests a web page contained in the Google cache by clicking on a 'Cached' link, it is the user, not Google, who creates and downloads a copy of the cached web page. Google is passive in this process. Without the user’s request, the copy would not be created and sent to the user, and the alleged infringement at issue in this case would not occur."

The Ancestry.com database takes things even further, however, serving up the cached pages as the first option and offering a small link to the "live Web site." There is no way to get to the link for the live page without first viewing the cached page. On the actual record page for each search result, the cached link is identified as "cached," but it is still the only option open if you want to view the content - there is no link to the live Web page until after you view the cached page. And from the search results where you are given the option only to "view Web page" you are taken directly to the cached page, with no notice that the page is indeed cached. This is where I feel that this database has stepped over the line, possibly into copyright infringement. Ancestry.com is serving up copies of copyrighted work and, to make matters worse, selling this as one of their subscription databases. Because the pages are cached, they are also depriving the Web site and/or content owner of traffic and potential income.

How do you feel about this issue? Is the new database a useful service, a violation of copyright, or an unethical step in the wrong direction for Ancestry.com? Click on "comments" below and share your thoughts.

Comments
August 28, 2007 at 9:47 am
(1) Randolph Clark says:

Can you link us to an example of this?

August 28, 2007 at 12:39 pm
(2) AncestrySubscriber says:

The idea behind it is good but (a) it should be free. Ancestry is NOT building the web sites and (b) it should be called LIVE since it isn’t. User should be told how to get to live site such as click site link to see latest site info. But this should NOT be a subscriber offering. If they want to hook people in, then provide extra benefit to the subscriber — maybe a link to something in the Subscriber database. I have a genealogy site and I figure anything I put there is going to get used. I hope they give me credit but since I put it where anyone can get to it — it’s not something I can demand.

Of course lately Ancestry has posted titles such as Directory of Scottish Settlers in North America, 1625-1825. Vol. III. But don’t bother to try looking if you have Scottish ancesters who came to North America during that time. This is limited to the World Deluxe Membership account only. Why? Because the “places” designation has only Scotland in it — 3 times! How wrong is that!

August 28, 2007 at 1:23 pm
(3) ~Kimberly says:

Only subscribers can access this database so I can’t provide an example link for you to see unless you are a subscriber. If you do subscribe to Ancestry.com, then click on Internet Biographical Collection above.

Just wanted to update my post as well. Since I blogged earlier this morning (it’s now 5 hours later), Ancestry.com has already rectified the things I felt were wrong:

1) The record page now includes a link to the cached page as well as to the live page. I’d prefer that the live page be the first link, but this is definitely an improvement.

2) The “view web page” link from the search results which previously took you to a cached page without warning has been removed.

Just so everyone knows, I don’t personally have any Web sites or pages involved in this that I know of. My blog is just commentary on something I felt crossed the line a bit.

August 28, 2007 at 1:49 pm
(4) Janice Brown says:

Please see http://cowhampshire.blogharbor.com/blog/_archives/2007/8/28/3190057.html
or email me at janicebr@earthlink.net for additional screenshots.

Janice Brown

August 28, 2007 at 3:44 pm
(5) Janice Brown says:

As someone with extensive FREE genealogy web site experience, plus someone whose blog (Cow Hampshire) content has been “stolen” by Ancestry.come via their online “Biographical Database,” I can succinctly state that they have violated the copyright to my blog articles at Cow Hampshire. There is no excuse for this. I have contacted their copyright attorney and to date have not heard back. How would you feel if your book or web site was being “sold” to their customers?

Janice Brown
Blog: Cow Hampshire

August 28, 2007 at 4:09 pm
(6) Lynn says:

Well, I have mixed emotions about this. I have frequently found information in online searches that was gone when I went to look for it. My undocumented sense is that it usually is because the person who put it up let it wither through inattention – bit rot in other words.

I do believe that ancestry should honor robots.txt restrictions and takedown requests from content owners.

August 28, 2007 at 4:17 pm
(7) Elma says:

It looks like Ancestry has made these pages free now.

August 28, 2007 at 4:23 pm
(8) ~Kimberly says:

Janice,

I’ve been there and I know how it feels. I’m right there with you on this! It looks like Ancestry is listening too, because they have already moved the database into their Free Records section, as well as added links to the live Web page right under the one to the cached Web page. Am I saying they were in the right with this? No. But they are at least listening.

August 28, 2007 at 6:07 pm
(9) Susan Kitchens says:

Some tech questions: What is the form that their spider or bot takes? What is the block of IP addresses that belong to Ancestry.com? I can alter my site to refuse to even allow access from their servers. If your site is hosted by blogspot, typepad, wordpress.com or other free weblog hosting service, blocking IP addresses probably isn’t an option.

August 28, 2007 at 6:20 pm
(10) Sara Binkley Tarpley says:

Well, facts cannot be copyrighted; but original content and Web pages themselves can be. I have some biographical essays on my site that I do not even allow cousins to put on their sites.

Today’s format on Ancestry is a lot better than yesterday’s, but I still feel that it is unethical. If Ancestry wants to provide a search engine that produces links, they should create one, rather than describing search results as being part of a collection and suggesting that the results are owned by or otherwise affiliated with Ancestry.com.

August 28, 2007 at 7:23 pm
(11) Hal Whitmore says:

If Ancestry has moved this to their free pages, I think this will solve most of the legit compaints. Clearly, however, a very serious PR mistake that has cost them a lot of good will. And, unless they have fixxed this too, I think it is downright tacky to suggest a citation that includes Ancestry. I’ve been pretty neutral about Ancestry/TMG, but this whole mess rates a black mark in my opinion.

August 28, 2007 at 8:04 pm
(12) Janice Brown says:

I’m afraid that Ancestry.com has not satisfied me. They still include my blog in their “database” without giving my blog URL a proper citation (sorry a “live link” to my blog alone doesn’t cut it), and as far as the free part, you must be fooling yourselves if you think it is free right now.

You must register in order to get the “free access,” and although one friend says he gave a fake email address to log in, how many will really think of that? So now Ancestry.com has your email address (if its your real one)–do you realize how valuable that is to them from a marketing perspective.

I won’t be satisfied until they 1) remove the database from the paid section of Ancestry, 2) include each blog or web site’s REAL URL and TITLE/NAME as part of the database search results, 3) make the database REALLY FREE (no registration, log-in or subscription needed).

Janice at Blog: Cow Hampshire

August 28, 2007 at 8:21 pm
(13) Becky Wiseman says:

I’m with Janice on this one. The URL to our websites and/or blogs should be included. The way it is now it still appears that this is Ancestry content – and it isn’t, not by a longshot!!

August 28, 2007 at 8:50 pm
(14) Susan Kitchens says:

And now I’ve jumped into the fray. Complete with parody makeover of portions of ancestry.com’s home page.

August 28, 2007 at 8:53 pm
(15) Miriam says:

I’ve blogged about this today, too…with the perspective that there may be images on the blogs or websites’ home pages that have terms of use that require that they are not copied or resized for any reason without the creator’s permission. Thumbnails of websites that appear on Ancestry may be in violation of this, even if Ancestry does not profit.

August 28, 2007 at 10:15 pm
(16) Wal Rutherford says:

Surely when a person puts their family history on line they do so to help people find ancestors or links to ancestors, not to help money grubbing companies to make a living from it. Does this mean, as it will to me, that Genealogists will not publish anymore,if so, then Ancestry should be brought into line with and show the help and generosity shown by the average researcher.
Wal

August 28, 2007 at 10:15 pm
(17) Susan Kitchens says:

Found tech info that’s necessary to make the bot go away. More research is necessary. But the updated info is at the bottom of my post.

August 29, 2007 at 4:04 am
(18) MAD says:

Ancestry in its zeal to monopolize the genealogy market stole every site it came across it appears as many or more sites are non genealogy and many have XXX content, or content not suitable for families.

Look at this site.

http://search.ancestry.com/cgi-bin/sse.dll?indiv=1&rank=1&gsfn=&gsln=&_82000000=&rg_81000001__date=&rs_81000001__date=0&gskw=boob&prox=1&db=webbiographies&ti=0&ti.si=0&gss=angs-d&fh=41&recid=173722&recoff=414+415

August 29, 2007 at 4:42 am
(19) MAD says:

The pages were not just cached Ancestry changed the source code and added their script. My personal sites look like they belong to Ancestry and my pages can be saved to their shoe box.