July 15, 2008

Find Evidence on Your Opponent's Web Site

7-15-2008 National:

One of the best places on the Internet to find information about a company -- such as a litigation adversary -- is the company's own Web site. But while a visitor researches a company, the company may be researching the visitor, revealing more than the researcher would like. In addition, the company may at any time change or remove information on its Web site that may be most valuable to the researcher. This article discusses the information that Web site owners can learn about visitors to their site, and shows ways to see older versions of Web pages that may have been changed or removed.

Web sites routinely collect certain information from visitors to maintain statistics and to enhance the visitor's experience on the Web site. Much of this information may be sent from the visitor's computer to the Web site without the visitor's knowledge, and may reveal more than the visitor expects. A Web site owner can learn many things about visitors through "cookies" and environment variables such as the IP address.

A "cookie" is a small piece of information written on a visitor's computer by a Web site. A cookie might contain the visitor's Web site user name and password, display preferences or even name and address. When a Web site offers to "remember" a visitor, it is offering to write cookies. Cookies stay on the visitor's computer after the visitor has left the Web site, closed the Web browser, disconnected from the Internet and even turned off the computer. If a visitor provides his name and e-mail address to a Web site, that information might be stored in a cookie, and would be available to the Web site on the next visit, which could be months later.

Cookies have received a great deal of attention in the media because privacy advocates are concerned about the way advertisers use cookies. However, cookies are probably not a significant concern for those performing covert research on an opposing party's Web site. As a general rule, cookies contain only information that the visitor has provided to the Web site or information that the Web site could have obtained without cookies.

If a visitor is concerned about information that might be stored in cookies, cookies can be erased. In the Internet Explorer Web browser, for example, the visitor can pick Tools menu, Internet Options, General, Delete Cookies. This can be done at any time -- before, during or after the visit to a Web site -- and will immediately delete all cookies. Unfortunately, this will also delete desirable cookies, such as Westlaw or Lexis logins. For those who wish to preserve desirable cookies while deleting undesirable cookies, there is privacy software that provides enhanced cookie management.

A greater concern for those performing covert research is environment variables, particularly the Internet Protocol address. The IP address is a unique identifying set of numbers used to direct communications through a network or the Internet. A Web site always has access to every visitor's IP address: Without that information, the Web site and visitor would not be able to communicate. However, the IP address may reveal more than the visitor realizes.

Most larger businesses, including large law firms, have "static" IP addresses, permanent IP addresses that specifically identify the company. For example, the static IP address 67.200.59.2 can easily be identified as the Young Conaway law firm. Most smaller businesses and residential connections to the Internet use "dynamic" IP addresses, temporary addresses that are assigned when the person connects to the Internet and may be different every time. The dynamic IP address 141.158.235.41 can be identified as a customer of the Verizon Internet service in the Philadelphia area, but cannot be connected to a specific individual or company.

The Web site Broadband Reports has a useful tool to show what can be learned from a person's IP address. When a person visits www.dslreports.com/whois, the page displays the visitor's current IP address. That IP address can then be entered in the WhoIs box to learn what is readily known about that IP address. Another site, www.IP-adress.com, displays the IP address of the current visitor with a map showing the locality associated with the IP address.

Web sites routinely store IP addresses for statistical purposes, but Web site owners do not ordinarily analyze the IP address of every visitor to a Web site, so there is little concern in casually browsing public areas of an opponent's Web site. However, Web site owners are likely to check the IP address when there is suspicious behavior. For example, they might check the IP address of a person who tries to view a confidential, blocked or hidden page. They might check the source of an e-mail requesting information about the company or its products. Users should be aware that the e-mail sender's identity cannot be concealed by using Web e-mail services, such as a Hotmail, Gmail or Yahoo Mail -- these services embed the sender's IP address in the e-mail. The only way to effectively conceal the sender's identity is to send the e-mail from some other location, such as a home computer, a public library or an Internet cafe.

Web site owners may also track the IP addresses of messages posted on the Web site's message boards or chats conducted through online chat services, and are likely to check the address if the post or chat is suspicious in nature. For example, if a visitor posts a message on a customer support message board asking if any other customers have had a particular problem with the company's product, the site owner might be inclined to check the poster's IP address.

Environment variables can also reveal the last page that the visitor saw before coming to the current page, the page where the visitor clicked a link to come to the current page. Like IP addresses, this is not the sort of thing that a Web site owner normally checks in the absence of suspicious activity. However, if a page on one site links to a page on another site that is supposed to be confidential or hidden, the host of the latter site might look into the former site and into the visitors who clicked that link.

Other information found in environment variables is generally less of a concern for covert research. For example, environment variables reveal the visitor's browser (Internet Explorer, Firefox, Opera, etc.), which is not especially confidential. Hypothetically, environment variables could reveal a visitor's network login, but as a practical matter that information is rarely revealed.

THE WAYBACK MACHINE

Browsing a party's Web site will only show the information that the Web site owner currently wants visitors to see. Sometimes, the most valuable information about an opposing party is the information that has been changed or removed. Fortunately, there are ways to see older versions of Web pages. Pages that were changed recently can be viewed through Google's cache feature. Pages that were changed months or years ago may be available through the Internet Archive, also known as the Wayback Machine. Viewing these older versions of Web pages avoids the privacy risks discussed above: The copied pages are not on the company's Web site, so the company has no record of the researcher's activities.

When Google indexes Web pages, it stores a copy, referred to as a "cached" page. Google provides a link labeled "Cached" that allows researchers to view this copy. This cached version may be a day, a week or a month old, depending on how recently Google indexed the page.

Google's cache is most useful when the page found in a search doesn't fit the search performed. The mismatch occurs because the page has changed since it was indexed. The cached version will show the page as it appeared when it was indexed, with the search terms highlighted. The cache can also be useful when seeking information that is known to have been recently removed. If a researcher recently saw useful information on a Web site but that information is no longer there, a Google search for the missing information could turn up a cached version of the page that would contain the desired information. Google discusses its cache feature in detail in the Google Guide at Cached Pages.

If older versions of Web pages are desired, they may be found in the Internet Archive, better known as the Wayback Machine, a reference to the "Peabody's Improbable History" segment on the classic "Rocky and Bullwinkle" cartoons. The Wayback Machine crawls the Internet and makes copies of Web pages, storing them as they existed at some time in the past. It currently stores more than 85 billion Web pages, comprising two petabytes of information, archived since 1996.

The Wayback Machine does not allow visitors to search the archive's content; it simply retrieves older versions of a page with a known Web address. The page may not look precisely the way it did at that time: Images, formatting or code may be missing from the page. However, the text of the page is as it was on the day it was archived. Links on the page will function, and will take the visitor to archived versions of the linked page, allowing visitors to browse through an older version of the site. This is very useful if the precise address of the desired old page is unknown. Users should be aware, however, that the linked page may not be from precisely the same date as the linking page. It is important to watch the URL (Web address), which indicates the date in a year-month-day format. For example, the Wayback Machine contains a version of the Young Conaway home page archived on Aug. 11, 2007, with this URL: http://web.archive.org/web/20070811170145/http://ycst.com/. The page links to an article about the firm's support for the South Asian Bar Association that was archived on June 29, 2007, with this URL: http://web.archive.org/web/20070629214521/ycst.com/newsart.htm?a=179.

The Wayback Machine can be used to find older versions of guidelines, policies or procedures of an organization that have since been changed. It may contain claims that the company made about its products, services or business prospects that it may now deny. It may show when a company possessed particular information. It may hold older versions of manuals or documentation that are no longer available.

USE AS EVIDENCE IN LITIGATION

The Wayback Machine has been used several times as evidence in trade secret and copyright infringement cases. See Syncsort Inc. v. Innovative Routines International Inc., No. 04-3623, 2008 U.S. Dist. Lexis 35364 (D.N.J. April 30, 2008) (to prove that information was not a trade secret because it was publicly available on the Internet at one time); Allen v. The Ghoulish Gallery, No. 06cv371, 2007 U.S. Dist. Lexis 86224 (S.D. Calif. Nov. 20, 2007) (to prove validity of copyright claim); Telewizja Polska USA Inc. v. Echostar Satellite Corp., No. 02 C 3293, 2004 U.S. Dist. Lexis 20845 (N.D. Ill. Oct. 14, 2004) (to demonstrate inaccurate claims made in opposing party's past advertising).

However, use of the Wayback Machine as evidence has been questioned as hearsay under Fed. R. Evid. 801 and as lacking authentication under Fed. R. Evid. 901. See, e.g., Novak v. Tucows Inc., No. 06-CV-1909, 2007 U.S. Dist. Lexis 21269 (E.D.N.Y. March 26, 2007); Chamilia LLC v. Pandora Jewelry LLC, No. 04-CV-6017, 2007 U.S. Dist. Lexis 71246 (S.D.N.Y. Sept. 24, 2007); and St. Luke's Cataract & Laser Inst. P.A. v. Sanderson, No. 8:06-CV-223, 2006 U.S. Dist. Lexis 28873 (M.D. Fla. May 12, 2006), though one court has permitted its use over such objections. See Telewizja Polska USA, 2004 U.S. Dist. Lexis 20845, at *6 (finding an affidavit to be sufficient authentication, and the information not hearsay as an admission by a party-opponent). Nevertheless, the Wayback Machine remains a valuable research tool, even if its contents cannot be used for evidence.

Researching an opposing party's Web site, both past and present content, can be a valuable source of information. But researchers must remember that if they are looking at their opponent's current Web site, rather than an older copy, the Web site owner may be aware of who they are and what they are doing. ..News Source.. by Tracey R. Rich is the library and information services administrator for Young Conaway Stargatt & Taylor in Wilmington, Del., and is the co-author of Bisel's Pennsylvania Damages.

No comments: