Net tools to help you 'record history'

By Jay Dougherty

Washington - With the war on Iraq under way, there's no doubt that we're living in a historic time. In decades past, records of history were often recorded simply by clipping newspaper headlines and stories. Thanks to the Internet, though, people can now follow the events of the war as they unfold. But how do you save records of these events? Tools of the digital age can help.

- Save the screen

Headlines in newspapers of old could easily be clipped out and stored in a shoebox. But how you can save the historic "Iraq under Attack" headlines? The easiest way is to capture a screen, save it to a document, and then save that document. Open your browser, and navigate to the website you'd like to save.

When the webpage is displayed as you wish to save it, hold down the Alt key and tap the PrtScn button on your keyboard. Doing so will send an image of everything that is in your Web browser to the Windows Clipboard, an invisible "holding place" for data. Now switch to your word processor, open the Edit menu, and click Paste. A copy of the image will appear in your word processor. Now save the file normally to a location of your choice.

- Save the webpage

Saving just a screen, of course, has disadvantages. First, you'll be saving it in a "read only" format. Second, in saving just what appears on your screen, you'll likely not be saving all of the information that you want, since most articles require you to scroll or click to read them in their entirety.

So saving a webpage to your hard drive may be what you want. Bookmarking is not satisfactory when the site you bookmarking is updated frequently. Saving a local copy is mandatory.

To save the contents of a simple Web page, excluding links that it may have, you can use your Web browser. With Internet Explorer, for example, open the File menu, and click Save As. When you do, the Save As dialog box will prompt you to supply a location on your hard drive where the Web page should be saved. In addition, you will have the opportunity to provide a unique name for the Web page in the File Name text box.

Think carefully about where you wish to save the webpage, knowing that there may be many more in the coming days that you wish to save. You may want to create a special directory structure for such archived events, and save the page to the appropriate place within that structure.

Once you click the Save button in the Save As dialog box, the webpage will be saved with the name you supplied. The page will have the extension. "htm," which will allow you to reopen it by navigating to its location on your hard drive and double-clicking the file. The page will open in your default Web browser.

You should note that the graphics that are displayed on the page will typically not be saved in the same directory as the webpage file itself. Internet Explorer and other Web browsers will save graphics in a subdirectory directly below the Web page itself. So be sure that you do not delete any subdirectories. If you do, the graphics will be missing when you open the page.

- Save a site

What happens when the page you wish to save contains links to other pages that you would also like to save? Now things get more complicated, and you'll have to turn to specialised tools other than Web browsers.

These tools escape easy categorisation. Sometimes referred to as Web crawlers, sometimes as Web page savers, they all perform roughly the same tasks: they start with a Web address (or URL) that you supply, analyse the page for links, and then allow you to determine how deeply you wish to pursue the links. Finally, they save all the necessary files to your hard drive so that you can read the Web pages when you are offline.

The grandfather of these tools is WebWhacker (http://www.bluesquirrel.com/products/whacker/index.html), which has been around since Web browsers were babies and is now in version 5. You can download and try out a free demonstration version of the product, or you can purchase a full version for around 49 dollars.

With WebWhacker, you can easily create archives of current websites, create CDs with the sites on them, and send copies to friends. Of course, you have to pay attention to any copyright issues when using any information stored online.

SpiderSoft (http://www.spidersoft.com) is another commercial tool that gives WebWhacker a run for its money.

GetBot2 (http://www.getbot.com) is a free tool, not as user- friendly as the commercial options, that will perform the same tasks, downloading all text, graphics, and linked pages of a Web site. However, you won't be able to work directly from your Web browser. But if you know the Web address of the site you'd like to download, GetBot does the job.

A couple of warnings are in order. First, allowing a Web crawling tool that saves to disk to download many levels deep will end up taking lots of time and consuming lots of your hard drive space.

Think about it for a second: the average commercial webpage contains several dozen links to pages which themselves contain several dozen links to other pages which also contain many links. You could keep going until you end up saving the entire Internet. So it's important to specify that you want to go only a few levels down.

Second, keep in mind that saving lots of Web pages with their associative graphics can eat up your hard drive space pretty quickly, so only save what you'll want later.

Finally, note that many articles on news sites today feature a "print this article" link or button. You can use this to get a one-page version of the article which is easier to save to disk or print out. - Sapa-dpa