When I first started using the Internet in 1986, my friends and I were obsessed with anonymous FTP servers. What a wonderful concept! We could download all sorts of interesting files, such as FAQs, source code, GIF images, and PC shareware. Of course, downloading could be slow, especially from the busy sites like the famous WSMR-SIMTEL20.ARMY.MIL archive.
In order to download files to my PC, I would first ftp them to my Unix account and then use Zmodem to transfer them to my PC through my 1200 bps modem. Usually, I deleted a file after downloading it, but there were certain files—like HOSTS.TXT and the "Anonymous FTP List"—that I kept on the Unix system. After a while, I had some scripts to automatically locate and retrieve a list of files for later download. Since our accounts had disk quotas, I had to carefully remove old, unused files and keep the useful ones. Also, I knew that if I had to delete a useful file, Mark, Mark, Ed, Jay, or Wim probably had a copy in their account.
Although I didn't realize it at the time, I was caching the FTP files. My Unix account provided temporary storage for the files I was downloading. Frequently referenced files were kept as long as possible, subject to disk space limitations. Before retrieving a file from an FTP server, I often checked my friend's "caches" to see if they already had what I was looking for.
Nowadays, the World Wide Web is where it's at, and caching is here too. Caching makes the Web feel faster, especially for popular pages. Requests for cached information come back much faster than requests sent to the content provider. Furthermore, caching reduces network bandwidth, which translates directly into cost savings for many organizations.
In many ways, web caching is similar to the way it was in the Good Ol' Days. The basic ideas are the same: retrieve and store files for the user. When the cache becomes full, some files must be deleted. Web caches can cooperate and talk to each other when looking for a particular file before retrieving it from the source.
Of course, web caching is significantly more sophisticated and complicated than my early Internet years. Caches are tightly integrated into the web architecture, often without the user's knowledge. The Hypertext Transfer Protocol was designed with caching in mind. This gives users and content providers more control (perhaps too much) over the treatment of cached data.
In this book, you'll learn how caches work, how clients and servers can take advantage of caching, what issues are important, how to design a caching service for your organization, and more.