Caching Proxy Servers

An additional feature offered by many proxy server applications is caching; such a server is known as a caching proxy server. Caching enables the proxy server to store pages that it retrieves as files on disk. Consequently, if the same pages are requested again, they can be provided more quickly from the cache than if the proxy server had to continue going back to the Web server from which the pages were originally retrieved. This approach has two benefits:

  • Significantly improves performance Performance is improved particularly in environments such as a school, where there is a great likelihood that more than one user might retrieve the same page.

  • Reduces demands on Internet connections Because there are fewer requests to the Internet when a caching proxy server is in use, there is a reduced demand on the Internet connection. In some cases, this results in a general speed improvement. In extreme cases, it might even be possible to adopt a less expensive Internet connectivity method because of the lower level of demand.

As with any technology, with caching proxy servers, there are issues to be considered. Sometimes a sizable amount of hard disk space is required to store the cached pages. With the significant decline in the cost of hard disk space over recent years, this is not likely to be much of a problem, but it still needs to be considered.

Another factor is that it's possible for pages held in the cache to become stale. As a result, a user might retrieve a page and believe that it is the latest version when, in fact, it has since changed, but the new page has not been updated in the proxy server cache. To prevent this problem, caching proxy servers can implement measures such as aging of cached information so that it is removed from the cache after a certain amount of time. Some proxy applications can also make sure that the page stored in the cache is the same as the page currently available on the Internet. If the page in the cache is the same as the one on the Internet, it is served to the client from the cache. If the page is not the same, the newer page is retrieved, cached, and supplied to the client.