Controlling the HTTP Expires Header

By Austin Smith on January 7th, 2009 at 1:33am
Posted in Caching, Drupal and Performance

So I just submitted my first core patch, followed quick by several revisions. It felt as good as I was told it would. Still tingling.

Anyways, this is an issue that's got to be resolved--it's been stuck for months on one particular problem, which I don't think should be a problem at all. It's very problematic for developers of large scale sites to be unable to adjust the expiration sent by Drupal to the client. My goal in this patch is to give developers this ability and intentionally not address the issue--which, again, has delayed this patch for months--of how reverse proxies are going to deal with it. That's not Drupal's job--it's the job of whoever is connecting Drupal with the reverse proxy, and any attempt to solve this on the Drupal Core level will require not using PHP Sessions for users that aren't authenticated. Turn the page for the proof.

Consider this:

  1. Drupal can't discard sessions for anonymous users. How would shopping carts work?
  2. If we have the session cookie, we can't use Vary: Cookie in the response header (established in issue thread).
  3. If we can't use Vary: Cookie, we can't control how proxies will deal with the situation where users log in after browsing the site and accumulating pages in their cache.
  4. But individual developers controlling reverse proxies can configure an individual proxy to *not* forward on caching policies to individual browsers or forward proxies, or they can configure their site to redirect authenticated users to a different domain (e.g. www2.mysite.com). Either method will work, and neither can be achieved by any core patch, no matter how transcendentally amazing.
  5. But that requires sending proper caching headers from Drupal.
  6. But that requires a core patch.

The only difference between my patch and kbahey's original patch is that mine implements a hook to which allows modules to control caching on a per-URL level and his implements this ability via a system setting on a site-wide level. I think it's better and more powerful to do it per-URL. For instance, I want to proxy blocks (e.g., top stories) with 300 second caching using AHAH--that's a contrib module I'm building right now for the new Observer.com, and is why I patched drupal_page_header as well as drupal_page_cache_header. I also want to serve recent articles with 600 second caching, and month-old articles with 86400 second caching. I also need to implement the ability to purge content from the CDN on hook_nodeapi, so I'm writing a module anyways--so will most users of CDNs. One caching setting won't work for a site like Observer.com.

But that isn't actually the cross I'm bearing. I care more about getting this issue unstuck: it shouldn't be Drupal's problem to assume anything about the existence of proxies between the server and the client, because Drupal isn't in a position to solve the problem. But that's not a good enough reason to continue this delay.

Thoughts?

Discard sesions for anonymous users

> 1. Drupal can't discard sessions for anonymous users.
This is being worked on in http://drupal.org/node/201122. It is on Dries' wish-list for D7, so I assume we can expect some progress in the near future.

but can we count on that?

Even if this does work out, the presence or lack thereof of the session cookie shouldn't be a deciding factor in whether to send an expires header in the future or not--I can absolutely imagine wanting both; again, a store at /store and content at / would necessitate this.

module already written

I wrote a module called httpHeaders to control the expires and modified headers on a per-content-type basis.
http://drupal.org/project/httpheaders

Looks cool, but...

Per-content type isn't granular enough, and I believe that there should be a way to control the expiration time sent to the browser in core. Also it seems only to have a Drupal 5 version.

A good rule of thumb

"Make it possible, then get out of the way" has always been a good rule of thumb for how Drupal has approached technical limitations. In many situations, he crazy number of potentially correct solutions for a given technical problem make covering all the bases in core a doomed task: we're almost certain to get some of the less common cases wrong.

Traditionally this is what we expose hooks for, or include other similar plugin points. It also leaves us more flexible in the future, when innovative new approaches become available. The swappable caching infrastructure is one of the best examples.

So, yeah, big thumbs up to the "Make it possible first, then decide if we want to do something cool by default" approach.

But that isn't actually the

But that isn't actually the cross I'm bearing. I care more about getting this issue unstuck: it shouldn't be Drupal's problem to assume anything about the existence of proxies between the server and the client, because Drupal isn't in a position to solve the problem.

Even if this does work out,

Even if this does work out, the presence or lack thereof of the session cookie shouldn't be a deciding factor in whether to send an expires header in the future or not--I can absolutely imagine wanting both; again, a store at /store and content at / would necessitate this.

But that requires sending

But that requires sending proper caching headers from Drupal .

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <p>
  • Lines and paragraphs break automatically.

More information about formatting options