real ultimate waffe (.net)

Hitting WordPress Attachment Handling


In debugging some stuff here on, I dived (once more) into the image handling guts of WordPress.

A while ago, I revamped WordPress’ inline uploading functionality to be more pluggable, cacheable, maintainable and (hopefully) intuitive. A hundred bugs later, it seems to be working pretty well.

This time around, the issue was not how WP displayed the data, but how it stores it.

In trying to make things leaner and meaner on, we reworked some of the behind-the-scenes directory structures on our servers. This should have been totally transparent to the user, but, of course, it wasn’t. It broke our image handling.

Each file uploaded to WordPress is stored as a special type of post called an “attachment”. In the post and postmeta tables, WordPress stores things like file URL, file path on the server, image dimensions, image thumbnail info, etc. So to figure out anything we want about an uploaded file, all we have to do is essentially get_post() and a few get_post_meta()s. Sounds great.

It’s not.

Two reasons:

  1. To answer a question like “Where is this file located on the server”, we have to get_post_meta() every time for the answer. With such a generic function, it’s impossible for plugins to filter that data on the fly even for this really simple question.
  2. WordPress stores the data in a really inconvenient way; everything is absolute: paths, urls, you name it.

Because of the absolute paths, lack of API, and the aforementioned directory restructuring, we had hardcoded into our DB tens of thousands (a blind guess) of incorrect file locations and no way to filter them. Doing DB updates across hundreds of thousands of tables (not a blind guess) replicated on dozens of servers was not an option.

Partial solution (now implemented in WordPress core): Write a basic API for getting and putting the data. wp_get_attachment_metadata(), wp_update_attachment_metadata(), get_attached_file(), update_attached_file() (the lack of parallelism between function names, and the ambiguity in those names is an historical artifact) are all nicely filterable.

Improvements that still could be made: Don’t store absolute data. This is better for portability too. (Exceptition: guid – but don’t use it for the URL).

Related improvements that could be made:

Vaguely related improvements that could be made:

Update: All of the “related improvements” noted above have been made in the recently released WordPress 2.1. Of the “vaguely related improvements” suggested, 2.1 offers hookable thumbnail creation.