Hitting WordPress Attachment Handling

In debugging some stuff here on WordPress.com, I dived (once more) into the image handling guts of WordPress.

A while ago, I revamped WordPress’ inline uploading functionality to be more pluggable, cacheable, maintainable and (hopefully) intuitive. A hundred bugs later, it seems to be working pretty well.

This time around, the issue was not how WP displayed the data, but how it stores it.

In trying to make things leaner and meaner on WordPress.com, we reworked some of the behind-the-scenes directory structures on our servers. This should have been totally transparent to the user, but, of course, it wasn’t. It broke our image handling.

Each file uploaded to WordPress is stored as a special type of post called an “attachment”. In the post and postmeta tables, WordPress stores things like file URL, file path on the server, image dimensions, image thumbnail info, etc. So to figure out anything we want about an uploaded file, all we have to do is essentially get_post() and a few get_post_meta()s. Sounds great.

It’s not.

Two reasons:

  1. To answer a question like “Where is this file located on the server”, we have to get_post_meta() every time for the answer. With such a generic function, it’s impossible for plugins to filter that data on the fly even for this really simple question.
  2. WordPress stores the data in a really inconvenient way; everything is absolute: paths, urls, you name it.

Because of the absolute paths, lack of API, and the aforementioned directory restructuring, we had hardcoded into our DB tens of thousands (a blind guess) of incorrect file locations and no way to filter them. Doing DB updates across hundreds of thousands of tables (not a blind guess) replicated on dozens of servers was not an option.

Partial solution (now implemented in WordPress core): Write a basic API for getting and putting the data. wp_get_attachment_metadata(), wp_update_attachment_metadata(), get_attached_file(), update_attached_file() (the lack of parallelism between function names, and the ambiguity in those names is an historical artifact) are all nicely filterable.

Improvements that still could be made: Don’t store absolute data. This is better for portability too. (Exceptition: guid – but don’t use it for the URL).

Related improvements that could be made:

  • wp_get_attachment_url()
  • wp_get_thumbnail_file()
  • wp_get_thumbnail_url()
  • wp_attachment_is_image()
  • wp_mime_type_icon()
  • Rework get_attachment_icon() given the above
  • Kill reliance on or rework get_attachment_innerHTML() (see above)

Vaguely related improvements that could be made:

  • wp_handle_upload() is a little awkward (but nice and robust!). Maybe wrap it? That might be silly.
  • wp_generate_thumbnail_filename()
  • Make thumbnail creation hookable and provide above convenience function

Update: All of the “related improvements” noted above have been made in the recently released WordPress 2.1. Of the “vaguely related improvements” suggested, 2.1 offers hookable thumbnail creation.

About these ads

About Mike Adams (mdawaffe)

I work at Automattic...
This entry was posted in note to self, WordPress, WordPress.com. Bookmark the permalink.

5 Responses to Hitting WordPress Attachment Handling

  1. David Esrati says:

    I’ve got a small beef with the way files are uploaded- once you put in a title and a description – then you can’t ever get back to edit them- bad UI. Should be in manage/attachments area- just like pages, posts, comments etc.
    I’ve also had problems lately with pdf’s not building correct links- I end up with the upload going to the right place/uploads/year/month- but the link to the PDF is just to WP-content or some such-
    The link to file, link to page, link to image interface is also a little convoluted- seems that we should be able to tag pictures- and “Caption” them- instead of having to put them on a separate page-
    if there was a one size all file upload solution- I”m sure someone would have figured it out- but, I know that you guys are really smart and knew all this already.

  2. mdawaffe says:

    David,

    We’v fixed several of those problems in the upcoming WordPress 2.1 (which is not yet in beta). So you have something to look forward to in the next couple months!

  3. miyoshi says:

    Hi mdawaffe,

    That was a nice job you did. I like this API. I have wrote attachment management helper plugin for WordPress 2.0 and now am rewriting it for new 2.1 version. I found the API so helpful.

    By the way, it seems that wp_get_attachment_url() still refers guid for the URL. Doesn’t this harm portability? Are there existing patches on Trac for this?

  4. mdawaffe says:

    miyoshi,

    I’m glad you found the API useful.

    The old use of the guid as the URL was one of the main reasons this API was written. In a future release, it will be very easy to rewrite the insides of wp_get_attachment_url() to grab the URL from somewhere else. Since only the insides of the function will change, plugins that use the API will not break.

  5. Pingback: Mtekk’s Crib » WordPress Attachments - Part 1

Comments are closed.