WordPress __autoload() idea

PHP 5 introduces autoloading classes. The way WordPress is currently structured, I don’t think we’d get much benefit from switching to autoloading. There aren’t that many classes in core WordPress that don’t need to get loaded on every page load.

With some restructuring, though, we might be able to cut down on the number of bytes of code we load on every page (and we could get rid of some class_exists() and require_once calls).

We’d want an autoloader, though, that could handle plugin files as well as core files.

Idea: an autoloader that can load classes from some deterministic path (e.g., the usual $classname.class.php or what have you) but that can also register classes at specific paths.

if ( !defined( 'WP_AUTOLOAD_CLASSES' ) ) {
    define(
        'WP_AUTOLOAD_CLASSES',
        function_exists( 'spl_autoload_register' )
    );
}

if ( WP_AUTOLOAD_CLASSES ) {
   /**
<ul>
  <li>PHP 5.1.2+</li>
<li>Called once per class during setup (each time</li>
<li>with two args) to tell it where the class is</li>
<li>located.</li>
<li>Don't need to call for a class if that class</li>
<li>exists in some easy deterministic path.</li>
<li>Called (with only one arg) when looking for a</li>
<li>not-currently-loaded class.</li>
<li>/</li>
</ul>

    function wp_autoload( $class, $path = null ) {
        static $classes = array();

        // Being called by PHP's autoloader
        if ( is_null( $path ) ) {
            if ( isset( $classes[$class] ) ) {
               /* Use include, not require.  That
<ul>
  <li>way we get a more meaningful</li>
<li>Fatal error: class does not exist</li>
<li>/</li>
</ul>

                include( $classes[$class] );
            } else {
               /* Look in some default path(s) for
<ul>
  <li>appropriately named files.</li>
<li>/</li>
</ul>

            }
            return;
        }

        // Being called by us
        $classes[$class] = $path;
    }
    
    // Register it
    spl_autoload_register( 'wp_autoload' );
} else {
   /**
<ul>
  <li>PHP 4 (PHP 5.1.1-)</li>
<li>Just require each file.</li>
<li>/</li>
</ul>

    function wp_autoload( $class, $path ) {
        require_once( $path );
    }
}

// Core classes in WPINC probably wouldn't have
// to be explicitly registered, but as an example:
wp_autoload(
    'WP_Error',
    ABSPATH . WPINC . '/classes.php'
);
// OK to call multiple times per path
wp_autoload(
    'Walker',
    ABSPATH . WPINC . '/classes.php'
);
// ...

// In some plugin file
wp_autoload(
    'My_Plugin_Foo',
    plugin_dir_path( __FILE__ ) . 'foo.class.php'
);

Update: To clarify, I don’t mean to imply that this idea would improve WordPress performance. It’s just an idea, not a proposal. I haven’t done any benchmarking or even naive experimentation.

Conditionally including files involves tradeoffs that need to be evaluated per app. Also, as Jacob Santos points out in the comments below, conditionally including files from conditionally defined functions is way out there.

I might call the idea clever, but it’s not necessarily good :)

About these ads

About Mike Adams (mdawaffe)

I work at Automattic...
This entry was posted in WordPress. Bookmark the permalink.

21 Responses to WordPress __autoload() idea

  1. Pingback: ¿__autoload() en Wordpress? | aNieto2K

  2. sarsura80 says:

    sounds like an good idea, worth trying it..

  3. banago says:

    If it is worth including it in WordPress core, I hope they will sooner or later. Thanks for sharing this idea.

  4. Danny Archer says:

    if it is not broken, why fix it?

  5. Lloyd Budd says:

    Any performance measurements?

    I recall maybe… way back Cal of the Flickr team telling me about how in their scenario it provided better performance just loading it all, because then it was all served from memcache.

    • mdawaffe says:

      Yeah – as I recall conditionally loading anything screws over various internal PHP optimizations as well as APC performance. (Is that true anyone? I can’t find anything about that at the moment).

      You’d probably need to have an app that loads pretty large chunks of code and that uses lots of different large chunks of code for every page load.

  6. +1 to Lloyd’s comment. On a heavily loaded server, autoloading can have a severe performance hit.

    It’s as if you were adding the following before every call to every class:

    if ( !class_exists(foo) ) __autoload(foo)

    It’s sweet for a blog on a shared host, but definitely not that great for higher-end sites.

    D.

    • mdawaffe says:

      Hence the overwriteability of the constant.

      I expected as much about performance. The idea’s in the “huh.” phase :)

      Thanks.

  7. Otto says:

    I wrote a large internal app once that used autoloading classes. I was curious too, so I compared it with an init script that left out that bit and simply loaded every PHP file I had (which was in the 300+ range at one point). In that particular case, autoloading did provide a minor performance improvement on most of my pages, but on a few, it was a noticeable degradation. I

    Naturally, I took a closer look at why, and I found that the faster pages didn’t use a lot of different classes, while the slower ones tended to use them all, or close enough to that. Suddenly there was a lot more loading of files, but that loading happened at times scattered throughout my code instead of all at the beginning, thus scattering my file access everywhere and screwing over the file caching. I optimized those cases away by sticking a bunch of preemptive include-onces in whenever a chunk of code used a whole lot of classes (where “whole lot” is not well defined), and that eliminated the performance hit.

    But unless your code is *heavily* class-oriented, you’re not going to get any performance improvements with auto-loading. The improvement all comes from not loading what you don’t need. If everything has to load anyway, you take a major hit from it.

    So, on the whole, I’d sit on autoloading for a while. Focus on making WP class-based and object-oriented, and eliminating the straight procedural code from places where it can be eliminated. Once that is done, then you’ll see some improvements from autoloading. Then it will make some real sense to use it too.

    • mdawaffe says:

      You’ll notice I never actually said this’d make WP better, just that it might cut down on bytes of PHP in memory :) Not a good idea to make performance claims without benchmarks.

      But, you have benchmarks! Makes sense that the more you use the autoloader, the slower your page is (in your scenario, at least. There must be some tradeoff based on how big the classes are that you’re loading. WP Classes would probably be on the small side of that tradeoff curve).

      I was mostly interested in the “dynamic” nature of the above autoloader: that core and plugins can tell it to load classes from wherever and not just from some algorithmically determined path. It also generates a more complicated tradeoff: how many classes you load on a page, how big the classes are, how many total classes you’ve registered (the static $classes’ memory footprint increases).

      Focus on making WP class-based and object-oriented, and eliminating the straight procedural code from places where it can be eliminated.

      That’s the “some restructuring” I was talking about ;)

    • Otto says:

      Looking closer at your idea above, I see that you’re doubling the function to provide a list of classes vs. files, sort of thing. Seems to me that that might be enough of a difference to make it worthwhile. Most of the PHP 5 autoloading tricks I’ve seen involve having that function find the class based on the name and such, and having it search for the class as well, if it can’t find it immediately. That, naturally, adds a big hit, which you can eliminate by making classes register.

      Here’s the problem though. In order to make that valuable for, say, a plugin, then the plugin needs to have lots of classes (of which it only uses a subset most of the time), and then has to register them all in the main plugin code. This adds maintenance overhead in keeping that list up to date, and it’s pretty limited in scope. How many plugins are going to use enough classes to make this worth the trouble?

      You’re probably correct in that big classes will get better results from this, but I think the case where you have a really good object oriented codebase going on (a few major base classes, lots of derived classes from them) will get the most improvement from it. The better you can subclass things, the smaller they become, but also the more numerous they become. Keep them in separate files, load them as needed, and suddenly you have large swaths of code not loading most of the time. However, this case doesn’t jive with your registration approach either, as like you say, the $classes static var gets large and annoying. Doesn’t affect the core core (since everything will be in wpinc anyway), but for plugins it might be a problem. Hmm.. this requires thought, but there’s an idea in here somewhere. :)

    • Otto says:

      Oops, one more thing I forgot. Make it capable of letting you register a whole bunch of classes at once. If I’ve got a dozen classes in a plugin, then it’d be nice to make an array of classname/filename, pass that off, and have the whole mess added at once with an array addition instead of having 12 separate function calls.

  8. netweblogic says:

    Otto, that’s a great comment. Couldn’t agree with you more.

    Autoloading is a great concept for object-oriented php programs, but WP is not OO right now. I think adding that now would be losing focus on priorities such as making WP an OO software.

  9. Oncle Tom says:

    It would be nice and would resolve code dependancy problems.

    However, it increase the PHP requirement as it would now be PHP5. I hope this move will be considered, again.

  10. gipetto says:

    I :heart: autoload.

  11. Jacob Santos says:

    Damn, it is code like your example that is adversely affecting the performance in WordPress in the first place. WordPress would be better off, if code like you have never, ever gets into the core.

    How many times must I say this! Every time PHP has to go back to the compile step when in run-time, it slows down performance more so than just having everything compiled at once.

    In your code PHP goes like this compile->run-time->compile-run-time. This will be inherently slower than something, like say, compile->run-time->end. You see the difference? Your code is the first, quicker code will be the second.

    Now what everyone can gather from the comments, your mileage will vary. Given that my servers uses XCache and APC, your code will cripple the performance. You are purposely hindering opcode caching performance by forcing them to go back and perform the caching step again. Also, in some cases, depending on the version and opcode caching extension, the code might never be fully compiled, because the opcode caching extension does not know if it should keep the compiled code on the next run.

    This is why, it is generally recommended by Rasmus (the guy who initially wrote PHP and smarter than you and me, you should read his stuff and interviews). Also, all suggestions on compile and run-time performance are geared towards those who are using XCache or APC or some other opcode caching extension.

    This said, it will be faster in the cases of shared to only compile the code that is needed. That said, you gain nothing when everything is needed. Also, it will be faster to have a build script that places all of the code in a single file, instead of the 15 to 20 files that we have currently.

    I think the problem comes is that compile-time is not reliability retrieved, so it is not really scientific as statistics go to diff $_SERVER['REQUEST_TIME'] to time(). You can do that to tell, but the real picture isn’t that easy to get because you really need to get in to the guts of PHP to find where time is being wasted.

    If I sounds like I’m flaming you, then it is because I am. You are welcome. I got in to this with Ryan, you should talk to him sometime.

    • mdawaffe says:

      Thanks for the info. And no, I don’t think you’re flaming me (or if you are you should try harder). Sounds more like you’re flaming the idea, which is cool.

      I never said there would be performance gains. I never said it would be better. I’m sorry that the phrase “cut down on the number of bytes of code we load” comes across as “improve the performance” or that it sounds like I mean to use it as a performance metric.

      That’s not what I meant nor is it what I propose to do.

      I would never want something game-changing like that to go into core without real benchmarks. (I do not mean time() – time() on some dude’s laptop either.)

      Anyway, that’s neither here nor there. The post really was just an idea – not one I intended to implement or even propose – just one I wanted to jot down.

      I was thinking about how you might approach the issue of autoloading classes in apps supporting third party plugins that can put files in a number of different paths. (Suppose, for the sake of argument, that that issue is one you even want to approach in the first place.) The “dynamic” (can’t think of a more descriptive word) autoloader was one idea I had, but I wasn’t thinking specifically about WordPress when I was was considering the autoloader. I could have called this a “Path agnostic __autoload() idea” instead of a “WordPress __autoload() idea”, but decided to ground it in some context.

      You’re right to be concerned if you felt this might work its way into WordPress with no real data to back it up (or in spite of data that knocks it down). Please consider this comment my attempt to assuage that concern.

    • Jacob Santos says:

      The SPL Autoload solution allows for multiple functions to hook into it and allow for them to handle it. Most likely you can use a whitelist or check for namespace in the class name (“WP”), before trying to include the file.

      Any thing that has IF(…) : function() { } ENDIF; should never be prototyped. The issue is that many people take code from intelligent people as gold and so we get people doing function() { function() { } }. Functions contained inside other functions are worse than functions in conditionals (if the function is called twice, then you crash PHP).

      Best practices should always be used when writing examples.

      That said, I do think WordPress would be better served with several loading mechanisms, so that the person who wants to autoload WordPress may do so. I think it should very much be up to the developer to choose how WordPress is included. I would rather have the option to have the entire WordPress Includes in one file and the entire WordPress Administration in another file.

      I suggested before that WordPress could have a build script that creates those files. I understand that having the entire WordPress library in one file would be incredibly difficult to manage. The speed advantage for those using XCache and APC, those like me, would benefit greatly from this.

      This is something that can be done for 2.8 or during 2.9. I have actually done this with notable millisecond differences (I didn’t have XCache or APC in those instances). However, to be honest, it was using the time() – time(), so you may take the diffs with a grain of salt, because they aren’t worth much.

      In fact, I wasn’t testing the compile time, I was testing how long require took, so the time decrease was based on removing the file IO by combining PHP.

      This is partly why __autoload() also decreases performance. Not only do you add the extra File IO, but also a function call on top of it.

      I guess I wasn’t clear. My problem with this idea, is foremost performance. The second is WordPress would not benefit from the current file structure with the placement of classes. Most classes are required any way, so you don’t gain much by autoloading them, because you’ll be loading 99% of them regardless. No advantage.

      The third reason is that I don’t think you should just have one way to load the files. For development you want to spread the libraries out into their files, so that you can find the functions and classes quicker than browsing a 100000+ line file. This is to say, that if you were to implement an __autoload(), then you will want to provide the multiple file approach. Finally, if at all possible (using a build script) having a single file with all of the library code for those XCache and APC would be great.

      Therefore, it is easy to say, “Be all things to all people,” but that is, in general life and development, difficult to impossible. In this case, besides the single file approach, these two are not difficult to implement side by side.

      I also have problems with themes and other web applications that each file starts with: IF( attempted direct access to this file) die(); on every file. The problem, is that it is irrelevant, since for some of the files, there is no code that is actually executed. In other cases, you can get away with just, If( ! defined(‘ABSPATH’) ) die(); which is a lot quicker than the other approach of checking the URL, which in some cases can be spoofed (also depends on how the check works).

      I guess my problem was that I was worried some guy would come along, noticed that the code was “The R0ck3rz” and decide to apply it to their project or code. No one should go along thinking that putting procedures and classes in conditionals is a Good Idea(SM). The bar needs to be set as, if this is something you should do, then you should know what you are doing.

      You know what you are doing, but some novice with two months of PHP is not going to know what the problem is.

      Also, I’ll rather WordPress was refactored to remove as much as possible these types of constructs in the WordPress source code. I guess, the pluggable functions would have to be as it is. Plugins hinder this as well.

    • aen says:

      Why in the world would an op-code cache EVER discard the results of parsing/compilation unless the underlying file had changed? I know I’m replying to an old post, but for crying out loud… The point of an op-code cache is to, like, CACHE the semi-compiled bits that result from the PHP parser doing its thing, SO THAT IT DOESN’T HAVE TO PARSE THE SOURCE FILE AGAIN ON THE NEXT PAGE LOAD.

      What did I miss?

  12. Pingback: Top Posts « WordPress.com

Comments are closed.