Tag Archives: zf

Running Zend Framework modules from a Phar file

Using Zend Framework as an MVC application is probably the most common usage examples for Zend Framework. When you create such an MVC application, you will probably have heard about modules: reusable components of your application. Ideally these modules are drop-in and require little or no configuration before you can use them. This way you don’t have to recreate the same old “news module” for every client or customer. In my case, I usually just copy and paste the module from one base project into a new project. That’s easy. But it would be cooler to package your module as a Phar file, and run that file instead.

Bootstrapping Phar modules

Unfortunately, there is no support in Zend Framework 1.x for running a module from a Phar file. But we can build that support ourselves! With a little knowledge about bootstrapping, we can integrate Phar support for modules.

During the bootstrapping process, so called resources are used to prepare the application for running. A lot of resources are readily available, like a database resource for your database connection, cachemanager for configuring caches, … And there is a “Modules” resource. This one takes care of bootstrapping each module. The purpose of bootstrapping a module, is mainly to get the module namespace known in your application and to tell the FrontController where it can find a module.

This built-in modules resource loops over each directory it finds inside the specified modules folder, and then finds and executes the Bootstrap.php file inside each module. Since it iterates over directories, any Phar file we put there is just ignored. We will have to make sure that Phar files also get bootstrapped. We can do this by providing our own modules resource. The built-in one is called Zend_Application_Resource_Modules, and resides in Zend/Application/Resource/Modules.php. We don’t have to start hacking that file to add Phar support. We can create our own Modules.php resource, which extends the Zend Framework one. By providing our own implementation of an existing resource, we can add extra functionality on top of the existing code, without having to hack the existing code.

Here’s my implementation:

// file: library/MyLib/Application/Resource/Modules.php
class MyLib_Application_Resource_Modules extends Zend_Application_Resource_Modules
{
    /**
     * Initialize modules
     *
     * @return array
     */
    public function init()
    {
		// call parent functionality:
		parent::init();
		
		// find out module directory:
        $bootstrap = $this->getBootstrap();
        $bootstrap->bootstrap('FrontController');
        $front = $bootstrap->getResource('FrontController');
		$modulesDirectory = $front->getModuleDirectory() . '/modules'; // TODO hard coded...
		
        $default = $front->getDefaultModule();
        $curBootstrapClass = get_class($bootstrap);

		// find all PHAR modules, and bootstrap those:	
		$iterator = new DirectoryIterator($modulesDirectory);
		
		foreach ($iterator as $file) {
			if (!$file->isDot() && strpos($file->getFilename(), '.phar') > 0) {
				$module = str_replace('.phar', '', $file->getFilename());
				
				// bootstrap the module:
				$this->_bootstrapPharModule($file, $bootstrap, $default, $curBootstrapClass);
				
				// add to the modules in the FC
				$front->addControllerDirectory('phar://' . 
						$file->getPath() . 
						DIRECTORY_SEPARATOR . 
						$file->getFilename() . 
						DIRECTORY_SEPARATOR . 
						'controllers', 
				$module);
			}
		}
		
		return $this->_bootstraps;
    }
	
	/**
	 * Bootstraps a single PHAR module
	 *
	 * @param DirectoryIterator $file
	 * @param Zend_Application_Bootstrap $bootstrap
	 * @param string $default
	 * @param string $curBootstrapClass
	 * @return void
	 * @throws Zend_Application_Resource_Exception When bootstrap class was not found
	 */
	protected function _bootstrapPharModule (DirectoryIterator $file, $bootstrap, $default, $curBootstrapClass)
	{
		$fullPharPath = $file->getPath() . DIRECTORY_SEPARATOR . $file->getFilename();
		include($fullPharPath);
		$module = str_replace('.phar', '', $file->getFilename());
		
		$bootstrapClass = $this->_formatModuleName($module) . '_Bootstrap';
		if (!class_exists($bootstrapClass, false)) {
			$bootstrapPath  = 'phar://' . $fullPharPath . '/Bootstrap.php';
			if (file_exists($bootstrapPath)) {
				$eMsgTpl = 'Bootstrap file found for module "%s" but bootstrap class "%s" not found';
				include_once $bootstrapPath;
				if (($default != $module)
					&& !class_exists($bootstrapClass, false)
				) {
					throw new Zend_Application_Resource_Exception(sprintf(
						$eMsgTpl, $module, $bootstrapClass
					));
				} elseif ($default == $module) {
					if (!class_exists($bootstrapClass, false)) {
						$bootstrapClass = 'Bootstrap';
						if (!class_exists($bootstrapClass, false)) {
							throw new Zend_Application_Resource_Exception(sprintf(
								$eMsgTpl, $module, $bootstrapClass
							));
						}
					}
				}
			} else {
				// nothing to bootstrap, so let's move on
				return;
			}
		}
		
		if ($bootstrapClass == $curBootstrapClass) {
			// If the found bootstrap class matches the one calling this
			// resource, don't re-execute.
			return;
		}

		$moduleBootstrap = new $bootstrapClass($bootstrap);
		$moduleBootstrap->bootstrap();
		$this->_bootstraps[$module] = $moduleBootstrap;
	}
}

What happens is not very difficult: we find each Phar module, find the Bootstrap.php file in it, execute it and let the FrontController know we have bootstrapped a new module. The actual reading from a Phar file is handled transparently by the Phar module, and doesn’t need any special treatment in Zend Framework. Credits for this code should actually go to the Zend Framework: I looked into the original file, and almost literally copied what I could reuse.

Conclusion

Adding Phar support isn’t very hard. Too bad it isn’t built-in the framework, but luckily for us, the framework is flexible enough so we can add it. Once the module is bootstrapped, all classes can be used in your application. Once the FrontController knows where to locate your module, it’s accessible through the standard /module/controller/action request scheme. If you want to know more about creating your own Phar file, I recommend reading this excellent article from Cal Evans. It gave me exactly the information I needed for building my own Phar files.

A smarter strategy for using Zend_Navigation

What I love about Zend Framework is that it has so many components you can use. One of these components that can make your life easier is Zend_Navigation. The documentation says, it’s a component “for managing trees of pointers to web-pages”. So basically you can use it for all your navigational needs on your website: menu, breadcrumbs, sitemaps, …

One site menu please, with caching

Most examples on how to use Zend_Navigation show you how to create a relatively simple site structure. The amount of pages is limited, and this is great to demonstrate the different functionalities of this component. And to be honest, most sites you’ll create will just have a simple site structure.

Sometimes, you’ll build your site menu from a database, and usually this means your site structure will be bigger than what the examples show you. One way of making sure that everything stays fast, is making use of another great Zend Framework component: Zend_Cache. By caching your site structure, you bypass the fetching of data from the database and molding them into Zend_Navigation_Page’s or -Containers. Here’s a small example:

/**
 * Site service
 *
 * @category	Service
 * @package		Site
 */
/**
 * Service for general site related stuff
 *
 * @category	Service
 * @package		Site
 */
class Application_Service_Site
{
	const USE_CACHE_TRUE = true;
	const USE_CACHE_FALSE = false;
	
	/**
	 * Returns the site navigation from the database
	 *
	 * @param boolean $useCache OPTIONAL flag to specify if caching should be used
	 * @return Zend_Navigation 
	 */
    public function getNavigation ($useCache = self::USE_CACHE_TRUE)
    {
		// if caching is enabled, try to load the navigation from the cache:
        if (self::USE_CACHE_TRUE === $useCache) {
            $cache = $this->_getCacheObject();
			if (!$container = $cache->load('sitenavigation')) {
				// cache was invalid, so regenerate the data, and cache it:
				$container = $this->getNavigation(self::USE_CACHE_FALSE);
				$cache->save($container, 'sitenavigation');
			}
			return $container;
        }
        
		// fetch data from DB:
		$results = $this->getCompleteCategoryTree();
		
		// recursively build the navigation:
		$container = $this->_buildNavigationTree($container, $results);

        return $container;
    }
	
	/**
	 * Transforms a tree of data to a tree in Zend_Navigation
	 *
	 * @param Zend_Navigation $container
	 * @param object $tree
	 * @return Zend_Navigation
	 */
	protected function _buildNavigationTree ($container, $tree)
	{
		foreach ($tree as $node) {
			$page = new Zend_Navigation_Page_Mvc(array(
				'label'         => $node->label,
				'controller'    => $node->controller,
				'action'        => $node->action,
				'params'        => unserialize($node->params)
			));
			
			if ($node->hasChildren()) {
				$page = $this->_buildNavigationTree($page, $node->getChildren());
			}
			
			$container->addPage($page);
		}
		
		return $container;
	}
	
	/**
     * Get a cache object from the cache manager
     *
     * @return Zend_Cache
     */
    protected function _getCacheObject ($cacheName)
    {
		// get the bootstrap:
		$bootstrap = Zend_Controller_Front::getInstance()->getParam('bootstrap');
		
		// get cachemanager from the bootstrap:
    	$cacheManager = $bootstrap->getResource('cachemanager');
		
		// get requested cache:
    	$cache = $cacheManager->getCache($cacheName);
		
    	return $cache;
    }
}

Never mind that the example isn’t really realistic and that the code isn’t complete. I’ve added in-line comments everywhere, so it should be self-explanatory. In short, the code fetches data from the database, and transforms it into a Zend_Navigation object. Basically, you fetch a tree of data, and recursively create a tree in Zend_Navigation for your entire site structure. Optionally, you can specify to use caching or not. For convenience, the cache is managed by the ‘cachemanager’ Application Resource. This makes it easy to have a different setup for production/development environment via your application.ini config:

resources.cachemanager.sitenavigation.frontend.name = Core
resources.cachemanager.sitenavigation.frontend.customFrontendNaming = false
resources.cachemanager.sitenavigation.frontend.options.lifetime = 1209600
resources.cachemanager.sitenavigation.frontend.options.automatic_serialization = true
resources.cachemanager.sitenavigation.backend.name = File
resources.cachemanager.sitenavigation.backend.customBackendNaming = false
resources.cachemanager.sitenavigation.backend.options.cache_dir = APPLICATION_PATH "/../tmp"
resources.cachemanager.sitenavigation.frontendBackendAutoload = false

And then it goes horribly wrong!

The example above works fine, and you just start coding away on other stuff. You fill up the database with some test data, and you’re happy that everything works just fine. But you’ve already made your first mistake: premature optimization. By defaulting to using cache, you don’t really know how fast or slow things are without using a cache mechanism. But no worries, let’s assume you didn’t make that mistake, and you turn on caching after you’ve evaluated the current site speed, and you see that caching really makes things faster.

What went horribly wrong for me is that by using the cache, I saw the page loading time go down from 4 seconds to 1.8 seconds. That’s quite a speed bump, but 1.8 seconds still is slow. And since there was no other optimization option available to me, I had to start looking in the code. Turns out that my navigation was a big time consumer. If I totally disabled my navigation, the page loading time went down to 0.9 seconds. So there was definitely something fishy going on there.

Who needs a big tree anyway?

A bonsai is nice too! Sorry, bad joke. But that’s exactly what was wrong: There was so much data in the database, the navigation tree just became too big. Everything was read from the file cache and loaded into the memory. On each request. Ouch! That hurt the server (and my ego). Loading more than 10.000 pages into memory isn’t that efficient. And when you think about it, who really needs the entire tree in memory on each request anyway? No-one does, that’s who.

My fault was assuming that the practices that work for small and static site navigations would just work for large and dynamic site navigations. My last project was a website for a company that offers trainings and courses. There were more than 6.000 courses, nicely put in a category structure of sometimes 3 levels deep. For the navigation, I always needed the first level of categories, optionally the subcategories of a selected category, and optionally a selected course. This came down to 30′ish pages I needed. Compare that to the more than 10.000 I initially loaded into the memory.

How do you start optimizing your code so you end up with a Zend_Navigation container that’s populated depending on your request, instead of just loading everything? Quite simple really, by just passing some extra parameters from the request to the method that builds the navigation. Here’s an example:

/**
 * Site service
 *
 * @category	Service
 * @package		Site
 */
/**
 * Service for general site related stuff
 *
 * @category	Service
 * @package		Site
 */
class Application_Service_Site
{
	const USE_CACHE_TRUE = true;
	const USE_CACHE_FALSE = false;
	
	/**
	 * Returns the site navigation from the database
	 *
	 * @return Zend_Navigation 
	 */
    public function getNavigation (array $params)
    {
		// fetch data from DB:
		$results = $this->getCategoriesAndBranchForParams($params);
		
		// recursively build the navigation:
		$container = $this->_buildNavigationTree($container, $results);

        return $container;
    }
	
	/**
	 * Get the path in the tree that leads from the root to current category
	 *
	 * @param array $params 
	 * @return array
	 */
	public function getCategoriesAndBranchForParams (array $params)
	{
		// get the first level of categories:
		$categories = $this->_getCategoriesOnLevel(0);
		
		foreach ($categories as &$category) {
			if ($category->getHasChildWithId((int) $params['category_id'])) {
				// get the leaf of the tree we're currently on:
				$leafNode = $this->_getCategoryById($params["category_id"]);
				
				// get the path from the leaf to the root:
				$category = $this->_retraceToNode($leafNode, $category);
			}
		}
		
		return $categories;
	}
	
	/**
	 * Recursively retrace the path from given node to the root of the tree
	 *
	 * @param object $node
	 * @return object
	 */
	protected function _retraceToNode ($node, $rootNode)
	{
		if (null !== $node->getParentNodeId()) {
			$parent = $this->_getCategoryById($node->getParentNodeId());
			$parent->addChild($node);
			
			if ($parent->getCategoryId() != $rootNode->getCategoryId()) {
				// we haven't reached our desired $rootNode yet, so continue
				return $this->_retraceToRoot($parent, $rootNode);
			}
		}
		
		return $node;
	}
	
	/**
	 * Returns the navigation from the cache
	 *
	 * @return Zend_Navigation
	 */
	public function getCachedNavigation ()
	{
		$cache = $this->_getCacheObject();
		if (!$container = $cache->load('sitenavigation')) {
			// cache was invalid, so regenerate the data, and cache it:
			$container = $this->getNavigation();
			$cache->save($container, 'sitenavigation');
		}
		return $container;
	}
	
	/**
	 * Transforms a tree of data to a tree in Zend_Navigation
	 *
	 * @param Zend_Navigation $container
	 * @param object $tree
	 * @return Zend_Navigation
	 */
	protected function _buildNavigationTree ($container, $tree)
	{
		foreach ($tree as $node) {
			$page = new Zend_Navigation_Page_Mvc(array(
				'label'         => $node->label,
				'controller'    => 'category',
				'action'        => 'overview',
				'params'        => unserialize($node->params)
			));
			
			if ($node->hasChildren()) {
				$page = $this->_buildNavigationTree($page, $node->getChildren());
			}
			
			$container->addPage($page);
		}
		
		return $container;
	}
	
	/**
     * Get a cache object from the cache manager
     *
     * @return Zend_Cache
     */
    protected function _getCacheObject ($cacheName)
    {
		// get the bootstrap:
		$bootstrap = Zend_Controller_Front::getInstance()->getParam('bootstrap');
		
		// get cachemanager from the bootstrap:
    	$cacheManager = $bootstrap->getResource('cachemanager');
		
		// get requested cache:
    	$cache = $cacheManager->getCache($cacheName);
		
    	return $cache;
    }
}

First, but least important: the caching has been moved to a dedicated method called “getCachedNavigation”. Personally, I find it more readable than working with flags.

Secondly, and much more important, a parameter has been introduced in the “getNavigation” method. This parameter is what you would get from the request object, and contains the controller name, action name, and query parameters. This query parameter is then used to fetch the part of the tree that’s relevant for current page. Instead of fetching the entire tree, we only fetch the first level of categories, and 1 branch we need in order to determine the correct breadcrumb path. All the rest is irrelevant.

A thousand words

A picture often says more than a thousand words. Here’s how I have reduced the memory footprint of my navigation, by simply keeping in mind which page is being looked at.

navigation_tree.png

This is just a small representation of what was originally in memory for the Zend_Navigation object. Imagine each level 1 category to have at least 2 second level categories. Each second level category has at least 10 pages. After optimizing, the result is:

navigation_branch.png

As you can see, the end result is a much simpler and smaller tree. I have kept the first level of categories, as I needed them for my menu. Then I have 1 branch which reaches out to the page I’m currently on. This branch is used for the breadcrumbs.

Conclusion

Needless to say that going from a very big tree in memory to just a very small tree will dramatically impact your site’s responsiveness. This optimization roughly translated into a 50% speed boost for my project. Best of all, it took only 1 hour of work to achieve this. I have learned from this, that caching should never be implemented and used from the beginning. It’s a premature optimization mistake which can lead us to think that something else is the bottleneck, not the code. Taking away the cache forced me to actually think again about the code I had written. Sometimes improvements are obvious, if you open up your mind for them.