Following reports of the search functionality on the Cloudworks site failing with an ungraceful php error message when using certain search parameters, we undertook some investigation in to the cause of the problem. It transpired that the Zend Lucene search engine was indexing all of the HTML content of the clouds, cloudscapes and user pages which included all of the recurring content such as navigation, header and footer elements. Not only was this skewing search results by giving recurring terms unfair weighting it was also causing the ungraceful error by virtue of the sheer number of results causing the search process to time out. Examples of search terms causing problems were ‘events’, ‘tags’ and ‘cloud’.
Our approach to solving the problem was to create new cut-down versions of cloud, cloudscape and user pages which were essentially ‘chrome-less’ and only output the required content to be indexed.
After the code changes were put live, a reindex of the live site was required to allow the changes to take effect. As the search index has been built up cumulatively over time as content is added to the site a reindex of the live site had never been performed. We did a test run of a full reindex on our approval server to see if the site would remain functional. After this ran successfully we ran a full index on the live site which ran for some hours but hopefully remained transparent to end users. The time out message now seems to have disappeared and meaningful results are now being returned for the previously problematic search terms. The code change should also eliminate similar problems happening with fresh installs of CloudEngine from v1.1.2.
As part of these changes, we also took the opportunity to add the Firephp debugging tool to the codebase. This relies on Firefox with the Firebug and Firephp add-ons as well as the server-side code. A setting has been added to the database that allows debug messages to be switched off, switched on for admin users or switched on for all authenticated users. This allows some debug or informational messages to be output to admin users on a live site via the Firebug console if need be but remain transparent to other users.