Why we moved from Drupal to CodeIgniter

CloudEngine and the Cloudworks website are based on the CodeIgniter PHP framework. However, this wasn’t originally the case. The first version of the Cloudworks site instead used the Drupal Content Management system and we only moved over to CodeIgniter a year or so ago.  I have been asked a few times about our reasons for this move, so before I go on maternity leave I thought I should put them down in writing.

Any sort of rewrite is a risky decision to make, but in this case it is one that that I am really pleased with. It took about a month for me to migrate the code over, although that time also included implementing a new design for the design for the site.  We lost a tiny bit of functionality in the move – you get lots ‘for free’ with Drupal – but nothing significant.

Before I start, I want to say that I am still a fan of Drupal for certain types of projects. It also has a great community that seems to be doing all the right things in terms of the directions that Drupal is moving forwards. In fact, I have chosen to use Drupal for other projects since. So please don’t take this as an all-out attack on Drupal, rather as some reasons why it wasn’t right for us in this context.

There are three main reasons why we moved and why I am glad that we made the decision to switch to CodeIgniter:

1) Productivity Curve

With Drupal, you can get the majority of a site built very quickly, but those last details can be agonising. Drupal has a lot of built-in functionality that you can effectively just switch on. It is also straightforward to write simple modules of your own. However the flip-side to this is that if Drupal doesn’t do exactly what you want, for example if you want a slightly different user interface or some subtle change in functionality, you often find yourself back at close to square one.  If you have a site that you just want to be ‘good enough’ and to get up and running quickly, none of this is a problem, but if those details do matter to you, the time you saved at the start is quickly eclipsed by this time fine-tuning at the end.

Using a framework such as CodeIgniter, you do have to write everything from scratch and it is a little bit more work to get things set up at the start.  However, having a framework does speed this up a great deal and writing new functionality with CodeIgniter is, if anything, quicker than writing an equivalent Drupal module. There are also lots of libraries out there that you can use with CodeIgniter, so you often aren’t starting from a completely blank slate. On the other hand, when you write some code in CodeIgniter, you can write it so that it does precisely what you want it to from the start.

Some of this difference comes down to learning curve of course. I’m certainly not a Drupal novice. I’ve written a fair few modules for Drupal and know a certain amount about how to override core behaviour in Drupal for example, but I definitely wouldn’t class myself as an expert either. I readily admit that for somebody with more knowledge than me, the difference in productivity between Drupal and CodeIgniter would be less extreme. Indeed I’d be genuinely curious to know how quickly a Drupal expert could replicate the Cloudworks site. However, any time taken learning more about Drupal is time that would have needed to come out of the time allocated to development of the site. In contrast, you could read all the CodeIgniter documentation in a day and I haven’t yet known a PHP developer who hasn’t been able to pretty much hit the ground running with CodeIgniter.

Aside from learning curve, I suspect that there are some things that are just fundamentally difficult to do in Drupal. I am in two minds about whether to go into specifics here as I don’t want to get dragged down into microissues. If you want a taster, you can see a few examples of my questions on the Drupal forums http://drupal.org/tracker/371512 before I gave up asking there! I know that that some of the issues I had have been fixed in later versions of Drupal, but I also have memories of having problems with dealing with images and configuring the registration form and user profiles quite how I wanted them and I doubt all of these have been resolved.

2) A different way of working

Working with Drupal felt a bit like building a boat by starting with a car and gradually modifying the car until it became a boat, whereas working with CodeIgniter felt like building a boat from scratch. This changes your mindset significantly when you are working on a site. Now if what you are building is very close to the ‘Drupal model’ then this isn’t too tough, but if it’s not then the mental process of figuring out how to twist Drupal to meet your requirements is much more of an uphill struggle.

Theming is a good case in point when it comes to this different way of working in Drupal versus CodeIgniter. If a designer produces some HTML and CSS for you, you can drop this straight into a site using CodeIgniter but with Drupal, you can have a major piece of work on your hands making it into a theme. Working with a designer on a Drupal version of the site was much trickier as I had to try and explain what would make a design feasible to be turned into a Drupal theme. More generally, I found it trickier discussing work on the site with other people on the project when we were using Drupal compared with when we started using CodeIgniter.

There is also has a more subtle effect. Instead of starting out thinking ‘this is what I would really like the site to be like’ you find yourself naturally looking at what modules are out there and whether they might be useful. This is quite a different way to look at building a site and means that you are less open to ideas for which there aren’t easy Drupal solutions.

3) Site structure in the database

The way Drupal works is that in order to be configurable via a nice user interface, lots of the important information about the site sits in the database.  So as well as containing the ‘site data’ such as details of all the users, the text of content and so on, it also contains the ‘site structure data’ for example specifying the types of object (nodes in Drupal-speak) for your site and the ‘views’ of the data that you are using on specific pages. It’s exactly this that makes Drupal so easy and flexible to configure for non-developers.  On a site using a framework like CodeIgniter, such information would be more ‘hardcoded’ into the structure of the database or the actual code for the specific site in question.

This has a few knock-on effects. First, pushing quite simple changes in functionality to a live Drupal site usually isn’t as simple as just updating the code. This was a little while back so there might be a nice solution to this nowadays (let me know if there is!), but in practice, to update the live Cloudworks site when I was using Drupal, I normally ended up having to write lists of all the actions that I had performed via the Drupal admin interface and repeat these on the live site and hope that I hadn’t made any mistakes and then pray that I didn’t need to rollback for some reason. Secondly, it would have made life much more complicated for making our software open-source in terms of support. It’s easy to be reasonably confident that somebody is using an unmodified version of your code, but it’s harder to be certain that they haven’t changed something vital in the Drupal admin interface. Finally, the Drupal way of having the site structure information in the database means that database queries can easily get thorny. This isn’t a problem too often, but queries that are now only moderately complex using CodeIgniter were often absolutely horrendous in Drupal.

Finally

I think the cumulative effect of all these things was that I became a much happier developer when we moved to CodeIgniter.  I somehow felt in control of my code again and I spent less time tearing my hair out trying to find some Drupal workaround! As a result, I also felt I had more energy and enthusiasm to put in the site. Overall, although the move could be regarded as slightly controversial and I was nervous about it, I have no doubt that it was the right one for us in our situation.

 

Advertisements
Posted in Uncategorized | 18 Comments

Welcome to the CloudEngine blog!

Welcome to the CloudEngine blog. CloudEngine is the software behind the Cloudworks website that we have recently made open-source. You can find the source code, wiki and issue tracker at the CloudEngine BitBucket site. We’re very excited about going open-source and actively looking for volunteers to help contribute.

This is going to the place where we announce news about CloudEngine and post about why we have made certain decisions and lessons that we have learned. To kick things off, we have copied over some of the blog posts from the Cloudworks blog of a more technical nature as a start.

I’ll being going on maternity leave in a week’s time, although I hope to make a couple of more posts here before then. After that my colleague Nick Freear who has worked on several major CloudEngine features will be taking over leading the development.

Posted in Uncategorized | Leave a comment

Making Cloudworks open-source

[Originally posted by Juliette Culver on the Cloudworks blog]

My top priority before I go on maternity leave is to make the Cloudworks source-code open-source and I thought it would be worth sharing the main things that in our plan for this.

Name and branding

It is going to be confusing if the open-source version of the code is also called Cloudworks and if new installs look identical to the Cloudworks site, so we need to come up with a new name and branding. Names are hard because as well as picking a good name, you need to get them trademark-checked, you don’t want to clash with anything else with the same name that might cause confusion and you also want to be able to get a reasonably sensible domain name. We have some ideas but suggestions are very welcome! We’ve already got our graphic designer to produce a new colour scheme and site banner so we’re part of the way with the branding side of things.

Licensing

There are two main things to consider here. First, we have had to check with legal at The Open University whether we can release the code as open-source and if there are any restrictions on licences we can use, as well as check with the funding bodies, JISC and the EU, that have funded parts of the work. We’ve already done this and luckily it seems to be fairly straightforward and we can pick pretty much any licence we would like from that perspective.

Secondly, the code uses various other open-source code, for example the CodeIgniter framework, JQuery, TinyMCE and Zend Lucene. These are all covered by a variety of different licences and we need to make sure that we’re not breaking the terms of any of them when we release our code. I’m hoping that OSSWatch are going to come to the rescue here in helping me make sense of all this!

Governance Model

The OSSWatch website has lots of useful information about governance models. There are some interesting decisions for us here, especially with how the open-source work fits in with existing structures and with me going on maternity leave in October!

Installation and upgrade infrastructure

We need to write and test installation instructions for the code, as well as make sure that an ’empty’ install of the code behaves reasonably sensibly. We also need to think ahead as to how we are going to manage upgrades and document which versions of PHP/MySQL we have tested with.

Configurability

We are going through the codebase trying to spot anywhere that things have been hardcoded in which wouldn’t make sense in the open-source version. One of the main places is the support and about pages which are currently hardcoded HTML. We also need to allow people to customise the theme and logo.

We have decided however on the whole to provide the code very much ‘as is’ in the first instance and work on improving it later. The admin interface for example is rather primitive, but we’re working on the principle that it is better to get the code out first and make those improvements afterwards.

Hosting, website and documentation

We need to decide where to host the code – at the moment, it is looking like a choice between Github and Bitbucket. We also need to put infrastructure in place for tracking bugs (and decide whether to try and import the bugs we currently have in our local bugtracker) as well as think about things like a developer mailing list, wiki and website as well as the type of documentation and guidance we want to have on them. We also need to decide how we manage reports of security vulnerabilities.

Get involved

If you’re interested in being an early guinea pig for the open-source version or getting involved in development, please do contact us on cloudworks@open.ac.uk

Posted in Uncategorized | Leave a comment

The Cloudworks API – rationale and lessons

[Originally posted on Nick Freear’s blog]

A few weeks ago I blogged the forthcoming Cloudworks application programming interface and published a document for review. Today, I thought I would explain some of the decisions behind the design of the API and share some of the lessons we’re learning. We’ll also touch on Javascript widgets and possible next steps. And, we’ll try not to “sell” REST or get into a holy war!

Cloudworks logo

Rationale

We were concerned about creating a usable, easy to understand URL-scheme. However, I also found that I prefer quite a strict REST approach. So, we spent some time oscillating between what we have today, and variations like “/api?method=cloud.getInfo&cloud_id=123“. There was a feeling that the latter was almost self-documenting. I hope the result we came to is fairly understandable:

  /api/{item}/{term}[/{related}].{format}

In the end, our reasoning was that it is useful to have the things that are at the core of your data model, in our case clouds, cloudscapes, users and tags actually in the URL, for example, “/api/clouds/123/followers” (get the followers of a cloud). The optional parameters like “count”, “orderby”, or meta-data like an API key are expressed as GET parameters, eg. “?count=5&api_key=12345”. The ability to express the output format like a file extension, eg. “.json” (or “.xml”, soon) was taken from Twitter’s API among others. The above ideas can make the pattern of the API calls more predictable for developers. They also make HTTP caching easier, again as URLs are more patterned and predictable – an added benefit. We made the late decision to use plurals throughout, eg. “/api/cloudscapes …”. This is to allow us to extend the API – we haven’t yet implemented the call “/api/clouds” (get ‘all’ clouds, ordered by…), but we have made it easier to this add to the scheme.

We chose JSON as the first output format as it was requested by our first client, SocialLearn, and critically it is perhaps the most universal format, for both in-browser scripting, and server-side scripting. And, there is an, as yet undocumented, callback parameter to allow for JSON-P – required if you use for example the jQuery library. For example, “/api/clouds/123.json?callback=My.function2&api_key=...“. I also decided later on to make the outer element of the JSON response always an object. So, “lists” or arrays of items are within an object. This gives greater uniformity and the ability to add more meta-data, while adding a little more complexity. Generally, the response is as simple as possible, while providing links to other parts of the API. An example is “tags” which changed from a simple array of tag-names.

"tags": [
    "OULDI",
    "Learning design"
]

To an array of objects containing “api_urls”,

"tags": [
    {"name":"OULDI", "api_url":"http://cloudworks..."},
    {...}
]

One issue that caused some, perhaps unexpected, discussion was API keys. These are a common way of monitoring and if necessary controlling the use of an API by client software, typically as a GET parameter, eg. …/api/clouds/active?api_key=1234 . So far, so simple. There are at least four issues that have come out. One, how do make an API open and easy to use, while at the same time not risking overloading servers? We obviously don’t have the vast server farms that Google and others have. Two, if you decide to use API keys, how do you keep them secure, especially for Javascript clients? Anyone can look in the client’s source HTML and Javascript to see the key. Three, how do you write documentation and example code that gives an example API key, that actually works(!) – to make it easy to get started, while controlling the use of that key? Four, how do you not get bogged down in this and the issue of rate-limiting, and actually get something out there?! Our approach is to require the use of API keys at least initially, to log every API request, with IP address, user-agent string and so on, to not try to implement rate-limiting too early, and to monitor how things progress. And, we may have some answers to the Javascript question – more soon. We also have an idea to allow the API key to be put in the HTTP request headers, like YouTube/Google allow, and Wixi prefer. This may be better for HTTP caching – any feedback on this would be particularly welcome.

Lessons

These are some of the lessons I’ve learnt through the API work. These build on tips from others.

  1. Consider “feature flags” in place of a branch in your code repository (the Flickr developers). This isn’t specific to APIs. The internationalization work I did earlier this year was completed on a branch, which left a minor merge head-ache at the end. For the API work I committed to ‘head‘ and put flag-variables in the application configuration files. So, the code was on the live site before we flicked the switch. (Admittedly, this was easier because the API work touched less of the code.)
  2. Write test scripts early. (Well, I already knew I should write unit tests, but those can be tough, right!) I’ve picked up a number of bugs in the API by developing a test harness for all 24 calls. For example, I’m currently working on an XML output, as an alternative to JSON. This all looked really easy, until regression tests showed some calls failing – I was glad to pick up the bug early (the fix wasn’t too difficult). And, the test script gave me a warm fuzzy feeling before we had a real API user (sad?!). The test script uses cURL and does some basic checks on the response, and some of the tests existed before the implementations. There is also a test of the Javascript widgets (more below).
  3. Handle errors properly. It makes using your API easier if you use the HTTP error codes, and use them correctly. Don’t return a 200 for errors. The exception I found was for Javascript – if you return an error code the script won’t run. (And yes, I’ve coded some error handling into the Javascript.) I also put PHP’s error reporting level quite high and forced the display of errors through the API. I think this helped initially, though of course it’s no good for production.
  4. KISS, keep it simple stupid – start small, both in terms of the number of calls, and by keeping the response simple. Ideally, start with calls based on one or more use-cases. We were perhaps fortunate to start with JSON, which after looking at YouTube’s GData API I deliberately kept quite flat and simple. (YouTube’s GData JSON format directly encodes Atom, so it contains multiple XML namespaces and so on.) My approach to XML is often to add multiple namespaces – not good for the first response format.
  5. Look at other APIs, use them.
  6. Ideally start with a stable database scheme and real content. These were two ways in which I think we were fortunate with Cloudworks. So far I’ve only had to make 3 or 4 changes to the data model. If you’re developing an API for a new web site try to delay the API. A few weeks may make all the difference.
  7. Use the API yourself! It makes it easier if you have some use-cases in mind. The Javascript widgets we’ve been working on made me think about the consistency of the response. For more on how we’re starting to use our API, look at ‘Next Steps’ below.
  8. Consider API keys and rate limiting early. As noted above there will be issues to resolve.
  9. Get it out there. That is why we haven’t dealt with authentication, posting clouds and so on. We were concerned at times about the tight time schedule we’ve kept to and the potential for hasty decisions. However, on balance I’m glad we’ve got something out, in the open and it’s starting to be used – thanks guys!

Next steps

We have a simple XML response format to match the existing JSON response in the pipeline. And we’re working on some Javascript widgets for your blog or web-site, for example, to display the last 5 items from your cloudstream or the clouds associated with a tag. We are taking our cue from Delicious and Twitter which make this really simple for regular users. The idea will be to give every authenticated user a “Get Javascript embed code” button.

Looking further ahead, we’d like to tackle adding clouds and perhaps comments to the site through the API (this is an API after all, not a set of feeds ;). This inevitably means tackling authentication, possibly using OAuth. And we need to deal with paging of large responses. Looking at the Guardian’s API explorer, we think that this would be a really useful way for developers to dip their toes in. So we’d like to do something similar. And performance and caching is on our radar, for the Cloudworks site as a whole and the API specifically. However, the site coped well with the recent OU conference, so this is less of a concern than it was. We will be converting the static PDF API document to Wiki pages when the opportunity arises.

Thank you to SocialLearn who funded the initial API development and the Cloudworks lead developer, Juliette who worked closely with me. Finally, I must say that we’d love you to use our API! Please, look at the document, create an account on Cloudworks and email us for an API key.

Useful links:

Posted in Uncategorized | Leave a comment

Internationalizing Cloudworks

[Originally posted by Nick Freear on the Cloudworks blog]

I’ve been keeping a low profile since Juliette blogged that I had joined the Cloudworks team, back in January. However, as we will shortly be launching a new feature – namely, the Cloudworks site translated into Greek, it seems like a good time to share our experiences. I’ll keep technical detail to a minimum.

First, it should be pointed out that it is the user-interface, for example the main navigation links and form labels that have been localized (translated), not the dynamic content, Clouds, Cloudscapes and so on. Greek was chosen as the first language as we recieved European funding for this purpose, but translating into other languages will require less effort (volunteers are welcome to email cloudworks@open.ac.uk).

Screen shot of Cloudworks - Greek preview, on Flickr

Cloudworks is built on top of the CodeIgniter software framework, which has support for internationalization. However, CodeIgniter uses a bespoke method, which though it is efficient, has drawbacks, for translators – a lack of tools, and for developers – having to simultaneously edit two files when internationalizing a source text. Both these issues were a concern given the size of Cloudworks (700 texts extracted!) and the limited time available. (Tech: by default CodeIgniter stores language packs in arrays, a similar method to that used by the Moodle open-source e-learning software.)

After further research, I settled on GNU Gettext, a set of free/open-source software libraries and file formats, which is used by many Linux distributions, and by web projects including WordPress and Drupal (with subtle variations). In the Gettext method, each text string in the software is manually wrapped by the developer in a function call (a software notation or syntax). A software tool is used to automatically extract each chunk of text to a file (with a .po or .pot file extension).

The translator uses a specialised graphical editor to translate the text. Files can be merged and joined. The resulting file is converted to a binary format (.mo extension), which is deployed to the server. Gettext handles plurals (languages use suffixes differently for zero, one, two…), and to some extent dates.

I developed a system of placeholders for dynamic parts of sentences and phrases – for example the titles of Clouds, names of contributors, dates and so on. This borrowed heavily from Drupal with some variations to help maintain content flow, particularly in the about pages (the only content to be translated). And, the language will be chosen based on your browser software’s configuration, with the option to override this by selecting the language from a menu.

What lessons have we learnt?

  • Using built-in date/time functionality (Tech: strftime PHP/C function) is not trivial, due to encoding issues on Windows servers. This is one reason why web projects like Drupal handle dates and times themselves.
  • Preparing the template files for translation, writing notes/instructions for translators, and integrating the text from translators takes more time than you might think.
  • Character encoding can be an issue – for historical reasons Cloudworks uses Latin1 (ISO-8859-1), and we still need to convert content with a quite a lot of accents to Unicode. (Note, this may require some down time – we’ll keep you posted.)
  • The only drawbacks to Gettext that I have found in the context of a web site/application, are the need to create a binary file, and the difficulty of extending a language using multiple files. Drupal deals with these issues by handling the localization files itself, bypassing the default system functionality.

Finally, my thanks to Martha Vasiliadou from Innovade LI Ltd. in Cyprus, who is doing a great job of translating the site to Greek! And we’ll keep you posted as we release the new language functionality.

Posted in Uncategorized | Leave a comment