Cloud APIs

I was asked a couple of days ago on Twitter whether I have any thoughts on cloud APIs. I do, and rather too many to put into 140 characters. Probably too many for a blog post, but I’ll try to summarise how I think APIs can affect a Service Provider’s choice of cloud platform.

What is the point of API support?

An API is the licensee’s route to interoperability and extensibility. We are big fans of open systems, and interoperability and extensibility in particular, so we are in general big fans of APIs. And that means both our own, and interoperating with other API designs where it’s sensible to do so. The rest of this article should be read in that context.

Which API?

Most attention focuses on the ‘customer API’, by which I mean the API that end users use to manipulate their virtual resources. But this isn’t the only API that matters. As a service provider, it is just as important that you, and your resellers, can programatically control your own environment. Here are the main types of APIs we have (other cloud infrastructure software should be similar):

  • Customer facing API – allowing end users to orchestrate their own virtual resources.
  • Administrative API – allowing you and your resellers suitably segregated access to customer and billing information, creation and deletion of customers, etc.
  • Cluster Control API – allowing system administrators to manipulate both physical and virtual resources (for instance taking nodes on and off line).

For completeness, we actually have two APIs in the third category, and a number of less important internal APIs. We publish many of our APIs here.

We talk to some licencees who do not plan initially to give their end users access to a customer facing UI at all, let alone an API. But they do plan to automate internally. Some cloud purists will claim this ‘isn’t cloud’. Well, we can have that argument elsewhere, but clearly they are using cloud infrastructure software, and for them the second two categories of API are more important than the first. Similarly, the fact service providers actually need to make some money from cloud occasionally gets forgotten, and the importance of the administrative API above is underplayed by people making that mistake.

That said, for the rest of this article, I am going to talk about the customer facing API.

Who uses APIs?

Cloud evangelists sometimes see the world through their cloudy glasses, and assume that every cloud user is a paid-up convert to the cloud religion. This is not the case. Not every cloud user uses cloud techniques in every aspect of their usage of the cloud. In particular, not every end user exclusively runs massively scalable self-sizing autonomic applications written for the cloud. Our observations are that if you give your end users a decent web interface, very few users (well under 10%) will actually use your API at all. That’s not because the API does not have full coverage; indeed our UI uses our customer facing API to perform all virtual resource manipulation, and there are a few things you can do through the API but not the UI (but not vice versa), so it can’t be that. It’s because most users currently orchestrate the resources themselves manually. Point and click is just easier than APIs. The figures are a little different if you measure by percentage of actions taken, as large customers and customers with volatile usage patterns are more likely to have machine-driven automated orchestration. But these are comparatively rare in number (though the larger ones may be a significant proportion of revenue), and even with these, we tend to see a mixture of API and UI use. If you provide a poor or incomplete UI, no doubt more customers will use your API instead. If a cloud service provider tells me most customers use his API rather than his UI, my first thought will be that he must have a dreadful UI, not that he must have a fantastically interesting customer base.

Conversely, we’ve seen some pretty unexpected API use, including an end-user who surprised us with a completely new UI he’d written himself.

My personal take is that the importance of APIs to day is sometimes overstated. However, as more end users migrate towards more cloud-like applications, API usage will become more important.

API standards, eco-systems and differentiation

The reason for providing a customer facing API is to allow end-users to orchestrate their virtual resources. However, most customers do not want to write the software talking to your API from scratch. They’d thus prefer it follows a standard. The commonly held view is that in terms of deployment prevalence, there is only one API that matters (Amazon’s EC-2 API), so everyone should support that. That’s a huge oversimplification in my view. Here are some things to think about if you are service provider looking to offer cloud services:

  • The industry does not yet have any standard with any deployment traction for description of virtual resources, let alone standards for those virtual resources themselves. The EC-2 API therefore unsurprisingly allows end users only to provision and deprovision resources that are (essentially) the Amazon EC-2 product set or copies thereof, and only in the manners permitted by Amazon. Therefore, if the EC-2 API is your only API, you cannot differentiate your product from Amazon’s. In fact, you can only offer a subset of their product range. And every API change will mean you need to run to keep up. Being able to offer an undifferentiated subset of the market leader’s product range without their economies of scale seems to me a commercially unattractive proposition.
  • The world would be a better place if there were a standardised extensible open standard for cloud provisioning. There are some in development. None has any significant deployment traction. OpenStack is probably the most interesting, but it’s not yet stable.
  • Supporting other APIs (or rather a subset of other APIs sufficient to allow common tools to work) is, however, a useful goal. Lots of EC-2 toolsets make a limited number of calls (“is this server still running?”, “start this”, “stop that”) and an EC-2 compatibility layer would allow these to work on other platforms. We are looking at adding an EC-2 compatibility layer, and may well do something with OpenStack too. However, we would not recommend our customers tell their end users to use this as their main interface, as necessarily it will only give access to a subset of the features of their platform, i.e. it will offer limited richness. It will, just like everyone else’s EC-2 compatibility interfaces, only offer a subset of the functionality of the EC-2 API, snapshotted at one particular point in time. However, this does provide access to existing toolset eco-systems.
  • Another way to get access to eco-systems is to have an easy to use API that allows use of the platform API to be integrated into other products, particular open source ones. As we license our product, rather than run a single instance of it, this increases the incentive for this to happen. For a recent example, see our integration with Standing Cloud.
  • An edit: After I first published this, it was pointed out to me that I hadn’t mentioned that yet another way of gaining access is to use an abstraction layer, like libcloud/cloudkick, jclouds, or Deltacloud. The idea here is that the service provider (or software vendor in the case of those licensing software) provides code for the an abstraction layer which offers a cloud-backend-API independent API to tool chains and orchestration systems. However, the two main problems any service provider or software vendor faces here are as follows. First, the number of potentially useful looking abstraction layers (more than one) exceeds the number of potential useful looking APIs to emulate (one at present); even by implementing support in all of these, you get access only to a fraction of the universe of cloud API users, because most don’t use these libraries. If one (or even two) of these abstractions layers wins out as the common standard, that will be great, but I don’t yet know which one if any will (and neither does anyone else), so the effort to implement each has to be really very low indeed to make it worth spending time on this rather than simply emulating EC-2 (which means we get some degree of support for free in the abstraction layers anyway). Second, there is a risk that you end up only presenting a ‘lowest common denominator’ of functionality to the end user (i.e. it exacerbates the ‘subset of functionality’ problem above). There’s also the more philosophical question: is a competing ecosystem of cloud abstraction layers on top of a competing ecosystem of cloud APIs really better than a single, widely accepted, open, extensible industry-standard protocol? My feeling is it complicates things hugely. I would really like to be proved wrong on this one; this is an area we have our eyes on too.
  • The API is not the only technical barrier to cloud ecosystems. For images to be compatible, you need the Metadata service to be compatible (we emulate most parts of EC-2′s metadata service as well as providing our own), you need the disk storage models to be compatible (ours are a little different from Amazon, more like a standard server, but many images can be made to work), and you need your image to be constructed in a compatible manner. On the last point, EC-2 uses the Xen hypervisor in PV mode, with kernels provided outside the image. We use Xen in HVM mode, KVM, or VMware, and expect kernels to be provided within the image. It’s perfectly possible to provide an image that works with each of these (Ubuntu’s UEC images are a case in point), but some image providers don’t. In some cases, the image provider has had matters made more difficult by factors outside her control and largely outside the control of the Service Provider or IaaS software vendor. For the technically minded, look, for instance at the oscillating drivers for and device names of paravirtualised disks under Xen in Linux; it is, I believe, impossible, to provide a single image that provides paravirtualised disks reliably on every Xen version and presents the same disk names (and that’s just one hypervisor). I don’t want to over-emphasise the difficulty here, because it is possible (see our integration with Standing Cloud, or, to pick another example, RightScale and Rackspace); it’s just not as easy as it might be. To my mind, image compatibility is possibly more important for ecosystems than API compatibility at this stage. Awareness by the linux distribution vendors of different cloud stacks is improving this situation.

Conclusion

API interoperability is hugely important. But, my own view is that cloud technology has not yet reached a level of maturity where we should as an industry be nailing our standard to the mast of any particular cloud API. We do not yet have industry wide standards for the resources themselves, so it seems to me that APIs will continue to be in a state of flux. However, give it a couple of years, and things may be different. I hope by then as an industry we will have a sensible neutral, open, extensible standard. In the mean time, we’d love to here from people who are interested in working with our APIs. And we’ll be adding support for a useful subset of the EC-2 API (the de-facto standard), whilst keeping our eye on other opportunities – particularly OpenStack.