Mattias Holmqvist interviews Adrian Cole, the creator of jclouds, to learn more about how jclouds helps organizations move their applications into the cloud.
Mattias Holmqvist interviews Adrian Cole, the creator of jclouds, to learn more about how jclouds helps organizations move their applications into the cloud.
After being involved with the open source project jclouds for almost two years, Mattias is curious about knowing more about the history of the jclouds project and how it helps organizations move their applications into the cloud. Also, he often gets questions about jclouds and to better answer these, he called up Adrian Cole – the creator of jclouds, for an interview.
Mattias:
What is jclouds and why did you start it?
Adrian:
jclouds is an open source library that aims to end cloud vendor lockin by providing portable functionality for common services such as compute and storage. jclouds was first a plugin to JBoss Infinispan, to address non-blocking access to Amazon S3. Starting with Infinispan came from a desire to stop my cycle of closed source coding, and instead live the dream of merit over job title. jclouds happened because the S3 plugin got too big to be a plugin anymore 🙂
Mattias:
The project now has a massive community and a vibrant mailing list and irc channel, how did you succeed with getting the dev community involved in this way?
Adrian:
It was not easy starting the community of jclouds, and it would have been impossible without Manik Surtani, who mentored me through the early days. He taught me a lot, including soft things like merit of doing meetups, how to nag nicely, etc. Tenacity is probably the only way the group, myself included, got through the first several months. It’s all open, since day 1, so you’ll see a lot of email with no community answers, then one, then two… You have to have faith, but also you have to remain relevant: seek out community and help them scratch their issues.
Mattias:
What technologies have been key to successfully driving jclouds forward?
Adrian:
The first joy (and pain) of jclouds technology has been Google Guice. It kept the dream of a small library alive, adding dependency injection with very little overhead and high testability. Guice attracted interest, and also helped us cope with difficult environments such as Android (which is still iffy), and google app engine. I admit not everyone groks it, and those that do need to spend a little time to get there. The first year was difficult as we were stuck on a pre-release version of Guice 2.0, which sucked, moreover so since other libraries were clashing with us. Guice 3 has been great, especially to those willing to understand it 🙂
Google Guava (which is the second important library we use) has been a consistent joy. Since we also have clojure users, using a functional style in Java makes adapting much easier. They are also just a plain awesome group to work with, even if it isn’t easy for non-googlers to get changes in. Using Guava correctly lends to very testable code.
Gson is another key component. So many APIs are json, and gson has a fantastically clean and understandable way to do parsing. Jesse’s been flat out awesome here, too.
The above 3 google libraries are good enough reason to use them alone, but another reason is the community of like-minded folk. For example, Tim Peireils often comes up with interesting ideas like distributed BlobStore cache, Matt Stephenson could quickly figure out how we do things as they use a similar approach in galaxy-java. Those using the G’libraries tend to have a faster track to helping make jclouds better, and same with us to them!
Mattias:
There are two major parts of jclouds – Blobstore and the ComputeService. Could you explain in simple terms what these are and which existing implementation of these abstractions are available today?
Adrian:
BlobStore is a means to bind Key/Value structures like java.util.Map to remote storage clouds like Amazon S3 or OpenStack Swift. It is not a replacement for JPA, but could be set behind a cache or Map view to deal with data that don’t need the structure of object mapping. Some use BlobStore as their only data source, and to all relational logic in the application. This is pretty powerful, as you could take a war file and move it anywhere, literally.
ComputeService is a way to help manage provisioning of virtual machines. It allows you boot a group of virtual machines and pass a shell script to execute, for example, a Tomcat installation. As a substrate, ComputeService is very effective, and rides underneath some really cool application management tools such as Pallet, or embedded as dynamic scaling systems like what EnterpriseDB do for PostgreSQL.
Mattias:
Do you see any new abstractions making their way into jclouds in the future? (hypervisor? Cdn? Others?)
Adrian:
Well, we have a nearly complete abstraction for Load Balancing, and probably will have things like DNS available soon. VirtualBox is pretty interesting, thanks to you and others. Lightweight management of VMs on people’s laptops is a fairly common ask. Besides new abstractions, we have extensions on the way for existing ones, ex. image manipulation for ComputeService.
Mattias:
Now we have been talking about jclouds in particular for a while. Let’s dive into the more general area of cloud development. Since you have a lot of experience talking to companies that are actively working in the cloud (both cloud providers and application development organizations) it will be really interesting to hear your side of the story.
When you help companies with jclouds/cloud development, what is the most common issue that these run into. What do you see as the major stumbling blocks for businesses that move into cloud?
Adrian:
If you are creating a service or a product that can be run as a service, how to develop the API, is a very common concern. What form should it take (ex. REST vs Amazon-like Query api)? How do you test it? How do you keep it clean? Those adapting to clouds typically find that even in jclouds there’s a leak in the abstraction. For example, metadata about things like maintenance windows, network congestion, pricing, are still not fully explored in API, so business have interesting challenges trying to take on multi-cloud.
Companies looking to use cloud vs build towards it struggle. There is a gap to fill as few cloud tools desire to, or happen to support old school stuff. This can lead to some decisions about whether to brute force, partner, etc.. In other words, not everything can realistically be found in a box, so there’s often multi-vendor, professional services approaches needed.
Mattias:
Which kinds of businesses/apps would gain most from a transition to the cloud? Is there a danger in moving/developing some kinds of applications for cloud?
Adrian:
Depends on what cloud means. I think all businesses can benefit from SaaS (Software as a Service). Not all have to deal with infrastructure directly, as that burden can be placed under the SaaS or PaaS (Platform as a Service) layers. Biggest danger folks express often is about lock-in, and I sometimes hear fear about lack of empirical data on cloud’s ROI. In any new system, earliest ones take the hardest punches. It is no longer early days of cloud, so I think there are less overall fears. Apps that are easy to reproduce from companies who do not desire to run them are the best candidates for cloud.
Mattias:
Writing apps for the cloud is a bit different from regular application development. What do you think is the most important things developers should think about and learn when starting to write apps for the cloud?
Adrian:
Embrace change. Throw away assumptions. It isn’t helpful to bring expectations of 6 month stable release cycles to cloud. In cloud, your dependency isn’t just a library.. it is a service. Keep this in mind, and remember you have to change alongside your app. If you’ve not gotten on the agile, test or even business driven band wagon, now is a good time.
Mattias:
The cloud services comes as a number of different flavors and also at different abstractions. PaaS is a higher-level of abstractions where you have access to a number of services (such as a database), which are available in the cloud. IaaS (Infrastructure as a Service) is at a lower-level of abstraction where you have access to virtual (or physical) hardware. What are the major drawbacks/advantages with using PaaS or IaaS as of today? Are there certain situations where one is preferable to the other? What services do you think is currently missing from the cloud providers?
Adrian:
IaaS is getting more precise, and then more so. For example, the IaaS of today offers a lot more control into networky things than ever before. It looks more and more like a virtual model of a datacenter. Use IaaS when your business requires control at that level, or your products you use don’t yet integrate this for you.
PaaS is more broad, and increasingly so, as we start to see Integration (aka Mule, Camel, etc.) services, more messaging, etc. Use PaaS when you have a good sense of the architecture your applications require, but aren’t necessarily interested in how that works at a machine or network level.
Nearly all IaaS or PaaS services could get better at QoS (Quality of Service) , and things like chargeback. I’d like to see cleaner integrations of configuration management with IaaS, currently only something Abiquo do.
Mattias:
Where do you see the cloud in one year from now? What trends will get significant traction?
Adrian:
We will see more distributed configuration systems like Cloud Formation, CloudBand, and Nimbula. With OSS CloudFoundry and OpenShift leading the charge, I’d expect the PaaS ecosystem to really take off, with a lot of traction in enterprises in the next year.
Mattias: Thank you very much!
Adrian: Sure!