Java/J2EE

July 13, 2009

Platform as a Service: The Next Generation Application Server?

Platform as a Service is a term that can be fairly confusing for many people. Normally the term is associated with Google App Engine from Google and Force.com from Salesforce.com as the main references for this model. From a technical point of view, it is aimed to provide a similar type of value to the one that is currently provided by many of the application servers i.e. it provides a generic container that can host different applications and shield them from the details of the underlying operating system, network, database implementation. Unlike most of the existing application servers it was designed for massive scaling from day one. Another big difference is in the way it is being consumed. With PaaS you don’t need to install any software and go through all the hoops to setup a cluster environment etc. PaaS is provided as a hosted service that is pre-configured and installed. You get a production ready environment right at the start.

Key Characteristics of a Cloud/SaaS Enabled Application Platform

Last week I came across David Mitchel Smith presentation from Gartner. David provided a good definition to the main PaaS characteristics. A snippet from his presentation covering this area is given below:


image

What’s interesting is the great emphasize on Multitenancy support. The fact that the platform is going to be shared between multiple applications and potentially even different customers require various levels of tenancy support and isolation to ensure that even though were taking advantage of the fact that we can share resources between applications, each application needs to be able to use the platform as if it is running on its own dedicated resources. Another interesting point is the need for XTP support. Many would view XTP as a niche that is normally referenced in the high end part of the market, so it is fair to ask why would XTP fit into a general PaaS solution?  XTP represents a model for supporting enterprise transaction processing applications in an extremely scalable environment. In our case, scalability is not necessarily driven from the demand of a particular application but from the fact that many applications are going to run on a shared environment.  A PaaS targeted to enterprise applications would need to provide support for this level of scalable transaction processing support as a core service.

Can you run your existing business applications with GAE or Force.com?

No. Unfortunately as in anything in life reality can really spoil the party.

Force.com was designed to make it simple to run business applications that are database centric i.e. CRM, Reporting etc. It provides a rich set of high level services that make building such applications extremely simple. Google provides a more generic application platform and recently announced support for Java which is a big step towards reducing vendor lock-in concerns.  Google seems to be geared for consumer based applications. Force.com offers a more high level platform that is based on its proprietary services. This means that in order to take advantage of their service you will need to go through a complete re-write.

Force.com is based on a database centric architecture. They also seem to be limited in scalability as they partition their database per application. This means that if your application needs to scale more than what a single database can provide, you can find yourself pretty much locked.

Google's recent support for Java makes their offering closer to standard JEE application servers, however their current support impose a lot of limitations due to their sandbox model. These limitations mean that at the end of the day GAE can be applicable only to a small set of relatively simple applications.  Since there is no guarantee or control over the resources that you are going to receive from their underlying infrastructure, it is likely that the application performance will be unpredictable and will therefore be affected by other applications that are sharing the same hardware.

The fact that the platform as a service shields you from the details of the underlying infrastructure is what makes it simple and that is both the advantage and limitation. You can’t control the environment, you can’t choose your operating system, you can’t install your own set of services and you can’t control the performance characteristics of this platform.

This puts a huge adoption barrier for most current enterprises.

IaaS vs PaaS

Infrastructure as a Service provider provides a hosted service model that offers plain machine level access. This model is also known as Server as a Service. The fact that you get access to the bare metal allows you to run almost any application on this hosted environment. Unlike PaaS, IaaS gives you extreme flexibility. You can choose your own operating system, install any package that you want, you can setup your firewall, security etc. Amazon is known to be the leader in this space and also provides  a set of services on top of their infrastructure such as SimpleDB, SQS and MapReduce. Having said all that, this flexibility comes at the cost of complexity. In many cases you will need to install your own software, configure it, tune it etc. before you could make it run effectively on the cloud. Many application developers don’t have the skill-set to do that. This exposes many operational challenges as many organizations are not geared to support this type of environment from their own IT. The high level services provided by Amazon are still proprietary and would require a complete re-write if you plan to use such services.

PaaS for Enterprise applications – doing it right

The ideal solution would be to combine the best of the two worlds i.e. the flexibility of IaaS and simplicity of PaaS, and here is how:

  • Build a Generic PaaS on top of AWS – To build a PaaS we don’t need to re-invent the wheel. Unlike Google and Force.com we don’t need to own the infrastructure, we can actually use Amazon infrastructure or even better build our PaaS such that it can be portable between Amazon IaaS and a VMware IaaS. By doing so, the PaaS can provide us the ability to deploy applications in a simple way just as in GAE but would still enable us to control the environment, install our own software and get the full flexibility that IaaS provides.
  • JEE as first class citizen – Many of the existing enterprise applications are built in JEE. Making JEE a first class citizen within our PaaS environment will enable you to leverage the existing skill-sets within those organizations. Similar to the standard model that has more than one implementation out there, reducing the lock-in factor significantly.
  • Pre-configured for extreme scalability – All the services provided through the PaaS will need to be implemented and pre-configured for extreme scaling and come with a production ready setup to enable dynamic scaling and fault-tolerance.

Next generation application Server?

Yes, In my opinion, PaaS represents the next wave in middleware technology. One that is targeted for virtualized enterprises, one that was designed for scale-out from the get go, one that fits the new way of delivering SaaS applications and one that can be extremely simple to use. The current PaaS players i.e Google and Salesforce and to a lesser degree Microsoft represent one type of player, those that own it all, i.e. the infrastructure and the platform. There is already a new emerging category of players in the PaaS market, Application PaaS players. Application PaaS players would specialize on delivering only the platform and not the hardware and network infrastructure. They act as the bridge that will enable portability between different infrastructure providers including the internal IT (a.k.a private-cloud). The application PaaS players will also be segmented in similar ways to the way application servers are segmented today, i.e there would be the one targeting the low-end consumer market in similar ways to Google App Engine, the ones that will be aimed toward the high end of the market and those who will be specialized in certain languages or development framework, i.e. Ruby/ Java/ .Net etc. In the PaaS type of world, those who would be able to provide a holistic solution that works smoothly across all the application tiers would have an advantage over those who are providing point solutions.

What about my existing applications ?

Most people would categories PaaS as a platform that is delivered through the internet. That statement would make the idea behind PaaS irrelevant for a large part of the existing enterprise applications, as a majority of them are not ready to run their application in a hosted service over the internet. Let’s examine that statement:

Many of the existing IT run farms of application servers in their internal IT. Those application servers are running in an internal data center that is not that different from any other hosted services, only that it is a specialized hosted service tailored for the needs of the specific organization. I would therefore argue that those organizations that are already running such application server farms would find it easier to evolve such server farms to PaaS model than to change their entire IT infrastructure into internal cloud. The reason is that those applications were already written to run in an application container model, therefore a large part of the transition work can be done within the container implementation and outside the application code. Targeting them first would therefore be potentially easier transition than trying to transform all your other applications into a virtualized environment.

I know that there are many people out there that would argue that internal PaaS is not a PaaS because it doesn’t answer the exact definition of PaaS. However in my view the similarities exceed the differences and the value to the enterprise would be almost identical to the one that I would receive by any of the internet based PaaS platform.

Will I lose control?

PaaS will provide the internal IT much better control over the applications that are running on their environment. You can have high visibility as to ways the application consume resources. In the same way, you can control fairly tightly the application security, scalability, resource management and fault-tolerance which can finally be managed in a consistent way across all the applications.

Final words

Platform as a Service represent the next generation application server IMO. GAE and Force.com are most known references in the market for that model. Both tries to offer the complete stack and that’s both their advantage and limitation. There already is a new category of Application PaaS providers that specialize on providing PaaS on top of existing infrastructure providers. There would be generic PaaS providers (similar to GAE) geared for the low end part of the market and those that are geared for the high end of the market. There would also be more vertical PaaS providers (Similar to Force.com)  i.e. those that will provide PaaS for a certain segment of applications or segment of the market such as Online Gaming PaaS, Telco PaaS etc. A good example for such a service is Twilio. As I outlined in my recent post (Google App Engine plus Amazon AWS: Best of both worlds), this is not just a theory but a reality in the making. GigaSpaces' contribution to this new reality is our new cloud framework on Amazon EC2. Geva Perry provides an excellent overview of other good examples that follows that same line his post What's Really Exciting About Cloud Computing.

References:

June 25, 2009

Google App Engine plus Amazon AWS: Best of both worlds

George Lawton wrote a a good summary of my JavaOne talk in his article titled Google App Engine plus Amazon AWS: Best of both worlds 

Google App Engine (GAE) is focused on making development easy, but limits your options. Amazon Web Services is focused on making development flexible, but complicates the development process. Real enterprise applications require both of these paradigms to achieve success… What we really want is the flexibility and performance of AWS and the simplicity and ease of use of GAE.

This is exactly what we had been working on for the past year, leading us to the launch of our new cloud platform. With this platform we leverage GigaSpaces XAP as the high performance scale-out application server and Amazon as the robust and flexible IaaS. Together they form an alternative Platform as a Service geared for enterprise grade applications. This allows the cloud environment to inherit the extreme performance, latency and scalability of the XAP platform, which in turn enables achieving your performance and scaling target with less machines, implying a lower cost.


Real-life case study: Primatics financial – Risk analysis as a service

Francis de la Cruz and Argyn Kuketayev from Primatics Financial joined me through the presentation. In their part of the session they described their experience in developing a SaaS application for Real Time analytics.

Kuketayev described how Primatics used this approach to create a new automatically scaling cloud version of an existing banking application. Primatics initially developed a mortgage securities application that allows banks to estimate the value of a basket of hundreds of thousands of loans. The value of these loans fluctuates as economic conditions change and some portion of home owners cannot afford to make payments on their loans. Banks normally only need to assess the value of these loans at the end of each month, making them an ideal candidate for cloud services like AWS.

From a scalability perspective the challenge is to be able to provide a highly multi-tenant application that need to serve many firms, many users in that same firm each running many jobs at the same time. Implementing such a model can be fairly complex as you will need to be able to manage the life cycle of each job and each user independently and in isolation from one another.


The need for scale 

Trying to build such a service directly on Amazon is going to be fairly complex, as you can learn from George’s summary below:

Primatics wrote the first version of EVOLV:Risk as a hosted web application for a regional bank.. The application needed to be fault tolerant so that if one node crashed, they did not have to restart the application over again from the beginning. Kuketayev said that it is not just about the loss of four hours, but the office is trying to close out the month and needs to access data to end the monthly cycle so they can go home.

Using GigaSpaces' toolset they rewrote the entire application infrastructure in about four-months to run on top of AWS. Now they can kick off as many instances as required for different banking customers, and each instance runs significantly faster than before. Kuketayev said that it is important for banks that none of their applications run on the same infrastructure as another bank.


The diagram below shows the specific architecture that Primatics ended up using. Those that are familiar with Space Based Architecture would find it fairly straight forward:

The application is built out of  a set of processing units. Each processing unit contains the compute agents in the form of a polling-container.  The compute agents gets a a reference to a remote Data Grid that is shared by all processing units. Each agent gets the job injected to it by the polling container and gets a reference to the data it required to process the job. Once the job is completed, the result is stored back in the space. The results are flashed out back to a database through a mirror service.

In a case of a failure, other compute agents are able to continue from the exact point of failure and continue the job processing as if nothing happened. This is because the state of the job is kept safe in the data-grid and not in the agent’s memory.


image

Kuketayev from Primatics nicely summarized thye lesson he learned after going through the experience of trying to build it on his own vs. trying to use GigaSpaces:

Kuketayev said that one of the biggest lessons is that you need to have your infrastructure do the provisioning for you automatically, or otherwise you end up spending a lot of time just turning things on and off. He said they are now using configuration APIs to automate this process, whereas before they were using scripts. This allow for automatically throttling and failover recovery without human intervention.

Kuketayev advised "You need to make sure you use the right tools … You don't want to have to worry about provisioning and reliability. Make sure you have provisioning, failover, monitoring and SLA out of the box."


The full JavaOne presentation is available here:


Final words

Fr solution providers the size of Primatics, building a risk analysis application as a service couldn’t be possible without cloud computing. Cloud enabled them to offer their solution as a service without the need to go through major investment of building a data center to support it.

Primatics’ experience is not special. One of the benefits of building Software as a Service is that you have one shared environment for all your customers. At the same time, one of the challenges is that in a shared environment, failure becomes more public and will impact ALL your clients. If the system doesn’t scale well, you’re going to be hit twice as hard as in a standalone application.

Building a robust and scalable SaaS application can be fairly complex. A good cloud infrastructure will get you a first class data center, but it won’t solve your application requirements.What’s interesting with cloud computing is that it forces you to think about the cost and efficiency of your application more than ever before. In the Primatics example, running a simulation of 100 nodes for 3 hours is very likely to fail at some point. A failure during such a simulation will immediately cost you 300 hours, not to mention the fact that you might lose the simulation window for the day and the reputation challenge you’ll will be facing with your customers. In addition, putting the data in-memory and making the application run 3-5 times faster means that you would need 1/5 of the machine power, which saves 80% of the cost of running the application.

I believe that the challenges imposed by cloud computing force us to focus on what we do best and avoid investing in areas which are not core to our business. Because the pay-per-use model significantly lowers the cost barrier, going down the path of writing your own infrastructure, as many have tried to do before, will be much more expensive and risky then ever before.

References:

May 25, 2009

Interesting talks, and free drinks in JavaOne

Its been two years since I've last visited the JavaOne conference. This year is going to be particularly interesting as its going to be the first major Java event following the Oracle acquisition of Sun.

image I will have a Technical Session on Tuesday titled: Alternative to Google Application Engine for Java™ Technology-Based Applications, where I'm going to outline the difference between the Google and Amazon approach for cloud computing and discuss how we can combine the best of the two approaches. Argyn Kuketayev and Francis de la Cruz from Primatics is going to join me  and present their experience in deploying a risk management application as a service and provide some of the technical details on how they where able to scale-out their application  on the cloud.


Daniel Templeton from Sun Microsystems will have a lab session: PetClinic in the Clouds: Scaling a Classic Enterprise Application In this Hands-on Lab, participants will take a popular Web application (the Spring PetClinic sample application) and modify it so that it can be deployed on the Amazon EC2 cloud computing infrastructure. They will be exposed to using the GigaSpaces platform as a service, in-memory data grid concepts, the OpenSpaces framework, cloud computing concepts, and persistence as a service using Sun's MySQL™ database technology.

image

We are also co-hosting an event Tuesday June 2 at 8PM with our partner Webtide with whom we've done a great integration for Jetty.

Among those who will attend the party will give a chance to win a free book Savvy Guide for cloud computing by Jim Liddle.

*Note: Space is going to be limited so if you want to ensure your place make sure to register on the online registration site that we set for this event.

The list below include all the sessions and labs that I hope to see – any recommendations on other interesting talks would be appreciated.


Session ID

Session Title

Session Type

Speakers and Company

Date/Time

 Venue Room

TS-4605

Enterprise JavaBeans™ 3.1 (EJB™ 3.1) Technology Overview

Technical Session

Kenneth Saks, Sun Microsystems, Inc.; Marina Vatkina, Sun Microsystems, Inc.

Tuesday
June 02
10:50 AM - 11:50 AM

Hall E 134

TS-4308 Architecting Robust Applications for Amazon EC2
Chris Richardson,
Technical Session Chris Richardson Consulting Tuesday
June 02
12:10 PM - 1:10 PM
Esplanade 307-310
TS-4390 Castle in the Clouds: SaaS Enabling JavaServer™ Faces Applications
Technical Session
Lucas Jellema, AMIS
Tuesday
June 02
12:10 PM - 1:10 PM
Esplanade 302
TS-3817 Google App Engine: Java™Technology in the Cloud
Technical Session Toby Reyelts, Google; Max Ross, Google; Don Schwarz, Google
Tuesday
June 02
3:20 PM - 4:20 PM
Hall E 135

TS-5454

Alternative to Google Application Engine for Java™ Technology-Based Applications

Technical Session

Nati Shalom, GigaSpaces

Argyn Kuketayev

Francis de la Cruz

Primatics

Tuesday
June 02
4:40 PM - 5:40 PM

Esplanade 302

LAB-5564BYOL

 PetClinic in the Clouds: Scaling a Classic Enterprise Application

Hands On Lab

Michal Bachorik, Sun Microsystems, Inc.; Shay Hassidim, GigaSpaces; Daniel Templeton, Sun Microsystems, Inc.
Wednesday
June 03

Wednesday

1:35 PM - 3:15 PM

Hall E 132

TS-5214 Java™ Persistence API 2.0: What's New ?
Technical Session Linda DeMichiel, Sun Microsystems, Inc.; Anil Gaur, Sun Microsystems, Inc.
Wednesday
June 03
2:50 PM - 3:50 PM
Hall E 134
BOF-1304
Meet The App Engine (Java™) Team 
BOF Kevin Gibbs, Google; Toby Reyelts, Google; Max Ross, Google; Don Schwarz, Google
Wednesday
June 03
7:45 PM - 8:35 PM
Hall E 135

PAN-5366

Cloud Computing: Show Me the Money

Panel Session

Jeff Barr, Amazon.com; Jeff Collins, Intuit; Chris Fry, Salesforce; Simon Guest, Microsoft; Gregor Hohpe, Google, Inc.; Raghavan Srinivas, Self; Lew Tucker, Sun Microsystems, Inc.

Thursday
June 04
9:30 AM - 10:30 AM

Gateway 102-103

BOF-5392

Grails Integration Strategies

BOF

Dave Klein, Contegix

Thursday
June 04
6:30 PM - 7:20 PM

Esplanade 307-310

TS-5307

Building Next-Generation Web Applications with the Spring 3.0 Web Stack

Technical Session

Keith Donald, SpringSource; Jeremy Grelle, SpringSource

Friday
June 05
12:10 PM - 1:10 PM

Esplanade 307-310

LAB-5960 Storing Data in the Cloud Hands On Lab
Craig Hubbard, Sun Microsystems, Inc.; Chris Kutler, Sun Microsystems, Inc.; Craig McClanahan, Sun Microsystems, Inc.
Thursday
June 04
9:30 AM - 11:10 AM
Hall E 130-131

Tip* - If you want to find your own sessions I would highly recommend using the JavaOne search tool.

Seeyu next week!

April 14, 2009

Challanges for Developing Enterprise Application on the Cloud

In the past few weeks I found myself  involved in various discussions centered around the challenges that enterprises face today when they want to deploy their application on the cloud.

These discussions were very timely as they gave me some interesting ideas for my talk next week Practical Guide for Developing Enterprise Application on the Cloud, taking place on Monday, April 20 at 11:00 am Eastern Daylight Time (GMT -04:00, New York), at the CloudSlam online conference.  

I thought that some of those discussions are worth sharing. In this post I'll try to summarize the main highlights from those discussions.

I’ll start by pointing to the following discussion thread on the cloud mailing list:  Challenges faced by developers and architects when moving to the cloud:

 Kent Langley responded by listing the following challenges in his response:

  • Dealing with a lack of persistence in some cases Dealing with distributed programming models (prob. one of the most important ones imho) Having to think about the whole stack.  Not just the code.
  • Caching considerations
  • Messaging
  • Using Memory Data Grids
  • Understanding configuration management tools that might be involved Working more closely with the operations group in some case

Kevin Apte added the following comment:

There are many challenges- There is no out-of-the-box infrastructure for
hosting the typical J2EE and SOA Stack in the cloud. There is no Weblogic,
WebSphere, ALBPM, Message Bus like Tibco available in the cloud.

A development team could certainly move all of this into the cloud, but the
configuration, licensing issues etc. are all something the team would have
to solve on its own.  This is far too bleeding edge for many people.

Robert Hankel added other challenges

..   The problem of adding additional resources dynamically (e.g. more WebCache instances, or WebLogic servers) requires sophisticated distributed system management infrastructure where the entity being managed is no longer a physical or virtual box, but rather an array of boxes acting collectively as a single system..


Grig Gheorghiu posted an interesting write-up Experiences deploying a large-scale infrastructure in Amazon EC2, where he provide a very insightful summary of his lessons with deploying a large scale application on EC2.

Below are the main takeaways from Grig's summary, which I found relevant for this discussion:

1) Deploy multiple Web servers
2) Deploy multiple load balancers
3) Deploy several database servers.
4) Another way of dealing with databases is to not use them

Challenges summary?

It's easy to see that there is a common theme behind all those comments. Taking existing enterprise applications to the cloud can be very difficult simply because a) most of today's enterprise applications were built using frameworks and technologies not yet supported as first class citizen by cloud providers and b) most of those applications were not designed to take advantage of the cloud's elasticity.

Rather then pointing to my direct response to each of those challenges i thought that it would be better to provide a short summary of the main possible solutions that came through this discussion.

Does it have to be that difficult?

No. Below are two main approaches to those challenges.

- Packaging static images

The simplest approach would obviously be to package your local IT environment into images that could be easily ported to the cloud in the exact same way they run in your local IT environment; right? Well, yes, you can package anything in an image bundle and host your virtual machines in a reserved mode with fixed IP configuration. However, being able to technically do that doesn't mean that it makes sense. I would question what's the difference between this environment and any other hosting environment, and what do you expect to get by moving to such a hosted environment vs. running it in your local IT environment.

If you are going to try deploying your existing IT application on the cloud using static images, then most likely you'll end up "porting" not just the application but also the problems you were facing in your local IT environment; i.e. your application will be over-provisioned based on the peak load and you’ll end up with poorly utilized environment.

- Fully elastic application

The main driver for moving to cloud based environment in the first place was to be able to grow as you need and pay for what you use.

The question is whether you can deploy your application without changes to the application while at the same time leverage the elasticity that cloud brings.

Sounds impossible? Well, a good example that does just that is Storage. With storage you can take your existing application, run it with your local (static) disk, and then plug in a network storage and run the same application on that network device, without changing the application. In that world, instead of taking your existing local disk and virutalize it, you are taking the application and plug it to another device that has visualization built-in.

We can use the same approach as with storage; i.e. move your existing application code and run on top of a different underlying implementation that will enable you to capture the elasticity of the cloud without forcing you to re-write your entire application. If you're running in a JEE environment, it should be fairly easy. If your application has strong "ties" to back-end systems, you can use the hybrid model where your application front-end is running on the cloud, while being connected to the back-end system, using secured communication channel.

Is it really that simple?

Yes. My experience with the integration work that we had done between our Cloud Computing Framework (CCF) and EC2 showed me that getting a production-ready JEE application, including a load-balancer, self healing, auto scaling, security, database and even data grid capabilities plugged in, is actually much simpler than with any other environment I'm aware of. This is due to the built-in automation, predefined images, and the fact that I don't need to download and setup anything to get the entire system up and running. In fact, it's so simple that we decided to built our entire Demo as a Service framework around it; a framework that is used quite successfully and constantly, with customers, prospects, and now with partners.

Are there any production references?

Yes.  Read Jim Liddle's blog post and see a good example of an Enterprise JEE deployment that is already running on EC2, in production, on top of our new Cloud Computing Framework. In his post, Jim describes how this Telco operator were able to address some of the common challanges that were mentioned above such as security,flexability, cost, development complexity and lock-in as well as   high-avliability and scalability in a relatively simple manner .


What next?

In my discussion in the CloudSlam event I’ll try to provide a more detailed practical guide that demonstrating how you can take a step by step approach for porting existing JEE application to the cloud. I hope to get lots of questions and feedback during the discussion so that I can share those with you in one of my follow-up posts.


image


January 22, 2009

Saving cost using Application/Middleware virtualization

Earlier this week, I gave a joint webinar with James Liddle, where we outlined
practical guidelines for saving costs using middleware and application-level Virtualization:

  • Saving the cost of peak/static provisioning using on-demand scaling
  • Saving the downtime cost
  • Saving costs through outsourcing part of our application and operations to the cloud
  • Saving costs using application level optimization (doing more with less)
  • Saving costs using platform consolidation to reduce the number of software components as well as utilize OpenSource and more commodity Software packages

Towards the end (Slide 20), Jim Liddle presented real life case studies from the iPhone launch in the UK and how some of hose principles have been applied to enable a successful launch in the UK.

Additionally, Jim went through some of the motivations and case studies that led different Telco, Online Gaming and Start-up companies to utilize our cloud, offering to gain better cost effectiveness.

For those who did not have the chance to participate in the webinar, we uploaded a recorded version of that presentation for you to view.





I would like to point out a few specific items from this presentation:

Beyond Server Side Consolidation (Slide 5)

Server-side-consalidation
Server-Side-Consolidation (SSC) played an important role in bringing the concept of virtualization to mass adoption. It forced organization to be able to map their application into concrete packages and look at machines as a logical entity rather then just a physical entity. Server-Side-Consolidation also brought a relatively simple model for cost saving: Instead of running applications on dedicated HW you can consolidate them into one machine. By doing so, you can reduce the number of servers and save the hardware and the operational costs associated with that.

Having said that, SSC is only one *very narrow* aspect of virtualization that unfortunately became too coupled with Virtualization.

The next step is obviously to move from SSC to application and middleware level Virtualization. Application Virtualization refers to the opposite scenario. Instead of putting multiple applications on the same hardware, we are taking a given application and spreading it on a pool of machines. This holds significant potential for making applications more efficient. For example, just think of the saving potential gained by moving applications from static peak-load provisioning to on-demand provisioning. In addition to that, we can utilize commodity hardware resources to get the power of high end machines.


Practical steps ( Slide 19)

One of the main concerns that most people have WRT to application level virtualization is the effort associated with applying those principles.

The table below captures the value vs effort that we can receive with each part of our middleware and application level virtualization.
Practical-steps 

As you can see, there are steps that require non or very little changes to our application:
For example, taking an existing web application that requires 10 machines to meet certain peak load, however on average it needs only 3 machines.  Normally we would statically provision 10 machines to meet the peak load which means that on average those machines would be poorly utilized. .  Instead, we can provision our web container on-demand and use Pay-Per-Use model to pay only what is consumed. This will enable us to save roughly 7 machines (10 - 3 ) and wouldn't require any changes to our code. (You can learn how you can do that with GigaSpaces here)

If we have a computational business logic or even a rendering application (a good example of that is  Slideshare or Yutube). Those type of applications tend to deal with fluctuating loads. So if on average we consume 10 machines and for peak we need up to 90 machines - you will be able to save ~80 machines! by provisioning the computational/rendering machines on-demand. (See here on how to use GigaSpaces for MapReduce computation and here to learn on how to use the Actor model)  

On the messaging and data side, we can reduce the amount of machines and the cost of scaling by partioning our messaging or data thus enabling linear scaling of those layers. In addition to that utilizing memory resources instead of files provide a significant boosst in performance. The combination of the two enables us to utilize commodity software and hardware resources instead of high-end resources - for example we don't need to rely on high-end databases, we can simply use MySQL.

Final Words

In this presentation we tried to gather most of the knowledge based on our past experience. Most of these lessons are generic and not necessarily specific to GigaSpaces.

I also tried to provide some practical implementation guidelines (Slide 19):

  • Avoid radical change, enabling a gradual process
  • Choose an architecture supporting linear scalability
  • Minimize vendor lock-in
    • Enable application portability and freedom of choice of:
      • Cloud provider, Web container, Programming language, Database
    • Minimize API lock in:
      • Use of standards
      • API Abstractions – when standards are not available
  • Future proof your application
    • Don’t make decisions today, but be ready to make one without major effort
    • Avoid long-term commitment – choose the right licensing model

Real life case studies:

I would like to encourage you again to listen to the case studies (Slide 20) and learn how others  apply some of those principles in their local-IT and Cloud.

Additional references:

December 22, 2008

Reducing latency with Sun Real Time JVM

Frederic Pariente  Engineering Manager at Sun Microsystems posted an interesting summary of a case study with GigaSpaces on Sun blog: Gigaspaces curbs latency outliers with Java Real Time

In the context of a customer proof-of-concept this summer and in the light of the 2.0 release of Java Real Time System --JRTS 1.0 had the bad prerequisite of source code changes--, Gigaspaces revisited the opportunity for Java Real Time to serve the low-latency requirements of trading applications. Gigaspaces XAP 6.5, Solaris 10 and both Java 5.0 standard and real-time JVMs were used for the benchmark. The test scenario included a trade matching engine and multiple clients injecting messages at extreme speed. The success criteria was to get guaranteed latency per message under 10 msec, with no code modification to the matching engine"

"..The first lesson learned was that msec latency was achievable with the standard JVM, through some advanced tuning of the JVM command-line options. While the customer had reported application freezes up to 20 sec during garbage collection under heavy load --he was running the JVM with no particular flags, unfortunately default JVM options optimize for throughput--, latencies could be brought down to milliseconds by switching to the Concurrent Garbarge Collector"

"..The second lesson learned was that the number of outliers can be reduced by an order of magnitude by using the real-time JVM. At a small cost in terms of application throughput --lower-- and CPU usage --higher-- of course"


You can see the full detailed benchmark and JVM option in the original post.

How Real Time JVM works?



For those who are not faimiliare with Real Time JVM, Fredric points to a very detailed presentation on his post Java Real Time for latency-critical banking applications  which I'd recommend looking at. I took one slide from the presentation which i found useful for understanding the general concept behind Real Time VM.

RT-Java

As could be seen in the above diagram the RT-JVM introduces new type of threads "Critical RT threads". It makes sure that GC will not run while those thread are running and in that case provide better predictable behavior.

Other references:

Latency is Everywhere and it Costs You Sales - How to Crush it - My Take

November 14, 2008

Private/Public Cloud

Most data centers of today run applications on dedicated machines. This is often referred to as static provisioning. In addition, applications are typically provisioned to handle expected peak loads. Both lead to over-provisioning and low resource utilization.

John Foley wrote an article back in August Private Clouds Take Shape in which he describes how data centers are reshaping themselves by taking ideas from public cloud providers, such as Amazon and Google.  The idea is to make the data center more cost-effective by enabling on-demand utility-based computing rather than dedicated machines.

The shift towards a utility data-center is a game changer. It will change the way IT operates, the way applications run in the data center as well as the culture of IT organizations.

The push to private clouds has a strong momentum these days, as all the major players, starting with the hardware vendors and ending with virtualization vendors, realize that their future rests in how well they fit in this model. Microsoft's Azure announcement one of the most significant announcements from a major vendor so far.

The need for Private/Public Clouds

At the same time, it is clear that to make IT operations more effective, it doesn't make sense to run all the applications that are currently hosted in a company's data center in the private cloud.  Not all applications in the data center are mission-critical or production systems. For example, take staging or testing environments. Such environments are supposed to be a mirror of the production environment. This is reasonable when our production system runs on a single dedicated server, but what if it runs on 10 or even 100 servers? Does it make sense to have another 10 or 100 dedicated servers just for that purpose? Another good example is disaster recovery. Disaster recovery sites require us to double our resources, let alone the cost associated with maintaining two separate data centers. These are classic scenarios in which running applications on a public cloud could lead to huge cost savings.

A recent InformationWeek survey (which Foley mentions in his piece) provides a more detailed view on the types of applications likely to move from private clouds to public clouds.

Upcomming cloud computng analysis (Information Week)

Making your application ready for the private/public cloud

The challenges

There are a few challenges to be aware of if we want as ready applications for a hybrid private/public cloud:

1. How do we design applications to be cloud-agnostic: how do we perform application testing on a public cloud and then run that exact same application in production on a private cloud. For the application to be cloud-agnostic we need to ensure that neither our application code or configuration is going to change by the transition and that our application is going to behave the same in both environments.

2. How do we enable seamless fail-over to a public cloud? To enable a disaster recovery scenario, the public and private clouds need to be connected in a way that enables seamless fail-over from the private to the public cloud

3. Future-proofing: There are many cases in which we can't make a clear decision as to where our application should be running at the time of writing or developing the application. We would like to be in a position to change the decision as to where our application will be running even after our application has been completely developed.

The solution

1. Enterprise-ready Platform-as-a-Service (PaaS)

Many recent discussions on cloud computing have been centered on the low-level infrastructure, such as virtualization. This is sometimes referred to as Infrastructure-as-a-Service (e.g., Amazon EC2). It is clear that to address the the first and the third challenges mentioned above we need a new middleware stack that will provide generic services for running applications in a virtualized environment, or a platform-as-a-service.

Pass You can read more about this layer in GigaSpaces as Alternative to Google AppEngine for the Enterprise. The role of the enterprise-ready PaaS will be similar in nature to that of the application server of today, only that it will broaden to support the needs of private cloud environments, as I outlined in my earlier post.

2. Cohesive FT's VPN Cubed

While entperise-ready PaaS shields application code from the underlying cloud infrastructure, CohesiveFT's VPN Cubed is responsible for connecting one or more cloud networks through a secured channel in a way that makes them all appear as one big cloud, even if they are not owned by the same provider.

See for yourself how it works live!

My colleague Dekel Tankel  blogged about the joint solution by GigaSpaces and CohesiveFT aimed at addressing these challenges. The solution will be presented in a webinar next week Making Cloud Portability a Practical Reality.

We will show how you can write and deploy standard applications on top of GigaSpaces' Cloud Framework and use CFT's VPN Cubed for a seamless transition across clouds. Even more interesting is that using this solution you can even use multicast discovery across clouds.

By "standard application" I mean that you can deploy a standard JEE web application packaged as a WAR and deploy it in the multi-cloud environment. It doesn't need to be a "GigaSpaces application". In the webinar next week we will show live how we can deploy an application across both Amazon EC2 and Flexiscale, kill one of the machines and see how the application fails-over seamlessly between the clouds with zero downtime.

The webinar will take place next week: November 18, 2008, 1:00 PM EST.

You can register online here




November 05, 2008

Managing application on the cloud using a JMX Fabric

One of the challenges of managing application in a distributed environment such as Cloud/Grid is that collecting or finding the management information of each part of the application is a relatively complex task.

JMX provides a standard way to expose the management information (MBean) of a particular server. However, the way the client-side finds all the MBeans that comprise the application, or the way a single client might interact with the distributed parts of the application, is left open.

Steve Colwill from PSJ wrote a detailed blog, JMX for Grid Based Applications,
where he outlines a solution that uses JMX JSR-160 connectors and GigaSpaces to create a JMX Fabric. According to the proposed solution, the managed agent (server side) use the connector to add a reference of each MBean stub to the space. The client uses a FederatedMBeanServerConnection class that picks up those references from the space, connects to them and then delegates operations to the set of Mbean servers, effectively acting as a multiplexer.


Federated-jmx2 Using the space as a JMX directory service

The above diagram illustrates how the model described by Steve works. The client is abstracted from the physical location of each server and can easily discover services that join the network. The connection from the client to the servers uses peer-to-peer communication, which means that once the service is discovered, no additional overhead is needed for communication between the client and the managed service. In this case, the space is used as a directory service. We leverage the fact that it can be distributed and dynamically discovered to simplify the discovery process in a distributed environment.

Using the space as a management data repository

The above model is quite useful for cases in which we want to expose federated services which have an existing remote interface. But this is not always the case. If it isn't, we can use the space as a management data repository, which contains full management information for each agent and exposes that information to the client or to any management application. In this method too, the client application is abstracted from the managed service. But unlike the first option, the client gets the information about the managed entity directly from the space, and doesn't need to maintain a connection with the managed service. The space in this case is used as a distributed database, so the application can not only obtain management information about an individual server but can also gather aggregate statistics and perform other aggregate data queries, directly on the data model.

Summary

Steve's solution to managing application in a distributed environment is an interesting one, as it enables applications that are already using a standard JMX interface to use a new federated model without changing the application and without adding a performance overhead. This is achieved just by plugging a new space-based connector. It is a good example that shows how a space can be used as a distributed directory service. It is important to note this is only one pattern in which a space can be used to solve this type of challenge. There are other ways; using the space as a management data repository, as I suggested in this post, is just one of them. The nice thing is that implementing any of these patterns becomes fairly simple once the space is brought into the picture.

I would like to end this post by thanking Steve specifically and PSJ in general for being a great partner for such a long period of time, and for sharing your experience in such meticulous detail.

October 29, 2008

Need scalability? Don't forget pricing

In most discussions about scalability, we often approach the topic as a pure technical/architecture challenge, and ignore cost issues. The problem is that when we truly scale our application, and want to benefit from economies of scale, we're going to end up with scale limitations, not because of technical issues, but because of the pricing  and licensing models.

Scalable pricing

Scalable pricing means a pricing scheme that provides the benefits of economies of scale. Below are pricing models commonly used for software products and how they fit in the new dynamically-scalable world.

  • Free - while this certainly sounds like the best option (and may very well be) the customer needs to be aware of the following:
- The free license of a software product typically does not include support: not an option for most mission critical applications.
- When you do pay extra for support, you will typically be charged just like any other run-time license on a per CPU basis.
- Make sure that the company behind the product has a sustainable business model, otherwise there is a good chance that it will either die when its funding dries up or change its license model to monetize its user base. That's fine, but all it means is that it's not really a free offering in the long run, and you don't know what the pricing model will be exactly.
- In terms of total cost of ownership (TCO), free products are not necessarily the cheapest option. TCO is dependent on many factors, for example, dependency on other products (and their license costs), the need for integration and maintenance, etc. See my post, Economies of non scale, for more on the topic.
  • Subscription model - With a subscription model you pay a fixed periodical fee, typically on an annual basis for infrastructure software, and on a monthly basis for SaaS. Subscription pricing is suitable for on-demand scalability as it provides the flexibility to grow or reduce cost based on the annual use of the product.
  • Pay per use - this model is even more flexible then subscription model as it gives you higher granularity. Pay per use is provides in various forms where the usage can be a measure of CPU utilization or bandwidth utilization. Amazon for example charge per machine utilization for its EC2 services and data-utilization for its data services.
  • Perpetual license - This model is used to buy licenses in advance and pay for support separately (normally 15-20% on top of the per CPU license). This is the most commonly used model with commercial software products, however, due to the large initial investment required by this model, it doesn't fit well with on-demand environment.
  • Enterprise unlimited license - This model enables you to pay premium price in advance (based on potential future usage) and gives you the freedom to use the software without any limit. This model fits to environment where you anticipate that over a fairly short period of time the usage of the product will become wide and therefore the pay-per use or any of the other models mentioned above will become more expensive.

Which model to choose?

Each of the models has pros and cons and therefore the answer depends on your situation. Also, over time, as the situation changes, you will probably realize you need a different license model, and so it becomes equally important that the product you choose will give you the freedom to move from one model to another in the future.

GigaSpaces scalable pricing

With GigaSpaces we continuously look into ways to make our software license cost fit the on-demand world. For example, we launched a free Start-Up program that provides a totally FREE version of GigaSpaces for startups (hundreds of start-ups have already signed up for this program since we launched it last year). We also provide a Pay-Per-Use model for those running on Amazon EC2.

We felt that even though this is a fairly flexible pricing, we could do better. As of our 6.6 release, we added the option to buy our software at a yearly subscription price, and we also launched a new package called XAP Standard Edition, which is sold at a very low price of $9,500k per package (not CPU) where the package includes two servers, 4 GigaSpaces nodes and up to 50 clients or remote servers.

These changes were designed to address the needs of developers looking to start running their applications at a relatively low scale, who need the full functionality of the product, but cannot afford the full XAP price. Another principle that we kept when we designed this package is that moving from Standard to Premium edition wouldn't require any change in your architecture or code - which means that you could always scale to the premium edition just by changing the license key.
More details about the new pricing model is available here

Other references:
GigaSpaces and the Economics of Cloud Computing

Economies of Non-Scale

October 19, 2008

GigaSpaces as Alternative to GoogleAppEngine for the Enterprise

I Just came across an interesting post by Josh Heitzman who writes about his negative experience with Google App Engine, which led him to examine a list of Alternatives to Google App Engine. He points out GigaSpaces XAP as one of them:

One particularly interesting EC2 third party provider is GigaSpaces with their XAP platform that provides in memory transactions backed up to a database. The in memory transactions appear to scale linearly across machines thus providing a distributed in-memory datastore that gets backed up to persistent storage. A lot of the docs reference Java, but the page returned by the aforementioned link states “…deploy applications that use Java, .Net, C++, or even scripting languages…” so after a cursory investigation it is not clear what aspects of their platform is only accessible via Java and which aspects are generally accessible. Bears more investigation.

[See my comment to the post relating to the question about .Net and C++ support]

Josh's post raises the question of what is Platform-as-a-Service (PaaS)?

Platform-as-a-Service is a term used to describe a new set of development platforms that are typically accessible through the web. These platforms enable you to develop new applications easily, without the need to install any software or set up a development and deployment environment. A good example of this is Force.com from Scalesforce.com. Other SaaS providers have similar platforms. It seems that the common motivation behind this trend is to enable the SaaS provider a way to expose their internal framework to other partners and users and build an eco-system around their SaaS product. This led to the emergence of dedicated, proprietary platforms. Google App Engine is a similar effort from Google initiative to expose some of their own platform to external users.

These platforms, as Josh experienced, were not designed as a general purpose enterprise platform, and therefore, it is not surprising that they lack many of the elements that you would normally expect from enterprise middleware, such as transaction support, security and standard APIs.

Unlike such Internet based platforms, GigaSpaces XAP was primarily designed as an enterprise middleware platform. It is used in the most demanding mission-critical applications that require extreme scalability and low-latency. During the past year we have extended our middleware platform to the Internet cloud, starting with tight integration with Amazon EC2 and followed with partnerships and integration with leading players in the market, including GoGrid, Joyent, RightScale, CohesiveFTand others. We recently launched our cloud framework in private beta. It enables building enterprise applications on GigaSpaces XAP via the internet. In this way, you can run an application on a hosted GigaSpaces environment, without even downloading the GigaSpaces software.

What makes GigaSpaces XAP an alternative to Google App Engine for the enterprise?

  • Support for existing enterprise applications:
One of my previous posts discussed: Google App Engine - what about existing applications?

In this post, I want to reiterate this point, which goes to the heart of one of the main differences between GigaSpaces XAP and most Internet based PaaS, including Google App Engine and Force.com. Many enterprise applications are already built in Java, .Net or C++. To support enterprise applications, you first need support for these core languages. GigaSpaces XAP not only supports these languages, but also enables efficient interoperability among them.

  • No vendor lock-in: 
While I would argue that lock-in is unavoidable at some level, and that every platform imposes some lock-in, I'd also argue that it is important to examine the nature of the lock-in, and how easy it is to migrate from one platform to another. With XAP, we invested heavily in making lock-in minimal through abstraction, aspects, support for standard APIs and more. As of 6.6, we users can take existing web applications and deploy them on our platform without touching the application code. You can read more about this in: Can scaling be made seamless?
To make things easy, we published a migration guide that shows step-by-step how you can take existing transactional JEE applications and deploy them on our platform (locally or on the public cloud). We measured the performance and scalability gain you get by running JEE applications on GigaSpaces XAP versus traditional JEE application servers. We ran the exact same application code on both platforms and measured at least 5 times better scalability efficiency and performance increased as outlined here. We also published a new "pet clinic" demo that comes with source code, configuration and documentation, and can be used as a reference guide for running standard JEE application with zero or with minimal code changes. This reference implementation is available here.

  • Designed to support both local enterprise clouds and Internet clouds:
Although GigaSpaces can now run entirely on a public cloud, such as EC2, it is clear that many enterprises are not ready to run on public clouds, but would rather run their apps on an on-premise, private cloud. Supporting this requires existing development tools used to develop enterprise applications. XAP enables use of common development frameworks (e.g., Eclipse, Maven, Ant, in Java; and Visual Studio in .Net). You can write, test and debug your application locally and then deploy it on the cloud for testing or for production. You can decide at any point where you want to deploy your application, whether on a public Internet cloud, or on a private cloud in the corporate data center. You can also create a hybrid model that involves both a public and private cloud simulataneously.

  •  Enterprise-grade reliability and scalability:
Most Internet-based PaaS impose a radical shift in the way applications are built, and specifically on scalability and reliability. Many of them leave you to deal with failure scenarios on your own, or alternatively, force you to accept the fact that you may lose data if you want to achieve scalability. They also require that you re-write your application if you want to make it scalable.
Many of the assumptions the platforms operate under are not acceptable to enterprise-grade applications. GigaSpaces XAP was designed to meet the most demanding requirements for maintaining both scalability and 24/7 high-availability without losing any data and without compromising scalability or performance.

This is only a partial list of differences, but I think that it makes clear how GigaSpaces is different than most Internet-based PaaS offerings, including Google App Engine.

You can read more about our cloud offering on gigaspaces.com/cloud. If you are interested, try out our new Cloud Framework and see for yourself how easy it is to set up a production-ready cluster with load- balancing and scalability within minutes.

My Photo

Twitter Updates

    follow me on Twitter