OpenSpaces

May 19, 2009

GigaSpaces based solution makes it to the finalist of Cisco Developer Contest

I was very pleased to read an email from Leonardo, who was the winner of the OpenSpaces Developer Challenge (a worldwide programming contest using the Gigaspaces application server which was held last year), saying that he is now a finalist in the Cisco developer contest. Here's a bit about him and the application he submitted:

About Leonardo

Leonardo worked for several ISPs in various roles as network administrator and java programmer for IT consulting firms, and finally as software architect in high-performance Java EE based projects. He is passionate about parallel programming, distributed computing and more recently semantic web and its applications on software engineering.

Leonardo was the winner of the OpenSpaces Developer Challenge. He enjoys reading about various technologies in the field of computer science. When he is not developing code, he prefers to spend time with family and friends, walk in the park, or watch a movie.

About the application

Resource Management Platform is a proposal to develop an event based platform that leverages AXP, Services Gateway Initiative (OSGI), Jini and JavaSpaces technologies to enable deployment of IP Multimedia Subsystem (IMS) applications based on Session Initiation Protocol (SIP); more specifically, the Call Section Control Function (CSCF) components. It will have admission control mechanisms to manage Call processing.

This solution improves infrastructure manageability for large scale IMS applications. Such a platform will potentially be useful to enable deployment of high-performance, network-based SaaS (Software as a Service) or Cloud Computing solutions at the network edge by leveraging AXP.


You can find the full details about his project here.

Leonardo's project is interesting, because it shows how you can use Space Based Architecture (SBA) for implementing a scalable Telco application and offer it as SaaS application on the cloud.

Interestingly enough, I got another email the week before from Amin Abbaspour, who presented another case study illustrating how you can build a scalable SMS service using SBA, as shown in this diagram:

image 

What the two projects have in common, from an architecture perspective, is that they both represent a highly scalable Event Driven design. The unique thing about Event Driven applications is that they require a combination of messaging, data and service interaction that needs to be tightly orchestrated to meet high performance/low-latency requirements without compromising on consistency, ordering (FIFO) and reliability. This combination of requirements represent one of the hardest challenges in building scalable architectures. Trying to meet this type of challenge in the traditional way by integrating messaging system for event delivery , database or simple caching (like Memcached or TC) for data and a traditional application server for business logic is going to lead to fairly complex architecture. Trying to reach linear scalability and keeping the latency low with so many moving parts is close to impossible. This is what makes SBA such a good fit. The main difference about SBA is that it recognizes there is strong dependency between messaging, data and business logic. The key is to have one shared clustering, high availability and scalability for all three components of the architecture. This makes it possible to reduce the number of moving parts and network hops associated with each business transaction, thereby increasing reliability.

On a personal level, I was very pleased to see that the software we are developing is helping people like Leonardo and Amin to build their own carrier and put themselves in a unique spot in highly competitive market.

Good luck Leonardo and Amin!

References

August 18, 2008

GigaSpaces XAP 6.5/6.6 new releases

GigaSpaces 6.5 was released at the end of June, and we are now working on the 6.6 release, with the first milestone already publicly available. These are major milestones in a series of upcoming releases all aimed at strengthening our proposition as a Scale-Out App Server. Our main goal is to significantly simplify the process of achieving scalability, including scaling an EXISTING application within days and without enforcing a complete re-architecture.

I refer to this as our "Seamless Scaling" or "Simple Scaling" initiative. You can read some of the rationale behind this initiative in my previous post Can scalability be made seamless. This is a very ambitious goal, and by all means, we are not finished. In addition to the tremendous enhancements already put in place, we have a long-term roadmap that covers many aspects of the product.

The efforts we have undertaken (as well as those on our roadmap) involve enhancements to our development frameworks in Java, .Net and C++, mostly around the abstraction layer, including supporting standard APIs that enable us to inject many of our Data Grid and event-driven capabilities through annotations and configuration, with zero or minimal changes to application code.

They also involve increasing robustness, making large-scale deployments simpler to deploy and manage. Other efforts include extensive integration with popular frameworks, such as Spring and Mule, and recently we also added Web framework integration with Jetty. All this is designed to make the end-to-end scaling experience extremely simple and native. Judging by recent feedback we received, some of it publicly referenceable, it looks like we're making great strides in achieving our goal. I particularly liked the following quote by one of our existing customers Monte Paschi Group, who built a new pricing system with GigaSpaces. Their full case study is available  here. I chose the following quote from this study as it highlights some of the benefits that we don't always emphasize enough - development simplicity:

The development team is happy, too,  since the architecture has been
greatly simplified compared to the multi-layered application server system.
"We're not a software company, we're a financial company," Santini
explains. "We didn't have weeks or months to study the technology. Our
main goal was to use it to achieve our goals. GigaSpaces XAP allowed us
to do that right out of the box.

You can read the full details of the 6.5 release here. For convenience, we grouped separately the long list of features into Java,.Net and C++ categories, and provided detailed descriptions that outline the rationale and the value behind each feature.

In this post I'll try to highlight some of the important features and provide insight into our future plan.

Seamless scaling using the Service Virtualization Framework

At the application layer the most notable feature is the Service Virtualization Framework. The Service Virtualization Framework (SVF) can be seen as a major enhancement to Session Beans in EJB3 and Spring Remoting. This  framework enables you to write your business logic as a POJO and deploy the services across a cluster of machines, while providing a single client proxy that virtualizes all those instances as if they were a single server. For more details I recommend reading a new white paper covering the concept behind this framework and how it can help you simply build scalable, high-performance SOA and event-driven applications. The white paper is available here

Seamless scaling of popular development frameworks

We enhanced and expanded our integration with popular development frameworks. The purpose of the integration is to provide end-to-end seamless scaling to such frameworks in a way that doesn't require changes to the application. A good example is our Mule-ESB support. Mule users can take existing Mule 2.0 applications and significantly improve performance and scalability by plugging in the GigaSpaces runtime into the Mule Framework. The good news is that the wiring happens outside of your application code at a couple of levels:

  1. Connector level – leveraging our messaging layer as the transport for Mule
  2. Clustering level – taking advantage of our clustering capabilities, enabling the internal Mule data structure to span across multiple machines for scalability and high-availability

This integration is provided as part of our open source framework, OpenSpaces. We hope and anticipate that these integrations will be used as a reference by other frameworks looking for ways to provide similar levels of scalability and reliability. A good example for that already happening is the work David Greco performed by integrating the Camel open source ESB with GigaSpaces.

With 6.6 we also added out-of-the-box integration to Jetty. This was done in collaboration with the Webtide team (the company behind Jetty), who have given us excellent support throughout the process. What i like about this integration is that it enables taking an EXISTING web application packaged as a WAR and dynamically scaling it across a pool of machines. With this approach, you also get session-replication injected to your existing application without touching your code or WAR package. If you're willing to make slight configuration changes, you can get caching reference injected into your application. There is a new example that shows what it takes to scale an existing web application. The example uses the Spring Pet Clinic application and deploys it on a GigaSpaces cluster. The full example is available here.

Removing the language barrier

For decades language have been treated almost as a religion by developers. As an ex-CORBA guy, I know how much language interoperability is painful to deal with, and often requires compromises on functionality, performance and complexity. At GigaSpaces, we realized that there is no reason that different languages shouldn't be treated simply as forms of writing business logic. They each generate different values. For example, .Net provides better integration with Windows applications. C++ provides better performance in certain areas and provides low level APIs for integration with many third-party libraries.

Persistence: 6.5 has some major enhancements for implementing our Persistence-as-a-Service model in .Net through support of nHibernate. This enables .Net applications to integrate with existing databases and have their own database mapping layer with a native .Net API. This comes along with quite extensive Perforamance improvements. You can see some of the results here for .Net, and here for C++.

One of our goals for 6.5 was to bring the same level of scalability and simplicity we provide for Java to .Net and to C++, without compromising performance. Unlike some of the alternatives in the market, we don't just provide remote access to our Java runtime, but provide complete application server capabilities in these two languages -- as well as complete interoperability among all three. Java, C++ and .Net services and clients can run and share the same process and leverage that to remove the network call overhead often required when a call crosses language boundaries. An immediate benefit is that you're able to run your C++ and .Net business logic where the data is. Furthermore, you can now leverage our existing SLA-driven deployment to automate the deployment of Java and .Net applications. This means that instead of running each server process manually, you have a single deployment command that will make sure that your serves are running on the appropriate machines, that your backups are running on different hosts from your primaries, that if one machine goes down a new instance will take over immediately, or if one is not available, as soon as it becomes available -- all that without any human intervention!

Dynamic language support

The Java framework guys realized the need to support dynamic language as part of Java, making the JVM a common platform for running various languages. GigaSpaces XAP 6.5 leverages this, and provides enhanced support for Groovy, JRuby and JavaScript.

Dynamic language support enables writing business logic in Groovy/Jruby/JavaScript and executing it on the GigaSpaces cluster. One of the common use cases for this capability is to provide an elegant alternative to Stored Procedure. this means you can write business logic in Groovy, for example, that will be executed directly on the data grid nodes. With this, you can write your own custom data-queries and aggregation functions, and execute them where the data is. Beyond the performance benefit that you gain out of running the logic collocated with the data, you gain the benefit of using dynamic languages, i.e., you can add new functions on the fly without the need to deal with class-versions and class-loading issues and without the need to bring the data down whenever you do that. In this way you can add new functions while the system is running and continues to serve other applications.
This feature leverages the SVF mentioned above. This means that you can choose to run these dynamic procedures synchronously, asynchronously, in parallel, etc. Now Isn't that cool?

Click here for code snippets and detailed descriptions of this feature.

Data awareness everywhere
Throughout all of our development efforts, we are making sure data-awareness is maintained across the entire stack. Data-awareness means that invoking a method on the new Service Virtualization Framework can be routed to a particular service instance based on the data associated with that service instance. It also means that when you send a message through our JMS implementation, you we will be able to route it to the JMS partition that manages the relevant data. Unlike alternative solutions, this is native to our environment, meaning that there is no need for external integration and complexity to achieve this behavior.

Click here to view a code snippet and detailed description of how routing is handled in the Service Virtualiztion Framework.

Performance, Performance and more Performance...

Improving performance remains a constant goal for all of our releases. As the product matures, finding places in the product where performance can be further optimized is getting harder, and I therefore am always surprised when one of the developers comes up with some creative idea around performance.

In this release we improved performance on several fronts -- including Java, .Net and C++ -- which involved significant optimizations of object serilization and multi-core scalability. For the latter we are working with Azul, and making it part of our testing environment, as well as other multi-core systems such as Sun Niagra. You can see some of the figures and details here, here (Comparison with previous release of .Net) and here(C++).

We conducted detailed comparisons of latency and throughput of a "classic" transactional application based on the standard JEE model (Using JBoss ,JMS, Spring, Hibernate) with the same application but using GigaSpaces as the messaging and data-layer -- and eventually replaced the entire JEE stack with a GigaSpaces + Spring stack. It is important to note that throughout this process, the business logic code remained untouched. The initial results of this tests can be found here:

Latency1jpg_version1Throughput1jpg_version1

You can find the details of the code that was used in this test and the migration steps in a new whitepaper that is now available on our site here.  Uri Cohen wrote up in his blog (The Space as a Messaging Backbone) some of the interesting findings from this analysis that showed the difference between end-to-end measurment and point optimization and why in some cases putting a distributed cache in front of a database is not going to be enough.

What's next:

For obvious reasons I can't expose our entire roadmap as of yet. What I can say for sure is that we're going to continue improving the level of seamless and simple scalabaility provided by our platform.

We view the partnership and integration with other frameworks as strategic, and we're going to continue with that effort. One of the frameworks we are planning on working on is GlassFish.

We already announced our first cloud offering, designed to run on Amazon EC2, and including partnerships and integration with RightScale and Cohesive FT. If I'm not mistaken, this was the first Java application server available in a pay-per-use model, designed to meet the needs of enterprises and ISVs that want to offer their applications on the cloud, including as Software-as-a-Service. We're going to put in a lof of effort into making cloud deployments simpler, enabling our customers to use it on their local virtualized environments (private clouds) and on public clouds (Amazon EC2, GoGrid, FlexiScale, AppNexus and others), or even a combination of the two, without changing their applications. We're now working on a new version that enables provisioning a cluster of machines, deploying the application on said cluster and opening up an adminstrative console for the cluster -- all with a single click. This is already working in an internal beta. We're planning to provide a preview release by next month. With the availability of Windows-based clouds, we will be providing our .Net application platform as a cloud offering as well.

On the API and Standards front, we recently joined the OSGI alliance, where we expect to play an active role.  We are also looking into ways through which we can strengthen our compliance with some of the latest standards on the JEE stack, such as EJB 3.0 and JPA. The challenge is not just basic API mapping, but how to do it in a way that doesn't break our scale-out architecture and doesn’t create complexity. Unfortunately, previous versions of the EJB spec weren't a good fit. EJB 3.0 looks much more promising.

On the .Net front, we're going to continue with our performance optimization project. We're also working on making our .Net offering fit natively within a .Net development environment by providing better development and installation packages that fit better with the .Net spirit. We are also looking into ways to simplify the testing and debugging process. For pure .Net users we will make the .Net version available as a standalone package at a reduced price (details will follow).

On the C++ front, we're going to provide our customers with an open source version of our C++ binding and a complete package that will enable them to compile and build our C++ with their own set of dependencies, libraries and compiler versions and flags. This will also allow using the current C++ framework as a broad integration framework for third-party tools and languages.

There's much more than I could cover in this post. I tried to put together what I thought are the highlights of the release. As it's impractical to cover such a wide spectrum of topics in a single post, we started a process in which different people from our R&D and field engineering teams will post on specific aspects of the product and best-practices for using GigaSpaces in existing web, financial, online gaming and other applications.


Be part of our next release:

As we are now making the decisions of what to include in our 7.0 release, it would be nice to hear your feedback and specific requests for enhancements or new features. You can either send me a direct email or send it to PM at GigaSpaces.com Alternatively, if you think that you have a good idea that other users might be interested in, you can implement it on our community site – OpenSpaces.org.

The new GigaSpaces XAP 6.5 is available for download here.

July 29, 2008

Can scaling be made seamless?

Putting together the two words "seamless scaling" in front of a technical audience is a very dangerous thing to do. The technically savvy folks are walking around with plenty of scars from previous attempts to scale their system - enough to know that there "scaling" and "seamless" couldn't be further apart. But nevertheless, in this post I'm going to take the risk and do just that :)

Basically what I'm going to try and argue is that while scaling can't be made seamless across the board, there are different techniques to make scaling seamless in certain scenarios, or at least very close to seamless. I will use GigaSpaces as an example of how to achieve seamless migration of existing JEE applications into a scale-out model, with zero or minimal change to the code. I'll also outline our general principles, which I believe are applicable to any application seeking seamless scaling.

The seamless scaling dogma

There has been a lot of discussion over the past year about different patterns of scalability. I devoted quite a few of my posts on this topic. Most of them centered around architecture - how we can use partitioning to avoid a data bottleneck, how we can use in-memory implementations to get better performance and concurrency compared to implementations based on the file-system, and how we can use an asynchronous event-driven architecture as a better way to scale our business logic.

Randy Shoup outlined these principles nicely in his infoQ article, Scalability Best Practices: Lessons from eBay. The dogma behind all these discussions and panels was that scaling requires a very rare set of skills, which average developers don't have, and that's why we're still seeing plenty of online system failure. The most recent was the iPhone launch failure.

Does scaling really have to be complex?

Well, if you look at Network Attached Storage as an example, you'll see there are alternatives to the traditional dogma around scaling. With storage systems, we don't really think of scaling that much. More so - our applications don't really need to be aware of the fact that they run over a local disk or a network -attached device. We can scale by adding disks, even hot-swapping them in some cases, even while our application is still running.

Now imagine what the world would look like if it wasn't that simple. If our application would need to be aware of what's behind the scenes of these storage devices and would have to be re-written to deal with these scaling issues. It's not that hard to imagine, is it? Most likely we would still have been talking about storage-related system failure as a result of bad architecture and implementations issues. But we don't have anything to talk about, because storage gave us a level of abstraction that enabled almost everyone, regardless of their skill-set, to deal with scaling without being an expert at it, or really even thinking about it much at all.

Can we learn any lessons from NAS about our ability to achieve seamless scalability?

Let's see what were the conditions that made seamless scaling with storage possible:

  • Well-defined interface (or abstraction)
  • Interface that fits the share-nothing approach to make it suitable for scaling
  • Simple interface
  • Widely-used interface

Now if we examine these principles as they apply to other layers of the application stack, we'll get a decent answer as to why we haven't been able to apply the same level of seamless scaling - which storage already provides - to these other layers.

In the data layer, the most commonly -used interface is SQL. SQL fit well with criteria (1) and (4) criteria but doesn't meet (2) and (3). HashTable fit well with (2) and (3) but unfortunately is less commonly used in distributed systems. JavaSpaces, like HashTable, fits (2) and (3) but is even less commonly used then HashTable. In the messaging tier, JMS fits well with (1), (3) and (4) but doesn't lend itself well to (2), and so on. And these are the cases where there is a well-defined standard. Unfortunately, in other layers of our applications it's even harder to find a well-defined standard that fits to all of these criteria.

To overcome this complexity, there have been other attempts to use the JVM bytecode as a lowest common denominator and introduce seamless scaling not at the middleware API level, but on the JVM level using bytecode manipulation. This seems like an elegant solution to the problem, however most of the existing distributed systems were not written as a standalone Java applications that get distributed by some sort of magic, so it fails mainly on the 4th criteria - it fits mainly to new applications that were designed with certain assumptions in mind about how the standalone Java code would behave in a distributed environment.

Now to the point - can we scale seamlessly?

Those who expect a simple yes-or-no answer to this question are going to be disappointed - there is no clear answer , because it depends on the specific application scenario, the way the application was written and the maturity of various standards around these applications.

In general I would say that Java-framework-based applications are in better condition then applications based on other frameworks, due to the maturity of the standards and the advanced layer of abstractions that are now available as part of framework such as Spring and Mule.

Seamless scaling at the application layer would most likely mean the ability to plug-in different underlying scalable implementations at the middleware layer (data, messaging, business-logic, presentation). The use of abstraction layers such as IOC in Spring/Mule and the new EJB3 abstraction gives more freedom to plug in different implementations that don't necessarily conform to the exact same standard API. That means that your code can remain intact when you plug in a different messaging implementation, for example, whether it is a JMS implementation, a space-based messaging, or remoting.

Some cases are going to be easier then others. For example, taking a SessionBean and scaling it by having multiple instances of that service running over a pool of machines, while viewing them all as if they where a single server, can be done through configuration changes only. We can do pretty much the same thing to the messaging layer, where we will have a virtual queue and topic rather then a centralized server.

On the data layer things are more tricky, as most of the commonly-used standards in this area don't fit criteria (2) very well. If our data model is built with a complex object graph, or if our queries depends on complex joins, then we're not going to be able to scale it out without changes to the code or to the domain model. But even in these more difficult cases, it's possible to minimize the scope of change by using the DAO pattern, declarative transactions and annotations as a mapping layer on top of the domain model. This means that even if the change can't be completely seamless, it will nevertheless be quite simple to achieve.

Learning from the GigaSpaces experience

At this point I'd like to use our specific experience at GigaSpaces to describe the methods we used to enable seamless scaling:

  • Use standard APIs, but only when it makes sense. For years we chose not to implement large parts of the JEE standard, such as EJB and Entity beans, because they didn't fit the scale-out environment and were too bounded to database. What I'm trying to say is that implementing a standard API is not always going to make the transition to scale-out model seamless, so you should be careful which standard you pick.
  • Leverage existing abstractions to plug in different implementations that are based on other APIs or technologies than the one originally used. We use this principle quite extensively in our OpenSpaces framework, to map our own transaction handlers, Remoting abstraction, to enable seamless scaling of SessionBeans , etc.
  • Use annotations for mapping between different models.
  • Use aspects to add new behavior when it makes sense. We use aspects in several cases such as filters/remoting aspects and security aspects. We will probably be using aspects more to address a more advanced level of serialization.
  • Apply more tightly coupled integration to specific products/frameworks  A good example for that is our Spring, Mule and upcoming web tier integration. This sort of integration enables an end-to-end seamless scaling story that makes the user experience significantly better. On the .Net side our integration with Office and Excel enables something equivalent.
  • Use open source as a tool to open up the framework for extensions and other integration work. This is something that we introduced quite recently through our new OpenSpaces.org community site and found it to be a useful tool with many extensions already available. GigaSpaces users implemented their own extensions and made them available through the community site. The most recent one has been Camel integration.

Real life examples

Of course, this isn't just a theoretical discussion - we've been attempting to achieve this level of seamless scaling in practice since we introduced our middleware virtualization stack, which was our first attempt to address scaling of existing applications and not just new applications.

We have been involved in numerous scenarios of scaling out existing applications. An interesting example is detailed in Mickey's recent blog post, in which he describes in more detail how he was able to scale-out a JBoss/Oracle RAC-based application. Mickey provides a good description with code snippets that show the before and after effects, both in terms of code changes and obviously scaling and performance. You can find the details of that experience here. The bottom line of this case study is the fact that he was able to get that application from 15tx/sec to 1500tx/sec in less then 4 days! For me, measuring the time it takes to move your EXISTING application and see the immediate results is the ultimate measure. You have to agree that if the transition to a scale-out model wasn't seamless, it wouldn't have been possible to do in such a short time, and more importantly, without ripping and replacing the entire application. In Mickey's case, we started with decoupling of the database to get the initial scaling, and replaced the other layers incrementally.

Summary

Storage taught us the lesson of seamless scaling. Seamless scaling can be achieved on other layers of our application as well, using a combination of Standard APIs, Abstractions, Aspects and tailored integration. In most cases, seamless scaling would mean no changes to our application code but would require changes to configuration and packaging. Not all layers can make a fully seamless transition. But in those more difficult cases, we can use the same principles to significantly minimize the changes required for scaling.

In this post i wanted to share some of our GigaSpaces experience in that area as i believe many of the lessons and principles are pretty generic and can be applied to any project/product. At this point it is also important to note that this is not a one-off proposition. It's a continuous effort and requires a long-term roadmap and commitment. We've been struggling with this for years and applied every possible method to achieve this goal. Some required significant re-factoring of our entire infrtustructure. The lastest one has been the addition of our OpenSpaces framework as an open source development framework based on Spring. With this change, we can easily support more APIs and frameworks, as well as build an entire ecosystem around it that will enable others to apply the same model to even more frameworks and applications very easily.

You may wonder why we, as a commercial company, would want to do this - after all it also means that GigaSpaces can be replaced much more easily. Well, the reason is fairly simple - we believe that our success and adoption will be much higher if we can get to the point where scaling any application through GigaSpaces won't require any changes to code. It took few years and an intensive effort to get a point were I can feel comfertable to use the two words "Seamless Scaling". Now we're starting to see the fruits of that effort - just see the recent post by Seon Lee who appears to be one of the Mule users: Mule 2.0 + GigaSpaces 6.5 = Pure Sex:

Gigaspaces released 6.5 with API integration with Mule 2.0 … this is just plain awesome. You can use Gigaspaces as the transport (e.g. in place of JMS) and quickly get a SBA up and running utilizing the same concepts I used at RHG when we were servicing B2B problems. You also get the advantage of the clustering ability and fault tolerance that comes with Gigaspaces – which is just pure sex – not to mention all the other great features that come with this advanced Javaspaces implementation (i.e. management tools, monitoring tools, data partitioning, performance features like batching).

I expect to see even more on that line with our latest 6.6 release which includes Seamless Scaling of Web application - check that out!

June 27, 2008

TSSJS Prague: my take-aways

Once again TSSJS was a well-organized event with lots of interesting content. Hot topics that I took  notice of were RIA, new languages, and obviously distributed computing and scalability.

I arrived on Tuesday morning, which gave me a chance to meet John Davies, Ted Neward, Kirk Peperdine and Holly Commins. We found a nice spot not to far from Charles Birdge. At some point we started discussing the reasons we're seeing a burst of new languages. The discussion about languages is thought-provoking. Ted Neward (One of my favorite presenters) seems to be spending a lot of his time recently thinking about this topic. He explained over dinner (while he was completely jet-lagged!) his view. I'll try and summarize the main points:

  1. One size doesn't fit all - we shouldn't try to force one language to do everything and expect it to be good at it all. The concept of using multiple languages in the same application is actually something we've been practicing for a while by using a combination of HTML, CSS, XML, JavaScript, Java, etc. Each language serves a specific purpose.

  2. Different semantics require different expressions, i.e., different languages. An example that was given was Scala and Erlag and the notion of parallel programming as a first class-citizen in the language (as opposed to a set of libraries and explicit APIs in Java). The argument (brought up by Kirk) is that you can't leverage multi-core platforms without languages that were designed to do so.  It reminds me that  indeed multi-threaded programming  wasn't  common until threading became native in the languages. Now you can't think of writing even a simple application without threads. So i think that Ted and Kirk have a valid point.

  3. Usability and productivity - how many lines of code are required to express a certain idea? There are many examples that show how different use cases in Java could be relatively verbose and complex with comparison to the "new" languages.

  4. JVM/CLR makes it easy to introduce new languages as just new views and perspectives running on the same platform. Previously, languages such as Perl and TCL had to be built with an entire stack, typically based on C or C++, and had to be ported to various platforms and operating systems. This approach made the choice of language and language interoperability quite difficult as the decision to choose one languages over the other was considered a "catholic marriage". Today, the JVM in Java and the CLR in .Net enable better separation of concerns. They provide a common platform that can easily support multiple languages. This simplifies interoperability of different languages within the same application. A good example is the new support for dynamic languages in Java 6 and in .Net. This makes the language decision simpler, as the impact of this decision on our project is less drastic and less risky than it was before.

While I think that all of the points are valid I couldn't avoid thinking that we're forgetting past experience. For example, you could easily argue that lines-of-code is only one measurement of productivity. Another measure of productivity is maintenance, i.e., how simple it is to read the code and understand it, transfer it to another programmer, etc. My concern is that if the language becomes too flexible and enables each of us to write our own extension, we're going to find ourselves in a position where the only person that understands the code is the person who wrote it, and even that would hold true only for a certain period of time. Think about C++ templates, macros, operator overloading, multiple inheritance  -- a lot of "nice" features that made our code very flexible, but less readable due to the large number of indirection we had to go through to parse a single class.

One of the things I liked when I switched to Java from C++ was the fact that to understand my colleagues' code all I needed was to read their .java files. In most cases I didn't really need documentation and it was fairly simple to parse the code because Java restricted much of the flexibility that I just mentioned. Trying to do the exact same thing with C++ requires parsing of header files, macros and typedefs. Another issue is that introducing multiple languages can be quite complex and a barrier to productivity, due to limited skill sets within a certain project, even if choosing a language is less risky than before.

I think the concurrency argument is only a temporary one. I'd hate to choose a different language just for that, because it's something that I expect to see native in Java. So far we managed to deal with multi-core and parallel programming quite effectively with Java using event-driven architecture (EDA) and master/worker patterns, and abstract a lot of the concurrent programming with things such as Futures, Remoting, etc. Surely having some of the features of Scala or Erlang as part of Java would have made our life simpler, but if I measure the value vs. risk involved, I'm not sure it justifies using it right now in a real project.

Don't get me wrong. I'm not saying that there is anything wrong with these languages. What I'm arguing is that we need to be very careful before we choose them and make sure that we're measuring the right value, rather than assuming that any of the above arguments applies to our application without proper analysis. Ted was able to convince me to look further into this topic - so I'm probably going to give Scala a try and get a real feel of it.

The event started on Wednesday morning  with a very good presentation by Stephan Janssen. Stephan is the founder of Parleys.com. He is also the founder of the JavaPolis conference held annually in Belgium. He talked about his experience with a wide range of RIA platforms: DHTML, Adobe Flex/Air, JavaFX, Google Web Toolkit (GWT) and Microsoft Silverlight. He discussed his personal experience in using the various technologies as part of Parleys.com.

The combination of a general overview with real-life examples made the discussion quite interesting and lively. The bottom line of this part of the talk was to use Adobe Flex, if you're building a site in the short-term and JavaFX if you're planning on launching your site in about a year's time - due to the maturity cycle and the gaps between the two technologies. Personally I found the fact that there are so many options to do the same thing quite confusing. I wish that we could press the fast-forward button on the maturity cycle of these technologies. Working with previous versions of Parleys.com I must say that i was very impressed with the progress and the *right* use of technologies to build the new version of the site.

Another interesting and quite innovative idea that Stephan presented was about hosting services and collaboration with academic partners. The hosting service will enable companies like ours to host their live presentations in the Parleys.com site. In addition, you can embed the presentation in a blog entry. You can also record your talk online, using a web-based application. The partnership with academic institutions enables scaling not just the content, but also the bandwidth, similar to the way downloads works. IMO Parleys.com could easily become the YouTube of online presentations. If you missed the presentation I'd recommend watching this interview here:

My own presentation, Getting ready for the cloud, seems to have been well-received, although I had some concern that it might be too high-level for some of the audience. You can read some of the comments posted by others here and here.

The presentation included a live demo of a web 2.0 application (displaying market data) running live on Amazon EC2. Although the demo ran over a wireless line, it went surprisingly smooth and I was able to easily redeploy and relocate instances through a simple drag & drop using our UI, which was hosted on one of the EC2 machines. The following day, Uri Cohen gave a session in which he showed the details of what's going on behind the scenes and reviewed the actual code and API used in the demo. If you're interested in experiencing it yourself, you can try out the same demo on our new EC2 version

TSSJS was a good opportunity to meet in person the winners of our OpenSpaces developer competition. I heard interesting stories about what drove them to write their projects. The common theme was the technology challenge - they heard about our technology and scaling pattern and wanted to get a feel for themselves of how it works. You can listen to some of the stories in recent podcasts we published here here and here.

BTW, Jason Carreira, one of the winners, has since worked on another project: a scalable Twitter-like application using GigaSpaces and EC2 (an alpha version already exists, he is now looking for hosting opportunities). And Leonardo Goncalves -- the first prize winner -- is already thinking of the next version of his project. The third winner - Kirill Ishanov -- is also planning to participate in next year's contest. At the end of the first day we showed a video of some of the judges (John Davies and Jullian Browne were missing from the video). It’s a light-hearted video in which the judges also makes fun of Joe Ottinger:)

   
Two of the talks I very much enjoyed were given by John Davies, formerly the founder and CTO of C24 (which was sold to IONA), who has recently started a new venture called Incept5. I've worked with John for many years now and we often have excellent chats about ideas in our respective markets. John's first talk was one I'd heard before, but as always, he updated it with new anecdotes and ideas. He talked about extreme enterprise architectures, specifically ESBs and grid in the low-latency, high-volume, complex envrionments of investment banking. John started by explaining the value of a millisecond to the high-end institutions, literally in terms of dollars, something like $100 million per ms. He went on to talk about compiled languages compared to Java for this sort of processing. It was interesting to see John walk through a very high performance matching and reconciliation engine we had designed together for a client a few years ago and it's exciting to hear that his new company will specialize in this is. John talked about some of the clever coding patterns that had to be implemented to provide linear scalability, and although master-worker was the pattern of choice for scaling, it wasn't as simple as just writing lots of workers.

John's second talk was new to me, and although we discussed the ideas in it in the past, it was fascinating to hear them presented. The room was packed -- standing room only -- as it was a topic near and dear to the hearts of many developers and architects: "The Enterprise without a Database". I thought this would just be an extension of caching, but John went on to emphasize the huge amounts of time and energy (human) being lost on Object-Relational Mapping (ORM). Why do we still persist our well-established Object-Oriented models into a relational database? While ORM is simple at the example level, it doesn't scale given the levels of complexity in today's messaging standards. John made this very clear by example. I got the feeling that he was holding back the solution, perhaps to be released by his new company, but it was clear that there are alternatives to ORM: from caching objects to using CLOBs in classic databases. This is obviously an area to watch, as John always has a good vision for these sort of things.

At the end of the day I had the chance to have a beer with different people in a nice Mexican restaurant in Prague (courtesy of Jodie, the cameraman, and his local friend). After a few beers, mojitos and lots of peanuts (courtesy of John Davis:)), the topic of open source software (OSS) came up. I think that we all agreed that being open is a no-brainer, and that's the way software products should be built. Being open doesn't necessarily mean free - take Jive and Atlassian, for example. They sell commercial products, but they provide customers with the source code

Another model is the dual-license model such as that used by Red Hat and MySQL. It's sometimes referred to as the Fedora model. It means that you have a choice to use a free version but if you do, you're on your own. If you choose to use the supported version, you're going to be charged a subscription, for which you'll get extra features and better packaging/documentation.

I argued that it is important to have a solid business model behind a product/project. It should be as important to the users as to the company developing the product. if a product doesn't have a solid business model two things might happen: the project/product is going to be abandoned at some point due to lack of funding, or the owner of the product will change the licensing model to monetize on the IP and established user-base. We've seen both scenarios happening already.

I also argued that the Fedora model is usually successful only as part of a commoditization strategy. For example, JBoss's strategy was to go after the lower end of WebLogic/WebSphere accounts. The same applies to MySQL. This strategy seems to only work to a certain limit. I argued that this model is not proven in an emerging product category, where large investments in market education and innovation are required to achieve massive and sustainable adoption. In such cases, the Jive/Confluence model seems to be a better fit. Anyway, this topic is worth a separate discussion, so I'll leave it at that for now.

Unfortunately I had to leave on Friday (to be at my daughter's end-of-year party at school), so I missed Shay Banon's presentation. Based on what I heard it went very well.  You can view Shay's presentation here.

Anyway, it was a real fun event and i look forward to next year.

June 16, 2008

Meet us @ TSSJS Prague

TheServerSide has a good record of picking nice spots for their conferences, and this year's Java Symposium in Prague is no exception.

It's looking to be a fun event, as I'm going to meet not just lots of old friends, but also the winners of our first OpenSpaces developer contest. I've already written about the contest and some of the submissions in a previous post. If you haven't already, check it out ,as we decided to continue with the contest next year and use  TSSJS as an opportunity for attendees to apply for "early bird scholarships" worth $1,000 each -- all this at an award ceremony we're holding in Prague during the show. Besides free booze and food at this event, we're going to show a nice video featuring the judges of the competition. Those who worked with Alit through the current competition will probably be happy to know that she is going lead this part of the show together with John Davis who was one of the Judges.

I'm going to be there with some of my colleagues from GigaSpaces, namely Shay Banon and Uri Cohen.
My presentation is titled Getting ready for the cloud but it really talks about the next wave in distributed computing in which clouds plays an important role and have the potential of changing many of the things we used to do in the past.

Banon will be talking about some of the work that he's been doing with Mule, Lucerne/Compass and Spring in his session Beyond Data Grids. I've seen him discussing some of these topics in Las Vegas this year, so I know it's going to be really interesting. Last time it sparked many questions about how clustering technologies can deal with scaling challenges, how in-memory data grids can replace or co-exist with traditional databases, and how they can be applied to different frameworks given real life examples.

Uri is going to talk about his experience in building scalable web 2.0 applications using Ajax, Tomcat and Spring MVC, and running on the Amazon EC2 cloud. He will discuss specific patterns for dealing with Ajax scaling issues, and also provide patterns and tips for moving from a tier-based to a scale-out model based on recent work he's done with JBoss and, of course, GigaSpaces.

The TSS event is also going to be a good opportunity for us to expose some of our latest development in our upcoming 6.5 release,such as the new Service Virtualization Framework (based on Spring Remoting), Dynamic language support, extended support for hibernate and enhanced database integration, built-in Maven support, support for Spring 2.5 annotations and enhanced administrative and real-time monitoring.

Mule users will also benefit from our extensive support for the Mule ESB. We're also going to show some of the latest developments with EC2 and cloud computing environments. Even though TSS events tends to be Java-centric, I believe that Java users will be happy to learn about our interoperability among Java, C++ and .Net. For those unfamiliar with it, I would recommend giving it a closer look as it provides high performance and an extremely simple alternative for making the language barrier pretty much obsolete.

There is much more to it than I can cover in this post. In fact, we realized that an entire post will not be enough to cover all the relevant content of our 6.5 release, so expect to see several dedicated posts in the coming weeks -- here and on the GigaSpaces Blog -- covering different aspects of new features, including some "behind-the-scenes" stories. Stay tuned!

April 30, 2008

Cool Projects on OpenSpaces.org

The OpenSpaces.org community site launched in January. I was surprise by the rapid adoption of OpenSpaces since then, with lots of interesting innovations on things I didn't even think of. I'm sure that some of the projects will be very useful to many OpenSpaces users. This shows the value behind  an ecosystem and community. Given the right tools, people will start collaborating and share things that otherwise would be buried in their hard disk, or in their mind.

The OpenSpaces.org site also provides a great tool for GigaSpaces Partners and individuals in the general developer community to expose their skills by publishing valuable content. A good example is GridDynamics, a GigaSpaces partner, who invested time and effort on producing high quality, well-documented projects.

The same goes for various people on the GigaSpaces team who came up with great ideas based on work that they did with customers. They use the OpenSpaces,org platform to share the tools they developed with other users in the community who might have similar needs. For example, the OpenSpaces demos project shows how to integrate Ajax, Spring MVC and OpenSpaces to scale a typical web application (market data front end, in this specific case). 

Another good example is TGris, an extension of the testing grid framework that we use internally at GigaSpaces, and which several customers showed interest in for automating the testing of their own applications (note that the tool is not specific to OpenSpaces).

Another class of  interesting projects are those that integrate OpenSpaces with various frameworks and APIs. These projects simplify the integration and adoption process, and shorten time-to-value. Good examples are the projects that provide integration for OpenSpaces/GigaSpaces with Amazon SimpleDB, JPA, and Memcached , as well as the  Cache Integration project, which enables OpenSpaces/GigaSpaces support for many frameworks, such as Acegi Security, Cocoon, Jetty, iBatis, OpenJPA, Velocity and others.

Other people built entire functional applications,  such as Leonardo Gocalves's  GoDo - Goods Donation System (see details below), and Jim Liddle's MobileGSFeed, which provides a scalable solution for handling Atom feeds through the iPhone. Jim actually runs our Sales in the UK & Ireland. Never in my dreams did I imagine that OpenSpaces.org would be used by sales guys :-)

Anyway, I'm very pleased to let you know that we reached an important milestone for OpenSpaces two weeks ago when we reached the deadline of the developer contest. Fourteen candidates made it to the final stages. Only three will be finalists. A distinguished panel of judges interviewed each contestant. The judges are Adrian Colyer, CTO, SpringSource; Joe Ottinger, Editor, TheServerSide.com; John Davies; Julian Brown, Architecture Consultant, RWE;  Keerat Sharma, Platform Engineer, Gallup; and Ross Mason, Co-founder and CTO, MuleSource.

All of the candidates put up a real good fight and made it very hard for the judges to reach their final decision. The winners of the contest will be announced in a nice venue in Prague during TheServerSide Java Symposium event. Stay tuned for updates on the exact date and venue here and on The GigaSpaces Blog and web site. We also intend to publish interviews with each of the finalist project owners and post them in a blog.

Here are some of the interesting projects (in alphabetical order). The full list of projects can be found here.

Please join one of the projects or start a new one yourself. If you already developed something, but are concerned about the time it will take to initiate a new project -- don't be! It is extremely easy and quick to start a new project and if you need any help, we're ready to support you.

 

 

 

 

My Photo

Twitter Updates

    follow me on Twitter