The appearance of the first commercial Cloud Computing platforms has represented a significant step towards the materialization of the vision of utility-computing.
However, the promise of infinite scalability catalyzing much of the recent interest about Cloud Computing is still menaced by one major pitfall: the lack of programming paradigms and abstractions capable of bringing the power of distributed programming into the hands of ordinary programmers, sheltering from the complexity of developing systems deployed over large scale, elastic cloud platforms.
A crucial issue that we have tackled in the Cloud-TM project has been developing innovative mechanisms and abstractions aimed at ensuring adequate consistency levels while being:
1. simple and familiar for the programmers
2. highly efficient and scalable
3. fault-tolerant and highly available.
Decades of research and field experience in this area have brought to the development of a plethora of different approaches to ensure state consistency in distributed platforms, and taught a fundamental, general lesson. No universal, one-size-fits-all solution exists, as the efficiency of individual state management approaches is strongly affected by both:
1. the characteristics of the incoming workload, such as the ratio of read/write operations, as well as the spatial/temporal locality in the data access patterns, and
2. the scale of the system (e.g. low vs high number of nodes, local vs geographical distribution) on which these mechanisms are deployed.
The complexity of this problem is hence particularly exacerbated in cloud computing platforms due to the feature that is regarded as one of the key advantages of the cloud: its ability to elastically acquire or release resources, varying the scale of the platform in real-time to meet the demands of varying workloads.
The Cloud-TM approach
The Cloud-TM project addressed these issues by building a highly innovative data-centric middleware platform. The Cloud-TM platform is designed from the grounds up to meet the scalability and dynamicity requirements of cloud infrastructures, while providing intuitive, yet powerful abstractions aimed at masking complexity and allowing ordinary programmers to unleash the potentiality of large-scale Cloud platforms.
Most cloud computing infrastructures embrace weak consistency models that achieve scalability at the cost of an increase of complexity for the programmers. This leads to a significant growth of software development costs and of the time to market, ultimately hindering competitiveness.
Conversely, Cloud-TM adopts an intuitive, yet scalable programming paradigm. The Cloud-TM programming paradigm integrates the friendly abstraction of atomic transaction as a first-class programming construct, sheltering programmers from having to deal with the idiosyncrasies of weak consistency models. Strong-consistency and scalability, two properties often seen as antagonists, are reconciled thanks to innovative transactional consistency schemes designed precisely to meet the scalability and elasticity requirements of typical cloud infrastructures
Beyond transactional consistency, the Cloud-TM programming model provides transparent support for object orientation and queries, concurrency-friendly data structures and frameworks to control distributed execution of tasks, hiding issues such as fault-tolerance, load distribution and data placement.
Finally, Cloud-TM's pursues the minimization of the other major source of costs for cloud-based applications, namely operational costs, in a twofold way:
Overview of the Cloud-TM Platform
The Cloud-TM Platform high level architecture is depicted in the following figure. It is formed by two main parts: the Data Platform and the Autonomic Manager.
Data Platform. The Data Platform is responsible for storing, retrieving and manipulating data across a dynamic set of distributed nodes, elastically acquired from the underlying IaaS Cloud provider(s).
The Data Platform Programming APIs have been designed to simplify the development of large scale data centric applications deployed on cloud infrastructure. They include the Object Grid Mapper, the Search API and the Distributed Execution Framework.
To this end, the programmatic interfaces offered by the Cloud-TM Data Platform allow to:
Lower in the stack we find a highly scalable, adaptive In-memory Distributed Transactional Key-Value Store/Distributed Transactional Memory(DTM), which represents the backbone of the Cloud-TM Data Platform. In order to maximize the visibility, impact and future exploitation of the results of the Cloud-TM project, the consortium agreed to use Red Hat's Infinispan as the starting point for developing this essential component of the Cloud-TM Platform. Throughout the project Infinispan has been extended with innovative data management algorithms (in particular for what concerns data replication and distribution aspects), as well as with real-time self-reconfiguration schemes aimed at guaranteeing optimal performance even in highly dynamic cloud environments.
Autonomic Manager. The Autonomic Manager is the component in charge of the self-tuning of the Data Platform. In the Cloud-TM Platform, self-optimization is a pervasive property that is pursued across multiple layers of the platform.
Specifically, the Cloud-TM Platform leverages on a number of complementary self-tuning mechanisms that aim to automatically optimize, on the basis of user specified Quality of Service (QoS) levels and cost constraints, the following functionalities/parameters:
The following figure illustrates an example scenario highlighting the self-optimizing capabilities of the Cloud-TM platform. Depending on the current workload characteristics, Cloud-TM can autonomously acquire or release resources from the Cloud, and adjust, in a transparent manner, its internal consistency mechanisms to maximize performance and efficiency.
The YouTube channel of the project contains several videos demonstrating a number of features of the Cloud-TM platform.
The Open Source Way
Since the early stages of the project, academic partners have worked in close collaboration with the leading company in the open-source software arena, Red Hat. This has allowed to integrate a number of innovative solutions in highly visible open source projects, like Infinispan, JGroups, Hibernate Search and Hibernate OGM.
The choice of embracing open source, and the integration of the best-of-breed research results in popular Red Hat projects, have strongly amplified the impact and visibility of the project's achievements, and paved the way for their immediate industrial exploitation.
The choice of open source means also that the Cloud-TM platform is freely available for the broad community of SMEs that find in cloud computing a highly attractive model, not only from the economic perspective (thanks to its advantageous pay-only-for-what-you-use billing scheme), but also due to its simplicity and scalability.
The Final Cloud-TM Platform is out!
A new factsheet is out!
NETYS Best Paper Award
Cloud-TM Data Platform and Autonomic Manager are out!
FutuGrid Project Challenge Award