Hazelcast’s streaming data in-memory software features hopping, tumbling, and tiers

Hazelcast has added more SQL streaming data capabilities and tiering to its in-memory data grid software so that real-time and older information can be queried simultaneously.

The company basically stores a load of data in memory so it can be accessed, processed, and analysed much faster than by sequentially reading it from SSDs or disk drives. It unified the IMDG (in-memory data grid) and Jet streaming data processing software last year to make the v5.0 HazelCast Platform. The tiering means that older data can be automatically tiered out to disk initially and later to SSD or the public cloud.

Manish Devgan, Hazelcast’s chief product officer, said in a statement: “When Hazelcast announced its platform last year, the ability to merge real-time data with historical context opened new possibilities to deliver the right offer or insights to the end-user at the right time. By being able to work with datasets at scale within the same data platform, businesses can now enable even better outcomes in a much shorter window.”

Hazelcast graphic

Hazelcast says that enterprises need to operate in a real-time economy, going beyond batch processing and into a state of continuous processing of data as it’s created. That needs a real-time data platform that incorporates streaming and in-memory latencies to operate anywhere and pull data from any source, including databases, data lakes, and data warehouses.

The latest v5.1 Hazelcast Platform release includes streaming aggregation over fixed, tumbling, and hopping windows, additional SQL expressions, improved JOIN support and better performance. It also adds SQL support for JSON so that enterprises can store and query using this data format for adding real-time processing capabilities to functions. 

With hopping windows the analytics function hops forward across a dataset by a fixed time period and the windows can overlap. Fixed windows don’t move. Tumbling windows move forward repeatedly by fixed time periods and can overlap. JSON is a Java-derived data format used for data transfers between web applications and servers.

v5.1 adds the ability to create views and indexes, and run troubleshooting explain plans in SQL. It also adds higher availability by needing less maintenance downtime.

Hazelcast would have us believe that its v5.1 software enables enterprises to build applications that take automated, immediate action on data, without the wait times associated with database writes and human intervention. The apps have, as it were, greater contextual awareness.

David Brimley

A blog by Hazelcast’s product management VP, David Brimley, discusses the v5.1 software release. He writes: “Our goal is for the Hazelcast Platform to become the last-mile data storage and processing layer for real-time applications, be they transactional, batch, or streaming. Incorporating data processing, data storage and data integration in one distributed, easy to scale, highly available cluster that is capable of handling terabytes of data and is deployable anywhere.”

This has commonality with Imply and Apache Druid where the combination of real-time streaming and historical data also plays an important role.

Hazelcast Platform 5.1 is generally available via Hazelcast Cloud or as software to be deployed on-premises or within customers’ AWS, Azure or GCP environments. The tiered storage feature is currently in beta and will be generally available for production use in a v5.2 version of the Hazelcast Platform.