Iguazio emits storage-integrated parallelised, real-time AI/machine learning workflows

face popping out of computer chip

Workflow-integrated storage supplier Iguazio has received $24m in C-round funding and announced its Data Science Platform. This is deeply integrated into AI and machine learning processes, and accelerates them to real-time speeds through parallel access to multi-protocol views of a single storage silo using data container tech.

The firm said digital payment platform provider Payoneer is using it for proactive fraud prevention with real-time machine learning and predictive analytics.

Real-time fraud detection

Yaron Weiss, VP Corporate Security and Global IT Operations (CISO) at Payoneer, said of Iguazio’s Data Science Platform: “We’ve tackled one of our most elusive challenges with real-time predictive models, making fraud attacks almost impossible on Payoneer.”

He said Payoneer had built a system which adapts to new threats and enables is to prevent fraud with minimum false positives.  The system’s predictive machine learning models identify suspicious  fraud and money laundering patterns continuously.

Weiss said fraud was detected retroactively with offline machine learning models; customers could only block users after damage had already been done. Now it can take the same models and serve them in real time against fresh data.

The Iguazio system uses a low latency serverless framework, a real-time multi-model data engine and a Python eco-system running over Kubernetes. Iguazio claims an estimated 87 per cent of data science models which have shown promise in the lab never make it to production because of difficulties in making them operational and able to scale.

Data containers

It is based on so-called data containers that store normalised data from multiple sources; incoming stream records, files, binary objects, and table items. The data is indexed,  and encoded by a parallel processing engine. It’s stored in the most efficient way to reduce data footprint while maximising search and scan performance for each data type.

Data containers are accessed through a V310 API and can be read as any type regardless of how it was ingested. Applications can read, update, search, and manipulate data objects, while the data service ensures data consistency, durability, and availability.

Customers can submit SQL or API queries for file metadata, to identify or manipulate specific objects without long and resource-consuming directory traversals, eliminating any need for separate and non-synchronised file-metadata databases.

So-called API engines engine uses offload techniques for common transactions, analytics queries, real-time streaming, time-series, and machine-learning logic. They accept data and metadata queries, distribute them across all CPUs, and leverage data encoding and indexing schemes to eliminate I/O operations. Iguazio claims this provides magnitudes faster analytics and eliminates network chatter.

The Iguazio software is claimed to be able to accelerate the performance of tools such as Apache Hadoop and Spark by up to 100 times without requiring any software changes.

This DataScience Platform can run on-premises or in the public cloud. The Iguazio website contains much detail about its components and organisation.

Iguazio will use the $24m to fund product innovation and support global expansion into new and existing markets. The round was led by INCapital Ventures, with participation from existing and new investors, including Samsung SDS, Kensington Capital Partners, Plaza Ventures and Silverton Capital Ventures.