MinIO plugs its object storage into Snowflake

MinIO has followed Dell and Pure Storage by making its data stores available to Snowflake’s cloud data warehouse processing algorithms.

Satish Ramakrishnan, executive at MinIO, says in a blog that Snowflake has added external table support so Snowflake SQL compute can come to the data, so to speak, instead of customers having to select and move their data from an on-premises store into Snowflake’s cloud.

Ramakrishnan says this “will not change the locations where Snowflake runs – it will still run exclusively in the three major public clouds (AWS, GCP, Azure). It will, however, remove the requirement that all the data be stored in Snowflake for Snowflake to operate on it.”

If, for example, you have an object bucket in your MinIO store, you can configure it so that any SnowSQL command can access it, just as if it were a local Snowflake table. Data is still actually copied and moved up into the Snowflake cloud. How long does that take?

”The time taken by the initial query will depend on the amount of data that is being transferred, but reads are cached and subsequent queries may be faster till the next refresh,” says Ramakrishnan. “Using the external table approach the data does not need to be copied and the bucket can be used as an external table for queries, joins, etc.”

The benefits, he claims, are:

  • It extends the capabilities of the warehouse without incurring the cost of the move
  • The ability to run analysis on real-time data is now available
  • Moving data only to run an ad hoc query can be completely avoided
  • Analysis is possible in instances when the data cannot be moved for compliance or other business reasons
  • You still get all the advantages of the Snowflake capabilities with the same resources who are already familiar with the Snowflake platform

His post contains MinIO CLI code examples to accomplish this:

MinIO Snowflake external table code
MinIO Snowflake external table code

Despite the Dell and Pure precursors, “MinIO can become the global datastore for Snowflake customers – wherever their data sits.” This is because “MinIO is a high-performance, cloud-native object store. It will work on all flavors of Kubernetes (upstream, EKS, AKS, GKE, etc.) as well as on virtual machines like the public cloud VMs, VMWare, etc. and on bare-metal hardware.”

The Dell and Pure alternatives work on those suppliers’ own hardware too. 

Ramakrishnan also claims there are limitations in other object stores accessible through Snowflake’s S3 endpoint support: “While Snowflake will support S3 endpoints (which naturally includes other object stores), those object stores are not capable of running in all of the places that an enterprise keeps its data… To achieve a consistent, data anywhere strategy, enterprises will need to adopt MinIO.”

A SnowFlake document explains how to download and install SnowSQL to accomplish MinIO support of external tables.