Infinidat has decided to use Arrow Electronics to manufacture its InfiniBox-based storage arrays.
Arrow Electronics provides global product manufacturing services as well as being a value-added distributor. It says customers can invest their working capital elsewhere in their business with Arrow’s full product manufacturing support. Arrow supplies the components and subcontracts the manufacturing to an established network of qualified, global Electronics Manufacturing Services partners. Arrow is huge – it had sales of $37 billion in 2022.
Phil Bullinger.
Infinidat CEO Phil Bullinger explained in a statement: “Our collaboration with Arrow spans across our business from manufacturing and fulfillment services to global commercial distribution.”
This is: “accelerating our capabilities to deliver compelling business and technical value to enterprise customers globally with leading cyber storage resilience, storage consolidation, autonomous automation, and a powerful ROI.”
Prior to this arrangement, Infinidat operated with a local contract manufacturing partner in Israel. Bullinger told us “the global transition of these services to Arrow will expand our production operations with much greater purchasing power to multiple manufacturing sites in two different continents, including in the United States with closer proximity to our largest customer base. This will give us the ability to balance manufacturing activity and respond faster to global customer demand.”
It’s generally reckoned that manufacturing in-house entails significant working capital resources, management oversight, and supply chain overhead. A specialist contract manufacturer can combine sub-contracts into a large-scale manufacturing operating and gain efficiencies denied to a smaller-scale manufacturer.
The pandemic has caused severe supply chain difficulties across the IT industry and Arrow will be better placed to navigate them than Infinidat – particularly with global supply chains involved.
Salesh Rampersad, president of Arrow’s intelligent solutions business, provided his thoughts: “This collaboration is a testament to Infinidat’s commitment to providing reliable and innovative solutions to the industry and showcases Arrow’s integration services and global supply chain capabilities.”
Contract manufacturers can typically produce products in higher volume and lower cost than a business like Infinidat. We asked Bullinger: “What savings does Infinidat anticipate?”
He replied: “We expect to realize an overall positive business impact driven by more efficient upstream supply chain and global logistics, leveraging Arrow’s broader scale and capabilities. This significant expansion of our relationship with Arrow will help power our continued growth and expanding scale in the market.”
Generally a contract manufacturer does not reveal who their customers are, so Infinidat’s public deal with Arrow is unusual in that regard. Qumulo, Silk and VAST Data have each exited the hardware business recently, and Infinidat’s Arrow Electronics deal is not in line with a hardware exit.
Arrow will remain Infinidat’s primary commercial distributor globally, through its enterprise computing solutions business, and Infinidat will utilize Arrow’s intelligent solutions business for streamlined supply chain management, integration, and global logistics.
Profile: French startup Inspeere believes a backup stored across six to nine peer sites in encoded and compressed fragments is effectively impervious to ransomware attacks and can meet sovereignty requirements as well as being green.
Inspeere’s technology is based on computer science research and a patent by University Cote d’Azur professor Olivier Dalle. He left academia after 20 years to co-found Inspeere with French serial entrepreneur Michaël Ferrec in 2018. Inspeere was initially self-funded by its founders, then helped by French institutions, and has just received a €600,000 ($647,240) investment from the Business Angels of the Grandes Écoles, Defense Angels, Arts et Métiers Business Angels, the NACO fund managed by M/Capital, BPI France, Airbus Développement and the University of Côte d’Azur.
The background is that organizations should back up up their data using a 3-2-1 principle – meaning three copies of the data on two different media types with one copy stored externally. But a backup copy is inherently vulnerable to malware attacks. If, instead, the external copy is split into fragments which are encoded and compressed and stored on different systems, then an attacker would have to recombine, decode and decompress all the fragments to capture the data.
Dalle presented Inspeere’s Datis software product to an IT Press Tour audience in Madrid. It uses blockchain-based (distributed ledger technology) peer-to-peer technology to accomplish this, holding a local copy and distributing the external copy among six to nine peer systems. Each member of the peer group provides disk space for the fragment copies and uses the other peer group members as recipients of its own encoded, compressed and fragmented backup files.
Essentially, Datis is is a peer-to-peer storage system marketed as a backup target. Each peer is has an x64 processor and runs Debian Linux with a ZFS filesystem. An incoming backup file is received by the system. It is compressed, encoded and split into fragments. The ZFS read-only snapshot and replication capability is used to distribute the fragments to the peer systems with Reed-Solomon error correction.
This can be configured as, say, four nodes with two error-correcting parity nodes, such that any three peer systems can reconstitute the whole file – thus protecting against peer system and fragment loss. Permanent peer failure is like a RAID rebuild process.
Dalle explained: “ZFS does the compression and encoding and replication. We add the splitting.” Datis itself has no access to customers’ data and its backup style is incremental forever.
Inspeere has tweaked ZFS so that a local ZFS system is actually based on dispersed and independent ZFS peers. It has technology to divide traffic streams in a network – called SPLIT-IT, SAVVY dynamic load balancing tech, and DATASMOOTH to optimize bandwidth.
A receiving peer site will treat stream slices as files. Restoration means the set of peers has to replicate slices back to the originator – just enough to rebuild the data, not all the distributed slices. So a restore can take less time than original backup process.
Datis can enforce sovereignty, as peers’ geo locations can be restricted.
Dalle said Inspeere can support S3 and does support UrBackup, which is popular in universities.
The startup has around 80 customers and is edge-site focussed. It says that, as peers support each other, no actual datacenter is needed to store the backup. Therefore Datis, compared to other datacenter-using backup systems, uses less energy in its operation.
“We target small vertical market communities – like schools,” Dalle said, and “we want to move up market to mid-market companies. We will need to be able to backup NetApp filers.” It is working through reselling partners and has one MSP. It may expand that channel.
We have here a small French backup target supplier with unique technology that is related to the web3, distributed storage technology used by Storj and Cubbit. Although it uses blockchain technology, it’s not into crypto and is quite happy using euros. Pricing for small customers starts at €100/month.
Commissioned: As the curtain falls on 2023, IT organizations are looking toward the new year with a mix of renewed enthusiasm and cautious optimism. The enthusiasm stems from the arrival of generative AI services a year ago.
Generative AI (GenAI) has emerged as perhaps the biggest productivity booster for knowledge work since the proliferation of word processing and spreadsheet software in the 1990s. In elevating customer operations, sales and marketing and software engineering, GenAI could add up to $4.4 trillion annually in productivity value to the global economy, according to McKinsey Global Institute.
The cautious optimism comes from IT leaders’ opportunity for modernizing corporate datacenters to accommodate GenAI and other data-hungry workloads. It’s complex work but executed well it can help organizations improve application performance and drive operational efficiency while curbing costs.
Without further ado, check out these IT management trends for 2024.
GenAI drives workload placement decisions
GenAI can shave hours off tasks that knowledge workers complete each day to do their jobs, potentially transforming industries. This is a big reason 52 percent of IT organizations are already building or deploying GenAI solutions, according to a Generative AI Pulse Survey conducted by Dell Technologies earlier this year.
In 2024, GenAI will accelerate workload placement trends, with organizations reckoning with how and where to run large language models (LLMs) that fuel their GenAI applications. Some IT decision makers will choose public services.
Bringing AI to their data thusly will help organizations dictate security policies and access, create guardrails that reduce reputational risk and enjoy cost efficiencies.
Multicloud management becomes more seamless
Craving flexibility, more organizations will further abstract software functions from operating environments to run workloads in locations of their choosing.
These approaches are part of a broader trend of trying to manage multicloud environments as seamless systems. This means treating the entire infrastructure estate as one entity to deliver greater operational efficiency and business value.
The edge of operation consolidation
Infrastructure that supports edge environments has historically been highly fragmented, with organizations stitching together solutions they hope will keep applications running at near real-time. Reducing latency has also been a big bugbear.
In 2024, you’ll see more organizations embrace edge operations approaches that simplify, optimize and secure deployment across complex multicloud estates, ensuring better uptime and service.
Most IT staff are as comfortable with public cloud experiences as they are one-day shipping. Both offer agility and rapid service. Yet it’s also true that most organizations are weary of wrangling cloud services from different vendors, as well as the unpredictable costs associated with consuming said services.
In 2024, more organizations will seek to enjoy the same pay-as-you-go subscription models for infrastructure services but delivered on premises to their datacenters or colos of their choosing.
Such as-a-Service solutions balance flexibility with control, helping IT leaders pay only for what they require to run their business. This will help curb rising costs associated with resource-intensive workloads – such as GenAI and HPC apps – while affording IT more control over how it consumes compute and storage.
Multicloud-by-Design will evolve
Over the years, organizations have watched their applications sprawl across a number of operating locations, based on requirements for performance, latency, security, data portability or even whims.
As such, most IT organizations run apps on premises, in public and private clouds, in colos and at the edge – a kind of de facto multicloud estate. The location variance will grow significantly, with 87 percent of organizations expecting their application environment to become even more distributed over the next two years, according to a report – Unlocking the Power of Multicloud with Workload Optimization – published by the Enterprise Strategy Group in May 2023.
In 2024, more IT leaders will build multicloud-by-design estates, or intentionally constructed architectures intended to improve application performance and operation efficiency. This will also help meet regulatory requirements, control and secure assets and optimize costs.
Also: Given the large volumes of data they create GenAI apps will have an outsize influence over how IT leaders design their infrastructure, including shepherding staff as they build and train LLMs.
The key takeaway
You may have noted that GenAI is the thread woven throughout these trends. In fact, the most disruptive force in technology in 2023 will also remain the hottest workload in 2024.
IT leaders will have some critical decisions to make about what GenAI applications they run, as well as whether to operate them internally, externally or across multiple locations.
This will require careful consideration of the compute and storage, as well as the architecture that will situate and run them. A multicloud-by-design approach to IT infrastructure provides a smart, responsible path. And a trusted partner can light the way along their journey.
A briefing from Nasuni revealed that the cloud file services supplier’s growth is continuing unabated and that its deals are getting larger.
Jim Liddle.
B&F met Jim Liddle, Nasuni’s chief innovation officer and the ex-CEO and founder of Storage Made Easy – the UK startup acquired by Nasuni in mid-2022. Nasuni is one of four enterprise cloud file services suppliers – the other three being CTERA, Egnyte and Panzura. The common factors are a cloud (or on-prem) object storage base, with file services made available to edge sites and users on a file sync ‘n’ share foundation, plus lots of services on top – such as ransomware attack detection and recovery.
Nasuni passed the $100 million ARR mark in January his year and Liddle told us: “We’re still growing at the same rate.”
There were about 750 Nasuni customers at the start of the year and the number has risen to more than 800. Liddle told us: “Nasuni has more than 4,300 AEC locations worldwide” – AEC being the Architecture, Engineering and Construction market. Egnyte reported it had more than 3,000 AEC customers in September, when it claimed more than 17,000 enterprise customers.
Liddle explained that Nasuni was: “continuing to see seven-figure deals and has signed an eight-figure deal in the last 30 days.” That’s a deal worth between $10 million and just under $100 million, and it’s one of the largest deals in Nasuni’s history. This comes after Nasuni hired a CRO, Pete Agresta, in January, with a brief to grow its enterprise business. Agresta was Pure Storage’s VP For enterprise sales.
The sales effort has been strengthened with the appointment of Matthew Grantham as head of worldwide partners, and Curt Douglas as VP of sales for its western region.
Liddle noted that Nasuni is replacing NetApp in some deals and Panzura in others. It does not meet CTERA nor Egnyte in competitive bids, suggesting that these two companies operate in different market sectors.
The surge in the AI market, led by generative AI, is encouraging organizations to have a more encompassing view of their data assets so that they can be organized and curated for presentation to AI analysis and processing. They have data visibility challenges, with one contact telling him: “I can’t even see past the shadow cast by the data mountain.”
He reckons that some of the biggest obstacles are data silos: “Large organizations have hundreds of NAS devices and file servers and other legacy systems implemented during the pandemic. They need to consolidate in order to get to a single source of truth and to turn data into an opportunity vs just a cost. AI is proving to be a catalyst for this change.”
“We have focused on a strategy for companies to combat these challenges and be able to leverage their unstructured file data for AI – to be announced in in Q1 2024 with complimentary tools and services.”
This involves making it so that customers “can easily integrate Nasuni stored data into their AI pipelines/workflow. The destination and resources that underpin that can be decided upon by the company.” For example, “Nasuni’s global file system can be accessible through a single cloud (or on-premises) located edge device that can accessed to facilitate the data required for an AI pipeline process.”
Comment
Our supposition is that, given continued growth in 2024, Nasuni could be considering an IPO for the 2025 period.
Data protection supplier Acronis has been highlighted in Frost & Sullivan’s Frost Radar report for Endpoint Security as one of the top companies in the space. The report highlights the innovation and growth potential of Acronis, and why it should be strongly considered by organizations looking to invest in or augment the protection of their endpoints. Get a copy of the Acronis section of the report here.
GigaOm will enoy Frost & Sullivan copying its Radar concept.
…
Cohesity announced that, according to IDC’s Semiannual Software Tracker, Data Replication and Protection Software, it had the fastest growing worldwide year-over-year revenue growth among the Data Replication and Protection Software market’s top ten largest competitors in the first half of calendar 2023. The IDC report is not publicly available.
.…
Commvault has appointed Michel Borst, based out of Singapore, as Area VP for Asia and Joanne Dean, based in Perth, as Area VP for Channels and Alliances, APAC, expanding its Asia Pacific leadership team.
Appliances-based deployments, mostly integrated appliances, will continue to see growth,
Flash-based backup storage will increasingly replace hard drive capacity,
Appliances with value-added services will see an increasing adoption.
The listed representative vendors are: Acronis, Arcserve, Cohesity, Commvault, Dell, ExaGrid, HPE, Infinidat, Object First, Pure Storage, Quantum, Rubrik and Veritas. Gartner suggests organizations can use a Market Guide to understand how the status of an emerging market aligns with future plans. Seeing Gartner call the backup target market an “emerging market” is unintentionally funny – the thing has been “emerging” for almost 20 years, ever since Data Domain launched its first system in 2004.
…
Fivetran announced that its data integration platform is being utilized by residential furniture supplier La-Z-Boy to break down data silos and accelerate data-driven decision-making across the organization.
Nasuni supports over 800 enterprise customers, including numerous Fortune 500 enterprises, in over 70 countries. Last year, it surpassed $100 million in annual recurring revenue (ARR) and continued its growth in 2023. Get a report copy here.
…
David Bennett, CEO at Object First, made two 2024 predictions. (1) In 2024, immutable backups will be a requirement for companies covered by cyber insurance. Cyber insurance underwriters will bring reality to the market. The average ransomware demand increased by 74% this year and cyber claims have already jumped 12%. In 2024, cyber insurers will have no choice but to raise premiums – a lot – in order to reel in losses, or take it upon themselves to advocate for better cyberattack preparation amongst their customers.
(2) Investments in data recovery and resiliency will increase massively in 2024. As immutability and object storage are elevated to the security stack, we will see a massive increase in data recovery and resiliency. Companies will realize they’re in over their head when it comes to ransomware and effectively protecting their data, and will look to their channel partners and vendors for help. The answer many channel partners/vendors will come back with is to focus on simple and secure backup storage solutions.
…
Quantum announced that MR Datentechnik, a German IT solutions and managed services provider, has implemented Quantum ActiveScale object storage as the foundation of its new MR S3 Storage Service for backup and recovery, archiving, and data security. MR Datentechnik wanted to support S3 applications and workflows for integration with its cloud storage solution, also to have the latest version of Veeam Backup & Replication supported.
…
Scality says Spanish VAD V-Valley, a subsidiary of the Esprinet Group, will distribute Scality’s RING and ARTESCA software-defined object storage products in Spain.
…
Seagate has launched a 24TB Skyhawk surveillance disk drive based on its 24TB Exos HDD technology announced in October. The existing SkyHawk generation tops out at 20TB. The MTBF rating jumps up from 2 million hours to 2.5 million hours, the cache doubles to 512MB, and the sustained data transfer rate increases to 285MB/sec from 260MB/sec. It has a RAID RapidRebuild feature that rebuilds volumes up to three times faster than standard hard drives. It also includes a five-year limited product warranty and three years of Seagate’s Rescue Data Recovery Service. Shipping this month, SkyHawk AI 24TB is available for $599.99. Datasheet here.
…
Veeam announced new Backup-as-a-Service (BaaS) capabilities for Veeam Backup for Microsoft 365 with Cirrus by Veeam and support for Microsoft 365 Backup Storage. Customers have three options in how they wish to utilize Veeam and manage Microsoft 365:
Cirrus by Veeam: Delivers a simple, seamless SaaS experience, without having to manage the infrastructure or storage within Microsoft 365 Backup Storage,
Veeam Backup for Microsoft 365: Deploy Veeam’s existing software solutions for Microsoft 365 data protection and manage the infrastructure,
A backup service from a Veeam Cloud & Service Provider (VCSP) partner: Built on top of the Veeam platform, with value-added services according the provider’s area of expertise.
…
Wasabi issued its 2024 predictions from from Kevin Dunn, country manager UK/Ireland & Nordics. The first prediction is that, unfortunately, not much will change in the cloud market – or in the tech market more generally. The big players will continue to dominate at the cost of consumers and businesses, despite the regulations being finalized, planned, or promised which aim to redistribute power and influence away from the heaviest hitters. There will be a growing number of regulations to create a fairer market. AI will increase demand for storage providers and multi-cloud will become more common. All the new data being collected and stored to train AI should be copied and stored with another cloud provider for security.
Profile. IBM mainframe data access company VirtualZ Computing has a native American heritage woven into its backstory which is connected to two of its founders: CEO Jeanne Glass and CTO Vince Re.
VirtualZ, founded in Minneapolis in 2018, has two products: Lozen, which provides real-time, read-write, peer-to-peer, access to IBM z mainframe data; and PropelZ, which enables the creation of a one-time copy of z mainframe data and loading it into a database in the public cloud or on-premises. This is said to be a fraction of the cost of alternative mainframe extract, transform and load (ETL) applications.
A third product, Zaac, is in development. It will do the opposite of Lozen, in that it will provide mainframe applications with real-time, read-write access to external data. We don’t know what external data sources will be supported yet, but it would be no surprise to find out that they include the Lozen targets.
VirtualZ Lozen diagram.
Lozen uses a TCP/IP network link from the mainframe to the destination system and the software is bi-directional. It does not remove the data from the mainframe – providing, as it were, a real-time snapshot instead, with the single version of the data truth staying on the mainframe.
The software runs on the mainframe’s non-billable zIIP processor, meaning its mainframe compute is free. It connects to apps in the hybrid cloud with some pre-built connectors – such as a Mule ESB connector for MuleSoft LLC, or via OpenAPI standards.
This is different from BMC-acquired Model9’s mainframe data access, which is essentially backup up data from the mainframe, transferred across a TCP/IP link and used to feed data into a Virtual Tape Library (VTL) or object storage vaults.
Lozen enables applications running on open systems servers or in the public cloud to use real-time mainframe data and so be more accurate and up to date in their processing. It is not envisaged as a way to move data off mainframes, as a migration tool. Rather, it brings the mainframe into a hybrid cloud and provides a regular pipeline for open server workloads needing mainframe data, or vice versa.
Pricing is impressive. Lozen, and the coming Zaac, cost $150,000 for an initial terabyte of data. PropelZ will set you back $50,000/year, and this is claimed to be a fraction of mainframe ETL offerings.
Bootnote
VirtualZ founders. From left to right: Dustin Froyum, Jeanne Glass and Vince Re.
VirtualZ has three co-founders: CEO Jeanne Glass, CTO Vince Re and SVP global alliances Dustin Froyum. We should note it is the only mainframe software developer in history with a female co-founder and female CEO.
It has a Native American and diverse heritage across its team. Virtualz says of the Lozen product: “Our strategic data access solution was inspired by a female Apache warrior and battlefield strategist named Lozen.” She was a member of the Chihenne Chiricahua Apache, born in the 1840s and sister to the warrior chief Victorio.
VirtualZ says: “Our CTO, Vince Re, grew up hearing stories from his grandmother about her mother – a Blackfoot Indian.” Also: “Jeanne Glass, our founder and CEO, has Native American heritage as well … Jeanne’s family is from the White Earth Reservation, where her grandfather grew up and where her grandmother was well-known for her beadwork.”
Startup PEAK:AIO has had its top-tier GPU data delivery performance validated in testing by Hewlett-Packard Enterprise.
PEAK:AIO is a UK startup providing storage software for the AI market based on storage servers from vendors such as Dell, HPE, Supermicro, and Gigabyte. It aims to deliver the same or better performance than multi-node parallel file systems such as IBM’s Storage Scale and DDN’s Lustre with basic NFS, NVMe SSDs and rewritten RAID software. HPE assessed a single PEAK:AIO AI Data Server and found it to surpass the recently published benchmarks of other AI GPUDirect storage vendors.
Mark Klarzynski
Mark Klarzynski, CEO and co-founder of PEAK:AIO, issued a statement: “AI is changing futures; it deserves more than a force fit square peg. PEAK:AIO’s view is AI has unique demands and requires an entirely new approach to storage.
“AI has thoroughly disrupted [the] traditional approach. After observing storage vendors attempting to adapt to the evolving AI market by merely rebranding existing solutions as AI-specific – akin to trying to fit a square peg in a round hole. I believed that AI required and deserved a reset, a fresh perspective and new thinking to match its entirely new use of technology.”
An HPE white paper evaluated PEAK:AIO’s AI Data Server software, executing it on a ProLiant DL380 Gen11 server configured with 16x Samsung 3.2TB SFF PCIe 4 NVMe drives, directly attached via four lanes each. This enabled PEAK:PROTECT RAID to control the drives directly with no intervening controller software. There were 5x Nvidia ConnectX-6 dual 100 GbE port HBAs for external connectivity.
HPE ProLiant Gen 11 server configuration diagram for PEAK:AIO test run
HPE tested this configuration with Nvidia’s GPUDirect protocol, hooking up a single Nvidia DGX A100 GPU server via five 200 GbE/HDR ports, and recording a total read bandwidth of 118GB/sec. PEAK:AIO points out that the load on the ProLiant CPU “was not significant, leaving room for growth with … generation 5 NVMe drives coupled with ConnectX-7 cards and how they will perform when they are more available.”
We added these numbers to a spreadsheet (see below) we maintain of suppliers’ GPUDirect performance numbers and charted the results:
With the PCIe 4 SSDs, the PEAK:AIO performance on a per node basis (118GB/sec) was second to IBM’s ESS3500 Storage Scale’s 126GB/sec, with DDN third at 90GB/sec. The tester also looked at external RDMA NFS performance, achieving 119GB/sec.
PEAK:AIO tested the server with gen 5 SSDs and CX7 hardware, and it went faster still. They recorded 202GB/sec with GPUDirect and NVMe-oF, and 162GB/sec with GPUDirect and RDMA NFS. That gives the other suppliers something to aim for.
Klarzynski tells us “HPE utilized a genuine single-node setup” and not a multi-node system like the other suppliers. It’s more affordable and simpler to set up and manage, in other words. He reckons: “In terms of AI GPUDirect performance, our solution stands ahead in the field because that is its entire focus.”
Bootnote
Our spreadsheet table of GPUDirect supplier performance numbers:
Wikipedia public domain image: https://commons.wikimedia.org/wiki/File:C_Merculiano_-_Cephalopoda_1.jpg
IBM has updated Ceph with object lock immutability for ransomware protection and previews of NVMe-oF and NFS to object ingest.
Storage Ceph is what IBM now calls Red Hat Ceph, the massively scalable open-source object, file and block storage software it inherited when it acquired Red Hat. The object storage part of Ceph is called RADOS (Reliable Autonomous Distributed Object Store). A Ceph Object Gateway, also known as RADOS Gateway (RGW), is an object storage interface built on top of the librados library to provide applications with a RESTful gateway to Ceph storage clusters.
Ex-Red Hat and now IBM Product Manager Marcel Hergaarden says in a LinkedIn post that Storage Ceph v7.0 is now generally available. It includes certification of Object Lock functionality by Cohasset, enabling SEC and FINRA WORM compliance for object storage and meeting the requirements of CFTC Rule 1.31(c)-(d).
There is NFS support for the Ceph filesystem, meaning customers can now create, edit, and delete NFS exports from within the Ceph dashboard after configuring the Ceph filesystem. Hergaarden said: “CephFS namespaces can be exported over the NFS protocol, using the NFS Ganesha service. Storage Ceph Linux clients can mount CephFS natively because the driver for CephFS is integrated in the Linux kernel by default. With this new functionality, non-Linux clients can now also access CephFS, by using the NFS 4.1 protocol, [via the] NFS Ganesha service.”
RGW, the RADOS gateway, can now be set up and configured in multi-site mode from the dashboard. The dashboard supports object bucket level interaction, provides multi-site synchronization status details and can be used for CephFS volume management and monitoring.
He said Storage Ceph provides improved performance for Presto and Trino applications, by pushing down S3select queries onto the RADOS Gateway (RGW). V7.0 supports CSV, JSON and Parquet defined S3select data formats.
It also has RGW policy-based data archive and migration to the public cloud. Users can: “create policies and move data that meets policy criteria to an AWS-compatible S3 bucket for archive, for cost, and manageability reasons.” Targets could be AWS S3 or Azure Blob buckets. RGW gets better multi-site performance with object storage geo-replication. There is “improved performance of data replication and metadata operations “ plus “Increased operational parallelism by optimizing and increasing the RadosGW daemon count, allowing for improved horizontal scalability.” But no performance numbers are supplied.
The minimum node count for erasure coding has dropped to 4 with C2+2 erasure-coded pools.
Preview additions
Three Ceph additions are provided in technical preview mode with v7.0, meaning don’t use them yet in production applications:
NVMe for Fabrics over block storage. Clients interact with an NVMe-oF initiator and connect against an IBM Storage Ceph NVMe-OF gateway, which accepts initiator connections on its north end and connect into RADOS on the South end. The performance equals RBD (RADOS Block Device) native block storage usage.
An object archive zone that keeps every version of every object, providing the user with an object catalogue that contains the full history of the object. It provides immutable objects that cannot be deleted nor modified from RADOS gateway (RGW) endpoints and enables the recovery of any version of any object that existed on production sites. This is good for ransomware and disaster recovery.
To limit what goes into the archive, there is archive zone bucket granularity which can be used to enable or disable replication to the archive zone on a per0object bucket case.
A third preview is of an NFS to RADOS Gateway back-end which allows for data ingest via NFS into the Ceph object store. Hergaarden says: “This can be useful for easy ingests of object data from legacy applications which do not natively support the S3 object API.”
Analysis. VAST Data is set on building a transformative data computing platform it hopes could be the foundation of AI-assisted discovery.
A blog by Global Systems Engineering Lead Subramanian Kartik, The quest to build thinking machines, starts by saying: “The idea behind VAST has always been a seemingly simple one: What if we could give computers the ability to think and discover for themselves?”
His next paragraph builds on the idea: “If computers were capable of original thought the process of discovery, which has catalyzed all human progress since the earliest moments of civilization, could be accelerated significantly. We could build revolutionary artificial intelligence capabilities that wildly surpass our own potential, solving the world’s biggest challenges and moving humanity forward.”
He seems to be aiming his pitch at future models, adding: “ChatGPT and AI [large] language models do not discover things or generate new ideas.”
Kartik notes: “AI-driven discovery and deep learning goes far beyond processing unstructured data like documents, images, or text … It’s processing real-world, analog data from sensors or genome sequencers or video feeds or autonomous vehicles, interpreting it in the context of the body of human knowledge, and making connections with ideas that we haven’t imagined yet.”
Kartik envisages self-discovering computers: “We think a data platform that provides neural networks with broad access to such natural data at tremendous speed and scale will deliver much more sophisticated AI than what we’ve seen to date. And as datasets grow larger, as algorithms get smarter, and as processors get stronger, self-discovering computers – thinking machines – will no longer be science fiction.”
In September, VAST Data co-founder Jeff Denworth wrote: “CoreWeave, like VAST, is focused on a future where deep learning will improve humanity by accelerating the pace of discovery.” A VAST announcement said in August: “The true promise of AI will be realized when machines can recreate the process of discovery by capturing, synthesizing and learning from data – achieving a level of specialization that used to take decades in a matter of days.”
It went on to say: “The era of AI-driven discovery will accelerate humanity’s quest to solve its biggest challenges. AI can help industries find treatments for disease and cancers, forge new paths to tackle climate change, pioneer revolutionary approaches to agriculture, and uncover new fields of science and mathematics that the world has not yet even considered.
That last point, about uncovering “new fields of science and mathematics that the world has not yet even considered” certainly seems like eureka-style discovery. But an organization may not like what has been discovered. Consider Galileo and his discovery that the Earth orbited the Sun. This infuriated the Catholic Church hierarchy who believed the Sun orbited the Earth, and told Galileo to abandon his heliocentric theory and not teach it.
I think that VAST has a more constrained idea of discovery; that its system will help customers improve their operations and not disrupt or damage them. The idea of discovery, of an AGI discovering new things, of a VAST Data system discovering new things, seems tremendously impressive but, unless the things discovered help an organisation they will not be wanted.
Imagine Kodak back in 1975 being told by an AI that digital cameras were the future and not film amd chemistry-based cameras; would this have made any difference to Kodak’s future business failure? Probably not.
Bootnote
The first digital camera was invented by Kodak engineer Steve Sasson in 1975. His management told him not to talk about it. Kodak stayed in denial for more than 25 years, working to get film good enough to compete with digital. It obviously failed. The point is that discovery, on its own, is not enough. The impact and relevance of the discovery has to be understood by the humans overseeing the system that produced the discovery.
The Storage Newsletter reports that Atempo SAS has requested the opening of a judicial recovery procedure to finalize the restructuring of its activities and accelerate its business development. Atempo intended to raise funds to move forward, but the investor exit dispute with a long-time financial partner has hindered this effort so Atempo asked for the protection of the Commercial Court. In recent years Atempo has built resale and distribution agreements with suppliers such as DDN, Quantum, Huawei, Panasas, and Eviden, and cloud service providers like OVH, Oustcale, and Scaleway. It also has a network of national and regional system integrators.
…
AWS has announced Redshift MySQL integrations with Aurora PostgreSQL, DynamoDB, and Relational Database Service (RDS) to connect and analyze data without building and managing complex extract, transform, and load (ETL) data pipelines. Customers can also now use Amazon OpenSearch Service to perform full-text and vector search on DynamoDB data in near real time.
…
What is the status of Velero container backup business CloudCasa? It has ambitions for funding and a spin off from Catalogic. Ken Barth, Catalogic CEO, told us: “Catalogic Software is very bullish on the future of CloudCasa. As you know, we were seeking funding partners for a spin out earlier in the year but with the slowdown in VC funding and exits, we haven’t seen anything that has excited us yet.
“CloudCasa has seen good market traction from the April recent release of CloudCasa for Velero. In November, based on customer demand, we added the option for organizations to self-host CloudCasa on-premises or in a cloud, or stay with our original BaaS architecture. We are in a great position to take advantage of the increasing adoption of Kubernetes in production environments, and CloudCasa placed well in the recent IDC MarketScape: Worldwide Container Data Management 2023 Vendor Assessment.
“We are looking forward to the opportunities that 2024 will bring for Catalogic and CloudCasa, both in terms of growth and strategic partnerships.”
…
Data observability supplier Cribl says it has surpassed $100 million in ARR, becoming the fourth-fastest infrastructure company to reach centaur status (behind Wiz, Hashicorp, and Snowflake). It also just launched its international cloud region, making its full suite of products available to customers in Europe. Cribl has closed its Series D of $150 million, bringing its total funding to date over $400 million from investors including Tiger Global, IVP, CRV, Redpoint Ventures, Sequoia, and Greylock Partners.
…
Cancer Research UK Cambridge Institute is using storage supplied by Zstor GmbH for its Lustre Object Storage Servers (OSS). The CIB224NVG4 2-Node Server, designed by Zstor, is a 2-node server configuration in a 2U format with 24 DapuStor Roealsen5 NVMe Gen4 SSDs built with TLC 3D NAND.
…
Lakehouse supplier Databricks has launched a suite of Retrieval Augmented Generation (RAG) tools. These will support enterprise users with overcoming quality challenges by allowing them to build high-quality, production LLM apps using their data. RAG is also a powerful way of incorporating proprietary, real-time data into LLM applications. There is a public preview of:
A vector search service to power semantic search on existing tables in your lakehouse.
Online feature and function serving to make structured context available to RAG apps.
Fully managed foundation models providing pay-per-token base LLMs.
A flexible quality monitoring interface to observe production performance of RAG apps.
A set of LLM development tools to compare and evaluate various LLMs.
DataCore’s Swarm v16 object storage release has added a single-server deployment for edge workloads. Swarm has been containerized and streamlined onto a single server powered by Kubernetes for compact configurations suited for Edge and Remote Office/Branch Office (ROBO) locations. Swarm 16 includes integration with Veritas NetBackup via immutable S3 object locking, which complements similar joint developments with Veeam and Commvault.
…
Data integration supplier Dataddo has launched its Data Quality Firewall, which ensures the accuracy of any data the platform extracts to storages like BigQuery, Snowflake, and S3. It claims the firewall is the first of its kind to be fully embedded in an ETL platform at the pipeline level. Dataddo says it performs checks on null values, zero values, and anomalies, and can be configured individually for each column, enabling granular quality control. It offers multi-mode operation to accommodate various fault tolerance standards, and rules are easy to configure and test in the platform’s no-code interface.
…
Datadobi’s StorageMAP v6.6 release has object storage support. It can now analyze object data stored on any S3-compliant platform, offering users a complete view of their unstructured data, including both File (SMB and NFS) and Object (S3) data. v6.6 also enables users to search for files based on specific metadata criteria and copy those files to a file or object target. For instance, they can easily copy files from a file server to a data lake for normalization and aggregation prior to copying (or moving) training datasets to apps for analytics and/or AI processing.
…
Data lakehouse supplier Dremio announced the public preview of Dremio Cloud for Microsoft Azure. It offers companies self-service analytics coupled with data warehouse functionality and the flexibility of a data lake. It’s built on Apache Arrow’s columnar foundation, and offers organizations rapid and scalable query performance for analytical workloads. A native columnar cloud cache (C3) provides fast throughput and rapid response times on Azure Data Lake Storage (ADLS). The software delivers sub-second response times for BI workloads.
Dremio Cloud can combine data located in cloud data lakes, leveraging modern table formats like Iceberg and Delta Lake, with existing RDBMSes. By delivering a semantic layer across all data, it provides customers with a consistent and secure view of data and business metadata that can be understood and applied by all users.
…
Data integrator Fivetran announced support for Delta Lake on Amazon S3. Fivetran customers can land data in Amazon S3 and access their Delta Lake tables. In April Fivetran announced support for Amazon S3 with Apache Iceberg. Fivetran says its no-code platform offers enterprises a simple, flexible way to move data from nearly any data source to any destination.
…
Hitachi Vantara announced the launch of Hitachi Unified Compute Platform (UCP) for GKE Enterprise, a new, integrated hybrid solution, with long-time partner Google Cloud. Through Google Distributed Cloud Virtual (GDCV), Hitachi UCP for GKE Enterprise offers businesses a unified platform to manage hybrid cloud operations. GDCV on Hitachi UCP empowers enterprises to modernize applications, optimize infrastructure and enhance security across hybrid cloud environments by combining the flexible cloud infrastructure of Hitachi UCP with the versatility and scalability of GDCV. The system can deploy and manage workloads within on-premises data centers, cloud environments or edge locations. As part of its launch, GDVC on Hitachi UCP has also been included in Google’s Anthos Ready platform partners program, which validates hardware that works seamlessly with GDCV.
Angela Heindl-Schober.
…
HYCU appointed Angela Heindl-Schober as SVP Global Marketing. Her CV includes global technology companies such as Vectra AI (most recent amd 8-years in place), Riverbed Technology, Infor, and Electronic Data Systems (now part of HPE). HYCU says she has a proven track record spanning over 28 years in global marketing roles. She effectively replaces HYCU CMO Kelly Hopping who went part-time in August this year, and became fill-time CMO at DemandBase in September.
…
TD SYNNEX subsidiary Hyve Solutions recently earned a top annual rating in the Human Rights Campaign Foundation’s assessment of LGBTQ+ workplace equality in the 2023-2024 Corporate Equality Index and has been recognized as a leader of workplace inclusivity in the United States. Hyve sells storage hardware to hyperscaler customers.
…
Informatica has launched enhanced Databricks-validated Unity Catalog integrations. Informatica’s no-code data ingestion and transformation pipelines run natively on Databricks for use with Databricks and Databricks Unity Catalog. The integration gives joint customers a best-in-class offering for onboarding data from 300+ data sources and rapidly prepares data for consumption with an extensive library of out-of-the-box, no-code, proven, repeatable data transformation capabilities.
…
Kinetica announced the availability of Kinetica SQL-GPT for Telecom, the industry’s only real-time offering that leverages generative AI and vectorized processing to enable telco professionals to have an interactive conversation with their data using natural language, simplifying data exploration and analysis. The Large Language Model (LLM) utilized is native to Kinetica, ensuring robust security measures that address concerns often associated with public LLMs, like OpenAI.
…
Lenovo said its latest quarterly (Q2 2024) results made it the number three worldwide storage supplier, having been number four last quarter. Its storage products are sold by its ISG business unit, which recorded $2 billion in revenues. Our records of the latest supplier quarterly storage results are:
Dell – $3.84 billion
NetApp – $1.56 billion
HPE – $1.11 billion
Pure – $762.84 million
Lenovo’s storage revenues thus fall somewhere between $1.56 billion and $1.11 billion. The company said it is seeing clear signs of recovery across the technology sector and will “leverage the opportunities created by AI, where it is uniquely positioned to succeed given its hybrid AI model, pocket-to-cloud portfolio, strong ecosystem and partnerships, and growing portfolio of AI technologies and capabilities.”
…
File collaboration startup LucidLink is launching a film series called Unbound, profiling how creatives are unlocking new possibilities without needing to work from the same office. The first episode will kick off with the creative studio Versus. It will show how they are collaborating across eight time zones and producing their best work with the support of LucidLink.
The series will run throughout 2024, with each episode spotlighting a diverse group of creatives across creators, media, advertising, and gaming. LucidLink says it serves as an indispensable component of their workflow and process.
…
Object First has issued a downloadable case study of its customer Centerbase. This is a cloud-based platform that aids in managing law firms, and their data recovery plans that ensures its backups are protected with out-of-the-box immutability. Centerbase chose Object First’s Ootbi with the goal of reducing restore time from its cloud repository – and ended up decreasing its recovery point objective (RPO) by 50 percent, from eight hours to just four.
…
The SNIA is planning an open collaborative Plugfest colocated at SNIA Storage Developer Conference (SDC) scheduled for September 2024, aimed at improving cross-implementation compatibility for client and/or server implementations of private and public cloud object storage solutions. This endeavor is designed to be an independent, vendor-neutral effort with broad industry support, focused on a variety of solutions, including on-premises and in the cloud.
An SNIA source tells us: “There are many proprietary and open source implementations claiming compatibility, but in reality there are plenty of discrepancies which lead to customer surprises when they’re transitioning from one object storage solution to another.” Some examples were mentioned at SDC.
…
Data analytics supplier Starburst, which is based on open source Trino, announced new features in Starburst Galaxy to help customers simplify development on the data lake by unifying data ingestion, data governance, and data sharing on a single platform. It added support for:
● Near real-time analytics with streaming ingestion: Customers can leverage Kafka to hydrate their data lake in near real-time. Upcoming support for fully managed solutions, such as Confluent Cloud, is also planned.
● Automated data governance: As new data lands in the lake, machine learning models in Gravity – a universal discovery, governance, and sharing layer in Starburst Galaxy – will automatically apply classifications for certain categories. Depending on the class, Gravity will apply policies granting or restricting access. Now, as soon as PII (personally identifiable information ) lands in the lake, Gravity will be smart enough to identify and restrict access to that data.
● Automated data maintenance: abstracts away common management tasks like data compaction and data vacuuming.
● Universal data sharing with built-in observability: With Gravity, users can easily package data sets into shareable data products to power end-user applications, regardless of source, format, or cloud provider.New functionality will allow users to securely share these data products with third-parties, such as partners, suppliers, or customers.
● Self-service analytics powered by AI: New AI-powered experiences in Galaxy, like text-to-SQL processing, will enable data teams to offload basic exploratory analytics to business users, freeing up their time to build and scale data pipelines.
…
Distributed storage supplier Storj announced the launch of Storj Select, which delivers customizable distributed storage to meet specific security and compliance requirements of organizations with sensitive data, such as healthcare and financial institutions. Customer data is only stored on points of presence that meet customer specified qualifications, such as SOC2, GDPR, and HIPAA, with Storj Select. It is also announcing CloudWave, a provider of healthcare data security offerings, as a customer that chose Storj Select to provide compliant data storage.
…
UK IT services company StorTrec has announced a partnership with Taiwanese company QNAP to expand its services to global manufacturers. StorTrec is focused on delivering end-to-end support to data storage manufacturers and its resellers. Prior to the partnership, QNAP only offered warranties with its products. But this partnership will allow QNAP to offer customers a value-added option for full support for their storage solutions, with StorTrec managing all of the support services.
…
Analyst TrendForce released enterprise SSD supplier revenue shares for Q3 203. It said there has been an uptick in enterprise SSD purchases by server OEMs in 4Q23. TrendForce notes that Micron’s strategic focus on PCIe SSDs in recent years is now paying off, allowing the company to continue increasing its market share even amid subdued enterprise SSD demand. It notes that as other enterprise SSD suppliers shift to technologies above 176 layers, Kioxia’s continued use of 112-layer technology presents challenges in increasing supply flexibility and optimizing profitability. WDC’s enterprise SSD sales were primarily focused on North American CSP clients, who generally maintained conservative purchasing strategies, leading to a continued decline in overall purchases in Q3.
As WDC undergoes corporate restructuring, it is anticipated that the production timeline for PCIe 5.0 enterprise SSDs will be shortened. This constraint could result in WDC’s enterprise SSD revenue growth falling behind that of other suppliers in subsequent periods.
Samsung is anticipated to lead in revenue performance growth among suppliers as the first to mass-produce the PCIe 5.0 interface.
…
AI-enabled data observability and FinOps platform supplier Unravel Data has joined the Databricks Partner Program to deliver AI-powered data observability into Databricks for granular visibility, performance optimizations, and cost governance of data pipelines and applications. Unravel and Databricks will collaborate on go-to-market efforts to enable Databricks customers to leverage Unravel’s purpose-built AI for the Lakehouse for real-time, continuous insights and recommendations to speed time to value of data and AI products and ensure optimal ROI.
GigaOm has evaluated 17 suppliers of scaleout file storage systems and has placed NetApp out in front with a substantial lead.
The GigaOm Radar report evaluates products’ technical capabilities and feature sets against 14 criteria to produce a circular chart, divided into concentric inner Leader, mid-way Challenger, and outer New Entrant rings. These are separated into quarter circle segments by two opposed axes: Maturity vs Innovation, and Feature Play vs Platform Play. Suppliers are also rated on their forecast speed of development over the next year to 18 months as forward mover, fast mover, and outperformer.
GigaOm analysts Max Mortillaro and Arjan Timmerman said: “The scaleout file storage market is very active. Roadmaps show a general trend toward expanding hybrid-cloud use cases, implementing AI-based analytics, rolling out more data management capabilities, and strengthening ransomware protection.”
The hybrid cloud use cases include integrating file and object storage, and also integrating on-premises and public cloud file storage in a data fabric with the capability of storing files in specific cloud regions.
Here’s the scaleout file storage (SCOFS) Radar chart:
NetApp, Pure Storage, VAST Data, and Weka are the four leaders. The bulk of the suppliers are challengers with Scality, ThinkParQ (BeeGFS), and long-term supplier Panasas classed as entrants.
There are four main groups of the 17 suppliers on the chart:
Mature platform plays: DDN (Lustre) and IBM (Storage Scale)
Mature feature plays: Scality, OS Nexus, and ThinkParQ (BeeGFS)
Innovative feature plays: Quobyte, Quantum, and Panasas
The mainstreamers with innovative platform plays: NetApp, Pure Storage, Weka, VAST Data, Cohesity, Qumulo, Nutanix, Dell Technologies, and Hammerspace.
The analysts sub-divide the mainstream group into outperformers and challengers. They point out: “It’s important to keep in mind that there are no universal ‘best’ or ‘worst’ offerings; there are aspects of every solution that might make it a better or worse fit for specific customer requirements.”
They say NetApp is ahead because of its “broad portfolio options to implement and consume a modern scaleout file system solution based on ONTAP with a complete enterprise-grade feature set, flexible deployment models, and ubiquitous service availability across public clouds.”
The report includes a description of each supplier’s offering and technology situation and its scores out of three on a variety of different criteria.
Changes from last year
This year’s scaleout file storage report recognizes that there are two market segments; enterprise and high-performance. Last year GigaOm produced a high-performance SCOFS radar plus a separate enterprise SCOFS report. The changes from a year ago are quite startling. Here’s last year’s high-performance SCOFS:
Last year’s leaders, DDN and IBM, have moved backwards and are now challengers. Pavilion and Softiron have gone away. Scality has entered the scene. Quantum, Hammerspace, Panasas, and Quobyte have all swapped hemispheres. Qumulo has gone backwards. NetApp, despite not being classed as an outperformer in 2022, has wildly outperformed every other supplier. Last year’s outperformer, VAST Data, has hardly moved its position at all, ditto Weka.
And here’s the enterprise SCOFS from last year:
Again, the changes are dramatic with VAST Data, Cohesity, and Hammerspace swapping east and west hemispheres, and Quantum swapping the north for the south hemisphere. Commvault goes away, as does Red Hat (with IBM’s acquisition).
UK-based Disk Archive Corporation, known for its spun-down disk archives, says it now has 350 broadcast and media company customers. Its pitch is that its performance can beat both tape and optical disk archives.
The company was founded in 2008 and developed and sells its ALTO appliance, ALTO standing for Alternative to LTO, a 60-disk drive 4RU chassis and embedded server, with up to 10x 60 HDD expansion chassis for a total raw capacity of 660 x 24 TB HDDs = 15.8 PB. The system can scale out to over 200 PB, and if 26 TB SMR drives are used, the capacity will be even higher.
Alan Hoggarth, CEO and co-founder, briefed an IT Press Tour in Madrid, and talked about the company’s beginnings. “We wanted to break the mold – transcending traditional data tape, video tape, optical disk, and cloud solutions. A large broadcaster could have a million video tapes. Bypass them and stick them on disk,” he said.
ALTO head unit and expansion chassis
His thinking was that Copan’s original MAID (Massive Array of Idle Disks) concept was flawed. This was a disk-based archive that delivered faster access to data than tape archives, and saved on energy costs by spinning down the drives until their content was needed. But it used RAID sets so several drives had to be spun up when data was required, and data could conceivably be lost, Hoggarth said. A v2 MAID based on slowed-down but not spun-down drives wouldn’t deliver the power savings and environmental benefits of fully spun-down drives. However, if you didn’t use RAID sets and fully spun down the disks, a disk archive could provide a lower TCO than a tape archive because there is no need for tape generation migrations or robotic libraries, saving energy and cooling costs.
A tape library needs robotics to bring cartridges to the read/write drives. But a disk archive has no need for robots because each disk is a drive as well.
An optical disk archive is subject to the whims and happenstance events of the restricted set of suppliers, such as Sony closing down optical disk product lines. Similarly, the tape market can be upset by disputes between the duopoly suppliers Sony and FujiFilm, as happened in 2019 with LTO-8 media supply seriously disrupted.
The ALTO active archive system was designed specifically for film production, TV broadcast, and legal evidence (court recording) markets where video and audio assets have to be kept for decades with faster-than-tape access to files when required. It relies on SATA disk drives, which do not suffer from generational media access problems that affect both tape and optical disk. The affordable SATA interface is used because faster SAS or even NVMe access is not needed in the archive access situation. When it takes 60 seconds to spin up a quiescent drive, having a faster interface in the milliseconds area is simply not relevant.
The disk drives in an ALTO system may only spin for 50 hours a year. This vastly reduced active period extends their usable life out to 15 years or longer. Hoggarth noted that helium-filled drives prevent disk motor and spindle lubricant evaporating, which did occur with some air-filled drives.
ALTO is sold as an unpopulated chassis, with customers buying their own disk drives. Hoggarth said: “Data reduction is not feasible with video and audio images. We talk raw data.”
It writes raw data files to a pair of disk drives, and drives can be removed for storage on a shelf on-premises or in an external vault. That means if Disk Archive Corporation goes bust, customers can still read the data from the disk drives. It also provides a physical air-gap, preventing ransomware access.
Hoggarth told B&F: “We write two copies to two identical disks. There is no RAID controller so they’re not mirrored. One disk stays on the appliance. The other is put on the shelf and the pair put in read-only mode.” A person takes out the second drive in a pair, puts a generated QR code on it, then moves it to a shelf location.
Every disk drive has a UUID, a Universal Unique Identifier, and the ALTO software knows and tracks this. The QR code label includes the UUID and the ALTO system tracks QR codes, UUIDs, and shelf locations to know which files/folders are on which disks and where. A management system has a GUI facilitating this.
Alternatively, both disks could stay in the appliance, providing a media-redundant archive. You could make it more resilient against failures with a media-redundant high-availability archive by putting the second disk drive in a pair in a separate chassis.
ALTO systems need active human admins because they have no robotics, unlike tape libraries. Hoggarth said robotics and tape suffer from humidity, needing air conditioning, and have difficulties in humid environments such as in Asia. Disk Archive Corporation’s first customers were in Asia and its no-robotics feature has proved a strong advantage when selling in tropical environments.
Virtually every country has a state or private TV broadcaster and film company, also a legal system, and they will all need video/audio archives. The no-robotics characteristic gave Hoggarth’s company a real edge when selling into tropical countries.
The system’s electricity consumption is low, even with a management system running in a server. Hoggarth said: “We keep a flat file record of everything stored in an server, which adds background power consumption.” The power consumption of an ALTO disk archive system is 210 watts/PB with 22-24 TB disks. As disk capacities rise, the watts/PB number goes down.
Active Archive Alliance members
According to Oracle documentation: “A ten module library with 20 tape drives and 20 power supplies has a total idle power of 338 W (1154 Btu/hr) and a steady state maximum of 751 W (2564 Btu/hr).” The ALTO spun-down disk archive library has lower power consumption than this.
The ALTO system can support a third copy of a disk stored in the cloud as a last resort. The Disk Archive Corporation website has various case studies illustrating different ALTO configurations. There is a reduced depth or short-rack alternative for space-restricted datacenters.
The Active Archive Alliance trade association was set up to advance the idea of an active archive, one with a higher than normal archive data access rate. It is a vendor-neutral forum with members such as FujiFilm, Quantum, SpectraLogic, and others. Curiously, Disk Archive Corporation is not a member, having never seen the need in its clearly defined market niche.
The company has just 15 employees in the UK, Italy, and India. The biggest worldwide market is India by a wide margin. It was Russia but then “bad things happened,” Hoggarth said.
Amazingly, the company owns no IP. Everything is open. The mainstream archive companies ignore this market for spun-down disks. Hoggarth says many of them sell tapes and tape systems and make a lot of money from them. “They have no incentive to change.”
There are around 450 Disk Archive customers and the company is involved in a 100 PB archive project. The rough price range for an ALTO head unit is $24,000-$40,000, depending on features.
In these steadily greening days, the idea of a cheaper and faster-than-tape archive with more longevity that better satisfies environmental, social, and corporate governance criteria could look an attractive proposition.