Home Blog Page 92

ExaGrid says 48% of its business now outside of USA

Privately-owned ExaGrid claims to have notched up record bookings and revenue during the final three months of 2023.

Exagrid supplies a two-tier appliance with incoming backups written as-is to a landing area, from which they can be restored. Once the ingest is finished, the data is deduplicated and sent to disk for longer-term storage. The business says it added 175 new customers in the quarter and completed 59 six-figure deals and three seven-figure deals.

The total customer count has now surpassed 4,100, we’re told, and the company has been free cash flow- and EBITDA-positive for 12 quarters in a row. ExaGrid has increased its reach beyond continental USA, a move that appears to be paying off.

President and CEO Bill Andrews said: “ExaGrid is continuing to expand its reach and now has sales teams in over 30 countries worldwide, with customer installations in over 80 countries. Outside of the United States, our business in Canada, Latin America, Europe, the Middle East, Africa, and Asia Pacific is rapidly growing. We now do 48 percent of our business outside of the United States.”

We don’t have the customer count number for Q3 2023

Andrews highlighted “our 95 percent net customer retention, NPS score of +81, the fact that 92 percent of our customers have our Retention Time-Lock for Ransomware Recovery feature turned on, and 99 percent of our customers are on an active yearly maintenance and support plan.”

We understand that ExaGrid’s customers are using standard primary storage disk behind the backup application about 65-70 percent of the time. With other customers, it meets most scale-up dedupe appliance competition from Dell PowerProtect (DataDomain as was) and HPE StoreOnce, with some Veritas customers using FlexScale appliances. It rarely competes with any other dedupe backup supplier.

ExaGrid customers retain their existing backup application in 85 percent of cases. When the backup software is being changed, ExaGrid finds that Veeam and Commvault are preferred with Rubrik and Cohesity also present.

According to our sources, ExaGrid has moved upmarket, deals are getting bigger, and around three-quarters of its bookings involved $100,000-plus purchase orders. The company has more than 30 open sales positions, and has just hired dedicated sales reps to go after global systems integrators such as Tata, Wipro, Tech Mahindra, HCL, Accenture, Capgemini, and more.

Pure Storage shifts focus to as-a-service model

Charlie Giancarlo

Pure Storage expects more than half of its revenue to come from subscriptions as it shifts from product sales to become an as-a-service supplier.

Charlie Giancarlo, Pure Storage
Charlie Giancarlo

This change in viewpoint is detailed in an interview with Pure CEO Charlie Giancarlo in The Technology Letter, with extracts detailed by Wells Fargo analyst Aaron Rakers.

Pure’s Evergreen//One Storage-as-a-Service (STaaS) subscription provides usage-based array storage on-premises, in a colocation facility or in the public cloud with Cloud Block Store. In its last quarterly earnings report for the three months ended November 2023, Pure took $1.3 billion in annual recurring revenues and $309.6 million in subscription services revenue, a 26 percent year-on-year rise. Evergreen//One revenues more than doubled.

Giancarlo stated at the time: “The outperformance of Evergreen//One this year has been significantly above our prior expectations, and we now expect this strong level of demand to continue through Q4.”

He told The Technology Letter: “Subscription is now, roughly, 40 percent of our reported revenue and it’s reached a level where it’s more than doubling [annually].” He said Pure’s marketing would increasingly emphasize Evergreen//One in the future.

That means FlashArray and FlashBlade systems provided on a consumption basis like its Cloud Block Store offering, which provides block-based storage through Purity OS in the AWS and Azure public clouds.

Giancarlo said Pure customers should assume that it will add Purity file services to the public cloud as well. That suggests a Cloud FileStore-type offering providing a consistent file services approach across on-premises/colo Pure arrays and the public cloud.

The subscription emphasis parallels Dell with APEX, HPE with GreenLake, and also NetApp with Keystone.

Giancarlo added more detail to Pure’s claim that flash storage will replace disk storage and no new disk drives will be sold after 2028, saying: “There won’t be any new disk systems sold in five years.” In other words, he believes there won’t be any new disk-based arrays sold by the end of 2028, implying existing disk and hybrid array customers could still be buying disk drives from then on, just not new arrays. This viewpoint is not supported by disk drive manufacturers or disk-based array suppliers.

Giancarlo said Pure will add more details about the internal financial aspects of its subscription services in its forthcoming quarterly reports.

Huawei overtakes Dell EMC as leading AFA supplier

Women sprint hurdlers

Gartner analysis of third 2023 quarter all-flash array (AFA) sales show Huawei has leapfroggerd Dell EMC to become the world’s biggest producer of this type of storage infrastructure.

The AFA market totalled $2.658 billion in the quarter, down four percent year-on-year. All-flash storage represented 53.9 percent of the $4.98 billion external storage market, which declined 13 percent annually. Within that primary storage declined 15 percent annually, secondary storage was up 2 percent, while backup and recovery went down 27 percent.

As reported by Wells Fargo analyst Aaron Rakers, Huawei, with $548 million in revenues, had a 20 percent AFA share while Dell EMC’s $516 million revenues gave it a 19 percent share. A pie chart from Rakers shows market share among the top seven suppliers:

Dell’s AFA revenue was down 23 percent year-on-year, with HPE’s AFA revenue decreasing 20 percent over the same period. NetApp’s AFA revenue went down 9 percent year-on-year, but Pure Storage did much better than these three competitors as its AFA revenue grew 2 percent year-on-year. Huawei’s AFA market share obviously rose year-on-year but we don’t have access to Gartner’s exact figures. We suspect it was close to a 25 percent gain.

In the second quarter of 2023, Huawei’s share was 16.3 percent giving it second place, with Dell EMC leading at 21.6 percent. NetApp was third with 14.8 percent. Rakers does not report all of Gartner’s numbers and we have calculated IBM’s share to be 14.2 percent. Pure Storage was fifth at 14 percent. HPE had a 6.9 percent share with Hitachi tail-ending at 3.4 percent.

We have charted the last three quarters of AFA supplier market share percentages to show the changes visually:

Huawei’s rise in Q3 was as sudden – particularly as it is not selling its products in North America – as Dell EMC’s fall. Dell must be seeing strong competition from Huawei in the Asia Pacific territory, as well as the Middle East and Africa regions as well.

Pure Storage has shown a steady increase in market share – given the overall sales climate – by overtaking NetApp, with HPE also showing a rise over the period, unlike Hitachi with its 3 percent share at $78 million. At this rate we expect Hitachi to decline into the Others category in a quarter or two, with VAST Data or Lenovo emerging to take its place in the rankings.

The total external storage capacity shipped in the quarter went down 7 percent annually. Primary storage capacity declined 4 percent, secondary storage capacity decreased 7 percent with backup storage slumping 26 percent. Flash storage represented 21.5 percent of all storage capacity shipped, up strongly from the year-ago 15.6 percent.

Gartner said the hyperconverged infrastructure (HCI) market totaled $2.03 billion, 2 percent lower than a year ago. Rakers produced an HCI supplier market share pie chart based on the Gartner numbers:

Dell EMC was well in front, with a 33 percent share ($666.1 million), Nutanix second with 13 percent ($267.7 million), and VMware (now part of Broadcom) pulling in $188 million to get third place. 

Ferroelectric RAM update and Micron

Micron has revealed a substantial interest in Ferro-electric RAM (FeRAM) technology with a paper presented at the December 2023 IEDM event. This could be its Optane storage-class memory replacement.

We briefly mentioned Micron’s latest FeRAM activity in a Storage Ticker roundup and will look at it in more detail here, with three items in mind. What is FeRAM? Where does it fit in the memory market? What are Micron’s prospects?

FeRAM tutorial

Ferro-electric RAM is a non-volatile random-access memory that stores binary data as an electric polarity in a dipole in a ferro-electric capacitor, generally composed of Lead Zirconium Titanate (PZT). The dipole polarization direction (P positive or P negative) is caused by the movement of a zirconium (Zr) or titanium (Ti) atom (cation) in lead (Pb) and oxygen perovskite crystals making up the capacitor. There are two stable crystal states, notionally with the Zr or Ti atom up or down from a charge center. The position can be discerned as a difference in voltage and that indicates a binary 1 or 0. The up or down polarization stays constant when the electric field is removed. 

Ramtron FeRAM crystal diagram.The grey spheres are lead atoms. The blue spheres are oxygen atoms. The red sphere is the Zr or Ti cation.

The polarity state is changed by applying a specific voltage to the ferroelectric capacitor. It is not changeable by magnetism or external radiation. The capacitor is not made of iron and so the ‘ferro’ part of the name is misleading. But its polarity change is similar to magnetic polarity changes which is why the ‘ferro’ term is used.

A FeRAM cell is made from an access transistor and a ferroelectric layer (capacitor) with a word line and bit line atop a silicon substrate:

Image from Techovedas

The cell’s binary content is read by the transistor applying a voltage to it and measuring the output. This overwrites the cell’s content meaning it is a destructive read and the cell’s content needs to be rewritten.

The FeRAM market

Up until now FeRAM has been made in small capacity products by suppliers such as Fujitsu, Infineon, SK hynix and Toshiba. Fujitsu has made FeRAM product for more than 20 years. FeRAM is much faster than flash, but has lower density than flash and higher cost, which has restricted its use to applications such as smart electricity meters, programmable logic controllers, airbags, printers, RAID controllers, and RFID tags. Sony’s PlayStation 2 launched in 2000 used 32Kb of Fujitsu FeRAM. Micron’s technology could change this, making FeRAM more generally applicable in the data storage sphere.

FeRAM, as storage-class memory, fits between DRAM and NAND in data access speed and endurance terms. TLC NAND natively has 3,000 to 5,000 write cycle endurance, whereas FeRAM can be rewritten from 100 billion (SK hynix) to 100 trillion (Infineon) times with Micron’s technology supporting 10¹⁵ (10 quadrillion) write cycles. FeRAM’s write cycle time is in the 70 – 120ns area whereas Micron suggests a 300µs write cycle time for NAND.

Micron NVDRAM prospects

Micron’s paper, which is not publicly available, was presented in a Generative AI Focus Session. It discusses a 32Gb die. This is much larger than the Fujitsu and Sk hynix 8Mb products or Infineon’s 16MB and Toshiba’s 128Mb products. However, the gen 1 Optane product had a 128Gb die, four times larger still. The paper abstract says the Micron die has a 2-layer stack, like the gen 1 Optane product. Data is retained for more than 10 years.

Image from Micron IEDM paper abstract.

Read and write times are faster than NAND but not as fast as DRAM. Having the technology called NVDRAM (non-volatile dynamic random access memory) is an oxymoron, with NVRAM more appropriate. But NVRAM is a generic term referring to NOR and NAND flash, FeRAM, MRAM and other non-volatile memory technologies, and we guess Micron was looking for a distinctive name.

The paper abstract mentions a “near-term opportunity … to outfit existing, traditional compute architectures with more efficient memory for faster data movement and to accommodate larger models.” This suggests that Micron sees NVDRAM product being used to hold Gen AI datasets and provide faster access to them than SSD storage. Micron’s NVDRAM has an LPDDR5 command protocol interface, which suggests a socket-type interface could be used.

We would envisage that Micron is already exploring commercial possibilities with potential server partners and large scale Gen AI users.

Back in March 2021 when Micron exited the Optane 3D XPoint market it said:”Micron plans to apply the knowledge it has gained from the breakthroughs achieved through its 3D XPoint initiative, as well as related engineering expertise and resources, to new types of memory-centric products that target the memory-storage hierarchy.” FeRAM was probably in its sights then. It also said it “will increase investment in new memory products that leverage the Compute Express Link (CXL) … With immediate effect, Micron will cease development of 3D XPoint and shift resources to focus on accelerating market introduction of CXL-enabled memory products.”

We might well see CXL-accessed NVDRAM product.

B&F has asked Micron for access to the NVRAM paper and for a briefing on the technology.

Storage news ticker – January 4

DataStax 2024 predictions:

  • AI will become more deeply regulated in the wake of consumer and regulator backlash.
  • The rise of Dark AI will cause societal and/or business disruption.
  • AI companies will shake out and only those capable of managing the governance, risk, and compliance requirements will remain.
  • GenAI will drive deeper operational efficiencies in the SMB market, fueling transformation and disruption.
  • The “Instagrams” of GenAI apps will emerge in 2024. 

IBM released CloudPak for Data v4.8.1 just before Christmas. It introduces support for Red Hat OpenShift Container Platform Version 4.14, and has new features for services such as Cognos Analytics, DataStage, IBM Match 360, and watsonx.ai. Full details here.

Intel has recruited Justin Hotard, HPE EVP and GM of High-Performance Computing, AI and Labs, to be EVP and GM of its Data Center and AI Group (DCAI), effective February 1. He’ll report directly to Intel CEO Pat Gelsinger. Hotard, who replaces Sandra Rivera, will be responsible for Intel’s suite of datacenter products spanning enterprise and cloud, including the Xeon processor family, graphics processing units (GPUs) and accelerators. He is expected to play an integral role in driving the company’s effort to introduce AI everywhere. Rivera became the CEO of Intel’s standalone Programmable Solutions Group on January 1.

Micron presented its 32Gbit dual-layer stackable ferroelectric nonvolatile memory tech, NVDRAM, at the December 2023 IEDM meeting. It has FeRAM non-volatility and endurance, better-than-NAND retention, and read/write speed similar to DRAM. A paper entitled “NVDRAM: A 32Gbit Dual Layer 3D Stacked Non-Volatile Ferroelectric Memory with Near-DRAM Performance for Demanding AI Workloads” was presented by lead author Nirmal Ramaswamy, Micron VP of advanced DRAM (thanks to Techovedas for the information).

Micron NVDRAM

This is an alternative to the failed Optane storage-class memory, and like Optane comes in a stack of two layers. According to Bald Engineering, it uses a 5.7nm ferroelectric capacitor for charge retention in a 1T1C DRAM structure, while dual-gated polycrystalline silicon transistors control access. The stacked double memory layer resides above a CMOS access circuit layer on a 48nm pitch. The paper’s abstract states: “To achieve high memory density, two memory layers are fabricated above CMOS circuitry in a 48nm pitch, 4F2 architecture. Full package yield is demonstrated from -40°C to 95°C, along with reliability of 10 years (for both endurance and retention).”

NVDRAM utilizes the LPDDR5 command protocol. It achieves a bit density of 0.45Gb/mm², higher than Micron’s 1b planar DRAM technology. The cost was not mentioned but could be high.

Storage array supplier Nexsan thinks that in 2024 there will be more cloud repatriation as organizations face inflated cloud costs. They will need immutable data stores for protection, and increasingly adopt NVMe and NVMe-oF. Storage options that can decrease energy consumption per TB will become increasingly important. With the growth in unstructured data and data archives, cloud remorse, and extended retention policies, Andy Hill, EVP and worldwide pre-sales, says: “We see HDD continuing to play an essential role as a complement to solid-state options such as NVMe for organizations.”

Samsung is expected to scale its V-NAND 3D NAND technology by >100 layers per generation and reach 1,000 layers by 2030.

SK hynix plans to provide HBM3E, the memory product it developed in August 2023, to AI technology companies by starting mass production from the first half of 2024. The company will showcase Compute Express Link (CXL), interface technology, a test Computational Memory Solution (CMS) product based on CXL, and an Accelerator-in-Memory AiMX project – a processing-in-memory chip-based accelerator card with low-cost and high-efficiency for generative AI – at CES 2024 in Las Vegas, January 9-12. 

SK hynix plans to commercialize 96 GB and 128 GB CXL 2.0 memory solutions based on DDR5 in the second half of 2024 for shipments to AI customers. AiMX is SK hynix’s accelerator card product that specializes in large language models using GDDR6-AiM chips.

Reuters reports SK hynix aims to raise about $1 billion in a dollar bond deal. The company reported combined operating losses of 8.1 trillion won ($6.19 billion) during the first three quarters of 2023, caused by a demand slowdown for memory and NAND chips in smartphones and computers. Memory chip prices are expected to recover this year after supplier production cuts. SK hynix is investing to retain its lead in high bandwidth memory (HBM) chips.

Nearline drives will be last HDD holdout by 2028

SSDs, which are predicted to render hard disk drives obsolete by within four years, are set to increasingly grow capacity shipped faster than nearline disk drives between now and 2027, with their cost/GB falling faster than HDDs as well.

Disk drives, with a single set of read-write heads and seek time data access delays, are slower to access data than SSDs. They can stream data about as fast and are a fifth of the cost of SSDs on a per-TB basis. They are, however, smaller in capacity than SSDs with the latest 24-28 TB HDDs comparing unfavorably to the newest 30.7 TB TLC and 61.4 TB QLC SSDs.

Disk drive markets outside the enterprise mass-capacity nearline storage sector are seeing increased cannibalization by SSDs. Thus notebook and PC workstation storage is increasingly moving to SSDs, and mission-critical 10,000 rpm 2.5-inch drives are being replaced by SSDs as well. Mass-capacity nearline drives are the holdout area, with capacities set to rise to 40 TB and beyond as HAMR technology is deployed.

These nearline drives are the main HDD product now, according to Wells Fargo’s Aaron Rakers, who says they account for approximately 70-75 percent of total HDD exabytes shipped, and an estimated 60 percent-plus of total HDD industry revenue. Sales have slumped in recent quarters as the hyperscalers, Chinese customers, and large enterprises slowed purchases due to the general economic situation. 

The most recent figures show five successive quarters of decreasing nearline HDD capacity shipped

However, Rakers detects signs that nearline disk shipments are going to rise. He writes: “Reports from TrendFocus indicate that C1Q24 supply has been sold out with a macroeconomic-driven tightening taking place at the end of 2023 / early 2024 vs. prior expectations of 2H24.” Disk drive manufacturer Western Digital says it’s seeing signs of a nearline demand pick up.

Rakers expects a strong recovery in the nearline HDD market in 2024 as hyperscale demand returns in the second half. Gartner forecasts that the nearline HDD market will see capacity increase 22 percent year-on-year in 2024.

Nearline is also expected to continue to rise as a share of total HDD revenue, Rakers said, again citing Gartner figures. Nearline are tipped to account for around 72 percent of HDD revenues this year and grow to 93 percent in 2027 as consumer goods move to SSDs and enterprises continue to shift away from SSDs outside the nearline storage sector.

SSD capacity shipped is touted to grow at a 21 percent CAGR from 2022 to 2027, while nearline HDD capacity shipped is calclated to grow at 19 percent CAGR in that period.

Tellingly SSD cost/GB, we’re told, will fall 13 percent from 2022 to 2027 while nearline Cost/GB will decrease less, 8 percent in the same period. That means the SSD 5x price premium over HDDs should fall.

The SSD percentage of total (SSD+HDD) enterprise storage capacity shipped is s table at 31-32% from 2023 to 2027, with no inflection

While the disk drive slump has been taking place, Rakers has seen no sign of increased SSD cannibalization in the enterprise storage market. He writes: “We see SSD capacity shipped as accounting for approximately 12 percent of total enterprise storage (HDD + SSD) capacity shipped. This has remained relatively unchanged over the past few years, or rather not reflected any signs of a materializing inflection despite the severe price declines seen in NAND Flash over the past several quarters.”

He does think that SSD enterprise storage sales could rise in the future and will be watching out for it: “We will be focused on whether we see any signs of an inflection in SSD/flash capacity deployment elasticity in the enterprise market.”

He points out that he now thinks $/GB SSD pricing will increase during 2024, which would deter further SSD cannibalization of HDDs, but three factors could encourage it:

  • AI-optimized accelerated compute requires scale-out flash-based storage.
  • The power consumption and datacenter footprint of all-flash is a long-term advantage versus traditional HDD-based architectures.
  • Increasing flash/SSD density such as Pure Storage Direct Flash Modules (DFMs) “moving from today’s 48 TB DFMs to 75 TB and a target of 150 TB DFMs in 2024 and then to 300 TB capacities by 2026.”

Overall, we think Rakers is detecting signals that the SSD shipment share of enterprise storage could rise as we progress to 2027. Whether it will rise enough to prevent new HDD sales after 2028, though, is another matter entirely.

StorageX wants to end storage and compute separation… but for AI

Profile: Little-known Chinese startup StorageX has developed a Lake Ti computational storage processor it says enables compute to move to the data, allowing for fast processing of big data for Generative AI-class work.

This company is nothing to do with Data Dynamics and its StorageX unstructured data migration product. We have gleaned the information in this profile from the company’s website and other resources. As a Chinese startup information about it is less available than it would be for a Silicon Valley startup.

StorageX is based in Shanghai, China, and was founded in 2020 as Shenzhen Technology by CEO Yuan Jingfeng, who also goes by Stephen Yuan. StorageX has a second China office in Wuxi, and a third office in San Jose. It says its team has an average of more than 20 years of storage industry experience, encompassing memory and controller chips, data acceleration, GPU and datacenter architectures. Core team members come from Western Digital, Micron, HPE, Intel, Microsoft Azure, Nvidia, Tencent and other industry big hitters.

Yuan is said to have worked on Micron’s first NAND memory chip and holds a dozen or so patents in the areas of memory, storage and system architecture. He has two decades of industry experience in memory, enterprise storage and data center architecture. He has an R&D and product development background, and holds more than a dozen patents in the field of memory, storage as well as system architecture.

The CEO says: “Moving compute is easier and more efficient than moving data, which will generate huge benefit [for] future data heavy applications. … Back to the first principal, the key to improve computing efficiency is to improve the efficiency of data movements, moving compute will be more effective than moving data.”  

Market areas include datacenters, streaming media services, autonomous driving and others. StorageX says its computational storage processing (CSP) chips and systems can enable the deployment of high-performance near-data computing, provide powerful AI computing power, data acceleration capabilities, I/O acceleration capabilities and software platforms. They are claimed to be more suitable for the high performance of current data centers focussing on massive data sets than traditional separated compute and storage systems.

StorageX has joined the SNIA’s Computing, Memory and Storage Initiative (CMSI) group, and has become a voting member. Yuan is a member of the CMSI’s governing board, along with Nicolas Maigne (Micron), Scott Shadley (Solidigm), and James Borden (Kioxia). Bill Martin (Samsung) is the chair person, Leah Schoeb (AMD) the vice chair; and Willie Nelson (Intel), the treasurer.

Computational Storage background

The SNIA defines three computing storage standards: 

  • Computational Storage Drive (CSD) – i.e. a computational SSD including persistent storage and computing modules), 
  • Computational Storage Processor (CSP) – which does not include persistent storage, but deploys computing power on the data side for processing, 
  • Computational Storage Array (CSA) – a complete hybrid or converged system that integrates application compute and storage.

So far we have seen computational storage drives, from companies like Eideticom and ScaleFlux, and some computational arrays from startups like Coho Data.

In general, computational storage has underwhelmed the market for two main reasons. First, although it says it moves compute to the data, it only moves a lightweight portion of compute to the data to do trivial stuff like compression or video transcoding. When we think of moving compute to the data we intuitively think of moving x86 class processing to the data. But we don’t get that with computational storage drives. We only get restricted, limited compute based on a few Arm cores, ASICs and so forth.

Secondly, moving compute to the data has been done at a drive level, not an array level. That limits the amount of physical space available for the compute hardware; drives don’t have much empty space inside their shells and also don’t have excess electrical power waiting to be used. Any processor mounted inside a drive can only see the data on that drive. 

If an application needs its compute resource to access 500TB of data and a disk drive can only hold 24TB and an SSD 30TB, then you would need 17 of the SSDs to hold the data. The processors in each one would need to communicate with the others to enable full dataset processing visibility. It’s a nonsense of added networking complexity to move data results between these drive level processors to regain what you lose by not moving the dataset to the X86-class compute in the first place.

Doing compute at the array level, with the array controller processing power boosted to run app processing as well as storage array operation processing, has been tried and failed. 

Remember Coho Data back in 2015? It developed DataStream MicroArrays, which combine Xeon-based server processing, PCIe NVMe flash cards, and disk storage, for closely-coupled storage tasks such as video stream transcoding and Splunk-style data analysis. The compute was not for general application execution and the DataStream system was not positioned as a hyper-converged appliance. 

The firm closed down in 2017. In general, compute inside storage arrays has failed to take off because large data sets need processing by multiple processors and their cores. The processing required to simply operate a large storage array is quite enough without then adding in a whole hardware layer of application processors and their memory and IO paths into the storage stack mix as well. 

This is the background context to what StorageX is trying to achieve.

StorageX technology

StorageX defines the computational storage array as a system combining computational storage processors, control software and other devices. Yuan believes that AI workloads require new processor designs to solve systemic problems faced by X86 and GPU processors when dealing with large amounts of data. Data-centric computing (storage nodes plus CSPs) will follow on from GPU-based processing acceleration and CPUs buttressed by DPUs and network acceleration.

Its Lake Ti (Titanium lake) P100 product incorporates an AI engine (CSP/XPU), data [flow] acceleration and I/O acceleration in one package, which is closely integrated with a storage system and data node. It has dual 100Gbps networking and is designed for data-centric computing, however that is defined. 

In other words Lake TI is an ancillary near-data processor attached to a storage system. It is said to empower data-aware applications, with the storage array plus Lake Ti functioning, as we understand it, as a smart data lake. StorageX wants to lower the entry barrier for large language model AI computing by reducing costs and increasing efficiency at the computing hardware level.

Yuan said: “Our biggest difference is that we have taken computing power to the extreme, integrating acceleration and high performance AI in the CSP near the storage node to achieve the goal of data-centric computing and acceleration.”

We understand that StorageX (Shenzhen Technology) has applied for almost 50 patents, focused on AI, large models, near-data computing, SoC chips, data acceleration, IO acceleration and high-speed interconnect technology. It has customers who are trying out its product and we expect to hear more from the company in 2024.

Tiger Technology takes surveillance storage to the cloud

Tiger Technology has adapted file-tiering-to-the-cloud technology and applied it to surveillance videos, enabling public cloud playback and disaster recovery.

Video surveillance storage needs are growing as more and more cameras are installed, and recording resolution increases. Full HD video is 1,920 horizontal pixels and 1,080 vertical pixels, while 4K video increases that to 3,840 horizontal pixels and 2,160 vertical pixels – per frame. Of course most video surveillance cameras don’t operate at these levels but the trend is towards higher resolutions, which means more storage capacity is needed.

Enterprises using video surveillance may generally store the videos on an on-premises digital video recorder system with local storage. According to a Western Digital online calculator, you will need 14TB of capacity to store the output of six 4K cameras running at ten frames per second, active for 12 hours a day, for 30 days’ retention time, with medium video quality and scene activity. Doubling the camera count to 12 and the recording period to 24 hours pushes the capacity need to 56TB. It mounts up quickly – and some customers can have more than a thousand cameras.

Tiger Technology provided video surveillance storage to a US airport operator that had 4,000 cameras and 4PB of data. Due to government regulations, it could not afford any camera data loss. It had a second datacenter constructed with an active-passive failover system. That standby datacenter was flooded in 2019 and the airport lost two thirds of its data. It needed even higher availability and greater data resilience.

Many customers have an on-premises-first policy for mission-critical data, and safeguarding it has cost and other implications.

CEO and founder Alexander Lefterov told an IT Press Tour in Madrid that an active-passive setup for the primary and secondary datacenters, with data replicated from the former to the latter, was insufficient for mission-critical data. If the primary datacenter failed, it could take two to five minutes for the switchover, and camera data would be lost.

It’s better to have an active-active system, with data written to both servers and synchronized – so that no data is lost if or when a datacenter failure occurs. It provides continuous data protection. But you need more storage capacity for this, meaning more cost.

If older, less mission-critical data is sent to the public cloud to save on-premises capacity, it still has to be available. Tiger Technology’s Tiger Bridge software is a general cloud storage gateway and tiering product, coded as a Windows Server kernel-level file system filter driver that can satisfy the data availability requirement. It monitors an on-premises file set and moves low-access rate files to cheaper public cloud storage to save on-premises storage capacity and cost. Surveillance Bridge is an optimized version of the Tiger Bridge software and works with video surveillance workflows.

A customer’s Video Surveillance Management System (VMS) records video to and plays it back from the local storage. As this fills up, configurable policies are used by the Surveillance Bridge software to replicate infrequently accessed data to lower-cost and scalable public cloud storage tiers. This reclaims local storage capacity. A metadata stub is left behind indicating the file’s new location. The VMS always sees data as being stored locally on-site so there are no changes to its operations. The stubs enable fast video search and identification.

Tiger Technology diagram
Tiger Technology diagram

During the public cloud upload, Surveillance Bridge splits a video file into small chunks and an MD5 check ensures no video frames are lost during the transfer. In effect, Surveillance Bridge extends the NTFS file system to the cloud, helping to maintain the access controls, encryption, and auditing capabilities inherent to NTFS. The cloud data also functions as a disaster recovery resource.

The US airport using Surveillance Bridge met three criteria:

  • Functional resilience – ability to have full system functionality with hardware or software component failure.
  • Data resilience – full data availability with system component disruption.
  • Maintenance complexity – ability to regularly test and maintain the system without affecting business operations and engaging complex failover protocols.

It has survived four events that would previously have caused data loss. Servers have been upgraded with no loss of service. There is no need to schedule maintenance at night, and there has been zero data loss and zero downtime.

Surveillance Bridge supports all major public cloud providers and storage tiers, including archive for long-term retention. When surveillance data is stored in an archive tier, playback has an added element. Admin staff can use Surveillance Bridge’s Job Manager utility to rehydrate specific date ranges or select cameras in some cases. Once rehydrated, the recordings can then be directly played back from within the VMS. 

The software does not use any proprietary formats when replicating data to the cloud. Customers don’t suffer from vendor lock-in and their data can always be retrieved. Find out more and access data sheets here.

Bootnote

Tiger Technology is based in Sofia, Bulgaria, with offices in the US, France, and UK. Its focus is producing software for hybrid cloud file data services. Tiger has around 70 employees and more than 11,000 customers in Media & Entertainment, surveillance, healthcare, and general IT around the globe.

Storage news ticker – January 2, 2024

TrendForce’s investigation into the impact of the recent 7.5 magnitude earthquake in the Noto region of Ishikawa Prefecture, Japan, reveals that several semiconductor-related facilities are located within the affected area. This includes MLCC manufacturer Taiyo Yuden, silicon wafer (raw wafer) producers Shin-Etsu and GlobalWafers, and fabs such as Toshiba and TPSCo (a joint venture between Tower and Nuvoton). Given the current downturn in the semiconductor industry and the off-peak season, along with existing component inventories and the fact that most factories are located in areas with seismic intensities of level 4 to 5 – within the structural tolerance of these plants – preliminary inspections indicate no significant damage to the machinery, suggesting the impact is manageable.

DRAM and NAND supplier Micron has settled a lawsuit between it and Fujian Jinhua Integrated Circuits, a Chinese state-supported chipmaker. The legal dispute included accusations of IP theft by each against the other. According to YiCai Global, three Micron execs quit in 2016 and joined United Microelectronics in Taiwan, which subsequently signed a 32nm DRAM technology supply deal with Fujian in China. Micron sued United Microelectronics and Fujian for alleged IP infringement, and the two sued Micron in return, alleging infringement of their IP in Micron products sold in China. United Microelectronics made a reconciliation payment to Micron in late 2021 when that element of the case was settled. The Micron-Fujian settlement is global in scope and ends all legal disputes between them.

China’s Cyberspace Administration (CAC) declared in May 2023 that Micron’s products represent a security risk in the country and should not be bought by “operators of critical information infrastructure.” This affected 14 percent of Micron’s global sales ($2.2 billion). Micron recently said it will invest $602 million (4.3 billion yuan) in its Chinese chip-packaging facility. It looks as if Micron’s relationship with China is improving.

A Panmnesia video explains the memory specialist’s  CXL-Enabled Accelerator technology, as used in vector embedding searches in AI, that will be showcased at CES 2024.

Panmnesia video screengrab.

Samsung has verified CXL memory expander operations with Red Hat Enterprise Linux in a customer environment. Samsung optimized its CXL memory for RHEL 9.3 and verified memory recognition, read and write operations in Red Hat’s KVM and Podman environments. “The successful verification of Samsung’s CXL Memory Expander interoperability with Red Hat Enterprise Linux is significant because it opens up the applicability of the CXL Memory Expander to IaaS and PaaS-based software provided by Red Hat,” explained Marjet Andriesse, SVP and head of Red Hat Asia Pacific. Samsung and Red Hat are jointly working on a “RHEL 9.3 CXL Memory Enabling Guide.” Their ongoing agreement covers a range of storage and memory products, including NVMe SSDs, CXL memory, computational memory/storage and fabrics.

SSD controller, object and block cloud storage predictions

Storage suppliers predictions generally predict that the supplier’s products will be successful – because they represent the view of the market through the supplier’s own lens. No supplier is going to predict that its product strategy is mis-placed or will fail.

Successful suppliers will have a view of the market that matches what customers in the market are thinking. See if you agree with these forecasts from Backblaze (cloud storage), Lightbits (cloud block storage) and Phison (SSD controllers).

Backblaze 

Nilay Patel, VP of Sales at Backblaze, predicts:

1: Cloud operations budgets get grilled. We’re at the end of a cycle that started in March 2020, when IT departments had to make work from home (WFH) a thing that could work practically overnight. There was a lot of overspending that folks just accepted. And of course, that emergency spending continued on for a couple of years. … Organizations will have to decide whether to solidify this spending as part of their everyday budget or start cost cutting in other areas.

Vendors that sell collaboration storage such as Box and Microsoft are starting to cut unlimited plans to save on their own operational costs. … The combination of cost-cutting at both ends will help start an exodus from those traditional vendors as anxious and budget-strapped organizations search for less expensive options. 

2: Ransomware protection increases. Organizations have moved past panicking that cyberattacks are inevitable, and are realizing protection is not as complicated nor as expensive as they thought it would be. … IT departments will be able to justify the investment as far less expensive than the average downtime and mitigation costs that result from a ransomware attack.

3: Easier object storage options will fuel AI innovation. Recently, Amazon introduced S3 Express One Zone, object storage designed for massive AI applications with higher speed, higher scalability, and lower latency. This will enable new applications that in the past developers wouldn’t have been able to write without spending a huge amount of money—but not just for AWS customers.

Whenever there’s new innovation that comes out of AWS, ecosystems are created around it and, typically – thanks to AWS’s complexity and high prices—alternative approaches from other companies bring more value to businesses. The expectation is a broader impact where the tooling, model training, AI inference, and other AI oriented workflows will support data stored in object storage as a matter of course. Organizations who are competing with AWS, or looking for a less expensive approach, will be able to unlock these new performance capabilities via using object storage from other providers. 

Lightbits 

Eran Kirzner, co-founder and CEO, thinks:

1. NVMe over Fabrics (NVMe-oF) will gain more momentum as the main Tier-1 storage connectivity. Kirzner predicts NVMe-oF will continue to replace iSCSI and Fiber Channel (FC) and become the de facto standard for cloud storage technology and the underlying storage access protocol to support modern-day applications with a thirst for higher performance. 

Industry tech leaders like Microsoft have recognized the convergence of enterprise and modern cloud storage platforms by jumping into the mix and democratizing the NVMe protocol with their announcement at 2023’s Microsoft Ignite to support inbox NVMe/TCP–making it available now on all data center OSs.

2. The number and market size of AI Cloud providers will grow exponentially.  We’ll see more native AI services offered by the hyperscalers and a proliferation of AI Cloud service providers offering specialized services. AI Cloud service providers with specialized platforms, like Crusoe Cloud, are being launched where the speed and scalability of GPUs, compute, and storage will play a key role in enterprise organization’s successful AI initiatives.

3. Hybrid cloud is here to stay and enabled by software-defined cloud architectures. Hybrid cloud implementations are becoming universal, with organizations using multiple clouds to support diverse application workloads. … Business leaders with a hybrid- or multi-cloud strategy want the flexibility of moving workloads to the cloud platform that offers the best cost-efficiencies, without compromising on performance, scalability, or data services.

They want the same look, feel, and capabilities from their cloud storage across any deployment platform – on-premises or public clouds, plus the fast provisioning of storage resources to where and when it’s needed. … They should prioritize a software-defined architecture that offers license portability across cloud platforms.

4. Data security will be a critical capability of cloud storage systems. Organizations will require detection, protection, and prediction from advanced AI-driven monitoring systems (AIOps) integrated into their cloud storage systems. Encryption at rest and in flight are table stakes for any cloud storage supplier.  Many business leaders are moving away from hardware-based encryption with Self-Encrypting Drives (SED) and shifting to SW-based encryption to reduce storage costs and lead time, eliminate hardware vendor lock-in, and enable cloud and hybrid cloud portability.

5. Legacy storage appliances with their monolithic architectures are going the way of the dinosaur, incapable of keeping pace with performance-sensitive, cloud-native applications or enabling organizations’ cloud-first, hybrid cloud strategies. Public cloud usage is ubiquitous; cloud storage is no longer a barrier to migrating legacy performance-sensitive applications. In tandem, Kirzner sees many organizations building cloud-native infrastructure within their on-premises data centers and the continued proliferation of CSPs offering specialized platforms. 

The common thread for successfully building a cloud service is a modern storage system that is software-defined and NVMe-based. This combination delivers fast, simple storage provisioning, with the flexibility to move storage services where and when they are needed, as well as lower storage TCO. 

Phison

SSD controller supplier Phison’s 2024 predictions:

  1. SSD, GPU, DRAM and other essential data center components will increasingly include device-level cryptographic identification, attestation, and data encryption to help better guard data against attack as AI deployments expose new digital threats.  
  2. Private, on-premise deployment of infrastructure for LLMs to run AI model training on proprietary data without exposure to cloud security vulnerabilities. 
  3. Ultra-rapid advancements in AI and LLMs will challenge AI infrastructure reliance on GPU and DRAM, resulting in new approaches to architecture that take greater advantage of high-capacity NAND flash. 
  4. In these systems, PCIe 5.0 NAND flash will gain wider adoption to power applications in production environments at top speed and efficiency, freeing GPU and DRAM to separately run AI inference models, maximizing resource efficiency and productivity.
  5. Private LLMs will focus initially on essential activities that are not held to strict time-to-market deadlines, such as improved chatbot interactions for professionals and incremental advancements for patented products.
  6. As these private deployments accrue positive results, applications will be adapted for adjacent operations and procedures, furthering the proliferation of these everyday infrastructural solutions for AI.  

Dr. Wei Lin, CTO, Phison HQ, Head of Phison AI R&D and Assistant Professor at the College of Artificial Intelligence, National Yang Ming Chiao Tung University, said in a statement: “As critical infrastructure evolves to support rapid advancements in AI, NAND flash storage solutions will take a central role, enabling greater architectural balance against GPU and DRAM for balanced systems built to maximize the benefits of ongoing, long-term AI deployment.”  

UltiHash – making better dedupe hash technology 

UltiHash, a German startup that’s devised a byte-level deduplication algorithm, claiming it dedupes better than existing alternatives, with faster data access as well, has raised a $2.5 million pre-seed funding round.

Its deduplication software is deployed in an S3-compatible object storage cluster that can run on-premises or in AWS. The cluster has a head node and data nodes. Clusters can scale horizontally with variable sized data nodes supporting petabyte-scale volumes,

Incoming data is scanned and repetitious variable sized byte-level combinations replaced by markers. UltiHash says its dedupe operates both within and across datasets and is independent of structured, semi-structured and unstructured data data types such as text, images, videos, audio files, database records and so forth.

UltiHash diagram colour inverted for readability.

An UltiHash diagram (above) explaining its dedupe says “Files and folders are analyzed at the byte level.” However the UltiHash storage repository is an S3-compatible object storage construct, as a second diagram indicates;

The “Deb package” is UltiHash’s SW inside a Debian Linux package.

This means that the files and folder ingestion is carried out by some kind of input routine and, we understand, sent to the UltiHash SW using its API. 

UltiHash integrates with S3-native applications and services, helping with its adoption. It says it has “built-in features for data backup and recovery, ensuring high availability and business continuity.” The software supports “multi-tenant environments with robust access control, ensuring secure segmentation and user management” and has “monitoring for real-time insights into storage usage, performance, and operational trends.”

The company’s lossless dedupe is claimed to cut storage costs by up to 50 percent, an overall 2:1 dedupe ratio. It’s website has details of storage capacity savings vs AWS S3 for several file types; 

Customers’ overall dedupe ratios with UltiHash will vary with the data types used. UltiHash also claims speed advantages, saying its software is up to 50 percent faster on reads (GETs) than Amazon S3 when benchmarked on TIFF files; 

The company’s read throughput and time numbers show UltiHash is slightly slower than S3 when reading RAW files and very much faster than S3 when reading CSV and PNG files, and also XML files.

But the performance picture versus S3 places UltiHash at a disadvantage with writes (PUTs) for RAW, TIFF, CSV, PNG and XML files as its own AWS performance benchmark numbers show;

We asked the company about the dedupe load on CPUs and were told: “Some CPU is used only during the “write” phase. Since UltiHash, in general, divides between “write” and “read” activities and can provide separate nodes for them (standard practice for any high-load IO solution). For on-premise it would mean that you need one CPU heavier machine only, and the rest can be more general-purpose nodes.”

The software is available in a public beta and in the AWS Marketplace. It is setup in a VPC with the UltiHash-AMI and a cloud formation template. 

Pricing is straightforward, with customers being charged $6/TB/month. A 30-day free trial is available. Request a demo here.

UltiHash was co-founded in 2022 by CEO Tom Lüdersdorf, a Berlin-based entrepreneur, and original CTO Benjamin-Elias Probst. The current CTO, Katja Belova, was recruited in December 2023. UltiHash has a development office in Berlin, some remote workers in the Berlin timezone, and a head office in San Francisco. Overall it has a team of around 10 people.

The pre-seed round was led by Inventure, with participation from PreSeedVentures, Tiny VC, Futuristic VC, The Nordic Web, Antti Karjalainen, founder, and angel scout for Sequoia Capital, and other private investors.

Bootnote

Startup StorReduce developed variable-length deduplication technology for data stored in Amazon S3 buckets. It was founded in 2014 and the technology seemed so promising that Pure Storage snapped up StorReduce in 2018. It ran the software in an ObjectEngine  appliance and stored the deduped data in an underlying FlashBlade array. But the system was not a success and Pure closed it down in 2020, saying it would work with backup vendors, who each had their own dedupe technology, to have them use FlashBlade as a target.

Nowadays Cohesity, ExaGrid, Quantum, Veritas and other suppliers use variable-length deduplication. It is not clear how UltiHash’s dedupe ratio compares in efficiency to other variable-length dedupe suppliers’ technology. Its use case is different with a focus on workloads needing fast access to S3 data. Lüdersdorf stated: “The exponential increase in data storage resources is not sustainable from either a business or environmental perspective. Resource optimization is the only way forward to manage data growth and continuously use data as the lever to solve challenges. In this industry, speed is a must, and we’re here to make hot storage resource-efficient.”

B&F Christmas crossword answers

Here are the answers to the Christmas crossword;

Notes on the answers which, except for 28 down, come from the B&F Glossary;

Across

 1. Resilvering.

 5. ILM – Information Lifecycle Management.

 9. Disk drive.

10. AWS for Amazon Web Services.

11. PB standing for Petabyte with Peta standing for People in favor of Ethically Treating Animals.

12. RAID.

13. Snap as in snapshot.

14. YB standing for Yottabyte.

16. TiB short for Tebibyte.

17. Tebibyte.

19. Get, the opposite of Put.

20. HA short for High Availability.

21. Host Bus Adapter.

26. AFA standing for All-Flash Array.

27. Hudi – Apache Hudi (Hadoop Upserts Deletes and Incrementals).

29. VM as in Virtual Machine.

31. Kubernetes.

33. 3D.

34. S3 as in Amazon’s Simple Storage Service.

Down

1. Read Write Head as used in hard disk drives.

2. SaaS as in Software-as-a-Service.

3. B as in InfiniBand.

4. Gigabyte.

5. iSCSI as in Internet Small Computer Systems Interface.

6. Metadata.

7. IDE as in Integrated Drive Electronics.

8. S I Unit.

15. Bit.

18. Erasure.

22. SMART as in Self-Monitoring, Analysis, and Reporting Technology which is used by disk drive or SSD controllers.

23. Blocks.

24. TSV as in Through Silicon Via.

25. M.2 which, with no dot, in M2.

28. Ion – this is not in the glossary.

30. CSI standing for Container Storage Interface.

32. Ei standing for Exbibyte or 1,024 pebibytes.