Scandal struck data replicator WANdisco was readmitted to the AIM stock market yesterday after shareholders approved proposals to raise fresh capital via a new share issue.
The company’s stock was suspended from AIM in early March when WANdisco’s claimed that a single senior sales rep had fabricated sales in calendar 2022: the reported $24 million revenue for the year turned out to be just $9.7 million worth of genuine turnover.
A wholesale restructuring of the board and the C-suite followed, with co-founder, chairman and CEO Dave Richards and CFO Erik Miller both leaving. WANdisco was forecast to run out of cash earlier this month and shareholders were asked to approve a capital infusion through issuing new shares. This they did on July 24 and the company raised $30 million at £0.50/share.
Some 99.97 percent of shareholders approved this resolution an a General Meeting yesterday. Chairman Ken Lever, who was installed earlier this year, said in a statement:
“We are absolutely delighted that the resolution to increase the authorised share capital has received such overwhelming support, which will enable the company to conclude the fundraise. The board and the executive management can now concentrate on driving the business forward to achieve growth in value for shareholders and all stakeholders.”
The shares are now trading at 46.74 GBX (£0.4674), compared to their 131,000 GBX (£13.10) value immediately before the suspension. Lever’s exec team have a mountain to climb.
Share price chart from Google Finance
Management has hatched a turnaround plan and this could involve a name change at some point.
Just last month, WANdisco filed its 2022 annual report which revealed a going concern warning, indicating that every dollar it generated during the trading year cost $3: turnover was $9.7 million and the net loss came in at $29.7 million.
Anthony Miller, co-founder and managing partner at TechMarketView, said the overwehelming shareholder support WANdisco received at its AGM was a “show of confidence worthy of an autocratic regime”.
“As a result, WANdisco’s shares were re-listed on AIM this morning at 50p. They last traded at £13.10p,” he added, “WANdisco shares are now no more than casino chips. Place your bets and see if you can beat the banker.”
VAST Data has secured a contract from the Texas Advanced Computing Center (TACC) for its single-tier, NFS-based flash storage system, edging out traditional disk-based HPC parallel filesystems such as Lustre.
TACC, located at the University of Texas in Austin, is in the process of building Stampede3, an open-access supercomputer powered by Intel Max Series CPU and GPU nodes that can deliver 10 petaFLOPS of performance. The hardware was funded by a $10 million grant from the National Science Foundation (NSF) and will be used by its scientific supercomputing research community.
Jeff Denworth
Jeff Denworth, co-founder and CMO of VAST Data, said in a blog post: “TACC has selected VAST as the data platform for Stampede3, their next generation large-scale research computer that will power applications for 10,000+ of the United States’ scientists and engineers. The Stampede purchase precedes selection for their next really big system, which will likely be announced later this year.”
With over 1,858 compute nodes, more than 140,000 cores, over 330 terabytes of RAM, and 13 petabytes of VAST Data storage, Stampede3 is poised to provide a significant boost in computing power for the scientific community. The VAST flash system is providing both scratch and nearline storage, replacing a DDN Lustre disk-based system in the prior Stampede2.
The VAST storage will offer 450GBps of read bandwidth, serving as a combined scratch and nearline storage tier. Despite considering various storage options, including Lustre, DAOS, BeeGFS, and Weka, TACC opted for VAST Data due to its ability to handle anticipated AI/ML workloads that require fast random reads. Thanks to data reduction, with a 2:1 reduction ratio, and QLC NAND, TACC found VAST’s flash cost affordable compared to traditional disk storage.
Stampede3
Stampede3 is a hybrid or heterogeneous setup with several subsystems:
High-end simulation from 560 Xeon Max Series CPUs (c63,000 cores); Sapphire Rapids gen 4 Xeon SPs with high bandwidth memory (HBM),
AI/ML and graphics subsystem using 40 Max Series GPUs (Ponte Vecchio as was) in 10 Dell PowerEdge XE9640 servers, each with 128GB of HBM2e RAM,
High memory-dependent computation from 224 Gen 3 Xeon SP nodes incorporated from earlier Stampede2 system,
Legacy throughout and interactive computing from >1,000 Stampede2 Gen 2 Xeon SP nodes.
Dan Stanzione
TACC executive director Dan Stanzioni commented: “We believe the high bandwidth memory of the Xeon Max CPU nodes will deliver better performance than any CPU that our users have ever seen. They offer more than double the memory bandwidth performance per core over the current 2nd and 3rd Gen Intel Xeon Scalable processor nodes on Stampede 2.”
These processing systems and the storage facilities will be interconnected with an Omni-Path Fabric 400 Gbps network, offering a smooth transition for existing Stampede2 users as it transforms into Stampede3. This upgraded system is projected to operate from this year through 2029.
Supercomputer simulations and atomic resolution microscopes were used to directly observe the signatures of electron orbitals in two different transition-metal atoms, iron (Fe) and cobalt (Co). This new knowledge can help make advancements in fields such as materials science, nanotechnology, and catalysis. Credit: Chen, P., Fan, D., Selloni, A. et al. DOI: 10.1038/s41467-023-37023-9
Not Lustre
Stampede2 used a Lustre parallel file system running on 35 Seagate disk-based ClusterStor 300 arrays (Scalable Storage Units or SSUs). There was a 33 x SSU scratch system and a 2 x SSU home system. These were supported up by a 25PB backend DDN Lustre storage system called Stockyard which operates across the TACC site.
Lustre is used by more than half the top 500 supercomputing systems and is a near-standard supercomputing file system because of its ability to deliver read/write IO to thousands of compute nodes simultaneously. TACC chose VAST’s NFS over Lustre because VAST Data’s architecture delivers parallel file system performance without the inherent complexities of a special parallel file system like Lustre. It also out-scales Lustre, we’re told, having proven capable of handling an extremely IO-intensive nuclear physics workload faster and more efficiently.
One TACC nuclear physics workload is extremely IO-intensive and Lustre can only cope with 350 nodes running it before the Lustre metadata server runs out of steam. TACC tested a VAST Data system and found it also supported 350 client nodes on this workload, and ran 30 percent faster than the Lustre storage. It then connected the VAST storage to Frontera and scaled up the client node number through 500, 1,000 and 2,000 nodes to 4,000 clients and the VAST storage ran sufficiently.
Denworth noted that “20U of hardware running VAST software could stand up to 50 racks of Dell servers.” During a VAST software upgrade, one of the VAST storage servers had a hardware failure which caused the overall storage to run with two versions of the VAST operating system. There was also a software installer bug. The storage still functioned, minus one server with two VAST OS versions, and the upgrade process completed when the installer software bug and failed hardware was fixed.
Denworth said: “HPC file system updates are largely done offline (which causes downtime) and it’d be crazy to think about running in production with multiple versions of software on a common data platform.” There was no VAST system downtime during the failures, TACC said.
The VAST storage should be installed in September and Stampede3 should be operating in production mode in March next year.
TACC has also been evaluating a flash file system software upgrade for Frontera, looking at Weka and VAST. Frontera is set to be superseded by the next flagship TACC supercomputer, the exascale-class Horizon, and VAST is now in with a chance of being selected as a Horizon storage supplier.
Dewnworth commented: “Stampede3 will kick off a bright partnership that’s forming between VAST and TACC, and we want to thank them for their support and guidance as we chart a new path to exascale AI and HPC.”
TACC Systems
TACC has several supercomputer systems:
$60 million Dell Frontera; TACC’s flagship system, performing at 23.5 Linpack petaFLOPS from 8,008 Intel Xeon Cascade Lake-based nodes plus specialized subsystems. It is 21st in the TOP 500 supercomputer list. It has 56PB of capacity with 4 x DDN 18K Exascaler storage disk-based arrays providing 300GBps bandwidth. There are also 72 x DDN IME flash servers with 3PB capavcity and 1.5TBps bandwidth.
$30 million Dell-based Stampede2 provides 10.7 petaFLOPS using 4,200 Intel Knights Landing-based nodes and 1,736 Intel Xeon Skylake-based nodes. It is a capacity-class system ranked number 56 in the Top 500 list, and is the second generation Stampede system and being superseded by Stampede3. Stampede2 is a 2017 system, and followed on from the initial 2012 Stampede system.
Lonestar5 for HPC and remote visualization jobs running at 301.8 teraFLOPS with >1,800 nodes and >22,000 cores with a 12PB Dell BeeGFS file storage system. Now superseded by Lonestar6.
Wrangler is a smaller supercomputer of 62 teraFLOPS for data-intensive work, such as Hadoop, with 96 Intel Haswell nodes (24 cores and minimum 128TB DRAM per node), a 500TB high-speed flash-based object storage system, and a 10PB disk-based mass storage subsystem with a replicated site in Indiana.
Stockyard2 is a global file system providing a shared 10PB DDN Lustre project workspace with 1TB/user and 80GBps bandwidth.
Ranch is a 100PB tape archive using a Quantum tape library.
Stampede2 has been successful as an open science system, with more than 11,000 users working on more than 3,000 funded projects running more than 11 million simulations and data analysis jobs since it was started in 2017. This replicated the success of Stampede1, which ran more than 8 million simulations with more than 3 billion compute hours delivered to 13,000-plus users on over 3,500 projects.
At one point in 2018, Stampede2 was fitted with 3D Xpoint NVDIMMS as an experimental component in a small subset of the system.
Analysis: DataCore is actively developing a platform that integrates block- and file-based storage across edge sites, datacenters, and the public cloud. It caters to various applications, including virtual machines, containers, media and entertainment, as well as AI.
DataCore is an established supplier in the field of software-defined storage, founded in 1998. The company has gained significant traction with its SANsymphony block storage, Swarm object storage (acquired from Caringo), and Bolt Kubernetes-orchestrated container storage (acquired from MayaData). Bolt has been notably involved in storage capacity bids reaching 100PB.
Dave Zabrowski
The business is well established, having traded profitably for 14 years. Its client base is robust, with strong ties to medium and small enterprises, as well as departmental and line-of-business applications in larger businesses. CEO Dave Zabrowski, who joined DataCore in 2018, plans to maintain this approach. His vision is to tap into larger enterprises through vertical market applications, thereby circumventing the need for CIO-level selling efforts.
Perifery
DataCore also runs a division known as Perifery, which provides edge storage solutions. This division offers the Perifery archiving appliance and Transporter physical data transfer products. These products cater to vertical edge markets such as media and entertainment, and health and life sciences. The key goal is to integrate these products into workflows rather than selling them as standalone products.
Perifery also oversees the Swarm object storage product and the media and entertainment object storage supplier Object Matrix, which DataCore acquired in January. Currently, Perifery is working on an AI product designed to preprocess media content generated at the edge before it is transferred to the public cloud or a datacenter.
Zabrowski shared a hypothetical scenario where a surgeon is operating on a patient and a camera has taken images of polyps. These are tissue growths inside the body and could be benign, potentially cancerous or actually cancerous. The image is fed to a local server/workstation, where a machine learning inference system analyzes it and tells the surgeon what percentage of such polyps are, or could become, cancerous. In effect the ML inference software, operating at the edge, the hospital, gives the surgeon cut-it-out or leave-it-in information in real time.
We think that DataCore’s Perifery division could be developing AI stack infrastructure components that could help in such work.
Containers and objects, not files
DataCore’s focus has shifted from vFilo, its file storage software product, to object storage for managing unstructured data storage. Zabrowski has expressed confidence in Swarm’s technology for its superior scalability when compared to MinIO object storage.
To run DataCore’s software platform components on the edge, in datacenters, and the public cloud, containerization is necessary. However, Zabrowski has cautioned that it is not possible to containerize monolithic applications like SANsymphony without rewriting them. There are different considerations for distributed software like the acquired Caringo Swarm object storage, which will likely be containerized.
The long run
DataCore’s primary investor, Insight Partners, is a long-term stakeholder and, as Zabrowski states, is not looking to exit the investment soon. DataCore is a 25-year-old firm that is evolving to cater to an edge-led future. This evolution includes its division Perifery and AI applications that facilitate more processing at the edge.
The future of DataCore involves more than just providing storage. It aims to enable data and integrate it into workflows by facilitating data movement between workflow stages and manipulating and processing it for efficiency. This evolution will likely involve data transfer to datacenters and the public cloud for additional processing and long-term storage. In essence, DataCore aims to be a comprehensive data storage infrastructure supplier rather than just a provider of storage.
A report tracking enterprise VARs (value-added resellers) by analyst William Blair provides insights into the current state of the enterprise IT infrastructure market and the prospects and challenges for industry players like Commvault, NetApp, Nutanix, and Pure Storage.
The report, authored by Jason Ader and Sebastien Naji, shares perspectives of 64 resellers on quarterly spending. The recent findings on Q2 2023 reveal “growing optimism that the worst is behind us, especially as the macro environment has been more resilient than many expected and customers’ IT needs are steadily expanding.”
“Though there was an uptick in positive sentiment (34 percent of respondents felt better than three months ago), the general consensus was that things are staying the same (78 percent of respondents),” they say. “The pipeline for the third quarter looks solid with 42 percent of respondents expecting better results versus the second quarter. Finally, pricing pressure rose from the prior three quarters as vendors saw increased pushback from customers.”
The main customer priorities, according to VARs, revolve around cloud apps/services, security, data protection, and networking. There has been a small degree of workload repatriation from the cloud, but this is balanced by ongoing trends towards cloud migration. The report reveals a negative impact on the size and number of on-premises data center refreshes due to continued cloud migration. Ader writes: ”VARs assert that the size and number of on-premises datacenter refreshes are being negatively impacted by the continued migration to cloud (less on-premises infrastructure needed going forward).”
Although the security and networking areas were strong, there were weak areas in servers and storage “as customers digest prior purchases, sweat assets, and move more workloads to the cloud … VARs are starting to see green shoots of recovery in the second half for on-premises compute and storage, especially for larger projects that have been on hold.”
Specific vendors mentioned in the survey were Commvault, NetApp, Nutanix, Pure Storage, and Varonis, each offering a unique perspective on the state of the market.
Commvault
The report indicates that Commvault resellers have observed steady customer renewal and expansion activity, given that data protection is a customer priority.
The Federal market was a strong point, with Metallic’s FedRamp status a contributor. Metallic in general is doing well with its SaaS backup capabilities, both for new customers and as a cross-sell opportunity in existing Commvault accounts. M365 and Salesforce backup featured as a focus with Metallic backups.
The two analysts noted that Commvault is fighting for business in a tough market: “The data protection market remains highly fragmented and competitive, with newer vendors like Rubrik, Cohesity, Veeam, Druva, HYCU, and OwnBackup all seeing good traction … often viewed as less expensive and easier to manage and sell than incumbent offerings from Veritas, IBM, Dell-EMC, and Commvault.”
”We think Commvault is faring better than Veritas and Dell-EMC, which VARs say are both donating significant market share.”
NetApp
NetApp VARs reported mixed views, some doing well, others seeing lower demand. The VARs think “NetApp still has the best storage technology for public cloud and hybrid cloud use cases. Its cloud-native solutions (wherein NetApp’s file storage is a first-party service offered by the three big CSPs) continue to be a major differentiator for the company – e.g. Azure NetApp File (ANF) and Cloud Volumes Service (CVS) for Google Cloud … as well as the newer FSx for ONTAP in AWS.”
However, “several VARs noted that adoption remains somewhat narrow, supporting NetApp’s commentary that cloud revenue is concentrated among a small number of large customers.”
In the cloud ops area, “Spot is not performing as well as it could be amid the current focus on reducing cloud costs and cloud optimization. Following the departure of Anthony Lye in August 2022, the NetApp cloud portfolio has had mixed results – though NetApp management believes with the appointment of Haiyan Song as General Manager of the CloudOps business, this portfolio could get a new boost.”
Nutanix
The VARs noted that Nutanix was doing well and gaining share. “Nutanix renewals continued at a solid pace in the second quarter particularly for its Core HCI (hyper-converged infrastructure) platform as customers expand nodes and move new workloads to Nutanix. With HPE’s SimpliVity and Microsoft’s Hyper-V solutions continuing to lose share, Nutanix has seen some good replacement opportunities within certain enterprise accounts (we heard of an eight-figure win with Micron in the quarter).”
Uncertainty about VMware’s future with Broadcom may mean Nutanix will gain share in HCI over VMware’s vSAN. But “HCI demand appears to have been more muted of late as customers strive to move more workloads to the public cloud (causing some downsizing of HCI projects).” However, Nutanix and HPE are doing well, with Nutanix’s HCI software on HPE hardware winning against Dell’s VxRail, “which is losing traction with customers given the ongoing brouhaha around VMware’s acquisition.”
Some VARs said Nutanix is struggling to upsell customers to its add-on offerings which are thought to be expensive.
The two analysts worry that “a secular shift to cloud for midmarket customers will weigh on the company’s ability to meet its growth and profitability targets.”
Pure Storage
In the Pure Storage all-flash array area VARs noted that after “strong storage growth in 2021 and 2022, this year has seen deal activity come down as customers digest previously purchased capacity.” In particular, “IDC estimates the external enterprise storage systems market will contract by 1.8 percent in 2023 before returning to 5.5 percent growth in 2024.” On that basis, Ader said: “We are modeling a 7 percent year-over-year decline in Pure product sales in its July quarter, reflecting the tough comps from last year.”
He says on-premises demand remains resilient, driven by large unstructured data sets and the fact that larger customers are hesitant to move their tier-1 apps to the cloud.
The VARs are seeing continued expansion and refresh activity with backup, ransomware protection, AI/ML, and analytics use cases mentioned. In the data protection area “Pure FlashRecover (which is powered by Cohesity) has been the prominent product here, with Pure reps promoting its near instant restore capabilities in the event of a ransomware disaster.”
The VARs like the lower-cost appeal of the QLC-based Flashblade and FlashArray //E products but “many customers still see Pure as a pricey alternative.” Reflecting this, “many VARs continue to highlight that Pure is still a pricier solution for backup use-cases, and that success in data protection is mainly driven by existing Pure customers being upsold to an additional use-case.”
Varonis
VARs selling data security vendor Varonis’s offerings observed: ”The rapid growth in unstructured data across different, distributed file stores (both on-premises and in the cloud) has made managing data access complex. This complexity has led to to more customer demand, with multiple VARs noting that data security remains a high priority spending area for customers.”
Data loss prevention (DLP) measures are supposed to prevent data loss yet they did nothing to stop the large scale file exfiltration caused by malware that disrupted the widely used Fortra and MOVEit file transfer services. Code42 CEO Joe Payne claims this is because DLP has a blind spot and cannot see the attacks. New approaches are needed, he thinks.
Blocks and Files: Could existing DLP measures have foiled the Fortra and MOVEit file exfiltration attacks?
Modern data protection solutions, like Code42 Incydr, do see and help stop exfiltration caused by malware when a user is compromised on the company’s endpoint or on their corporate OneDrive account, for example. Malware attack red flags go up when clients see large volumes of files moving to untrusted destinations.
Blocks and Files: How does DLP technology work? Does it rely on a list of don’t-export files and a separate list of acceptable file transfer destinations?
Joe Payne: Traditional DLP tends to focus on the content itself – relying on labeling and tagging data to determine what employees can and can’t do with it. There are several problems with the traditional DLP approach, though.
First, accurately labeling all data at scale is practically impossible because of the huge volume of digital products that employees are working with; and because most traditional DLP tools only oversee data that’s been labeled “confidential,” they can miss risky activities.
Others analyze content to find patterns, like recognition of credit card number strings, which insiders can easily obfuscate by adding an extra number. When faced with these systems, users who plan to exfiltrate a file can easily bypass their organization’s DLP system by simply not labeling it as confidential or by changing the recognizable pattern.
Modern data protection solutions created for structured and unstructured data, on the other hand, focus on user behaviors and data destinations, rather than content. They aim to secure sensitive data across endpoints and in the cloud, quickly detecting when data goes to untrusted locations.
Blocks and Files: Could you see DLP suppliers offer guarantees like the anti-ransomware guarantees of some backup vendors?
Joe Payne: Because they take a very basic approach to content inspection, traditional DLP misses a lot of risky activities. So a guarantee like this would put a lot of DLP suppliers with an outdated approach out of business.
Joe Payne: This is a great analogy. Companies need solutions that allow traffic to flow – allowing employees to collaborate. Solutions that constantly throw up barricades or only monitor some traffic are not only ineffective when it comes to security, but also hamper efficiency and hinder the innovative collaboration that today’s businesses rely on for a competitive edge.
At Code42, we believe in securing the collaboration culture – meaning that we automate response to everyday mistakes via microlearning modules for accidental and non-malicious risk, block the unacceptable risks, and thus free up security teams’ time to investigate truly problematic activity.
Comment
This statistic that 76 percent of organizations have suffered a data breach despite having a DLP solution in place is noteworthy. A chart from a Code42-commissioned Ponemon report shows that in 2021, only 19 percent of data breaches were caused by a third-party mishap.
That description, third-party mishap, is exactly what happened in the Fortra and MOVEit data breaches. Even Code42’s expanded data loss prevention focus on insider behaviors would not have prevented them because the Fortra and MOVEit code was effectively treated as if it were a privileged outsider by each customer organization.
It seems to us that relying on third parties like Fortra and MOVEit to prevent data breaches through their software and services is not enough. We need some way for vendors to be able to inspect such privileged outsider-initiated file movements and prevent them happening, if they are unwarranted.
Comment: Research house Gartner has published its 2023 storage hype cycle, highlighting the evolving landscape of various technologies in the industry, and some have dropped completely out of sight.
The hype cycle serves as a five-stage technology progression chart, starting with an Innovation Trigger phase, followed by a Peak of Inflated Expectations, a Trough of Disillusionment, a Slope of Enlightenment, and finally reaching a Plateau of Productivity.
Gartner Research VP Julia Palmer, who shared the update on LinkedIn, said: “It’s like having a roadmap to navigate through the hype and identify the most promising innovations.” You need to log into a Gartner account to see it in detail. Here we present a screen grab of the highlights:
Technologies on the chart are represented by dot colors: dark blue indicates 5 to 10 years to reach the plateau, light blue represents two to five years, and a white dot indicates less than two years to plateau.
Upon initial analysis, some placements seem unusual. For instance, NVMe-oF should likely be positioned further to the right, and the positions of object storage and Distributed File Systems could potentially be exchanged.
Flash Memory Summit organizer Jay Kramer pointed out some notable omissions from the chart, including Storage Class Memory, Multi-Cloud, and Converged Infrastructure. Here’s last year’s storage hype cycle chart:
Storage Class Memory was rising up the Slope of Enlightenment in 2022 but faced setbacks after Intel discontinued Optane. However, technologies like SCM-class SSDs, MRAM, and RERAM continue to be developed. Notable trends from last year’s storage hype cycle include the disappearance of Copy Data Management, Enterprise Information Archiving, Persistent Memory DIMMs, and Management Software-Defined Storage.
Hybrid Cloud File Data Services have come from nowhere and are now descending the Trough of Disillusionment. There is no entry for CXL, which is expected to play a significant role in storage products. Additionally, we believe that Data Orchestration and SaaS Application Backup deserve a place on the chart.
While the storage hype cycle can be an intriguing insight into the industry’s evolving landscape, it should perhaps be regarded as a fun and less formal representation of the market’s evolution.
In its latest storage market review for the first quarter of 2023, Gartner reports that both NetApp and Pure Storage experienced a decline in market share.
The total all-flash market reached $2.523 billion, representing 50.4 percent of the overall storage market in monetary terms, with a year-on-year growth of 4 percent. However, it was down 22 percent compared to the previous quarter. Flash storage accounted for 31 percent of the total capacity shipped, marking a significant increase from the prior quarter’s 18.2 percent and a notable rise from the 16.8 percent recorded a year ago.
The supplier revenue shares are shown in a pie chart compiled by Gartner:
According to Wells Fargo analyst Aaron Rakers, Dell’s flash revenue rose by 17 percent year-on-year and 12 percent quarter-on-quarter, while HPE’s all-flash revenue also saw a 17 percent year-on-year bounce. However, Pure Storage’s revenue suffered a decline of 10 percent year-on-year and a substantial 24 percent plunge quarter-on-quarter, leading to a decrease in its market share from 17.7 percent a year ago to the current 15.4 percent in the all-flash array (AFA) market.
Rakers further mentioned that Pure Storage’s all-flash capacity share of 624EB increased by 5 percent year-on-year, but remained unchanged compared to the previous year at 4.7 percent.
While there is no detailed insight into the share changes of other suppliers, prior data from B&F articles indicates that NetApp’s market share dropped from 22 percent a year ago to 16 percent in Q1 2023. Pure Storage’s share also experienced a decline, although not as significant, meaning it narrowed the gap with NetApp. IBM, on the other hand, has been catching up with Huawei and is gradually distancing itself from HPE.
Among the Y/Y revenue share gainers were Dell EMC, IBM, Huawei, HPE, and the “Others” category. These changes in revenue share have been charted to illustrate their magnitude.
Overall, both NetApp and Pure Storage lost share to other players in the market.
The rest of the storage market
As for the total storage market, Gartner reports that it was worth a little more than $5.005 billion, showing a modest one percent year-on-year growth. Primary storage increased by one percent year-on-year, while secondary storage and backup and recovery rose by 4 percent and 9 percent, respectively.
Hybrid flash/disk and disk revenues declined 2 percent year-on-year. The total external storage capacity shipped experienced a 24 percent year-on-year growth, with secondary storage capacity witnessing a massive 70 percent increase, backup storage rising by 18 percent, and primary storage capacity seeing a modest 3 percent rise.
Managed infrastructure systems provider 11:11 Systems has joined the AWS Partner Network (APN) and will offer customers 11:11 Cloud Backup for Veeam Cloud Connect, 11:11 Cloud Backup for Microsoft 365, and 11:11 Cloud Object Storage.
Vitali Edrenkine.
…
Data protector Arcserve has appointed Vitali Edrenkine as its chief marketing officer (CMO). Edrenkine was most recently SVP Growth Marketing at Vendr, and prior to that, he worked as SVP of Demand Generation and Digital Platforms at DataRobot. Prior to that he was at Rackspace.
…
Catalogic’sCloudCasa for Velero is now available in the Microsoft Azure marketplace. CloudCasa is a cloud-native, data protection service that also fully supports Velero and detects and alerts on vulnerabilities in Kubernetes clusters.
…
Dell is recommending customers use Cirrus Data block migration with its Dell APEX-as-a-Service to move block data to the APEX block service, which is based on PowerFlex. Cirrus claims Dell APEX customers with Cirrus Data can perform block data migrations 4 to 6x faster than traditional approaches, meaning public cloud migration tools (see Azure and AWS) and specific vendor storage tools. The transfer uses thin reduction; data is compressed, and all zeros are eliminated. One Cirrus Data claimed a customer moving 9PB of data from AWS’ EBS to Dell PowerFlex realized savings of $3.2 million annually; 60% less than what they were paying for raw native storage.
…
DataCore’s Swarm object storage has been certified as a backup target for v12 Veeam Backup & Replication and v7 Veeam Backup for Microsoft 365.
…
Dell says it has enhanced PowerProtect Data Manager and PowerProtect Data Manager Appliance, PowerProtect Cyber Recovery and PowerProtect DD Operating System. The vendor lists the features below:
Customers can directly back up PowerStore data to PowerProtect DD. PowerProtect Data Manager provides volume and volume group snapshot backups directly to PowerProtect appliances from PowerStore arrays using the Data Manager UI.
PowerProtect Data Manager adds Recovery Orchestration for VMware VMs.
Customers can vault data from the Data Manager Appliance to a PowerProtect DD-based cyber vault on-premises or in cloud, facilitating orchestrated recovery checks and the restoration of backup and appliance configuration data.
Dell has added retention lock compliance on the Data Manager Appliance.
Customers can replicate data to/from the cloud using PowerProtect DD Virtual Edition on-premises or APEX Protection Storage for Public Cloud for AWS, Azure, Google Cloud and Alibaba Cloud.
PowerProtect DDOS includes Smart Scale workload management with automatic identification and inclusion of affinity groups when performing migrations between appliances.
PowerProtect DDOS updates have support for KVM VirtIO disks with PowerProtect DD Virtual Edition and enhanced cloud security of APEX Protection Storage for Public Cloud with support for retention lock compliance on AWS.
Storage Scale eBook.
…
IBM has claimed a performance breakthrough with up to 125GBps per node throughput (from 91GBps) via the Storage Scale System 3500 that scales to 1000s of parallel access nodes. It has improved caching performance and added automated page pools to enhance the system’s ability to provide more balanced and improved performance for AI applications. Read a Storage Scale eBook to find out more.
…
Marvell QLE2800 HBA.
Marvell says it’s the provider of the quad-port FC target host bus adapters (HBAs) in the HPE Alletra Storage MP hardware. The QLogic QLE2800 HBA has port isolated architecture and is capable of delivering up to 64GFC bandwidth and support concurrent FC-SCSI and FC-NVMe communications. It includes support for virtual machine identification (VM-ID), automatic congestion management using Fabric Performance Impact Notifications (FPIN) and enhanced security with encrypted data in flight (EDiF) that HPE can potentially use in the future.
…
Cloud file services supplier Nasuni is integrating Microsoft’s Sentinel security information and event management (SIEM) platform, with its offering. It allows customers to automatically spot threat activity and immediately initiate the appropriate responses. Microsoft Log Analytics Workspace gathers and shares Nasuni event and audit logs at any Nasuni distributed Edge device for constant monitoring with the Sentinel platform. Nasuni also has new targeted restore capabilities for its Ransomware Protection service.
…
HPC and AI storage supplier Panasas has listed some academic customers for its ActiveStor products: the Minnesota Supercomputing Institute (MSI), the UC San Diego Centre for Microbiome Innovation (CMI), LES MINES ParisTech, and TU Bergakademie Freiberg state technical university in Germany and Rutherford Appleton Laboratory in the UK. According to Hyperion Research, total HPC spending in 2022 reached $37 billion and is projected to exceed $52 billion in 2026.
…
SSD controller developer Phison says it has found a way of having its controlled SSDs hold AI application data that would otherwise be in DRAM, compensating for DRAM being expensive and consequently not fully specified for AI-processing servers. Based on the application timeline, Phison’s aiDAPTIV+ structurally divides large-scale AI models and collaboratively runs the model parameters with SSD offload support. This approach maximizes the executable AI models within the limited GPU and DRAM resources, which Phison claims reduces the hardware infrastructure cost required to provide AI services.
…
Swiss-based Proton has launched Proton Drive for Windows, a secure, end-to-end encrypted cloud storage app that offers users a way to sync and store their files to Proton Drive cloud storage right from their Windows device. Like Dropbox it ensures that files and folders are always up-to-date across all connected devices. Any changes made to files on a Windows PC will be automatically reflected on a user’s other devices that have Proton Drive installed.
…
Scale-out filer supplier Qumulo announced GA of Azure Native Qumulo Scalable File Service (ANQ) in 11 new regions: France Central, Germany West Central, North Europe, Norway East, Sweden Central, Switzerland North, UK South, UK West, West Europe, Canada Central, and Canada East.
…
RightData announced DataMarket, a user-friendly way to act on all data within an organization, including understanding definitions, viewing metadata, control access, and direct access to APIs, connectors, and natural language-based data analysis. Data consumers can use natural language search to find out about data products, see quality ratings and reviews. Users can then request access and use the data through its provided API, JDBC connectors, downloads, or the rendering of data visualizations directly within the DataMarket.
…
Serve The Home reportsSabrent has a Rocket X5 PCIe gen 5 M.2 2280 form factor SSD in development. It has 1TB and 2TB capacity points and is packaged with a fan-driven heatsink. The 2TB model so far delivers >14GBps sequential reads and 12GBps sequential writes. These numbers would put the X5 on a par with Kioxia’s read-intensive CM7-R.
…
Samsung has an automotive UFS 3.1 NAND with 128, 256 and 512-GB variants product for in-vehicle infotainment (IVI) systems and satisfying the negative 40°C to 105°C temperature requirements of AEC-Q100 Grade2 semiconductor quality standard for vehicles. The 256GB model provides a sequential write speed of 700MBps and a sequential read speed of 2,000MBps. Samsung claims the product offers the industry’s lowest energy consumption.
…
SMART Modular’s DC4800 data center SSD, in E1.S and U.2 formats, has been accepted as an OCP Inspired product and will be featured on the OCP Marketplace. It’s is a PCIe gen 4 connected drive available in capacities up to 7.68TB.
…
The SNIA’s Storage Management Initiative (SMI) announced that SNIA Swordfish has achieved a milestone with the publication of Swordfish v1.2.4a by ISO/IEC as ISO/IEC 5965:2023. The new ISO/IEC-published version replaces ISO/IEC 5695:2021, with the addition of functionality to manage NVMe and NVMe-oF.
…
SuperWomen in Flash today shared figures from a recent study by the National Center for Women & Information Technology (NCWIT), a non-profit community founded in 2004 which is funded by the National Science Foundation:
In the last five years, approximately 10% of U.S. IT patents included women as inventors.
In the US, IT patenting overall increased almost 17-fold between 1980-84 and 2016-2020.
Patents with woman inventors increased 56-fold from 1980-84 to 2017-2020, even as the percentage of women employed in IT either remained flat or decreased slightly.
In 2022, 27% of professional computing occupations in the U.S. workforce were held by women.
In 2022, 23% of total tech C-suite positions in Fortune 500 companies were held by women.
…
UC3400 (top) and SAA3400D (bottom). Looks pretty much like the same hardware with different labels.
Synology has launched two enterprise storage units, the UC3400 and SA3400D, both 12-bay dual-controller devices focused on business continuity. The SA3400D is focused on file and application hosting and provides minute-level failover in case of component failure. It provides more than 3,500/2,900 MBps seq. read/write throughput and 500 TB maximum storage capacity. The UC3400 is an active-active SAN, with in excess of 180,000 4K random write IOPS, and up to 576 TB of storage, for iSCSI & Fibre storage, such as VM storage.
…
Scale-out parallel filesystem vendor WekaIO announced what it terms as two guarantees: the WEKA Half Price Guarantee for cloud deployments; and the WEKA 2X Performance Guarantee for on-premises deployments. The Half Price Guarantee promises the WEKA Data Platform can help cloud customers achieve up to 50 percent cost savings over their current equivalent cloud storage solution with zero performance impact. Its 2x Performance Guarantee says on-prem customers will achieve a 2x performance increase over their all-flash arrays for the same cost. Eligible customers using a hybrid cloud deployment configuration can participate in both schemes. Learn more here.
…
Data protector Veeam has published an AWS Data Backup for Dummies eBook. Get it here.
…
Veritas and Kyndryl have partnered to launch two new services: Data Protection Risk Assessment with Veritas; and Incident Recovery with Veritas. These services will help enterprises protect and recover data across on-premises, hybrid and multi-cloud environments. Data Protection Risk Assessment with Veritas will rely on Kyndryl’s cyber resilience framework and Veritas’ data management solutions to offer unified insights across on-premises, hybrid and cloud environments. Incident Recovery with Veritas will use AI-based autonomous data management capabilities to give customers a fully managed service encompassing backup, disaster recovery and cyber recovery.
…
Phil Venables, CISO at Google Cloud has joined the Board of Directors at identity security company Veza.
…
Tom’s Hardware reports China’s YMTC is manufacturing 128-layer 3D NAND with Xtacking 3.0 technology, which is based on placing the 3D NAND flash wafer underneath a separately fabbed CMOS peripheral circuit logic chip, bonded to it with millions of small connectors. It has previously built its 128-layer NAND with Xtacking 2.0 tech. Xtacking 3.0 was developed for YMTC’s 232-layer chip but US tech export restrictions have halted that product’s development, so back to last generation tech rejuvenated with new peripheral logic. Research house Yole analysts have determined that YMTC’s 232L NAND has two separate components or decks; one with 128 layers and the other 125.
Case study. Oregon Health and Science University (OHSU) researchers in Portland are using cryogenically cooled electron microscopes to build 3D images of biological molecules as they investigate how proteins affect the brain, the COVID virus, serotonin neural processes, ageing and the myriad other aspects of human and other organisms’ biology.
Data from the instruments are stored in a Quobyte file system – chosen when OHSU decided it needed its own local storage rather than relying on an external system provided by the Pacific Northwest National Laboratory (PNNL). The storage system has to support high-performance computing applications used by researchers as well as being a vault for the instrument-generated data.
Craig Yoshioka and cryo-EM equipment.
The OHSU Cryo-EM center has four telephone kiosk-size Titan Krios 300-kiloelectron volt Transmission Electron Microscopes (TEM) to visualize proteins and other biological molecules in 3D and at near-atomic scale. They examine samples cooled down to liquid nitrogen temperature and use high-energy electrons to view their structures down to a resolution of 2.8 Å.
Each Krios has a direct-electron detection camera fitted, and is mounted on a vibration absorbing pad inside its booth to keep its detectors as still as possible. Without this, they would be affected by vibrations as small as that caused by a person’s voice, a breeze, or currents in the adjacent Willamette river.
Cryo-EM image of a virus.
OHSU also has other microscopes, such as a Glacios Cryo-TEM, to pre-screen samples before using a Krios to get near-atomic resolution of selected images. Using a multi-million dollar Krios for pre-screening is overkill.
Each Krios generates an average 3TB of data a day, meaning 8 to 16TB/day for OHSU’s cryo-EM facility. We’re looking at around 120TB/week, up to 6.2PB/year, and that data has to be kept for use in the following 12 months or so. A proportion of it may be noise rather than signal, but the researchers may be able to pull out more signals from the noise with better algorithms in the future. So the data is kept.
There are 900 researchers using the cryo-EMs and accessing the data with up to 200 active projects at any given time.
Craig Yoshioka, a PhD and Research Associate Professor at OHSU directs its cryo-EM center. He said the original storage system was based on ZFS running on Linux servers. This could accept the data from the instruments fine, but could be slow at delivering it to the HPC apps.
The possible options he considered for fixing the slowness included simply scaling it up, or moving to a BeeGFS alternative, a Panasas system, a Quobyte cluster setup, VAST Data array, or a WekaIO file system. He specifically wanted a distributed file system accessed via a centralized web interface with a single namespace and intuitive management utilities.
Orioginal NAS system (top) and new Quobyte cluster system (bottom) with Krios telephone-sized booths on the left of each image.
He inspected Quobyte in November last year, liked what he saw, and decided to use its software running on a mix of hard disk and solid state drives. His team had migrated and loaded 1.5PB of data onto the Quobyte system by January this year and the system is working just fine.
DataStax has tweaked its Astra DB Database-as-a-Service (DBaaS) by incorporating vector search capabilities in response to the increasing demand for generative AI-driven data storage solutions.
The foundation of vector search lies in vector embeddings, which are representations of aspects of various types of data, including text, images, audio, and video, presented as strings of floating-point numbers.
DataStax uses an API to feed text data to a neural network, which transforms the input into a fixed-length vector. This technology enables search input that closely matches existing database entries (vector embeddings) to produce output vectors in close geometric proximity, while inputs that are dissimilar are positioned further apart.
Ed Anuff, DataStax chief product officer, emphasized the significance of vector databases in enabling companies to transform the potential of generative AI into sustainable business initiatives. “Databases that support vectors – the ‘language’ of large learning models – are crucial to making this happen.”
He said massive-scale databases are needed because “an enterprise will need trillions of vectors for generative AI so vector databases must deliver limitless horizontal scale. Astra DB is the only vector database on the market today that can support massive-scale AI projects, with enterprise-grade security, and on any cloud platform. And, it’s built on the open source technology that’s already been proven by AI leaders like Netflix and Uber.”
Astra DB is available on major cloud platforms, including AWS, Azure, and GCP, and has been integrated into LangChain, an open source framework for developing large language models essential for generative AI.
Matt Aslett, VP and Research Director, Ventana Research, said in DataStax’s announcement: “The ability to trust the output of generative AI models will be critical to adoption by enterprises. The addition of vector embeddings and vector search to existing data platforms enables organizations to augment generic models with enterprise information and data, reducing concerns about accuracy and trust.”
Astra DB adheres to the PCI Security Council’s standards for payment protection and safeguards Protected Health Information (PHI) and Personally Identifiable Information (PII), we’re told.
Other database and lakehouse providers – such as SingleStore, Databricks, Dremio, Pinecone, Zilliz, and Snowflake – are also actively supporting vector embeddings, demonstrating the growing demand for these features in the generative AI data storage landscape.
Try DataStax’s vector search here (registration required) and download a white paper on Astra DB vector search here (more registration required).
Additionally, customers using DataStax Enterprise, the company’s on-premises, self-managed offering, will have access to vector search within the coming month.
Solidigm has unveiled its latest product, the D5-P5336, featuring storage capacity of 61.44TB. This drive is currently the largest capacity PCIe drive available on the market and utilizes QLC (4bits/cell) flash technology while delivering performance comparable to TLC (3bits/cell) drives.
Building on its predecessor, the D5-P5430, the D5-P5336 is designed as a value endurance read-intensive product. The D5-P5430 remains a mainstream read-intensive drive, topping out at 30.72TB. Both drives are available in U.2, E1.L, and E3.S formats, built on 192-layer 3D NAND technology. The D5-P5336 incorporates a 16KB indirection unit, an upgrade from the previous 4KB unit, resulting in faster read operations through logical block address to physical block address mapping.
Greg Matson
Greg Matson, VP of Strategic Planning and Marketing at Solidigm said: “Businesses need storage in more places that is inexpensive, able to store massive data sets efficiently and access the data at speed. The D5-P5336 delivers on all three – value, density and performance. With QLC, the economics are compelling – imagine storing 6X more data than HDDs and 2X more data than TLC SSDs, all in the same space at TLC speed.”
The drive in U.2 format starts with a 7.68TB capacity, doubling its capacity in three steps to 61.44TB. The physically smaller E3.S format has a 7.68TB to 30.2TB range while the physically larger E1.L ruler format starts at 15.36TB, maxing out at 61.44TB like the U.2 product. This PCIe gen 4-connected drive boasts maximum speeds of 1,005,000/43,000 random read/write IOPS, 7GBps sequential read bandwidth, and 3.3GBps sequential write bandwidth, making it ideally suited for read-intensive workloads, particularly when contrasted with the D5-P5430, which delivers 971,000/120,000 random read/write IOPS, 7GBps sequential read and 3GBps sequential write bandwidth.
Exploded view of Solidigm D5-P5336 E1.S format drive
Compared to some datacenter TLC drives, the D5-P5336 excels in read-intensive performance, even though TLC flash is generally faster than QLC NAND. For example, the SK hynix PE811, a PCIe gen 4 drive utilizing 128-layer TLC NAND, outputs 700,000/100,000 random read/write IOPS and 3.4/3.0 GBps sequential read/write bandwidth.
Solidigm conducted a comparative analysis, normalizing to Micron’s 7450 Pro:
The five-year endurance is up to 0.58 drive writes per day (DWPD), varying with capacity: 7.68TB – 0.42; 15.36TB – 0.51; 30.72TB – 0.56; and 61.44TB – 0.58 DWPD.
Given the speed, endurance, and pricing information, Solidigm created comparison grids for 100PB object storage disk and TLC flash arrays against its D5-P5336. The results indicated a significant cost advantage for the D5-P5336, offering a 47 percent lower 5-year total cost of ownership (TCO) when compared to an all-disk setup with 106 racks of 20TB HDDs. Moreover, compared to a 100PB Micron 6500 ION rack, the D5-P5336 showed a 17 percent lower cost.
The results are based on Solidigm’s own testing and we cannot independently verify the figures.
The D5-P5336 is shipping now in the E1.L form factor, supporting up to 30.72TB, with plans to extend the product line to U.2 and E1.L formats up to 61.44TB later this year. Furthermore, Solidigm intends to introduce E3.S form factor drives, featuring capacities of up to 30.72TB, in the first half of 2024.
Comment
Given the D5-P5336’s reported cost-effectiveness, high performance, and adoption by OEMs such as VAST Data, we suspect that other SSD manufacturers will likely follow suit and introduce their own 60TB-class drives in the next six months. This move will enable OEMs to compete at a drive capacity level and narrow the gap between current SSD offerings, which max out at 30TB, and Pure Storage’s 75TB Direct Flash Modules.
It will be interesting to see how costs compare between Solidigm’s D5-P5338 object storage array and equivalent HDD arrays at lower capacities, such as 75PB, 50PB, and 25PB. The potential shift to 30TB+ HAMR drives may also play a crucial role in influencing this landscape. The HDD suppliers are expected to closely scrutinize these factors and share relevant information in response.
Cisco and HPE, along with their channel partners, have entered into agreements to resell a suite of data protection managed services from Cohesity, using the AWS and Azure clouds.
Cohesity Cloud Services (CCS) comprises four fully managed, cloud-native as-a-service solutions: backup, cyber vaulting, threat defense, and disaster recovery. These are part of Cohesity’s Data Cloud software portfolio, which includes DataProtect backup, FortKnox cyber vaulting, DataHawk threat detection, SmartFiles, and SiteContinuity disaster recovery, all provided in a managed service format. It’s worth noting that SmartFiles is not a data protection product and isn’t included in the suite.
Chris Kent
Cohesity’s Vice President for Product and Solutions Marketing, Chris Kent, emphasized the company’s focus on partnerships, stating: “We are partner-focused, and these agreements add significant resources to our ability to reach even more customers worldwide.”
Both Cisco and HPE already have established agreements with Cohesity. The collaboration has resulted in over 460 joint customers for Cisco and Cohesity, and more than 600 joint customers for HPE and Cohesity. Cisco, HPE, and their respective resellers will now be able to offer Cohesity Cloud Services, which are hosted on the AWS and Azure platforms. The pair reckon that adopting CCS for data protection offers a simpler and more efficient alternative to on-premises solutions, reducing management requirements and freeing up valuable staff time.
Currently, one in five Cohesity customers utilize CCS, and approximately a quarter of all new business for Cohesity is attributed to CCS. Larger enterprises are showing significant interest, with nearly two-thirds of CCS’s annual recurring revenue over the past 12 months coming from companies with revenue exceeding $1 billion, we’re told.
Furthermore, the company has a sales partnership with OwnBackup, and there is potential for integrating OwnBackup’s SaaS application backup services for Microsoft Dynamics 365, Salesforce, and ServiceNow into the CCS backup-as-a-service product. However Cohesity is not a launch partner for the Microsoft 365 Backup service via API integration. Data protection competitors such as Commvault, Rubrik, Veeam and Veritas, are API integration-level partners for Microsoft 365 Backup.
CCS is available in America, Canada, Europe, Singapore and southeast Asia, Japan, and the Middle East. It is being sold today by Cohesity, HPE and HPE resellers, and will be available from Cisco and its channel crew later this year.