Home Blog Page 159

Amberflo on SaaS storage: Subscription billing is an aberration

Customers getting sticker shock from their monthly cloud bills is a cliche of the public cloud business. Startup Amberflo claims its metered usage billing service can stop that.

Puneet Gupta

Founder and CEO Puneet Gupta told a Santa Clara  IT Press Tour audience that usage-based billing was the best way to consume and pay for cloud services, so long as the metering was open, accurate and carried out by a scalable process.

“Subscription billing,” he said: “is an aberration,” as it doesn’t reflect usage. But: “Meter readings should be shared with customers, 100 percent, and customers then can’t complain about predictability.”

Amberflo provides a SaaS-based metering and billing offering. A SaaS supplier needs to define meterable events in its cloud offering; compute instance usage, storage capacity usage, IO events and so forth.

The Amberflo service detects these and collects them in its own time-series database, which uses AWS S3 as its data store. At billing time these events are priced using the supplier’s pricing method and invoices sent out.

It sounds simple and basic but the technology behind it has to be reliable so as not to lose data and function as a system of record.

If a SaaS supplier were to write the code for their own metering and billing system they have to be aware that scalability could be a real problem. This is a real-time application and can have a very high event record ingest rate. Gupta said: “AWS has its own billing service. Three years back it was running at 3 billion events per second.” Amberflo itself is processing tens of billions of meter events per day.

Gupta says that SaaS customers need to pay attention to their usage of cloud services and understand the billing metrics. He says: “Customers need to own their own metering. … Most customers have not been careful with their cloud usage.”

Being careful is key and checking bills without understanding metrics and usage is the wrong approach: “You will never get ahead of cloud cost management if you’re parsing the bill. It’s after the fact.”

Gupta says Amberflo will add more services: “Soon you will see us tackle cloud cost management the way it should be done.”

Bootnote

Puneet Gupta co-founded Amberflo in August 2020. He resigned from being  Product Development VP for Oracle’s cloud infrastructure the year before after joining Oracle in 2015. Prior to that he spent three years at Amazon as a general manager for AWS.

His co-founder was Lior Mechlovich, Amberflo’s CTO and an ex-lead SW engineer at AWS, with 10 years at Informatica in his CV before joining AWS in 2018.

Amberflo has around 25 employees and raised $15 million in A-round funding in January this year, led by Norwest Venture Partners. This followed a January 2022 $5 million seed round led by Homebrew. Crunchbase records a $4 million pre-seed round in December 2020 as well, making a total of $24 million raised. With just 15 employees, AmberFlo is a well-funded operation.

Storage news roundup – February 22

Microsoft’s Azure cloud has announced a preview of Azure HPC Cache Premium Read-Write designed to provide high-bandwidth and low-latency access to files similar to local NVMe storage. It provides lower latency than the Standard HPC Cache for compute-intensive enterprise workloads. You can provision up to 84TB (80+TiB) of capacity in a single cache and point thousands of compute clients at it to get up to 20GBps of read throughput, reduced latency (150 µsec for reads, 1 msec for writes, 24.5 percent increase in write operations) and IOPS Scalability (170,000 random 4KiB writes, 450,000 random 4KiB reads). Premium HPC Cache provides Azure’s best file-based performance for time-sensitive workloads like media rendering, simulations for genomics and financial models, and chip design. It’s currently available as a Public Preview in select regions. If interested , you can create a new cache and choose the Premium (Preview) option. 

Cloud backup and storage service provider Backblaze has published an IPO playbook in series form for other startups. It says: “The IPO process… is as cryptic as the most jealously guarded algorithms for anyone who hasn’t been through it, and for no good reason.” It’s now “open sourcing” how it went through its IPO. Check out the first installment here.

Storage company Backblaze report

Privately-owned block data migrator Cirrus Data claims it has shifted over an exabyte of block data and says 2022 was a record year for the company, with bookings nearly doubling for the full year and profitability reaching new levels. Its Cirrus Migrate Cloud integrates with all the major public clouds. Customers can purchase data migration capacity directly through the Amazon Web Services marketplace, soon from Microsoft Azure and Oracle Cloud marketplaces as well. 

Containerized app data protector CloudCasa and container storage provider Ondat have announced a new bundled offering which provides customers with a unified solution to run their stateful applications on Kubernetes without worrying about availability, performance, protection, or data management and recovery. DevOps and platform teams will now have a simple, easy to consume software-as-a-service solution that gives the flexibility to store, manage, and backup applications anywhere on-premises, hybrid, or cloud environments. This bundle is available to existing Portworx customers at a 50 percent discount off the current pricing of Ondat and CloudCasa in the first year.

Data lakehouse supplier Databricks has announced the VS Code Extension for Databricks. It enables developers to write code locally, using authoring capabilities of the IDE, while connecting to Databricks clusters to run code remotely. IDEs let developers employ best practices that become necessary with large codebases, such as source code control, modular code layouts, refactoring support, and integrated unit testing. Databricks will be rolling out support for other IDEs and additional tools.

Dataddo, a SaaS startup with an automated, no-code data integration platform, has joined the Snowflake Partner Network to help mutual customers streamline centralization and distribution of data across their organizations using fewer tools. Dataddo CEO Petr Nemeth claimed: “Customers can use any of our hundreds of connectors to reliably sync data from online services to Snowflake, and from Snowflake to online services. Plus, the transformation and anomaly detection functionalities built into the Dataddo platform ensure that any data it loads into Snowflake meets an essential standard of quality.”

IBM has dropped the Spectrum brand prefix from two of its product websites; the ones for Spectrum Scale and Spectrum Fusion. They are now called IBM Storage Scale and Storage Fusion. But the Storage Scale datasheet still refers to Spectrum Scale. The Storage Fusion solution brief document refers to IBM Storage Fusion, though.

IBM Elastic Storage System (ESS) 3500 offers support for the next-generation Nvidia ConnectX-7 InfiniBand dual-port NDR 200 Virtual Protocol Interconnect (VPI) network adapter and related network cables. ESS 3500 is also adding support for a 10TB self-encrypting drive (SED) option to enable key-managed SED encryption. SED functionality is supported by ESS 3500 solution version 6.1.5. To enable SED, all drives (NVMe and HDD) in a recovery group in the hybrid or capacity configurations must be SED-capable. IBM Spectrum Scale offers file system software encryption and SED support; both can be combined. More info here.

Lenovo says that PC revenues declined substantially in its Q3 FY23 results announced Feb 17, storage revenue more than tripled (+345 percent) and it’s now now #5 in the world. Lenovo says it experienced an record sales in Server, Storage, Software and AI Edge, which led to 156 percent year-over-year growth in operating profit. CFO Wai Ming said in the earnings call: “The latest quarterly third-party statistics indicated that ISG (Infrastructure Solutions Group) market share by revenue in the global storage market nearly doubled year-on-year.”  ISG president Kirk Skaugen said: “Within Storage, we had records for hyperconverged, traditional entry and midrange and cloud storage. So it was very broad-based.”

Lenovo storage revenues
Lenovo segment revenue history. Storage and servers are included in the ISG segment, with PCs in the Intelligent Devices Group

MSP-focused data protector N-able has launched N-able Managed Endpoint Detection and Response (Managed EDR), a threat monitoring, hunting, and response service designed for MSPs that have standardized on N-able Endpoint Detection and Response (EDR). MSPs can reinforce their IT security or Security Operations Center (SOC) team with enterprise security specialists through N-able Managed Endpoint Detection and Response.

Data protector NAKIVO ended calendar Q4 2022 with year-on-year revenue growth of 15 percent in EMEA and Asia. Of the total revenue in Q4 2022, 63 percent came from the EMEA region, 28 percent came from the Americas, and 9 percent came from the Asia-Pacific region. The US, Germany, and South Africa were the highest revenue contributors in the quarter. NAKIVO has over 25,000 paid customers in 177 countries. The customer base grew by 16 percent in Q4 2022 versus Q4 2021. The number of new customers grew by 16 percent in the EMEA region, by 18 percent in the Asia-Pacific region, and by 14 percent in the Americas.

Next Pathway’s SHIFT Cloud is available as a SaaS offering on the Azure Marketplace. Customers now have access to a robust and performant code translation engine that allows them to translate legacy SQL code and ETL pipelines to Microsoft Azure in a self-service capacity, we’re told.

Chris Hetner, Panzura
Chris Hetner

Cloud file services supplier Panzura has formed a Customer Security Advisory Council, chaired by Chris Hetner, a respected leader in cybersecurity. The council will provide education and awareness around data resiliency with a mission of advancing business, operational, and financial alignment to cybersecurity risk governance. Hetner served as the Senior Cybersecurity Advisor to the Chair of the United States Securities and Exchange Commission and as Head of Cybersecurity for the Office of Compliance Inspections and Examination at the SEC. He will provide ongoing education to Panzura customers about cybersecurity and operational resiliency. 

PNY Technologies announced the launch of its CS2230 M.2 NVMe portable SSD designed to replace SATA SSDs and provide faster speeds for gamers and business users. Available in 500GB and 1TB versions, the SSD can achieve read speeds of up to 3,300MB/s and write speeds of up to 2,500MB/s for the 500GB version and 2,600MB/s for the 1TB version. It runs across PCIe gen 3 x4 and includes Acronis data protection. USB-C to USB-C and USB-C to USB-A cables are included.

PNY storage

Release 7.0 of Chinese supplier Vinchin‘s Backup & Recovery is coming soon with features including NAS backup, physical server backup for Linux and Windows servers, VMware backup verification, MariaDB protection, and more advanced Hyper-V protection. Vinchin Backup & Recovery 7.0 also delivers a more intuitive and dynamic data visualization interface with attached statistics from daily operations for enhanced and straightforward data management.

Vinchin backup dashboard

Weebit Nano will be rolling out its first production-ready Resistive RAM (ReRAM) IP at Embedded World next month (Hall 4, booth 650a). Weebit ReRAM is silicon-proven, non-volatile memory (NVM) that provides a needed solution for a broad range of embedded applications, with ultra-low power consumption, excellent retention even at high temperatures, fast access time, high tolerance to radiation and electromagnetic interference (EMI), and numerous other advantages, we’re told.

Western Digital has announced the end of support for the Discovery for My Cloud Home utility which provides the ability to mount the My Cloud Home device as a local desktop drive. From June 23 it will no longer be supporting the WD Discovery software for the My Cloud Home Desktop App. The company said: “There will also no longer be any new releases or security updates for My Cloud Home on WD Discovery. We will discontinue all development, including critical security updates and technical support. As new versions of operating systems are released, these features might stop working and will not be fixed. WD Discovery will continue to function and be supported for other compatible WD storage products.” 

DataStax clones decentralized blockchains into centralized AstraDB database

Cassandra NoSQL database supplier DataStax has announced an Astra Block service to clone and store Ethereum blockchains in its Astra DB cloud database. 

A blockchain is a decentralized distributed ledger recording transactions in a shared, immutable way in a peer-to-peer network with no central authority. Cryptographically protected chain linkages allow ledgers to be updated and viewed. An entire Ethereum blockchain can be cloned and stored in AstraDB, and is then updated in real time as new blocks are mined. DataStax claims this streamlines and transforms the process of building and scaling Web3 applications. It plans to expand this Astra Block service to other blockchains in the future, based on user demand.

Ed Anuff.

Ed Anuff, chief product officer at DataStax, said: “These distributed ledgers open up a whole new world of innovation similar to what we saw with digital content 20 years ago or social media 15 years ago – that is, the possibilities are only limited by our imaginations. Crypto currencies, non-fungible tokens, and smart contracts have drawn a lot of attention, but there are many other areas that will benefit from blockchain innovation: healthcare, real estate, IoT, cybersecurity, music, identity management, and logistics, to name a few.”

DataStax says its new service allows advanced querying and real-time analytics to be run at sub-second speeds, enabling developers to build blockchain-based functionalities into their applications. For example, developers can build applications with the capability to analyze any transaction from the entire blockchain history, including crypto or NFTs, for instant, accurate insights. 

Blockchain computations are compute-intensive and it can take seconds to to access blockchain data. The ability to analyze and track blockchain transactions is difficult, making many use cases untenable – particularly real-time ones. DataStax says that, according to Gartner’s 2022 Hype Cycle for Blockchain and Web3, “By 2024, 25 percent of enterprises will interact with their customers or partners using decentralized Web3 applications.” But developers have struggled to access this data, having to resort to hundreds of API connections, building their own indexers, and manually managing the data infrastructure.  

Astra Block removes these problems by, ironically, providing a centralized copy of the decentralized blockchain – thus subverting blockchain’s design philosophy.

Centralized vs decentralized

Peter Greiff, data architect leader, DataStaxl, said in answer to this point: “Astra Block is not about centralizing blockchain data but addressing some of the dilemmas that developers of Web3 distributed applications, or dApps, have with using those distributed ledgers – accessing that data is hard, slow and expensive.”

Grief reckons “Astra Block provides a solid base for a hybrid architecture for your dApp, using Astra Block for very low latency reads to access data and then writing info and data back to the blockchain distributed ledger directly. Astra Block manages the cloning of the chain into an operational database for dApp reads to be performed. Then, you’re only writing back the absolute minimum necessary to the chain. Those transactions might be slow and expensive compared to the transactions taking place in Astra Block, but you are still using that distributed approach where you need it.”

So: “DataStax operates blockchain nodes for its customers, and whenever a new block is mined, Astra Block detects that event, processes it, and does all the enrichment. Your Astra account is kept up to date with that data via built in CDC (change data capture) synchronization. Block is able to use CDC for Astra DB to propagate any further change events to your Astra Block database.” 

He asserted that: “You get all the features you would expect from a cloud-managed database, like a multi-tenant system, globally distributed, push button cloud clusters, intelligent auto scaling, and Data API Gateways, and then you can combine that with the fully distributed and trusted approach that blockchain or distributed ledger deployments can offer.”

Customer Elie Hamouche, founder of Clearhash, says: “AstraDB with Astra Block removes much of the complexity that comes with collecting and storing unbounded blockchain datasets to power real-time analytics workloads.”

The free tier of AstraDB offers a 20GB partial blockchain dataset to get started. The paid tier gives developers the ability to clone the entire blockchain, which is then updated in real time as new blocks are mined.

Microsoft and Ankr

Separately, Ankr and Microsoft have partnered to provide a reliable, easy-to-use blockchain node hosting service. The enterprise node deployment service will offer global, low-latency blockchain connections for any Web3 project or developer. Ankr and Microsoft intend making this service available soon through Microsoft’s Azure marketplace, providing a readily accessible gateway to blockchain infrastructure.

Customers will be able to launch enterprise-grade blockchain nodes with custom specifications for global location, memory, and bandwidth. When it’s launched, customers will be able to optimize data querying for high levels of speed and reliability on their choice of dozens of different blockchains with serverless functionality utilizing GeoIP, failovers, caching rules, and monitoring. They can track the performance of their nodes anytime, anywhere. Enterprise RPC clients can access usage data and advanced telemetry across 38+ blockchains.

 

On-prem file storage catching up to object, SAN

Just over half of Blocks & Files readers work for organizations with more than a petabyte of on-premises data and the amount of file storage is increasing considerably more than SAN and object storage.

These are the headline results detailed in a report looking at how organizations are managing and storing ever more data on-premises. We asked more than 600 organizations worldwide from a range of industries, via a reader survey, to indicate how much data they were managing, and what technologies they were using to do so.

One of the first questions we asked was about the amount of stored data:  

Data storage

Just under half stored less than a petabyte while 14 percent had more than 100PB – quite a range. Twenty percent of our sample worked for organizations with 10,000 employees or more. A third of them managed 100PB or more of data.

About 86 percent of the organizations represented in our survey said that their digital data storage was increasing, 10 percent said it was static, and just under 5 percent said it was decreasing. We were surprised anyone said that.

We looked at on-premises storage types and found that 48.9 percent were increasing external block (SAN) use. However results were skewed; 60 percent of the larger organizations said they were increasing SAN usage.

Just over half of our respondents said on-prem object storage use was increasing. Again, 65 percent of the larger organizations are increasing object storage use.

The on-prem file storage picture was more clear cut; almost 70 percent were increasing that and just under 24 percent were decreasing it. Three-quarters of the larger organizations were increasing it. 

On-prem files are increasing faster than objects, which are growing faster than block storage. The big are getting bigger, so to speak, and preferring files over objects and blocks.

We compared server SAN (HCI or hyperconverged infrastructure) use with actual SAN usage. Just over 40 percent used SANs, and 25.5 percent used HCI, with a third using neither. It seems the HCI replacement of SANs is not happening.

Download a copy of the report here.

Komprise CEO talks up agnostic information lifecycle management

On-prem and public cloud storage vendors can add information lifecycle management (ILM) features to their products – but that only confirms the need for a global, supplier-agnostic ILM capability. So says Komprise CEO Kumar Goswami.

Goswami has written one blog about whether customers should use public clouds, and another about optimizing cloud usage. We took these as starting points for a set of questions about Komprise’s place in the ILM area.

Blocks & Files: Komprise analyzes your file and object data regardless of where it lives. It can transparently tier and migrate data based on your policies to maximize cost savings without affecting user access. Can Komprise’s offering provide a way to optimize cloud storage costs between the public clouds; arbitraging so to speak between AWS, Azure and GCP?

Kumar Goswami, Komprise
Kumar Goswami

Kumar Goswami: Komprise enables customers to test different scenarios by simply changing the policies for data movement and visualizing the cost impacts of these choices. By running such what-if analysis, customers can choose between different destinations and then make their decision. 

And yes, we support AWS, Azure and GCP as destinations. Komprise also supports cloud-to-cloud data migrations. Komprise does not arbitrage across different options. Customers make these choices and Komprise gives them the analytics to make informed decisions and data mobility to implement their choices.

Blocks & Files: How would you view AWS’s S3 Intelligent Tiering and the AWS Storage Gateway with integration between the cloud and on-premise, offline AWS-compatible storage. What does Komprise offer that is better than this?

Kumar Goswami: AWS has a rich variety of file and object storage tiers to meet the various demands of data, with significant cost and performance differences across them. Most customers have a lot of cold data that is infrequently accessed and can leverage highly cost-efficient tiers like Glacier Instant Retrieval that are 40 percent to 60 percent cheaper. 

Yet customers still want to be able to see the data in Glacier IR transparently and access it without rehydrating it up to a more expensive tier, and this requires a data management solution like Komprise that preserves native access at every tier with transparency from the original location. 

Komprise manages the data lifecycle across AWS FSX, FSXN, EFS, S3 tiers and Glacier tiers and preserves file-object duality so you can transparently access the data as a file from the original source and as an object from the destination tiers – without rehydration – using our patented Transparent Move Technology (TMT). For customers with hundreds of terabytes to petabytes of data, they want the flexibility, native access and analytics-driven automation that Komprise delivers.

AWS S3 Intelligent Tiering will move data not used for 30 to 90 days to its own internal lower cost tier, but if you access the object, it will bring it back up to the expensive tier and keep it there for another 30 to 90 days before tiering it again. S3 Intelligent Tiering has its own internal tiers so you cannot use it to put data in the regular S3 tiers like Glacier Instant Retrieval for instance. 

You also cannot directly access the tiered data from the lower tier without rehydration. Nor can you set different policies for different data, it automatically manages data within itself and is like a black-box. AWS S3 Intelligent Tiering is useful if you have small amounts of S3 data, you don’t have file data, you don’t have analytics-based data management and don’t require policy-based automation and are worried about irregular access patterns.

With Komprise you can migrate, tier and manage data at scale across the hybrid cloud to AWS. When tiering to S3, Komprise leverages AWS ILM to automatically tier data from higher-cost object tiers to lower-cost tiers including Glacier and Glacier IR using policies set within Komprise. This gives customers transparent access from the source while getting the cost benefits of ILM. 

We also support options like AWS Snowball for customers who need to move data offline but still want transparent access. Most customers want to continue using their enterprise NAS environments like NetApp and Dell EMC as they tier data to AWS, and Komprise supports this. Komprise also supports tiering to Amazon S3 Storage Gateway which replaces a customer’s existing NAS with a gateway.

Blocks & Files: Azure’s File Sync Tiering stores only frequently accessed (hot) files on your local server. Infrequently accessed (cool) files are split into namespace (file and folder structure) and file content. The namespace is stored locally and the file content stored in an Azure file share in the cloud. How does this compare to Komprise’s technology?

Kumar Goswami: Hybrid cloud storage gateway solutions like Azure File Sync Tiering are useful if you want to replace your existing NAS with a hybrid cloud storage appliance. Komprise is complementary to these solutions and transparently tiers data from any NAS including NetApp, Dell, and Windows Servers to Azure. This means you can still see and use the tiered data as if they were local files while also being able to access the data as native objects in the cloud without requiring a move to a new storage system.

Blocks & Files: Google Cloud has lifecycle management rules in its storage buckets. What does Komprise offer that is better than this?

Kumar Goswami: Komprise leverages cloud lifecycle management policies and provides a systematic way to use these ILM policies on both file and object data with transparent access to the data from the original location as well as direct access to the data from the new location. So, Komprise adds transparency, file to object tiering and native access to any cloud ILM solution including that provided by Google.

Blocks & Files: NetApp’s BlueXP provides a Cloud Tiering Service that can automatically detect infrequently used data and move it seamlessly to AWS S3, Azure Blob or Google Cloud Storage – it is a multi-cloud capability. When data is needed for performant use, it is automatically shifted back to the performance tier on-prem. How does Komprise position its offering compared to BlueXP?

Kumar Goswami: NetApp BlueXP provides an integrated console to manage NetApp arrays across on-premises, hybrid cloud and cloud environments. So it’s a good integration of NetApp consoles to manage NetApp environments, but it does not tier other NAS data. 

Also, NetApp’s tiering is proprietary to ONTAP and is block-based, not file-based. Block-based tiering is good for storage-intrinsic elements like snapshots because they are internal to the system – but cause expensive egress, limit data access and create rehydration costs plus lock-in for file data. To see the differences between NetApp block-based tiering and Komprise file tiering, please see here.

Blocks & Files: Komprise is layering a set of file-based services (cloud cost-optimization, global file index, smart data workflows) on top of existing on-prem filers and cloud-based file services. The on-prem filer hardware and software suppliers are making their software operate in the hybrid multi-cloud environment, such as NetApp and Qumulo and also Dell with PowerScale. As part of this they are adding facilities that overlap with Komprise features. NetApp’s BlueXP is probably the most advanced such service. How will Komprise build on and extend its relevancy as hybrid-multi-cloud file software and service suppliers encroach on its market?

Kumar Goswami: Customers want data storage cost savings and ongoing cost optimization. Customers want flexibility of where their data lives and customers want to run workflows and monetize their data. This demand is driving the market, and it is excellent validation of the Komprise vision and value proposition. 

Storage vendors who stand to gain by having customers in their most expensive tiers are recognizing the demand and starting to offer some data management. But the storage vendor business model is still derived from driving more revenues from their storage operating system – they offer features for their own storage stack and tie the customer’s data into their proprietary operating system. 

Customers want choice, flexibility and control of their own data, which is what we provide. And data management is much more than just data placement for efficiency. This is why customers want a global index of all their data no matter which vendor’s system the data lives on, why customers want native format of data without proprietary block-based lock-in, and why customers want data workflow automation across environments. 

We are barely scratching the surface of unstructured data management and its potential. Think about the edge. Think about AI and ML. Think about all the different possibilities that no single storage vendor will be able to deliver. We are focused on creating an unstructured data management solution that solves our customers’ pain-points around cost and value today while bridging them seamlessly into the future.

Blocks & Files: There are many suppliers offering products and services in the file and object data management space. Do you think there will be a consolidation phase as the cloud file services suppliers (CTERA, Lucid Link, Nasuni, Panzura), data migrators (Datadobi, Data Dynamics), file services metadata-based suppliers (Hammerspace) ILMs such as yourself, and filesystem and services suppliers (Dell, NetApp, Qumulo, WekaIO) reach a stage where there is overlapping functionality?

Kumar Goswami: As data growth continues, customers will always need places to store the data and they will need ways to manage the data. Both of these needs are distinct and getting more urgent as the scale of data gets larger and more complex with the edge, AI and data analytics. The market across these is large, and data management is already estimated at about a $18B market.  

Typically for such big markets, you will see multiple solutions targeting different parts of the puzzle, as the market supports multiple independent players. Our focus is storage-agnostic analytics-driven data management to help customers cut costs and realize value from their file and object data no matter where it lives. We see our focus as broader than ILM. It is unstructured data management, which will broaden even further to data services.

Blocks & Files: If you think that such consolidation is possible, how will Komprise’s strategy develop to ensure its future?

Kumar Goswami: Komprise is focused on being the fastest, easiest and most flexible unstructured data management solution to serve our customers’ needs. To do this effectively, we continue to innovate our solution and we partner with the storage and cloud ecosystem. We have and will continue to build the necessary relationships to offer our customers the best value proposition. Having built two prior businesses, we have learned that focusing on the customer value proposition and continually improving what you can deliver is the best way to build a business.

Blocks & Files: If consolidation is not coming do you think an era of more overlapping functionality is coming? And why or why not?

Kumar Goswami: In our experience, consolidation is tough to predict because it can be influenced by many conditions such as competitive positioning, market conditions and innovation pipelines. Given the large size and growing demand for unstructured data and given the fact that we are still early in the unstructured data explosion, there is enough market demand to support multiple independent players. It is hard for us to predict beyond that.

App storage engines speed up data IO, search

Equipping high IO applications with internal storage engine code makes their data ingest, search and read faster than just leaving these operations to external SAN, filer or object storage controllers.

We’re thinking of Airbnb, LinkedIn, Meta, MinIO and WhatsApp here and the realization above was sparked by reading a LinkedIn article written by Bamiyan Gobets, CRO of Speedb. B&F described this startup in April last year and now Gobets has filled in some technical background. Gobets’s career history includes time spent as a Unix support engineer for Sun Microsoystems, a support engineer at Auspex and working for NetApp in a variety of engineer roles. He has a solid support background.

Bamiyan Gobets, Speedb
Bamiyan Gobets

Gobets says it’s all about IO for “creating, reading, updating, and deleting (CRUD) data on physical disk and external disk arrays. When done right, this creates dramatic efficiency gains for the application by eliminating read, write and space amplification from compactions on an LSM-Tree, or inserts on writes for a B-Tree.”

So what are LSM-trees and B-trees?

Imagine an application has a high IO rate and this might be biased towards writes or for reads. An LSM-tree is a data structure optimized for writes while a B-tree is better structured for reads. The data here is typically an item with a value, a key:value pair such as [pen-2] or [phone-56]. “Pen” and “phone” are keys and “2” and “56” are values. The need is, first, to store them on persistent media, disk or SSD after they arrive in the application’s memory and, secondly, to find and read them quickly if they are subsequently needed.

A LinkedIn key:value pair could be [Mellor – CRO job role details], and this could be ingested when I enter a new role in my LinkedIn entry, and read when someone looks at my entry.

A problem is that data arrives in an application’s memory in random order, needs to be written to disk sequentially to optimize disk storage, and indexed so it can be found quickly. An application could leave all this to a SAN or filer, relying on that external device’s storage controllers and software to do the job. But that device can be shared with other applications and so not optimized for any particular one, slowing down an app’s IO.

If the app has an in-built or embedded storage engine, it can present better-organized data to the array, making its storage and subsequent retrieval more efficient and faster.

B-Tree

One way of writing data is just to append it to existing data. Then you search for it by reading each piece of data, checking if it’s the right one, and reading the next item if it is not. This can, however, take a very long time if your database or file has a million entries. The longer the file, the worse the average search and read time. We need to index the file or database so we can find entries faster.

A binary search tree and a B-Tree both store index or key:values for data items. A binary tree has a starting point or root, with a few layers of nodes. Each layer is laid out from left to right and a node has a key:value that is greater than nodes to the left and smaller than nodes to the right. To find a particular key, you progress down through the layers in the tree util you arrive at the node with the desired value. The deeper the tree and the more nodes, the longer this takes.

A B-Tree fixes this problem. It has a starting point or root with a small number of node layers called leaves and children. We’ve drawn a simplified diagram to show this. Note that there are different definitions of what is a leaf node and what is not. We have kept things simple and may be oversimplifying. Bear with us, the point is to contrast this with the LSM-Tree.

B&F B-Tree diagram
B&F B-Tree diagram

A node can hold one or more keys, up to a limit, which are sorted. The numbers in each node in our diagram are range limits. The left-most child node, for example, has a range of keys between 5 and 15. If we have a leaf node with a range of 20 to 150 then we know all key:values below 20 will be in its left-most child, key:values between 20 and 150 will be in the middle child, and keys valued at more than 150, but less than 500, will be in in its rightmost child.

As you add more data, a B-tree gets more index values which have to be added. This can lead to more leaf nodes and child nodes, with nodes being split in two, and reorganization going on in the background.

The B-tree has far fewer layers than a binary tree which reduces the number of disk access needed to search it. This makes a B-tree scheme good for read-intensive workloads. However, Gobets says: “In general, considering IO ingestion is most challenging from a physical perspective, LSM-Tree has been most widely adopted, and also has the most theoretical potential to optimize reads through innovation, therefore ideal for modern, large scale workloads.”

LSM-Tree

The Log-Structured Merge Tree concept has ordered key:value pairs which are held in two or more places such as an in-memory structure and a storage structure. We can envisage three such structures: (1) an in-memory container to hold incoming data requests as a hash table or K:V pair table, (2) when full the in-memory container contents are written to one or more sorted and indexed in-memory delta container, freeing up the in-memory container, (3) base storage structure into which the delta container contents are batch written or merged in a way that is consistent with the disk structure.

The merging or compaction is a background operation. Once deltas are merged into the base they are deleted. A search is then a three-step process, looking into each data structure in turn until the item is found.

Gobets says that the code for an LSM-Tree does not have to be written from scratch. Developers can link to libraries of such code. There are three popular open source libraries for this – LevelDB, RocksDB and Speedb. Speedb is a rewritten RocksDB implementation that, the company claims, is significantly faster than RocksDB.

He says thousands of enterprise apps use LSM-Tree-based storage engines and lists some prominent LSM-Tree app users:

LSM-Tree users

He says: “These embedded storage engines offer the highest-performance data storage capabilities, enabling massive scale and stability under pressure, which is critical to the success of these companies and our experiences with their products.

“This approach is the most efficient for massive ingest of data because it allows new data to be quickly written from sorted arrays in memory, perfectly aligning aggregated bytes (in memory) to fill complete blocks (on disk), while still allowing for efficient retrieval of new data from memory, and old data from disk.”

Of course Gobets is a CRO and has software and support to sell. Even so, his article is educational and informative about the need for B-Trees and LSM-Trees.

Storage news collection – 17 February

[#Beginning of Shooting Data Section] Image Size:S (3680 x 2456), FX 2021/08/31 12:43:13.18 Time Zone and Date:UTC+1, DST:OFF Tiff-RGB (8-bit) Nikon D810 Lens:105mm f/2.8D Focal Length:105mm Focus Mode:Manual AF-Area Mode:Single VR: AF Fine Tune:OFF Aperture:f/22 Shutter Speed:1/100s Exposure Mode:Manual Exposure Comp.:0EV Exposure Tuning: Metering:Matrix ISO Sensitivity:1EV under 64 Device: White Balance:Direct sunlight, B1.00, G0.50 Color Space:Adobe RGB High ISO NR:ON (High) Long Exposure NR:OFF Active D-Lighting:OFF Image Authentication: Vignette Control:Normal Auto Distortion Control:ON Picture Control:[SD] STANDARD Base:[SD] STANDARD Quick Adjust:- Sharpening:7.00 Clarity:0.00 Contrast:0.00 Brightness:0.00 Saturation:+1.00 Hue:0.00 Filter Effects: Toning: Optimize Image: Color Mode: Tone Comp.: Hue Adjustment: Saturation: Sharpening: Latitude: Longitude: Altitude: Altitude Reference: Heading: UTC: Map Datum: [#End of Shooting Data Section]

Ranga Rajagopalan

Data protector Commvault has released its latest platform, PR 2023. This is one of Commvault’s long-term support releases with a support lifetime of up to three years. They are different from maintenance releases which are available on a monthly schedule. PR 2023 features hybrid-multi-cloud improvements, plus additions in the Kubernetes, security and cost-savings areas.

It includes AWS-Azure cross-cloud replication and recovery with the Complete Data Protection (CDP) and Disaster Recovery (CDR) products. There is new data protection for the Couchbase Big Data NoSQL database with CDP and Commvault’s Backup & Recovery (CBAR) products. A Warm Site Replication feature in CDP and CDR allows periodic replication for lower-priority applications with moderate disaster recovery requirements. There’s more on it in a post by the company’s SVP Products Ranga Rajagopalan here.

GigaOm has published an emerging technology Sonar report looking at DPUs and examining AMD, Fungible (its research predates Fungible’s acquisition by Microsoft), Intel, Kalray, Marvell, NVIDIA and Pliops. Here is the Sonar diagram:

NVIDIA is clearly in the lead with its BlueField technology and VMware support

Analyst Justin Warren writes: “It is difficult to distinguish between the leading vendors on common criteria as they all have quite different focuses. Some have chosen to concentrate on certain use cases, such as storage or networking, while others are focused on specific market segments, particularly the high-end hyperscale cloud market. This is a function of the DPU market being in the very early stages of development.” Full content is available to GigaOm subscribers.

Japan’s Kioxia is reporting [PDF] an operating loss in its fourth 2022 calendar quarter (Kioxia’s fiscal 2022 Q3) due to lower demand for NAND flash from its fabs. Kioxia customers are using up their NAND chip inventories and not buying as much as before due to uncertain economic conditions. A table from Kioxia’s announcement shows changes from Q2 fy2022 with a reduction in revenue from ¥391.4B ($2.9B) to ¥278.2 billion ($2.07 billion) and a dramatic fall in profit from a ¥34.8 billion ($258 million) to a ¥84.6 billion ($629 million) loss.

Kioxia Q3 fy2022 results summary table

There was weak demand in PC and smartphone markets and a sharp decline in NAND average sales prices with Kioxia writing down the value of its own inventory and lowering production volume. Demand for datacenter and enterprise SSDs is weakening sharply. 

Architecting IT has published an Ondat-sponsored analysis report looking at “Container-Native Storage Performance in the AWS Public Cloud”, which can be downloaded from Ondat’s website. It comprises 23 pages of analysis. Analyst Chris Evans writes: “the choice of storage for CNS is affected by storage performance, cost, and resiliency. We will dig deep into the details of the implementation of each storage solution, helping the end user pick the right choice for their application.” And dig deep he does, looking at latency, IOPs and throughput for EBS, local SSDs (Instance Store Volumes), and the Elastic File System. A conclusion is: “The results from our testing indicate that NVMe drives offer the best balance of performance and cost, with good predictability when deploying Ondat container-native storage. ” 

Scale-out filer supplier Qumulo has hired Ryan Farris as VP of Product and Brandon Whitelaw as VP of Strategic Partnerships. Qumulo says it’s entering a new chapter of growth. The new hires come just after the company wrapped up its fiscal year 2023 with a 22 percent increase in customers

Budapest-based post-production media business Origo Studios is using Qumulo’s software running on HPE ProLiant DL325 Gen10 Plus servers via a GreenLake contract. Origo Studios has worked on films such as Dune, Dune: Part Two and Blade Runner 2049 since opening a decade ago. Using Qumulo’s file data software delivered as-a-service through HPE GreenLake edge-to-cloud platform, the Origo Studios team says it was able to process Scanity-generated images at 4K resolution at almost twice the previous speed.

Samsung and SK hynix have reported a rise in orders for high-bandwidth memory (HBM) products, the ones that have a faster DRAM-CPU interface than the standard server CPU socket interface. The aim is to have CPUs process data faster for AI/ML applications including ChatGPT. The method is to get data closer to compute so less time has to be spent moving the data into the processor’s caches.

SSD controller supplier Silicon Motion is sampling its new SM2268XT controller to OEMs. It supports PCIe gen 4 with 4 lanes. The device is built to support 200+ layer 3D NAND with TLC and QLC cells. It has a host memory buffer design, with Silicon Motion’s 8th generation NANDXtend ECC technology using a performance-optimized 4KB LDPC engine and RAID. This delivers 1.2 million random read and write IOPS, 7.4GBps sequential reads and 6.5GBps sequential writes capability. A clock gating mechanism can power down areas of unused blocks. 

Siliocon Motion SM2268XT controller on a board

The SNIA has released its Smart Data Accelerator Interface (SDXI) Specification v1.0, a standard for a memory-to-memory data mover and acceleration interface. It provides data movement between different address spaces, including user address spaces located within different virtual machines, and without mediation by privileged software once a connection has been established. The interface and architecture can be abstracted or virtualized by privileged software to allow greater compatibility of workloads or virtual machines across different servers. Click here to download the complimentary version of the SDXI v1.0 specification.

Scale-out, parallel filesystem SW provider WekaIO announced a new global channel partner program, WEKA X, saying it has a channel-led, partner-first sales strategy encompassing VARS, SIs and MSPs. They get the usual mix of tiers (Pro, Prime and Premier) with training, certifications, exclusive pricing, and incentives to streamline deal registration. Partners can buy WekaIO’s WEKA Data Platform software direct from WekaIO or from other WekaIO server and cloud partners: 

IDC has published a VAST Data-sponsored (paid for) white paper on High-Performance NFS Storage for HPC and AI. It claims that the HPC community has moved away from NFS in the recent past because it “presented inherent scaling challenges in parallel computing environments where performance trumps everything else. … As traditional implementations of NFS-based systems were never designed for atomic parallel I/O, many practitioners in the HPC community shifted to other parallel (distributed) file systems such as Lustre that were specifically designed for parallel, clustered computing environments.” Now most HPC environments rely on parallel file systems.  

The paper identifies disadvantages with this approach – skilled staff needs, SW version control, implementation planning intricacy, change management and networking complexity – all making costs higher. “IDC believes that combining the benefits of NFS, flash media, and parallel scale-out architecture on the server side provides maximum benefits for HPC-AI environments.” VAST Data claims it can provide this and lower costs – although no numbers are cited – and claims: “VAST’s NFS-based approach to HPC-AI presents a strong and most compelling case for consideration.” 

Datacenter as a service equipment supplier Zadara said its object storage can be a target for Veeam Backup and Recovery v12’s direct write to object storage capability, joining Cloudian, Object First, MinIO, Pure Storage and Scality in that regard. Zadara said Veeam Data Platform v12 multiple Object Storage extents combined with Zadara Object Storage enables greater flexibility over configurations such as enabling dedicated Object Storage separated and isolated for certain workloads or customers. A Zadara blog goes into more detail.

Two open source data lakehouse platforms will emerge, says db insight

Research house db insight says it’s likely Delta Lake and Iceberg will become the main data lakehouse platforms, with proprietary alternatives being a transitional stage.

Tony Baer.

This prediction is made by analyst Tony Baer in a research report entitled “Data Lakehouse open source market landscape.” This 13-page document looks at the data lakehouse technology area, defining the technology and comparing three open source projects – Delta Lake, Apache Hudi and Apache Iceberg – with proprietary alternatives, such as AWS, Oracle and Terradata.

Baer writes that “Over the past five years, a new construct designed to combine the best of the Data Warehouse and Data Lake worlds has emerged: the Data Lakehouse.” This construct is based on a key enabling technology: “new table formats overlaying atop cloud object storage that deliver the performance, governance, granular security, and ACID transaction support of data warehouses, combined with the economics of scale and the analytical flexibility of data lakes.”

In his view, “Data lakehouses will enable data lakes to perform and be controlled, governed, and secured like data warehouses.”

Such warehouses support ACID (Atomicity, Consistency, Isolation, and Durability) as a way of having their data integrity maintained. ACID transaction support will become the lynchpin of data lakehouses because it will give enterprises confidence in the consistency of the data. Baer says: “In the long run, open source will prevail because ACID support will be table stakes, not a competitive differentiator.”

He think there is 80 percent functional parity between Delta Lake, Hudi and Iceberg. It will be “the breadth and depth of the commercial ecosystem and depth of support that will determine the winners” and there are likely to be just two. He thinks: “Today, Delta Lake and Iceberg have the clear momentum and are clearly the early favorites to be the lakehouses left standing.”

And Hudi (which stands for Hadoop Upserts Deletes Incrementals): “The challenge is building a commercial ecosystem beyond the long tail, with pressing need to line up a major data platform heavyweight” such as IBM, Oracle or SAP for example.

Baer says Data Lakehouses will eventually co-opt the enterprise data warehouse because they provide many of the same capabilities for multi-function analytics, but they will not replace data lakes or purpose-built warehouses or data marts. Cloud data warehouses with support for polyglot data types, Python and AutoML capabilities will be, he thinks, the gateway drugs for data lakehouses.

This is a readable and informative report that is free to download.

Backblaze fans cloud storage fires higher

Backblaze grew its calendar Q4 revenues faster than analysts expected as the cloud storage business outpaced the wider backup sector.

Revenues in the quarter ended December 31 were $22.9 million, 23 percent higher year-on-year, but there was a loss of $14.8 million, 54 percent deeper than last year’s $9.6 million loss. Costs are rising faster than revenues. Full 2022 revenues were $85.2 million, 26 percent higher year-on-year, with a loss of $51.7 million. 

CEO Gleb Budman said: “We were pleased to finish 2022 with strong Q4 overall revenue growth of 23 percent driven by an increasing proportion of our B2 Cloud Storage service, which grew 44 percent in Q4. We are excited as we begin 2023 as we see the opportunity to help more small and large businesses reduce the cost of their cloud infrastructure.”

Backblaze revenues

He added: “As we continue to grow revenue, we’re moderating expense growth and are targeting to approach adjusted EBITDA breakeven in Q4 of this year.”

Backup storage brought in $13.1 million, 11 percent higher than a year ago, while generic cloud storage revenues rose 44 per cent, four times faster, to $9.5 million.

A chart looking at the revenue history of the classic backup storage business versus the cloud storage business shows the lower backup storage growth rate:

Backblaze segments

At these growth rates, cloud storage revenues could exceed backup storage revenues in the second or third 2023 quarters. The cloud storage business operates well below Amazon Web Services’ S3 price umbrella, and that gives it a tailwind.

William Blair analyst Jason Ader told subscribers that the cloud storage business “remains well positioned as a low-cost cloud storage alternative … providing a tailwind for the business as customers look to reduce cloud costs.” Backblaze, Ader said, has a “large opportunity to provide cloud storage to the underserved midmarket.” Possibly Backblaze is thinking about how it might revitalize the backup storage business.

Backblaze expects next quarter’s revenues to be between $23.1 million and $23.5 million, 20 percent higher year-on-year at the midpoint, and full 2023 revenues are being guided to be between $98 million and $102 million, up 17.4 percent at the midpoint.

Western Digital My Book drive reaches 44TB

Western Digital has enabled massive external storage for desktop, notebook and workstation users with a 44TB dual-drive My Book product.

The My Book product line is used to backup desktop files and folders as well as acting as a central store for data from smart devices such as digital cameras. It is also found in small office/home office environments.

Western Digital My Book single drive version
My Book single drive version

My Book Pro single drive and twin drive systems arrived in 2006 as desktop and small office/home office external file storage systems. They provided remote access over the internet as well as local access and had miserly 500GB and 1TB capacities respectively. There were later My Cloud Home and Home Duo variants providing a home NAS which synced with other devices for storing, streaming and sharing across an internet connection. Twin drive My Book systems could be used as two separate drives: as a faster and combined logical drive with data striped across the two disks in a RAID 0 setup, or a single logical drive in a RAID 1 mirroring scheme.

Capacities were steadily uprated as the component SATA disk drives went from generation to generation. By August 2017 there were 4, 6, 8, 12, 16 and 20TB capacities in the Duo configuration, using WD Red 5,400rpm disk drives. WD’s Red Pro drives now extend to the 22TB capacity level from a 2TB start point. And the USB 3.2 gen 1-connected My Book product reaches that level too, from a 4TB starter priced at $99.99, through 6, 8, 12, 14, 16 and 18TB variants to the 22TB range topper at $599.99. That’s a 44x capacity increase in 17 years. What capacity level will we see in another 17 years?

Western Digital My Book Duo
My Book Duo

The dual-drive My Book Duo starts at $439.99 for 16TB, and goes through 20, 24, 28 and 36TB capacity points to reach the giddy heights of 44TB for $1,499.99, 2.5x the single drive product’s price. It can be used as two independent drives, like a twin disk JBOD, or as a single RAID 1 device for redundancy, but not RAID 0.

WD notes the average US household has more than 10 smart devices, such as portable SSDs and HDDs, memory cards and USB flash drives. Product management VP Susan Parks said: “With multiple devices used in our everyday life, we have the ability to instantly create, consume and generate massive amounts of content.” The new high-capacity My Books can be used to backup personal devices or an entire household’s devices.

In June 2021, WD suffered a bug and customers’ My Book Live NAS devices were reset over the cloud and files made inaccessible. This fault has been fixed.

WD’s announcement quotes John Rydning, an IDC research VP: “While many people rely on the cloud, we know consumers are looking for local storage at their fingertips to help them preserve and readily control their growing amount of personal and business data.”

Get a My Book datasheet here and a My Book Duo datasheet here.

Both products are available at select Western Digital retailers, etailers and on the Western Digital store.

Storage security toughen-up for compliance and cyberwar in 2023

Sponsored Feature Cybercriminals tend not to discriminate when it comes to the type of data they steal. Structured or unstructured, both formats contain valuable information that will bring them a profit. From a cybersecurity practitioner’s perspective, however, structural state presents specific challenges when it comes to storing and moving sensitive data assets around.

Generally speaking, structured – quantitative – data is stored in an organized model, like a database, and easily read and manipulated by a standard application.

Unstructured – qualitative – data can be harder to manipulate and analyze using standard data processing tools. Typically, it’s stored in orderly but unorganized ways, sliced across silos, applications, and access control systems, without formalized information about its state or location.

To complicate matters, unstructured data is increasing in importance, needed to drive business growth and planning. Some projections indicate that unstructured data constitutes more than 90% of all enterprise data and continues to grow at 21% per year, so assurance of the ability to store it securely over the long term is imperative.

It’s a challenge for IT security chiefs because unstructured data’s decentralized nature makes it harder to maintain effective and consistent security controls that govern access to it.

Complying with Executive Orders

The challenge is compounded by the regulatory requirements pertaining to cyber-governance that organizations globally must now comply with. It’s no longer solely a matter of risking penalties for non-compliance. Compliance is becoming a condition of business, and has been formalized in the US by Presidential Executive Order 14028.

Importantly, EO 14028 implies that suppliers lacking comprehensive security will not be doing business with the US government. In effect, cybersecurity responsibility is being deferred to solutions providers, and away from their customers, having a game-changing impact on procurement procedures – and cybersecurity provisioning.

EO 14028 was spurred into effect by 2020’s near-catastrophic cybersecurity breach event when hackers – suspected to be operating under the auspices of Russian espionage agencies – targeted managed IT infrastructure service provider SolarWinds by deploying malicious code into its monitoring and management software.

The SolarWinds hack triggered a much larger supply chain incident that affected 18,000 of its end user organisations, including both government agencies and enterprises. It’s assumed that the former were the primary targets, with some enterprise users suffering ‘collateral damage’.

“Concepts of best practice in data storage have evolved rapidly since the SolarWinds hack,” says Kevin Noreen, Senior Product Manager – Unstructured Data Storage Security at Dell Technologies. “This and other cyberbreaches involving ransomware have accelerated that evolution at both the tech vendors and their customers. At Dell, the recent feature focus for our PowerScale OneFS family of scale-out file storage systems reflects those changes, and in doing so, orients the platform’s future development.”

Securing stored unstructured data poses specific challenges, especially when it comes to provisioning high-performance data access for applications in science and analytics, or video rendering, adds Phillip Nordwall, Senior Principal Engineer, Software Engineering at Dell Technologies.

“Cybercriminals have growing interest in these fields. Intellectual Property in life sciences, for example, holds high transferable resale value,” Nordwall reports. “Streamed entertainment data is also highly saleable. So effective security while that data, at rest or inflight, is being managed is now totally critical.”

The way in which such security directives influence specification development for the newly-released Dell’s PowerScale OneFS 9.5 has helped to inform five broader future trend trajectories that Dell’s experts foresee for 2023, as Noreen and Nordwall explain.

Finding closures

The first prediction is that datacenter infrastructure vendors will close-out cyber vulnerabilities by engineering safeguards directly into their products, like network-attached storage solutions.

“While in 2023 cybercriminals will continue to exploit what they’ve exploited before, they’ll find their efforts increasingly frustrated as new built-in security features are introduced across the datacenter infrastructure, closing off their customary attack vectors,” says Noreen, a closing-off which will happen gradually, but which will happen nonetheless.

To that end, the latest version of Dell’s PowerScale scale-out NAS solution – OneFS 9.5 – brings an array of enhanced security features and functionality. These include multi-factor authentication (MFA), single sign-on support, data encryption in-flight and at rest, TLS 1.2, USGv6R1/IPv6 support, SED Master Key rekey, plus a new host-based firewall.

“In the past, security requirements were always viewed as important, but are now being emphasized to be more proactive as opposed to being reactive,” explains Noreen. “In addition, PowerScale OneFS 9.5’s latest specification scopes the growing range of security enhancements required by US Federal and Department of Defense mandates such as FIPS 140-2, Common Criteria and DISA STIG.”

PowerScale is undergoing testing for government approval on the DoD Information Network Approved Products List, for example.

“Having enhanced security compliance very evidently in place wherever and whenever possible across the spec serves a dual role,” says Noreen. “It reinforces your cyber-defensive posture, and it speaks a message to would-be attackers: ‘we are protected – go focus your attacks elsewhere’.”

Caught in the cyber crossfire

Dell’s second 2023 prediction is that commercial entities will find themselves scathed by cyberwar offensives if geopolitical conflicts cause renewed cyber offensives from nation state sponsored threat actors – especially as they probe the effectiveness of new governmental regulatory security compliance frameworks.

“Gartner has declared that geopolitics and cybersecurity are inextricably linked, and hybrid warfare is a new reality,” says Noreen. “Because of the increased interconnectedness between economies and societies, definitions of critical infrastructure have extended to include many commercial operations such as shipping, logistics and supply chains. Geopolitical conflict escalates cyber-risk, but will also accelerate the introduction and criticality of zero trust adoption over the coming 12 months.”

Zero-ing in on trust models

Following from this, the third prediction from Dell is that zero trust models will continue to reinforce enterprise cybersecurity strategies in 2023 as they are integrated into product platform technologies that work in sync with enterprise zero trust procedures and practices.

“Zero trust security models allow organizations to better align their cybersecurity strategy across the datacenter, clouds, and at the edge,” says Nordwall. “Our aim is to serve as a catalyst for Dell customers to achieve zero trust outcomes by making the design and integration of this architecture easier.”

“We designate zero trust as a journey,” says Noreen. “We need different implementations of zero trust that work together. Organizations now have to think about their IT infrastructure from multi-cloud to edge, their user-base – including supply chain partners – and think about how zero trust applies at a process level. In the datacenter that means component by component – servers, networking and, of course, NAS.”

Noreen adds: “Another thing we believe will step-up in 2023 is the notion of zero trust, and also resiliency borne of enhanced systems that ‘patrol’ data assets in order to detect attacks before they’ve had the opportunity to cause damage. These will be cybersecurity gamechangers.”

It will likely be mid-way into 2023 before the full benefits of end-to-end zero trust feed through. In the meantime, systems must be ‘patrolled’ for malware attacks that manage to infiltrate networks.

To perform this task PowerScale integrates with Superna’sRansomware Defender module [part of the Superna Eyeglass Data Protection Suite] and uses per-user behavior analytics to detect abnormal file access behavior to protect the file system.

“Superna’s Ransomware Defender solution minimizes the cost and impact of ransomware by protecting data from attacks originating inside the network,” Nordwall explains. “The Ransomware Defender module uses automatic snapshots, identifies compromised files, and denies infected users’ accounts from attacking data by locking the users out.”

Demands on supply

The combined benefits of enhanced security built into storage platforms, plus compliance with emergent regulatory mandates, will remediate longstanding cybersecurity weak links in supply chains in 2023, Dell predicts.

“We will see things continue to improve in supply chain resiliency in terms of better safeguards to ensure security of solutions as shipped,” says Noreen. “These measures eliminate any opportunity for vendor integrity to be compromised by intermediate interference.”

Noreen explains the safeguards: “When a PowerScale unit is assembled in our factory, we put an immutable certificate that sits on its system. That system is then shipped from the factory to the customer site. When ready to be installed there’s a software product that customers run against that hardware. It validates that what was shipped from the factory is what was delivered to the customer site. It attests that the system hasn’t been booted in the interim, nobody has installed additional memory – or anything that could relay malware.”

Workforce enforcement

Fifth on the Dell list of 2023 predictions is that organizations will invest even more emphasis on employee ransomware awareness training, leveraging tools and guidance that pinpoint patterns in cyber-defense weak spots.

“From one perspective organizations are compelled to step-up the focus on workforce education and training because employees continue to constitute a non-technological vulnerability in enterprise security, despite past investments in cyber-threat education and training,” says Noreen. “We expect them to make more extensive use of tools such as Superna Eyeglass’s Ransomware Defender to close-out these kinds of vulnerability.”

“If Ransomware Defender detects ransomware attack behavior, it initiates multiple defensive measures, including locking users from file shares – either in real-time or delayed,” adds Nordwall. “There are also timed auto-lockout rules such that action is taken even if an administrator is not available, as well as automatic response escalation if multiple infections are detected.”

Sponsored by Dell.

Spectra Logic targets on-prem archive market

SpectraLogic has announced a Spectra Digital Archive offering for on-premises archiving using its StorCycle, Black Pearl and tape library products.

StorCycle is SpectraLogic file/object lifecycle management software that scans primary on-premises or public cloud storage and moves older, less-accessed data from primary storage tiers to less expensive ones, such as nearline disk (NAS or object), tape, or archive storage tiers in the public cloud – both AWS and Azure. BlackPearl is a hybrid flash/disk, front-end cache that stores files as objects on back-end tape devices. Spectra’s tape library products include the TFinity line which can hold an exabyte or more of data.

David Feller, SpectraLogic VP of product management and solutions engineering, said: “Organizations are becoming increasingly overwhelmed by the size, scope and cost of their data growth. We designed the Spectra Digital Archive… to help data-driven customers easily and affordably manage, preserve and access this data by migrating it seamlessly to less costly storage tiers for long-term digital preservation, usage and monetization.” 

SpectraLogic StorCycle graphic
StorCycle graphic

The Spectra Digital Archive offering doesn’t do anything more than what is already available from Spectra. Indeed, StorCycle is already positioned to be used as a kind of ingest gateway for Spectra’s secondary and tertiary storage tiers. These can be either on-premises, in the form of  BlackPearl and tape respectively, or in the public cloud. 

Nathan Thompson, SpectraLogic
Nathan Thompson

SpectraLogic CEO and founder Nathan Thompson told us: “To date, over 90 percent of StorCycle installations have been in the paired software and hardware configuration. It’s become apparent that it’s a unique offering from Spectra so we wanted to make our customers aware of this combination of products that solve the management and archival of massive amounts of data, including software, hardware, professional services and support, all from one organization.

“When StorCycle was announced we thought the use case would largely be for archiving to public clouds, but our market experience has shown more interest in long-term, on-prem archival. Making Spectra Digital Archive available as a complete solution simplifies the design and buying process for our channel and this segment of customers.”

New product features are coming: “We will be adding a number of features and capabilities to the total solution over the next six to 18 months.”

One of the existing capabilities is project archiving. When an organization completes a project, all of its files and directories can be archived as a grouped entity and managed and accessed at that level. We can imagine that could be an appreciated on-premises archive feature.

In general, Spectra has just announced it’s upping its archiving game. Competitors such as XenData, PoINT Software and Quantum with its ActiveScale product line, are on notice that Spectra is gunning for the on-prem archive software and hardware market.