‘Reverse CDN’ is huge storage vendor opportunity for self-driving cars

Autonomous and near-autonomous vehicles will need black box facilities to help with accident cause analysis in the case of a crash. They may also need a ‘reverse’ content delivery network (CDN) to handle generated data upload to manufacturers and fleet operators.

These are some of the findings from our interviews with three self-driving car experts on autonomous and near-autonomous vehicle (AV, NAV) data storage. We have previously looked at AV, NAV storage in general, taken a first look at the specifics, and zoomed into self-driving car data generation.

In this article we take a closer look at the requirements for in-vehicle storage and data transmission to remote operational sites.

To recap, autonomous and near-autonomous vehicle need on-board data storage to drive along the highway, find their way using maps and enabling tactical obstruction avoidance of other vehicles, street objects and pedestrians. They also need to communicate with manufacturers and fleet operators to upload and download information.

Consumer AV, NAV on-board data generation per day ranges from 1TB to 15TB, and a robo-taxi will create anything from 60TB and 450TB a day.

Our panel

Our expert panel members are Christian Renaud, an analyst at 451 Research; Robert Bielby, senior director of Automotive System Architecture at Micron; and Thaddeus Fortenberry, who spent four years at Tesla working on Autopilot architecture. 

Left to right: Christian Renaud, Robert Bielby, Thaddeus Fortenberry.

Blocks & Files: Will the AV-generated data have to be stored in the vehicle and, if so, for how long?

Christian Renaude, 451: Some of it is ephemeral for real-time decision making, so relevant within a 10-15 second window only, and a subset of summarized data will be sent off vehicle for training/inferencing to the OEM. That data storage interval is an unknown to us right now as no one is shipping a full AV yet.

Robert Bielby, Micron: Except for when the vehicle in training phase, or data collection phase, a nominal amount of AV-generated data will be collected and stored in the vehicle.  

Nominal amounts of data that captures and logs overall vehicle system performance, usage statistics and the such, will be maintained similar to data logging that exists in today’s vehicles.  

One exception to this is in the area of the black box which is an emerging requirement for all vehicles with level 3 and above capabilities. In this case, a 30 second snapshot of all the relevant system data prior to an accident, or an event that caused the automatic emergency brakes to be applied, must be saved to memory. In some instances, 30 seconds after the event, for a total of 1 minute, will be required to be stored in memory.  

Additionally, it may be required that up to eight instances of different incidents needs to be stored.  While this application area is still evolving, there are different strategies and philosophies regarding the use of compression to manage down the number of bits that get stored. 

With system data rates in the range of 3 Gbit/s to 40 Gbit/s, a considerable amount of storage could be required for the black box if 8 incidents of 1 minute at 40 Gbit/s of data needs to be stored. The period that this data is required to be retained is not as great as other applications as it is expected that the contents of the black box, when of interest, will be written to another storage medium for more in-depth analysis.    

Thaddeus Fortenberry: Connectivity will continue to be the main constraint for getting data off customers vehicles, however getting data and learnings off cars is critical to support the rapid publication of relevant data. We will see cars with increasing onboard storage to support optimal transfers, pre-processing, and intelligent network utilisation (low-orbit, Wi-Fi, 5G, etc.).

Blocks & Files: How will the data be uploaded to a cloud data centre? How often?

Renaude: Cellular radios (4G LTE/5G) [and] real-time.

Bielby: The two primary times when data will be uploaded to the data centre will be in the creation and updating of real-time maps – which is a relatively low bandwidth operation – and in instances where there is a disparity detected in the results of the [on-board] AI algorithm. Additionally, general vehicle health and maintenance information will be transmitted to the cloud in addition to driver profile data.  

Uploading will occur through cellular connections for updating of maps and detected algorithm disparity. For other non-time critical data local Wi-Fi connections will be used when the car is parked and not in use for over the air uploads and downloads.

Fortenberry: The best way to get the fleet driving more accurate is to build a data portfolio of localisation, routes and environmental conditions. Obviously, there will be location and events which some cars will care about and many they will not. Therefore we will see a quality of service (QoS) parameter with data.

My belief is that a well-designed data ingest infrastructure is both crucial and key to a successful autonomous vehicle solution. Storage vendors should realise that creating a policy-managed gateway/accessor solution with storage caching is a huge opportunity. [More on this below].

Blocks & Files: What is the maximum amount of storage capacity that will be needed in an AV to cope with the data generation load and the worst case data transmission capability?

Renaude: Excellent question. Honest answer is it’s too soon to tell. If you were to take the average duty cycle I said before (2 hours) and take the average data generated during that time and break it down into that 10-15 second relevancy window, that would answer the on-board storage answer. Less than 500GB certainly, possibly less than 50GB.

Bielby: In looking at Mobileye’s REM technology which provides the basis for creating real-time mapping, data rates associated with REM are on the order of 10 KB / Km. Other real-time mapping technologies are also focused on the basis of this type of sparse data generation.  

Assuming a cellular shadow region that exists for 1 Km or even several Kms, the amount of data would need to be buffered until connectivity is restored is modest at best and on the order of 10’s of KBs. Additionally, to the real time mapping information, HD maps database is permanently stored in the car, with densities up to 160GB. Those maps are usually being updated every couple of months either when the car is connected to a Wi-Fi station or over the air. 

Another event that creates heavy data traffic to the cloud is when the AI inference process suspects that a certain road situation needs to be retrained. In that case sensors’ data are captured in local storage, typically for one minute around the event, and uploaded to the cloud for AI retraining. Such an event requires up to 300GB of local storage, which will be uploaded to the cloud.  

Fortenberry: The only real way to answer this is to establish the value of data for the car company. This is a currently a tough one to come up with because the ML training process is too disconnected from incoming data. In fact, most all companies are leveraging engineering vehicle data for their development.

The answer would also be dependent on the type and regularity of networks the vehicle connects to. We will be able to use quite a lot of local storage, but the vehicle BOM pressure will be substantial for some years.

Blocks & Files: Will disk drives or flash storage be used or a combination?

Renaude: Flash storage.

Bielby: While there are platforms today that are based on HDD, the continued decline in cost per bit of solid state-based storage and the inherent robustness that SSDs offer over rotating storage mediums ultimately are driving an aggressive trajectory to displace HDDs.  

Last year, Micron announced a 1TB BGA SSD which provides requisite storage capacity for today and tomorrow’s projected storage requirements in an area of only 16x20mm.  It’s clear that the lower power, lower area, and higher reliability of the SSD provides significant compelling benefits that are today driving out the design of traditional HDDs.

Fortenberry: I see no scenario where disk storage will be leveraged except for archiving in the data center. Currently disk makes sense for Object Storage as final tier before archiving, but I see this short lived. Performance in handling vast amounts of data is more important.

Blocks & Files: Assuming flash storage is used will the workload be a mixed read/write one? If so, how much endurance should the flash have? (AVs could have a 15+ year working life.)

Renaude: Yes mixed read/write. There is a spec for this that I haven’t been able to find the name of right away that dictates performance, operating conditions, and endurance.  

[Blocks & Files: We understand this is an AEC-Q100 document, where AEC is the Automotive Electronics Council. It publishes Q100 and Q200 series documents relating to automotive components.]

Bielby: This value is highly dependent upon the application space where the flash memory is used – as well as other factors including the efficiency of the Flash File System. In the extreme case, a black box, which, depending upon the architecture, can drive endurance requirements reaching levels of double-digit petabytes, whereas for applications with more modest workloads – 30 to 400 Terabytes / TBW present a reasonable endurance that will be required over the lifetime of an autonomous vehicle.

Fortenberry: Everything in the vehicle should be automotive-grade and allow for fairly high endurance. It is likely we will see design choices being made based on usage (personal vehicle vs Fleet vehicles), but collectively the value is in data.

Blocks & Files: Will the flash have to be ruggedised to cope with the AV environment with its vibrations and temperature/moisture variations?

Renaude: Yes, same spec.

Bielby: Basically yes. All Micron memory devices that are designed into an automobile are designed, tested, and qualified to operate within the harsh, demanding environment of the automobile.  Micron has over 28 years of experience and market leadership supporting and delivering the richest portfolio of memory solutions the automotive market, supporting the following:

  • ISO 9001/IATF 16949 Quality Management Systems
  • AEC-Q100 qualification methodology
  • Zero defect target approach 
  • -40 to 105°C: NAND, SSD, UFS, and e.MMC, Multi-Chip

In addition to providing products that are compliant with industry requirements, Micron also has multiple labs worldwide to help assist automotive customers to successfully design their application to ensure lowest risk getting into production.

Fortenberry: Probably need to comply to automotive grade components (AEC-Q100). Certainly, some companies will use non-AEC grade, but I think in the long run it will be cheaper for manufactures to avoid service events.

Reverse Akamai

Fortenberry offered this thought; “I would imagine a substantial storage opportunity of what amounts to a reverse content delivery network for ADAS (Advanced Driver Assistance System) data.

A content delivery network (CDN) provides video and other content produced by relatively few producers to thousands, if not tens of thousands of end systems that consume the content. A reverse CDN would provide content upload facilities from tens of thousands of endpoints to a relatively small number of manufacturers and fleet operators.

The reverse CDN will provide an effective pipeline for vast amounts of incoming data that is:

  • Policy compliant. Support international policy, contract policies, QoS enabled, etc.
  • Optimize transfers. Perform routines to reduce and optimize all data.
  • Caching. Focus on last mile optimisation to vehicle by enabling gateway caching.
  • Auditable. Reproducibility will become increasingly important as ADAS matures.
  • Encrypted. All ADAS data should be considered sensitive.
  • Operationally manageable.


The amount of in-car storage for AVa/NAVs ranges between 50GB to 500GB, according to Christian Renaude. Bielby suggests around 461GB, made up from 160GB map data, 300GB for sensor-generated data and a few kilobytes (<10) to cover cellular coverage gaps. 

Fortenberry doesn’t yet feel there is enough data to reliably estimate a realistic number. However, a 500GB on-board storage capacity does not sound particularly onerous.

All agree it will be based on automotive-grade (AEC Q100 spec) NAND with disk ruled out completely. Some of this storage will be an aircraft-equivalent black box to be looked at in case if accidents. It will need to be stored in a way that us resistant to high impact pressures, explosions, fire, flooding and extreme cold.

Lastly, Fortenberry is emphatic that a reverse content-delivery gateway service will be needed for AV/NAV data upload to cloud-based AV/NAV manufacturers and fleet operators. This is a good shout.