DNA storage and science: archiving unreality

DNA storage researchers have devised a neat way to record events inside cells down to a one-minute granularity, but are wildly off-beam when predicting this could become useful as a technology to store general archival data.

Identifiable segments of DNA can be sequentially added to a base strand and represent ones or zeros. These indicate a particular change in the environment inside a living cell and so function as in-vivo ‘ticker tape’ data recorders. But the rate of ingest is so slow – three eighths of a byte per hour – that suggestions it could scale up for general archive use are little short of ridiculous.

Researchers led by Associate Professor of Chemical and Biological Engineering Keith Tyo at the McCormick School of Engineering, Northwestern University, Illinois synthesized DNA using a method involving enzymes.

A Northwestern University announcement declares: ”Existing methods to record intracellular molecular and digital data to DNA rely on multipart processes that add new data to existing sequences of DNA. To produce an accurate recording, researchers must stimulate and repress expression of specific proteins, which can take over 10 hours to complete.”

It is faster to add DNA nucleotide bases to the end of a single DNA strand. These bases — adenine (A), thymine (T), guanine (G) and cytosine (C) — can be grouped together in different combinations. The researchers’ Time-sensitive Untemplated Recording using TdT for Local Environmental Signals (TURTLES) method uses a DNA polymerase called TdT, standing for Terminal deoxynucleotidyl Transferase, to add the nucleotide bases. 

Their composition can be affected by the cell’s internal environment. According to a paper published in the Journal of the American Chemical Society, the researchers “show that TdT can encode various physiologically relevant signals such as Co2+ (Cobalt), Ca2+ (CalCium), and Zn2+ (Zinc) ion concentrations and temperature changes in vitro (in glass — the test tube). So Cobalt presence could cause more A and fewer G bases, and Cobalt absence could cause the reverse, giving us a binary situation: more A bases equals 1 and more G bases equals 0.

They were able to record sequential changes down to the minute level. “Further, by considering the average rate of nucleotide incorporation, we show that the resulting ssDNA functions as a molecular ticker tape. With this method we accurately encode a temporal record of fluctuations in Co2+ concentration to within 1 min over a 60 min period.”

They were also able “to develop a two-polymerase system capable of recording a single-step change in the Ca2+ signal to within 1 min over a 60 min period”. This means they can look at the timing of changes inside the cell as well as the individual step changes.

The researchers say their method is more than 10x faster than other intracellular solutions, transferring information to DNA in seconds instead of hours. So far so very good.

Brain research

This discovery could, they say, change the way scientists study and record neurons inside the brain — by implanting such DNA digital molecular data recorders into brain cells and looking at changes within and between millions of brain cells. The university’s announcement says that “By placing recorders inside all the cells in the brain, scientists could map responses to stimuli with single-cell resolution across many (million) neurons.”

Scientific team member and paper co-author Alec Callisto said: “If you look at how current technology scales over time, it could be decades before we can even record an entire cockroach brain simultaneously with existing technologies — let alone the tens of billions of neurons in human brains. So that’s something we’d really like to accelerate.”

But they would have to extract the DNA and ‘read’ it — and we are talking about a massive operation overall.

Archiving application

The university says this about the TURTLES method: “It’s particularly good for long-term archival data applications such as storing closed-circuit security footage, which the team refers to as data that you “write once and read never,” but need to have accessible in the event an incident occurs. With technology developed by engineers, hard drives and disk drives that hold years of beloved camera memories also could be replaced by bits of DNA.”

But almost certainly not by the TURTLES technology, because its speed is devastatingly slow. Talking to Genomics Research, Tyo said the researchers’ process wrote data at a rate of three bits an hour!

We read this four or five times to make certain it was that slow. The exact phrasing Tyo used is “up to 3/8 of a byte of information in one hour.”

Tyo speed phrasing in Genomics Research article.

Tyo also said that running millions of these processes in parallel would enable more data to be stored and written faster.

That’s true, but how much faster? Let’s try some simple math and assume we could accelerate the rate using five million parallel instances. That would be 15 million bits/hour, meaning 1.875 million bytes/hour, 1,875MB/hour or 1.875GB/hour. That means in turn, 31.25MB/minute and thus 0.52MB/sec.

This is a paralysingly slow write rate compared to modern archiving technology. A Western Digital 18TB Purple disk drive transfers data at up to 512MB/sec — 984 times faster. We would need to accelerate the TURTLES speed by 4.92 billion to achieve this HDD write speed. It seems an unrealistic, ludicrous idea.

Comparing TURTLES to tape makes for even more depressing reading. LTO-8 tape transfers compressed data at 900MB/sec — faster than the disk drive. LTO-9 operates at up to 1GB/sec with compressed data. We didn’t bother working out the TURTLES parallelisation factor needed to achieve these speeds.

Using TURTLES DNA storage for general archival data storage use would appear to be unrealistic. On the other hand, using it to record event streams inside cells is an exciting prospect indeed. 

Paper details

Bhan N, Callisto A, Strutz J, et al. Recording temporal signals with minutes resolution using enzymatic DNA synthesis. J Am Chem Soc. 2021;143(40):16630-16640. doi:10.1021/jacs.1c07331.

Get full content here.