pull down to refresh

Abstract

DNA storage has shown potential to transcend current silicon-based data storage technologies in storage density, longevity and energy consumption1,2,3. However, writing large-scale data directly into DNA sequences by de novo synthesis remains uneconomical in time and cost4. We present an alternative, parallel strategy that enables the writing of arbitrary data on DNA using premade nucleic acids. Through self-assembly guided enzymatic methylation, epigenetic modifications, as information bits, can be introduced precisely onto universal DNA templates to enact molecular movable-type printing. By programming with a finite set of 700 DNA movable types and five templates, we achieved the synthesis-free writing of approximately 275,000 bits on an automated platform with 350 bits written per reaction. The data encoded in complex epigenetic patterns were retrieved high-throughput by nanopore sequencing, and algorithms were developed to finely resolve 240 modification patterns per sequencing reaction. With the epigenetic information bits framework, distributed and bespoke DNA storage was implemented by 60 volunteers lacking professional biolab experience. Our framework presents a new modality of DNA data storage that is parallel, programmable, stable and scalable. Such an unconventional modality opens up avenues towards practical data storage and dual-mode data functions in biomolecular systems.

How can you write data to DNA without changing the base sequence?

A new method lets anyone with a kit write data to DNA with just one enzyme.
Zettabytes—that’s 1021 bytes—of data are currently generated every year. All of those cat videos have to be stored somewhere, and DNA is a great storage medium; it has amazing data density and is stable over millennia.
To date, people have encoded information into DNA the same way nature has, by linking the four nucleotide bases comprising DNA—A, T, C, and G—into a particular genetic sequence. Making these sequences is time-consuming and expensive, though, and the longer your sequence, the higher chance there is that errors will creep in.
But DNA has an added layer of information encoded on top of the nucleotide sequence, known as epigenetics. These are chemical modifications to the nucleotides, specifically altering a C when it comes before a G. In cells, these modifications function kind of like stage directions; they can tell the cell when to use a particular DNA sequence without altering the “text” of the sequence itself. A new paper in Nature describes using epigenetics to store information in DNA without needing to synthesize new DNA sequences every time.
reply