In September 1957, Francis Crick proposed the ‘central dogma of molecular biology’. He suggested that information always flows in living beings from DNA – a stable, inheritable molecule – through a relatively unstable intermediate, the messenger RNA, and then onto proteins, which are the workhorses of all life functions. And everywhere scientists looked, they realised all organisms followed this dogma – until 1970.
In this year, Howard Temin and David Baltimore found something odd in one group of viruses.
Viruses, like other living beings, come in all shapes and sizes, and are classified into different families. However, viruses are not classified the same way as other life forms. This is because they can be both alive and not alive – a feature that demands that taxonomists also consider other attributes that make viruses different.
Another such feature is their genetic material.
Viruses are the only known life-forms that can use RNA as their genetic material. There are different kinds of RNA-containing viruses. To propagate itself, each virus makes a copy of the information in its genetic material to pass onto its ‘daughter’ viruses. Some viruses contain the machinery to make copies of their RNA, and they don’t have a DNA component in their life cycle whatsoever. The influenza, hepatitis C and SARS-CoV-2 viruses are in this category. These viruses also deviate from the central dogma only slightly: there is no DNA, but the information flows only from the RNA to proteins.
But what Temin and Baltimore discovered in 1970 was a proper exception to the central dogma. They found viruses that could make a DNA copy with their RNA using an enzyme called reverse transcriptase, in a process called reverse transcription. A virus then mixes this DNA with the DNA of its host, thus becoming part of the host forever. Such viruses – called retroviruses – violate the central dogma because information first flows from RNA to DNA, and then from the DNA to the RNA to proteins.
Viruses like HIV and Rous sarcoma belong to this family.
In all, there are seven families, or groups, of viruses, and each group specifies special adaptations, refined over years of evolution, often through several hosts. It’s also unusual – maybe even impossible – to have members of one class of viruses show fundamental properties associated with another.
This is why a preprint paper uploaded to the bioRxiv preprint server on December 13 caught the scientific community by surprise. The paper claimed, outlandishly, that parts of the SARS-CoV-2 viral RNA could be reverse transcribed into DNA and integrated into the human genome.
According to the paper’s authors, they were attempting to explain why some COVID-19 patients showed signs of the virus in RT-PCR tests even weeks after recovering from the disease. Their explanation is based on a group of genetic entities called long interspersed nuclear elements (LINE). The human genome has multiple LINEs – effectively, parts of our DNA responsible for reverse-transcribing human RNA into DNA, and integrating it into the human DNA at a different part. The paper claims these LINEs do the same thing with parts of the novel coronavirus’s RNA as well.
This process differs from what retroviruses like HIV do routinely: they use their own proteins to convert and mix the DNA.
The authors’ claims are based largely on one primary observation and one experiment. The observation banks on a powerful tool called RNA-seq, which provides the sequences of all the RNA molecules produced by a cell. So a RNA-seq’s output is a sort of measure of all the genes that are active in the target cell. The authors reported that in cells infected with SARS-CoV-2, there were some viral RNA sequences interspersed between RNA sequences of human genes.
This data may seem convincing at first glance, but the devil is in the details. The authors appear to have overlooked the fact that in the process of preparing a sample for RNA-seq, the scientist must herself artificially reverse transcribe RNA into DNA – because only DNA can be sequenced (for further study). So the chimeric viral and human RNA could just be an artefact of the RNA-seq process, since reverse transcriptases are known to mix and match target sequences.
To prove their claims in an experimental setup, the authors genetically altered cells to make proteins that can perform reverse transcription. Then they infected these cells with the SARS-CoV-2 virus, and reported that the SARS-CoV-2 viral RNA is converted into DNA.
They performed the experiment by forcing cells to make unnatural quantities of two proteins: LINEs and HIV reverse transcriptase (RT). The problem with the former is that LINEs are rarely produced naturally in the same quantities as those in the experiment, raising doubts about whether the results reflect what is realistically possible. And the problem with the latter is that there is no chance HIV RT is naturally present in a cell infected with SARS-CoV-2 because the two viruses do not infect the same cell types. So the experimental evidence has some big loopholes that don’t in any way justify what the authors claim.
Instead, the authors could have provided data from an older technique: the Southern blot. In 1973, the English molecular biologist Edwin Southern reported a very simple way to check if a particular fragment of DNA is present in a given sample. A DNA molecule has two strands (the ‘double helix’), and the string of nucleobases on one strand can only pair to a specific string of nucleobases on the other. So Southern figured that by studying one strand, researchers could know what the other strand looked like.
The way to do this – for example – is to synthesise one strand of the SARS-CoV-2 DNA and mix it with copies of human DNA, and check for signs of binding.
The preprint paper’s lack of convincing evidence has opened it up to criticism from scientists for its erroneous assertions and unproven claims. At the same time, David Baltimore, who won a Nobel Prize for helping discover the reverse transcriptase enzyme, told the prominent Science magazine the study was “impressive”, and other news outlets have amplified his comments.
Such words have elevated the study’s profile in a way it didn’t deserve to be in the middle of a pandemic scarred by misinformation and pseudoscience. The manuscript’s bioRxiv page itself includes numerous demands from researchers around the world (as comments) to take it down.
To be clear, what the preprint’s authors have claimed is still within the realm of possibility, but their experiments and interpretations aren’t convincing. The claim is extraordinary: the first report of reverse transcription by a non-retrovirus. It would mean there’s a chance that your body keeps a record of all RNA viruses that ever infected it, and open up a whole new angle to immune memory. But extraordinary claims require extraordinary evidence – which the preprint paper doesn’t have. So for now, we wait for proof.
Arun Panchapakesan is a molecular biologist working in the HIV-AIDS laboratory at the Jawaharlal Nehru Centre for Advanced Scientific Research, Bengaluru.