A view of tubes each carrying a sample to be tested slotted inside a thermal cycler. Photo: kOchstudiO/Wikimedia Commons.
The new coronavirus SARS-CoV-2 is called so because of its similarity to the SARS virus, which caused an outbreak of severe acute respiratory syndrome (SARS) in 2002-2003. Specifically, the new virus’s genome is a 70% match to that of the SARS virus. So using the SARS virus’s genome as a reference, scientists could use genetic sequencing to determine if the virus causing the current outbreak is the earlier SARS virus or a new strain.
Crucially, scientists in China were able to sequence the full genome of the virus only four days after the first case of infection was reported, paving the way for scientists around the world to design rapid molecular genetic tests for COVID-19.
Using a technology called high-throughput sequencing, scientists are today able to sequence multiple DNA fragments in tandem, which are then aligned on a reference genome from a related organism to build a full genome sequence.
The genomes of most organisms are made of DNA, but some viruses – like the new coronavirus – have genomes of RNA. The SARS-CoV-2’s RNA genome has 32,000 nucleobases.
Tracts of nucleobases make up genes, and combinations of genes make up a genome. Genes carry the instructions for the virus to synthesise different proteins, including those that make the virus infectious.
DNA is usually double-stranded while RNA is usually single-stranded. Both DNA and RNA are made of four nucleobases; three of them – adenine, cytosine and guanine – are common. In DNA, the fourth is thymine and in RNA, uracil.
After docking on human cells, the virus first releases its RNA inside the cell and uses the cell’s resources to transcribe an enzyme called RNA-dependent RNA polymerase (RdRP). RdRP replicates the virus’s genetic material inside the cell, subsequently used to produce a bunch of proteins. The newly reproduced genetic material and proteins then coalesce into new viral particles that ooze out from the host cell, ready to infect neighbouring cells. This way, the virus perpetuates itself within our cells at the expense of the human cellular machinery.
One of the fulcrums of modern medicine is poised on the molecular diagnosis of infectious diseases. And one test that makes this possible is the reverse transcriptase real-time polymerase chain reaction (rRT-PCR) test.
The rRT-PCR test is also used to diagnose the presence of SARS-CoV-2 in a sample. If SARS-CoV-2 is present in a sample, it means the person from whom the sample was obtained likely has COVID-19, which is the name of the disease caused by the new coronavirus.
First, a technician isolates the genetic material of the virus from a nasopharyngeal sample (obtained from a person by a swab of the upper respiratory tract). This RNA is then converted to complementary DNA, or cDNA, using an enzyme called reverse transcriptase.
The diagnostic panel for COVID-19 comprises four target genes. Three genes are specific to the new coronavirus and one is a human gene, used as an internal control.
The reactions to look for them are set up in four different tubes and amplified inside a thermal cycler, which cycles the samples through different temperatures required for the reaction, and whose progress can be tracked in real-time.
Two molecules are attached to each cDNA strand, a primer and a probe. The primer is a short stretch of nucleobases that latches itself to a location on the cDNA by forming hydrogen bonds with complementary bases on the cDNA. The probe is also a small tract of nucleobases but which could attach themselves only to one of the three coronavirus-specific genes.
After this step, the technician introduces an enzyme called a DNA polymerase (the ‘P’ of rRT-PCR). The polymerase enables the primer to elongate itself with more and more nucleobases provided in the solution, weaving a strand of nucleobases complementary to the cDNA.
When the elongating primer strikes the probe, the latter disintegrates and releases a molecule into the solution that fluoresces (or glows) – indicating the presence of a specific gene. Depending on the test, the technician could have used up to four probes, corresponding to all four genes of interest in the COVID-19 diagnostic panel.
The three coronavirus-specific genes include a universal gene that codes for the envelope protein, which is found in all coronaviruses. Two other genes, ORF-1a and RdRp, are to be found only in SARS-CoV-2. So finding only the envelope protein gene would mean the virus is a coronavirus, and finding ORF-1a and RdRp would indicate that the coronavirus is SARS-CoV-2.
If the rRT-PCR test is able to detect the presence of all three genes after 40 thermal cycles in the cycler, irrespective of the human control gene, the patient is presumptively said to have tested positive for the new coronavirus.
If only one or two genes are detected but not the third, the test is marked as ‘inconclusive’ and has to be repeated.
How does the technician make sure the reaction happened properly?
This is the purpose of including the human control gene. As the rRT-PCR test happens for the envelope protein, ORF-1a and RdRp genes, a primer and probe are introduced for the human gene as well. As the reaction progresses, the human gene must be amplified as well. If it is not, the whole test has to be repeated from the beginning – either from an old sample or with a new sample obtained from the swab.
Raees ul Hamid Paul is a senior research fellow at the department of medical microbiology and Imran Ibni Gani Rather is a PhD scholar at the department of clinical pharmacology – both at PGIMER, Chandigarh.