Phi X 174
The phi X 174 bacteriophage is a single-stranded DNA virus that infects Escherichia coli, and the first DNA-based genome to be sequenced. This work was completed by Fred Sanger and his team in 1977. In 1962, Walter Fiers and Robert Sinsheimer had already demonstrated the physical, covalently closed circularity of ΦX174 DNA. Nobel prize winner Arthur Kornberg used ΦX174 as a model to first prove that DNA synthesized in a test tube by purified enzymes could produce all the features of a natural virus, ushering in the age of synthetic biology. In 1972-1974, Jerard Hurwitz, Sue Wickner, and Reed Wickner with collaborators identified the genes required to produce the enzymes to catalyze conversion of the single stranded form of the virus to the double stranded replicative form. In 2003, it was reported by Craig Venter's group that the genome of ΦX174 was the first to be completely assembled in vitro from synthesized oligonucleotides. The ΦX174 virus particle has also been successfully assembled in vitro. In 2012, it was shown how its highly overlapping genome can be fully decompressed and still remain functional.
Genome
This bacteriophage has a sense circular single-stranded DNA genome of 5,386 nucleotides. The genome GC-content is 44% and 95% of nucleotides belong to coding genes. Because of the balance base pattern of the genome, it is used as the control DNA for Illumina sequencers.Genes
ΦX174 encodes 11 genes, named as consecutive letters of the alphabet in the order they were discovered, with the exception of A* which is an alternative start codon within the large A genes. Only genes A* and K are thought to be non-essential, although there is some doubt about A* because its start codon could be changed to ATT but not any other sequence. It is now known that the ATT is still capable of producing protein within E. coli and therefore this gene may not be non-essential.Phage ΦX174 has been used to try to establish the absence of undiscovered genetic information through a "proof by synthesis" approach.
Transcriptome
In 2020, the transcriptome of ΦX174 was generated. Notable features of the ΦX174 transcriptome is a series of up to four relatively weak promoters in series with up to four Rho-independent terminators and one Rho-dependent terminator.Proteins
ΦX174 encodes 11 proteins.Protein | Copies | Function |
A | -- | Nicks RF DNA to initiate rolling circle replication; ligates ends of linear phage DNA to form single-stranded circular DNA |
A* | -- | Inhibits host cell DNA replication; blocks superinfecting phage; not essential |
B | 60 in procapsid | Internal scaffolding protein involved in procapsid assembly |
C | -- | DNA packaging |
D | 240 in procapsid | External scaffolding protein involved in procapsid assembly |
E | -- | Host cell lysis |
F | 60 in virion | Major capsid protein |
G | 60 in virion | Major spike protein |
H | 12 in virion | DNA pilot protein |
J | 60 in virion | Binds to new single-stranded phage DNA; accompanies phage DNA into procapsid |
K | -- | Optimizes burst size; not essential |
Proteome
Identification of all ΦX174 proteins using mass spectrometry has recently been reported.Infection Cycle
Infection begins when G protein binds to lipopolysaccharides on the bacterial host cell surface. H protein pilots the viral genome through the bacterial membrane of E.coli bacteria most likely via a predicted N-terminal transmembrane domain helix. However, it has become apparent that H protein is a multifunctional protein. This is the only viral capsid protein of ΦX174 to lack a crystal structure for a couple of reasons. It has low aromatic content and high glycine content, making the protein structure very flexible and in addition, individual hydrogen atoms are difficult to detect in protein crystallography. Additionally, H protein induces lysis of the bacterial host at high concentrations as the predicted N-terminal transmembrane helix easily pokes holes through the bacterial wall. By bioinformatics, this protein contains four predicted coiled-coil domains which has a significant homology to known transcription factors. Additionally, it was determined that de novo H protein was required for optimal synthesis of other viral proteins. Mutations in H protein that prevent viral incorporation, can be overcome when excess amounts of protein B, the internal scaffolding protein, are supplied.The DNA is ejected through a hydrophilic channel at the 5-fold vertex. It is understood that H protein resides in this area but experimental evidence has not verified its exact location. Once inside the host bacterium, replication of the ssDNA genome proceeds via negative sense DNA intermediate. This is done as the phage genome supercoils and the secondary structure formed by such supercoiling attracts a primosome protein complex. This translocates once around the genome and synthesizes a ssDNA from the positive original genome. ssDNA genomes to package into viruses are created from this by a rolling circle mechanism. This is the mechanism by which the double stranded supercoiled genome is nicked on the positive strand by a virus-encoded A protein, also attracting a bacterial DNA polymerase to the site of cleavage. DNAP uses the negative strand as a template to make positive sense DNA. As it translocates around the genome it displaces the outer strand of already-synthesised DNA, which is immediately coated by SSBP proteins. The A protein cleaves the complete genome every time it recognises the origin sequence.
As D protein is the most abundant gene transcript, it is the most protein in the viral procapsid. Similarly, gene transcripts for F, J, and G are more abundant than for H as the stoichiometry for these structural proteins is 5:5:5:1. The primosomes are protein complexes which attach/bind the enzyme helicase on the template. Primosomes gives RNA primers for DNA synthesis to strands.
Uses
Evolution
It has been used as a model organism in many evolution experiments.Biotechnology
ΦX174 is regularly used as a positive control in DNA sequencing due to its relatively small genome size in comparison to other organisms, its relatively balanced nucleotide content — about 23% G, 22% C, 24% A, and 31% T, i.e., 45% G+C and 55% A+T, see the accession NC_001422.1 for its 5,386 nucleotide long sequence. Illumina's sequencing instruments use ΦX174 as a positive control, and a single Illumina sequencing run can cover the ΦX174 genome several million times over, making this very likely the most heavily sequenced genome in history.ΦX174 is also used to test the resistance of personal protective equipment to bloodborne viruses.
ΦX174 has also been modified to enable peptide display from the viral capsid G protein.