|
|
|
|
Vol. 12, Issue 10, 3114-3125, October 2001



and
*Department of Immunology and Infectious Diseases, Harvard School
of Public Health, Harvard University, Boston, Massachusetts 02115; and
The Institute for Genomic Research, Rockville, Maryland
20850
| |
ABSTRACT |
|---|
|
|
|---|
Serial analysis of gene expression (SAGE) was applied to the malarial parasite Plasmodium falciparum to characterize the comprehensive transcriptional profile of erythrocytic stages. A SAGE library of ~8335 tags representing 4866 different genes was generated from 3D7 strain parasites. Basic local alignment search tool analysis of high abundance SAGE tags revealed that a majority (88%) corresponded to 3D7 sequence, and despite the low complexity of the genome, 70% of these highly abundant tags matched unique loci. Characterization of these suggested the major metabolic pathways that are used by the organism under normal culture conditions. Furthermore several tags expressed at high abundance (30% of tags matching to unique loci of the 3D7 genome) were derived from previously uncharacterized open reading frames, demonstrating the use of SAGE in genome annotation. The open platform "profiling" nature of SAGE also lead to the important discovery of a novel transcriptional phenomenon in the malarial pathogen: a significant number of highly abundant tags that were derived from annotated genes (17%) corresponded to antisense transcripts. These SAGE data were validated by two independent means, strand specific reverse transcription-polymerase chain reaction and Northern analysis, where antisense messages were detected in both asexual and sexual stages. This finding has implications for transcriptional regulation of Plasmodium gene expression.
| |
INTRODUCTION |
|---|
|
|
|---|
Malaria, an infectious disease caused by the protozoan parasite
Plasmodium falciparum, affects 300-500 million people
globally each year (WHO, 1997
). Increasing drug-resistance in the
parasite and insecticide-resistance in the Anopheles vector
have exacerbated this substantial public health problem. Against this
backdrop, effective strategies to combat the disease require a
fundamental knowledge of the basic biology of Plasmodium to
develop new pharmatherapeutics and vaccines that target the parasite.
Most studies of Plasmodium biology have been directed at
single genes thought to be important for pathogenesis. With the advent of genomic technologies, however, new approaches to combat the disease,
such as identifying entire repertoires of transcripts expressed under
different conditions, have now become available. Genomic approaches
were initiated with the sequencing of the P. falciparum (3D7
strain) genome, a collaborative project, undertaken by the Malaria
Genome Consortium that is already close to completion (Butler, 1997
;
O'Brien, 1997
; Craig et al., 1999
). Chromosomes 2 and 3 have been fully sequenced (Gardner et al., 1998
; Bowman et al., 1999
), whereas 80 to 90% of the estimated 6000 open
reading frames (ORFs) in the 3D7 genome are now available as raw
sequence data. The next challenge is to use this vast amount of data to study the functional relevance of various genes. For example, it is now
possible to identify genes that are transcribed in different stages of
the parasite's development and also genes that are induced or
repressed in response to various stimuli such as immune or drug
pressure. For this reason, whole genome expression analyses with the
use of high-density microarrays (Hayward et al., 2000
) and
serial analysis of gene expression (SAGE) (Munasinghe et
al., 2000
) have been developed for P. falciparum. These
new approaches will complement each other to generate data for the
Plasmodium research community. Genome sequence will expedite
the microarray and SAGE analyses; conversely, open platform profiling
techniques such as SAGE will help the Malaria Genome Project with
annotation of previously uncharacterized ORFs and with novel gene discovery.
SAGE provides a sensitive and highly quantitative description of the
transcript profile of a given cell type (Velculescu et al.,
1995
, 1997
). The SAGE technology samples short sequence tags (14 bases)
from mRNA transcripts in the population of interest. These tags contain
sufficient sequence information to identify, by basic local alignment
search tool (BLAST) analysis, the transcript from which each tag was
derived (Munasinghe et al., 2000
). The frequency of each tag
in the SAGE library is an accurate estimate of the abundance of its
corresponding mRNA transcript. Numerous groups have used this technique
successfully and described the SAGE protocol in detail (Velculescu
et al., 1995
, 1997
; Madden et al., 1997
; Polyak
et al., 1997
; Matsumura et al., 1999
; Virlon et al., 1999
).
In this report, we show that SAGE can be used to study gene expression of the asexual stages of P. falciparum. Asexual parasites express many virulence factors and are the targets of antimalarials such as chloroquine; hence, an in-depth understanding of their transcriptional profiles will set the stage for future experiments addressing responses to immune or drug pressure.
SAGE was successfully applied to erythrocytic stage parasites (3D7 strain) of P. falciparum at baseline culturing conditions, and a SAGE library of ~8000 tags was generated. A majority of these corresponded to unique parasite genes, as demonstrated by BLAST analysis of a subset of tags. The SAGE data were validated by Northern and reverse transcription-polymerase chain reaction (RT-PCR) analysis of genes predicted to be highly expressed based on tag counts. BLAST analysis of highly abundant tags also provided insight into networks of major metabolic pathways that are used by the parasite under normal culture conditions. These pathways include mitochondrial, glucose, polyamine, and deoxy-D-xylulose 5-phosphate (DOXP) metabolism. Finally, SAGE also revealed the presence of antisense transcription in the malarial parasite, a phenomenon that has been previously missed by other methods of transcriptional analysis. These SAGE data were also validated by two independent methods, RT-PCR and Northern analysis; here antisense transcripts for genes expressed in asexual as well as sexual stage parasites were found. In summary, SAGE in Plasmodium has revealed many facets of the basic functioning of the parasite in culture, and it sets the stage for future comparisons of the transcriptional responses of P. falciparum to different stimuli.
| |
MATERIALS AND METHODS |
|---|
|
|
|---|
Parasite Culture and RNA Extraction
3D7 strain parasites were maintained under standard culturing
conditions (Trager and Jensen, 1976
) with modifications as previously described (Munasinghe et al., 2000
). Polyadenylated RNA was
harvested from cultures at 8% parasitemia (1% rings, 5%
trophozoites, and 2% schizonts) and used in the SAGE procedure as
previously described (Munasinghe et al., 2000
).
Data Analysis
SAGE tags from 3D7 asexual stages were analyzed with the use of the SAGE software (Johns Hopkins University, Baltimore, MD, and Genzyme, Cambridge, MA), which extracts 14-bp tag counts from sequence files. To assign gene identity to each tag, the 3D7 experimental tag list was matched against a P. falciparum tag database. This database was created by extracting 14-bp tags from P. falciparum sequence deposited in GenBank (as of July 13, 2000), as well as from a compiled database of recently deposited 3D7 genome sequence (obtained from The Institute For Genomic Research (Rockville, MD), Sanger (Cambridge, UK), and Stanford (Palo Alto, CA), sequencing centers, and compiled at the University of Pennsylvania (Philadelphia, PA) as of July 26, 2000; kindly provided by Drs. Jessica Kissinger and David Roos). Because the P. falciparum genome is not fully annotated, all potential SAGE tags from both sense and antisense strands were extracted (i.e., tags were extracted from each database in the "genomic mode" rather than the "cDNA mode").
The software output files are organized in such a way that matches to a
single locus, matches to multiple loci, and no matches to database
sequences can be readily determined. For genomic sequence that is
annotated, it is possible to assign gene identification to each tag in
the manner outlined above; however, most of the available P. falciparum genome sequence is not annotated. Therefore, the 187 most abundant tags (abundance level of >4 hits) were characterized by
manual BLASTx analysis; see flow chart in Figure
1. Here, for tags derived from
unannotated reads, a 500-1000-bp sequence surrounding the tag was
translated in all six reading frames and compared with the entire
National Center for Biotechnology Information protein database.
Fourteen bp tags that failed to match either database were analyzed
with the use of only the first 13 bp of the tag sequence in the manner
outlined above.
|
RT-PCR
RT-PCR was performed with the use of the 3' RACE kit (Invitrogen, Carlsbad, CA) according to the manufacturer's protocol. First-strand cDNA synthesis was primed with oligo(dT)18 (0.5 µg of mRNA was used per reaction), whereas PCR was performed with the gene specific primers described below. All primers anneal within coding regions of the genes and result in 350-790-bp PCR products. Calmodulin (sense): 5' GTCCATCACCATCAATATCAGC 3' calmodulin (antisense): 5' CTAAGGAGTTAGGAACGGTCATG 3' msp-3 (sense): 5' TTTTTGTGTTCTGGAAC-GCCTCCTCC 3' msp-3 (antisense): 5' GCTTCCGAAGATGCTGAA-AAAGCTGC 3' pfg27/25 (sense): 5' TCTTGTCGTTCATGATA-CGCTTC 3' pfg27/25 (antisense): 5' GTACAAAAGGATAGT-GCCAAGCCC 3' rap-1 (sense): 5' CTTTGAAGAAATCTCTGAT-TTCAGC 3' rap-1 (antisense): 5' GCTTTAGAAGGTGTCTGT-TCATATC 3' hsp86 (sense): 5' CCGAATTACTCCGATTCCAA-ACCTC 3' hsp86 (antisense): 5' CTTCTTCCATTTTAGAAT-CGGTTGC 3' PCR reactions were carried out according to the manufacturer's protocol (3' RACE kit; Invitrogen). Initial denaturation of the template occurred at 94°C for 3 min. Amplification was performed for five cycles at 94°C for 45 s, 52-54°C for 45 s, and 72°C for 45 s, followed by 21-26 cycles of identical amplification where the annealing temperature was increased to 55-57°C. Finally, extension of partial PCR products was completed at 72°C for 6 min.
Strand-specific RT-PCR used 1 µg of total RNA per reaction and was
performed with the express purpose of distinguishing sense RNA from
antisense RNA (Yu et al., 1995
). RT-PCR was performed with
the use of the 3' RACE kit (Invitrogen); however, first-strand cDNA was
primed with gene-specific primers that hybridize to either sense or
antisense messages, rather than with an
oligo(dT)18 primer. The gene-specific primers are
identical to the primers listed above. A 10th of the cDNA sample was
PCR amplified, with the same set of gene-specific primers and
amplification conditions described above.
PCR products were electrophoresed on 1.2% agarose gels. All resultant PCR products were cloned into the pCRII vector with the use of the TA cloning kit (Invitrogen) and sequenced to confirm the identity of the amplified cDNA.
Northern Blots
Northern analysis was performed according to standard protocols.
Briefly, 1 µg of mRNA from 3D7 cultures was gel electrophoresed, blotted onto BA85 nitrocellulose membranes (Schleicher & Schuell, Keene, NH), and probed with gene-specific DNA probes. All probes (calmodulin, msp-3, rap-1, and pfg27/25) were
derived from the RT-PCR products described in the previous section. DNA
probes were radiolabeled with [
-32P]dATP
with the use of random hexanucleotides and the Klenow fragment of DNA
polymerase. Blots were visualized by autoradiography.
For strand-specific Northerns, 20 µg of total RNA were used per blot
as described above. Synthetic RNAs corresponding to a sense or
antisense fragment (~300 bp) of either calmodulin or msp-3
were used as probes for strand-specific Northern analysis. The
synthetic RNAs were generated in the following manner. Briefly, RT-PCR
products of calmodulin and msp-3 cDNAs (~300-bp fragment; see previous section) were cloned into pBluescript. The orientations of
calmodulin and msp-3 genes within pBluescript were
determined by sequencing. Each plasmid (pBluescript-msp-3
and pBluescript-calmodulin) was linearized with either BamHI
or XhoI. These plasmids were then used as DNA templates for
in vitro transcription reactions with the use of T3 or T7 RNA
polymerase to generate synthetic sense or antisense RNA fragments for
each gene. Plasmids digested with BamHI were incubated with
T7 RNA polymerase (after standard protocols) to produce antisense RNAs
for both genes. Similarly, plasmids digested with XhoI were
incubated with T3 polymerase to produce sense RNAs. Strand-specific RNA
probes were also obtained under the same conditions in the presence of
[
-32P]ATP.
Quantitative Northern analysis was carried out for calmodulin and msp-3 to determine whether the ratio of their transcripts was comparable with that determined by SAGE. Northern blots and gene-specific DNA probes were prepared as described above. Known amounts of synthetic 300-bp RNA fragments (in the sense orientation) from each gene were run alongside the mRNA sample as markers for quantification. Blots were exposed to x-ray film (Kodak XO-MAT) such that the intensity of the signal was within the linear range of the film. Signal intensities for each of the transcripts in the mRNA sample were converted to molar amounts by reference to those of the synthetic RNAs. Signal intensities were measured by scanning the x-ray film into Adobe Photoshop (Adobe Systems, Mountain View, CA) and by using NIH Image software to quantify bands by pixel density.
Plasmodium gallinaceum Gamete Preparation and RNA Isolation
P. gallinaceum parasites were propagated in White
Leghorn chickens by serial injection into wing veins. At parasitemia of 50-70%, blood was withdrawn by heart puncture. Gametogenesis was induced as described previously (Goonewardene et al., 1993
),
with the inclusion of xanthurenic acid (Sigma, St. Louis, MO) at a final concentration of 50 µM in the exflagellation buffer. Gametes and zygotes were purified, also as described previously (Goonewardene et al., 1993
), and 1 × 107 cells
were incubated at 25°C in Medium 199 (Invitrogen) and harvested for
analysis at 0, 24, and 48 h after isolation. Total RNA was isolated with the use of Tri reagent (Molecular Research Center, Cincinnati, OH) according to the manufacturer's protocol. Total RNA
obtained from 1 × 107 parasites was used
for each RT-PCR reaction. Strand-specific RT-PCR was performed as
described previously with the following primers: pgs28
(sense): 5' CATCTAGCATAGTCAGCACAAGGTTTATTTG 3' and pgs28
(antisense): 5' CAAACGAAGATTATTTAGTCAAAC 3'.
| |
RESULTS |
|---|
|
|
|---|
3D7 SAGE Tag Library from Asexual Blood Stage Parasites
A total of 8335 SAGE tags was analyzed from the asexual blood
stages of P. falciparum, 3D7 strain. A preliminary analysis showed that these 8335 tags corresponded to 4798 unique genes (Figure
2A). Of these, 1254 genes were present at
an abundance of two hits (or counts) or greater. The 537 tags expressed
at abundance levels
20 tags (percent abundance of 0.2) accounted for
6.4% of the total collection of tags but only 0.3% (15) of the total
number of unique genes. As expected, these abundance groups had the
highest percentage of matches to GenBank entries (Figure 2B), implying
that many highly expressed messages have been readily cloned and
studied. The lower abundance tags (abundance of <20 tags) accounted
for 93.6% of the total collection of tags, and represented a vast
majority of the unique genes expressed in the parasite. Moreover, these
tags gave many fewer matches to GenBank; hence, SAGE in P. falciparum will aid in the discovery of novel malarial genes.
|
BLAST Analysis of SAGE Tags
To assess whether 14-bp tags could uniquely identify genes in the highly A-T-rich Plasmodium genome, these SAGE tags were searched against 3D7 genome sequence. We decided that for an accurate estimate of the "tag to gene" mapping in Plasmodium, all available sequence data, both cDNA and genomic, would provide the most complete picture. Sequencing of the P. falciparum genome is close to completion; however, much of the newly available P. falciparum sequence data has yet to be annotated. Therefore, the 187 most abundant SAGE tags were analyzed in a more rigorous manner by BLASTx analysis. A schematic of the BLAST analysis is shown in Figure 1. This analysis revealed that a majority of the SAGE tags (88%) corresponded to P. falciparum genome sequence. Most of the tags that match to single loci (70%) lie within known genes; hence, SAGE tags can be used to uniquely identify genes in Plasmodium. The other 30% of tags that match single sites correspond to unknown genes and hypothetical open reading frames. Thus, SAGE data reveal not only predicted ORFs that are expressed but also previously uncharacterized transcripts; hence, SAGE in Plasmodium has the capacity to assist in annotation of the genome.
Approximately 10% of the 187 most abundant SAGE tags did not match parasite sequence. We expect this number to decrease as the genome project nears completion. The percentage of SAGE tags that gave multiple matches within the P. falciparum genome was also calculated and found to be 18%. In the present study, the 35 tags that matched more than one loci were further investigated; of these tags, 21 (60%) matched two or three genes, whereas 14 (40%) matched greater than three genes. The latter set of tag sequences was of lower complexity in general. Northern blot analysis should help resolve whether tags that match multiple genes indeed represent multiple transcripts.
Abundant Transcripts Expressed in P. falciparum Grown in Culture
The BLAST analysis described above enabled us to assign genes to
highly abundant SAGE tags; examples of these are listed in Table
1. This analysis provided a snapshot of
the major transcripts expressed by the parasite. A complete picture of
metabolic pathways used by P. falciparum growing in culture
will incorporate protein expression and stability; nevertheless, BLAST
analysis of abundant SAGE tags provides the first global description of
genes and hence, metabolic pathways that might be transcriptionally
regulated at the level of expression. The most abundant transcripts
were grouped into functional categories to reveal the transcriptional
profile of 3D7 parasites grown in culture (Figure
3). Many tags represented housekeeping
functions carried out by all prokaryotic and eukaryotic cells
(transcription, translation, chaperones, cytoskeleton), whereas some
functional classes were highly specific for the unique life cycle of
Plasmodium (membrane-associated proteins involved in
invasion, DOXP pathway).
|
|
Interestingly, many of the highly abundant messages (5.3%) appear to be transcribed from the 6-kb mitochondrial genome, and another 2.1% (thioredoxin, vacuolar ATPase subunit B, ATPase transporter, ubiquinol cytochrome c reductase-like protein) are probably involved in oxidative metabolism. Therefore, a significant proportion of abundant transcripts encodes proteins that play a role in oxidative metabolism.
Stage-specific transcripts are highly represented in the list of
abundant messages, reflecting the different developmental stages
present in the culture. For example, mRNAs encoding cell surface
proteins involved in merozoite invasion (Cowman et al., 2000
) comprise 8% of the most abundant transcripts. These include merozoite surface proteins 3 and 4 (MSP-3 and -4), rhoptry-associated protein-1 (RAP-1), and merozoite capping protein. Tags corresponding to
serine repeat antigen, a soluble protein that is associated with the
parasitophorous vacuole, were found at high abundance (0.32%).
Surprisingly, a tag representing the gametocyte surface antigen
Pfg27/25, shown to be essential for gametogenesis (Lobo et al., 1999
), was also present at high abundance (0.25%)
in this SAGE library derived from asexual parasites.
Abundant SAGE tags represented major metabolic pathways of the malarial
parasite. Because asexual blood stages of Plasmodium do not
store energy reserves in the form of glycogen or lipids, glucose taken
up from plasma is the primary source of energy (Sherman, 1991
).
Therefore, glucose metabolism is a prominent aspect of intracellular
growth and not unexpectedly, proteins required for glucose metabolism
were represented among the abundant tags (aldolase, phosphoenolpyruvate
carboxykinase, and triosephosphate isomerase).
Although lipids are not used as a major source of energy by P. falciparum, there is a significant increase in levels of
phospholipids, diacylglycerol, and triacylglycerol, within the red
blood cell upon merozoite invasion (Vial and Ancelin, 1998
). This
increase in the total lipid content is associated with a biosynthetic
requirement for lipids during formation of the membranes surrounding
the parasite (the parasitophorous vacuolar membrane and the
tubovesicular membrane). N-Myristoyl transferase, an enzyme
that plays a role in the formation of lipoproteins, was found among the
187 most abundant tags; however, tags representing proteins involved in
lipid biosynthesis were not present.
Intraerythrocytic P. falciparum parasites are capable of de
novo synthesis of pyrimidines from precursor molecules (Walsh and
Sherman, 1968
), with a requirement for para-aminobenzoic acid and folate cofactors. Unlike their hosts, malarial parasites do not use
exogenous folate cofactors, but instead synthesize these de novo
(Scheibel and Sherman, 1988
). SAGE data revealed tags corresponding to
ribonucleotide reductase, an enzyme of the pyrimidine biosynthetic
pathway, and dihydrofolate synthase, an enzyme of the folate pathway.
Polyamine biosynthetic enzymes were also represented among the SAGE
tags (ornithine decarboxylase and ornithine aminotransferase).
The unique intracellular niche of malarial parasites results in the
expression of many parasite-specific metabolic pathways. For example,
growth of the asexual parasites within red blood cells is accompanied
by degradation of hemoglobin and the subsequent detoxification of heme
by-products (Foley and Tilley, 1998
; Krogstad and De, 1998
; Rosenthal
and Meshnick, 1998
). Tags representing proteins implicated in the
detoxification of heme (histidine-rich proteins I and II, glutathione
reductase) were found at high abundance in the SAGE library.
Surprisingly, the plasmepsin and falcipain proteases that play a role
in hemoglobin degradation were not found in the list of highly
expressed genes. This may be due to the fact that their transcription
occurs at an earlier stage in the parasite life cycle than the
trophozoite stage, which was the predominant stage in the study
population. Alternatively, these transcripts may be present at a very
low abundance.
Finally, SAGE data revealed the expression of mRNA encoding DOXP
synthase at high levels (0.09%). The DOXP pathway was recently identified as a parasite-specific metabolic pathway important for
isoprenoid biosynthesis (Jomaa et al., 1999
). Because this pathway is localized in the apicoplast, a plant-derived organelle of
Plasmodium, DOXP metabolism provides a novel target for
antimalarial drug development.
Validation of SAGE Data
To confirm the expression data in asexual-stage parasites as
determined by SAGE, RT-PCR and Northern analysis of several genes with
highly abundant SAGE tag counts (calmodulin, msp-3,
rap-1, and pfg27/25; Figure
4) were performed. Pfg27/25
represents a gametocyte-specific antigen, whereas the other three are
predicted to be expressed in asexual stages. Because the SAGE library
was derived from a culture that contained no detectable gametocytes, pfg27/25 was specifically chosen for RT-PCR and Northern
analysis. RT-PCR products for all four genes were generated from
asexual-stage mRNA (Figure 4A). These were cloned, sequenced, and found
to correspond to the expected gene. Transcripts at the predicted length
for all four genes were also detected by Northern blotting (our
unpublished results; Figure 4B). The presence of pfg27/25
transcripts in the asexual stages of P. falciparum has been
reported in another genome-wide expression analysis with the use of
microarrays (Hayward et al., 2000
).
|
For a more quantitative estimate of gene expression, quantitative Northern analysis of two highly expressed genes (msp-3 and calmodulin) was performed (our unpublished results; Figure 4B). Here, the molar ratio of msp-3 to calmodulin was ~3:1, which is similar to the ratio of their SAGE tag counts (Figure 4B). Hence, SAGE tag data appear to correlate well with relative levels of mRNA within the cells.
Antisense Transcripts
A surprising observation of SAGE in P. falciparum was the large proportion of tags corresponding to antisense transcripts. Unlike microarrays, SAGE is able to detect antisense transcription because the orientation of the SAGE tag on the mRNA can be readily determined. A SAGE tag consists of the 4-bp recognition sequence (CATG) of the restriction enzyme NlaIII (this enzyme defines the position of each tag in an mRNA transcript) and 10 bp of adjacent sequence in the direction of the 3' poly(A) tail of the RNA molecule. Among 45 annotated genes whose 5' and 3' ends are clearly denoted, 17% of the tags consisted of a CATG and the 3' adjacent 10 bp, in the direction of the 5' end of the transcript, on the noncoding strand of cDNA. This result was unexpected; hence, we wanted independent confirmation of the SAGE data. This was accomplished by strand-specific RT-PCR analysis of asexual as well as sexual blood stages, and strand-specific Northern analysis in erythrocytic stage parasites.
We confirmed the presence of antisense transcripts from erythrocytic
stages by strand-specific RT-PCR analysis of the three genes
calmodulin, rap-1, and msp-3, and subsequent
sequencing of the RT-PCR products to establish gene identity. Based on
SAGE data, we expected all three transcripts to be present in both the
sense and antisense orientations, a prediction that was confirmed by
RT-PCR (Figure 5A, lanes 1-12) and
sequence analysis. On the other hand, a PCR product for
hsp-86 was only detected for sense RNA (Figure 5A, lane 15),
consistent with the absence of an antisense SAGE tag for this gene.
Importantly, control experiments that excluded reverse transcriptase
(lanes 2, 4, 6, 8, 10, 12, 14, and 16) indicated a lack of
contaminating genomic DNA, showing that the PCR products obtained
during strand-specific RT-PCR were indeed derived from RNA. These data
validate the antisense transcripts predicted by SAGE.
|
The presence of antisense transcripts was also confirmed by strand-specific Northern analysis for calmodulin and msp-3. To control for the specificity of the strand-specific RNA probes, synthetic RNA corresponding to the sense or antisense strands of each gene was included in the experiment. This synthetic RNA consisted of short transcripts (250-300 bp within the coding regions) derived from each gene in vitro. Figure 5B shows that strand-specific probes can specifically detect synthetic antisense RNA (lanes 1 and 2 for calmodulin; lanes 7 and 8 for msp-3) or synthetic sense RNA (lanes 4 and 5 for calmodulin; lanes 10 and 11 for msp-3). With the use of these strand-specific probes, total RNA isolated from asexual stage parasites was shown to contain both antisense (~1 kb) and sense (~1.2 kb) transcripts for both calmodulin (lanes 3 and 6) and msp-3 (~2 kb) (lanes 9 and 12). Therefore, as confirmed by two independent techniques, the presence of antisense tags in the SAGE library reflects antisense transcription in asexual stages of the malarial parasite.
We wondered whether genes expressed in other stages of the
Plasmodium life cycle also exhibited antisense
transcription. To address this, the sexual stages (zygotes and
ookinetes) of the chicken malarial parasite P. gallinaceum
were tested for the presence of antisense RNAs. Pgs28 is a
major surface antigen of P. gallinaceum sexual stages (Duffy
et al., 1993
), and transcription of the pgs28 gene has been studied previously. Strand-specific RT-PCR of total RNA
from zygotes (0 h) and mature ookinetes (48 h) showed that the
pgs28 gene expressed both sense and antisense transcripts (Figure 6) at different stages of in
vitro development (lanes 1, 5, and 9 show antisense PCR product).
|
| |
DISCUSSION |
|---|
|
|
|---|
This report demonstrates the application of SAGE in P. falciparum. Despite the low complexity of the genome, SAGE tags as short as 14 bp can uniquely identify a majority of genes in P. falciparum. This observation has been exploited to study transcription in the asexual stages of the parasite, resulting in new insights into the biology of the pathogen. First, we provide a description of the transcriptional profile of the 3D7 strain of P. falciparum that builds upon the extensive data generated by the Malaria Genome Project. Second, the major metabolic pathways present in blood stage parasites are delineated; modulation of these pathways in response to stimuli such as drug and immune pressure can now be studied. Finally, this report shows that Plasmodium parasites express antisense RNAs at multiple stages during the developmental cycle, a finding that has implications for transcriptional regulation of Plasmodium gene expression.
Analysis of SAGE Tags
Of the tags that matched to single loci, 70% matched to known
genes, whereas 30% matched to unknown genes or hypothetical ORFs. This
distribution is in stark contrast to genome sequencing data, where 60%
of the putative ORFs were of unknown function, whereas 40% were genes
encoding proteins of known functions (Gardner et al., 1998
).
This discrepancy could be explained by the fact that the asexual blood
stages are more amenable to cultivation and experimental manipulation
in the laboratory than other stages; hence, many of the transcripts
expressed in these stages at high abundance have been previously
studied and are of known functions. It is also likely that a majority
of the transcripts expressed during laboratory culture of asexual blood
stages encode proteins that serve housekeeping functions conserved
within organisms widely separated on the phylogenetic tree. The genes
of unknown function identified by the Malaria Genome Project may turn
out to be of importance in host-parasite interactions and disease;
however, under culturing conditions only relatively few may be
expressed at high levels. Alternatively, the higher percentage of
uncharacterized, putative ORFs in the sequence data might be due to
overprediction of genes. Because SAGE data reveals genes that are
actually expressed in asexual stage parasites, identification of tags
that correspond to unknown genes and hypothetical ORFs will be of
tremendous use in annotation of the P. falciparum genome.
Some tags (10%) did not match to the Plasmodium databases.
Because the P. falciparum genome is 80-90% complete, these
tags should prove to be informative as the genome project proceeds to
completion. Alternatively, tags that do not match genome sequence may
turn out to span splice junctions. These questions should be resolved
as more genome sequence becomes available. Nevertheless, SAGE in
P. falciparum is comparable with other studies where tags with no matches to the genome were as high as 20% (Matsumura et al., 1999
) and 23% (Yamashita et al., 2000
) of the
total tags.
Finally, of the 8335 tags, 18% gave multiple matches to
Plasmodium databases, a number that is fourfold higher than
that obtained from human pancreatic SAGE libraries, where ~5% of
tags gave multiple matches (Velculescu et al., 1995
).
However, pancreatic SAGE tags were only searched against RNA sequence
databases, in contrast to our more extensive analysis that surveyed all
available Plasmodium genome sequence. Hence, the higher
percentage of multiple matches to the genome may reflect the method of
analysis rather than any limitation of the technique when applied to
the A-T rich genome of Plasmodium. Alternatively, the higher
percentage of tags giving multiple matches may be a consequence of the
lower complexity of the Plasmodium genome. Ambiguous tags of
interest can be investigated further on an individual basis by Northern analysis.
Metabolic Pathways Defined by SAGE
Other reports on SAGE have revealed metabolic profiles that are
highly specific to the organism or tissue under study. For example,
SAGE of mouse kidney revealed a preponderance of ion channels and
mitochondrial enzymes, consistent with the role of the kidney in
filtration and solute transport and the high-energy requirement for the
same (El-Meanawy et al., 2000
). Transcriptional profiling of
the 100 most abundant SAGE tags derived from seedlings of the rice
plant, Oryza sativa L., demonstrated a prevalence of
prolamin, a storage protein expressed in seeds (Matsumura et al., 1999
). As expected, other highly abundant transcripts
included those encoding water channels and respiratory metabolism enzymes.
SAGE data from P. falciparum shed light on the
transcriptional profile of blood stage parasites and hence reveal the
classes of proteins and metabolic pathways that are probably used
during asexual growth. For example, membrane-associated proteins form the most abundant category of expressed proteins. This is not surprising in light of the fact that the parasite is separated from its
extracellular environment by three separate membranes: the host red
blood cell membrane, the parasitophorous vacuole membrane, and the
parasite plasma membrane (Torii and Masamichi, 1998
). Many of
these highly expressed proteins are stage specific and have been
previously shown to be important in invasion of the red blood cell
(MSP-3 and -4) (Barnwell and Galinski, 1998
); others are
transporters that may import nutrients into the parasite cell (importin
-subunit). Hence, the unique niche of the malarial parasite within
the red blood cell requires the high expression of specific surface proteins.
A significant proportion (7.4%) of the most abundant tags was derived
either from transcripts encoded on the 6-kb mitochondrial genome or
from nuclear encoded transcripts involved in oxidative metabolism. High
levels of RNA synthesis from the 6-kb element may reflect the fact that
this episomally replicating molecule is present at ~20 copies per
cell (Preiser et al., 1996
). However, a high demand for
mitochondrial function and oxidative metabolism is suggested by the
abundance of nuclear transcripts encoding proteins (thioredoxin,
ubiquinol cytochrome c reductase-like protein) probably
involved in the maintenance of intracellular oxidative homeostasis.
Moreover, SAGE data show that transcripts encoding the molecular
chaperones hsp-60 and -70, which may be involved in import of nuclear-encoded proteins into the mitochondria (Das et al., 1997
), are also expressed at high levels. Hence,
mitochondrial functions are most highly represented in the abundant
classes of SAGE tags, probably reflecting the microaerophilic lifestyle of the parasite within the red blood cell. The robust expression of
genes involved in mitochondrial physiology may explain why mitochondrial pathways have been excellent targets for antimalarial drugs.
The major transcriptional pathways in the parasite as revealed by SAGE
will help to identify potential drug targets and lead compounds. For
example, atovaquone inhibits erythrocytic growth by targeting the
mitochondrial cytochrome bc1 complex
(Fry and Pudney, 1992
). Further evidence that other highly expressed
metabolic pathways could also serve as drug targets is found in the
following studies: the antimalarial drug fosmidomycin has been shown to target DOXP metabolism (Jomaa et al., 1999
); the ornithine
decarboxylase inhibitor, difluoro-methylornithine, inhibits
erythrocytic growth of P. falciparum in culture (Assaraf
et al., 1984
); and folate antagonists such as pyrimethamine
and cycloguanil target dihydrofolate reductase (Ferone et
al., 1969
). Other major transcriptional patterns uncovered by SAGE
in the parasite (proteasome, chaperones, unknown ORFs) may provide new
targets for antimalarial drug development.
SAGE Reveals Novel Transcriptional Phenomena in P. falciparum
Most techniques for global analysis of gene expression are unable to distinguish sense and antisense transcripts. Due to the directional nature of SAGE tags (4 bp representing the NlaIII site that is closest to the 3' end of each transcript, and 10 bp downstream on the coding strand), we were able to identify numerous antisense transcripts in the transcriptional repertoire of P. falciparum asexual-stage parasites. Strand-specific RT-PCR and Northern analysis confirmed this observation for three of the genes (msp-3, rap-1, and calmodulin) predicted to transcribe antisense messages. The fact that antisense transcription can be detected in Plasmodium by three independent methods suggests that this is a bona fide biological phenomenon and not an artifact of the SAGE procedure.
Whether antisense RNAs transcribed by the malarial parasite have
poly(A) tails is unclear. However, their representation in the SAGE
library suggests that (similar to our observations with mitochondrial
transcripts in the study by Munasinghe et al., 2000
) due to
the presence of long poly(A) tracts in the genome of P. falciparum, SAGE results in the sampling of both polyadenylated RNAs as well as those lacking poly(A) tails.
Our data also demonstrate that antisense transcripts are expressed in
other stages of Plasmodium development. The pgs28
gene that encodes a major surface antigen of P. gallinaceum
sexual stages has been studied extensively (Duffy et al.,
1993
). Transcription of pgs28 is restricted to the zygotes
and ookinetes. Strand-specific RT-PCR shows that pgs28
expresses both sense and antisense transcripts in both stages. Hence,
the presence of antisense transcripts may be a widespread phenomenon in
multiple stages of Plasmodium development and should be
tested further. For example, a family of genes (var) encodes
variable surface proteins involved in host-parasite interactions; var genes are transcribed during erythrocytic growth,
resulting in the expression of the PfEMP-1 protein (Wahlgren et
al., 1999
). Several var genes are transcribed in the
ring stages, whereas a single var gene is transcribed in
trophozoites (Chen et al., 1998
; Scherf et al.,
1998
). It would be interesting to test whether any of the ring stage
var transcripts are antisense.
What is the biological significance of antisense transcription in
P. falciparum? Antisense transcripts may reflect mechanisms of transcriptional initiation in a parasite with a highly A-T-rich genome (86% A-T in noncoding regions and 76% in coding sequence) (Bowman et al., 1999
). Numerous studies have shown that
transcription in P. falciparum is initiated from the
A-T-rich 5' upstream region of genes, resulting in sense transcripts
(Horrocks et al., 1998
; Dechering et al., 1999
;
Horrocks and Lanzer, 1999
). The presence of antisense transcripts for
17% of annotated genes implies novel mechanisms of transcriptional
initiation and termination, including potential roles in
posttranscriptional control of protein expression.
In conclusion, we have shown that SAGE can be readily adapted for the study of global transcription in P. falciparum. SAGE of 3D7 asexual parasites sheds light on the prominent metabolic pathways used in these stages. Because blood stages are the targets of both antimalarial drugs and the host immune system, this comprehensive transcriptional profile generated by SAGE will form the basis for future comparisons of gene expression under drug or immune pressure. Finally, the unique nature of SAGE reveals a novel phenomenon, that of antisense transcription that has previously been missed.
| |
ACKNOWLEDGMENTS |
|---|
We thank Drs. J. Kissinger and D. Roos (University of Pennsylvania, Philadelphia, PA; http://www.plasmoDB.org) for providing access to assembled P. falciparum genomic sequences. We also thank Dr. Connie Chow for insightful comments about this manuscript, and Dr. Kevin Militello for providing the hsp86 primers and valuable suggestions. We acknowledge the invaluable data provided by the Malaria Genome Consortium: Sequence data for P. falciparum chromosome (1, 3, 4, 5, 6, 7, 8, 9, and 13) was obtained from The Sanger Center Web site at http://www.sanger.ac.uk/Projects/P_falciparum/. Sequencing of P. falciparum chromosome (1, 3, 4, 5, 6, 7, 8, 9, and 13) was accomplished as part of the Malaria Genome Project with support by The Wellcome Trust. Sequence data for P. falciparum chromosome 12 was obtained from the Stanford DNA Sequencing and Technology Center Web site at http://www-sequence.stanford.edu/group/malaria. Sequencing of P. falciparum chromosome 12 was accomplished as part of the Malaria Genome Project with support by the Burroughs Wellcome Fund. Preliminary sequence data for P. falciparum chromosome (2, 10, 11, and 14) was obtained from The Institute for Genomic Research Web site (www.tigr.org). Sequencing of chromosome (2, 10, 11, and 14) was part of the International Malaria Genome Sequencing Project and was supported by awards from the Burroughs Wellcome Fund and the U.S. Department of Defense. The Chromosome 2 Sequencing Project was a collaborative effort by The Institute for Genomic Research (TIGR) and the Naval Medical Research Center. This work was supported by the Burroughs Wellcome Fund and by the U.S. Department of Defense.
| |
FOOTNOTES |
|---|
§ Corresponding author. E-mail address: dfwirth{at}hsph.harvard.edu.
These authors contributed equally to this work.
| |
ABBREVIATIONS |
|---|
Abbreviations used: BLAST, basic local alignment search tool; SAGE, serial analysis of gene expression; RT-PCR, reverse transcription-polymerase chain reaction.
| |
REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
F. Li, L. Sonbuchner, S. A. Kyes, C. Epp, and K. W. Deitsch Nuclear Non-coding RNAs Are Transcribed from the Centromeres of Plasmodium falciparum and Are Associated with Centromeric Chromatin J. Biol. Chem., February 29, 2008; 283(9): 5692 - 5698. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Richards, S.-P. Tan, W.-K. Chan, and A. Bongso Reverse Serial Analysis of Gene Expression (SAGE) Characterization of Orphan SAGE Tags from Human Embryonic Stem Cells Identifies the Presence of Novel Transcripts and Antisense Transcription of Key Pluripotency Genes Stem Cells, May 1, 2006; 24(5): 1162 - 1173. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Bishop, T. Shah, R. Pelle, D. Hoyle, T. Pearson, L. Haines, A. Brass, H. Hulme, S. P. Graham, E. L. N. Taracha, et al. Analysis of the transcriptome of the protozoan Theileria parva using MPSS reveals that the majority of genes are transcriptionally active in the schizont stage Nucleic Acids Res., September 25, 2005; 33(17): 5503 - 5511. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. T. MILITELLO, V. PATEL, A.-D. CHESSLER, J. K. FISHER, J. M. KASPER, A. GUNASEKERA, and D. F. WIRTH RNA polymerase II synthesizes antisense RNA in Plasmodium falciparum RNA, April 1, 2005; 11(4): 365 - 370. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Quere, L. Manchon, M. Lejeune, O. Clement, F. Pierrat, B. Bonafoux, T. Commes, D. Piquemal, and J. Marti Mining SAGE data allows large-scale, sensitive screening of antisense transcript expression Nucleic Acids Res., November 23, 2004; 32(20): e163 - e163. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Robinson, D. J. Cram, C. T. Lewis, and I. A.P. Parkin Maximizing the Efficacy of SAGE Analysis Identifies Novel Transcripts in Arabidopsis Plant Physiology, October 1, 2004; 136(2): 3223 - 3233. [Abstract] [Full Text] [PDF] |
||||
![]() |
|