This website has moved to the current location (which you have been redirected to) from aztec.stanford.edu/gfp. Please update your bookmarks.

Award Abstract:

This pilot project will develop a high-throughput strategy to analyze native expression patterns and subcellular localization of Arabidopsis gene products of unknown function. This strategy, Fluorescent Tagging of Full-Length Proteins (FTFLP), will comprise five major steps: (1) Selection of functionally unassigned Arabidopsis genes and prediction of their protein structure and suitable site for fluorescent tag insertion (2) Amplification of each gene in two parts, with the junction between the two parts corresponding to our chosen insertion site for the fluorescent tag(3) Introduction of the fluorescent tag, yellow fluorescent protein (YFP) using a triple overlap PCR approach (4) Insertion of PCR products into binary vectors (5) Production of transgenic Arabidopsis lines and analysis of expression pattern and intracellular localization for each tagged protein. As a pilot approach, the project aims to analyze a statistically significant number of genes to support the applicability to a subsequent wider study. To this end, approximately 800 genes were selected from a total of ca. 8,000 unknown genes. This pilot list was chosen based on the following sequentially-applied criteria: 1) have matching full-length cDNA, 2) are annotated as unknown protein or putative protein, and 3) do not have any Gene Ontology annotations. The selected genes reflect the diversity of all the unknown Arabidopsis genes with respect to plant specificity, predicted domain and/or gene family information, and availability of matching full-length cDNA sequences.

FTFLP as a tool for functional proteomics offers four significant advantages: it focuses on genes of unknown function, it produces tagged full length proteins that are more likely to exhibit faithful intracellular localization, it expresses fusion proteins at native expression levels to minimize artifacts due to over-expression and ectopic expression, and the use of native promoters allows determination tissue specificity. Three deliverables will be offered to the research community:

1) Expression vectors harboring full-length sequences for each gene under its native promoter and tagged with YFP flanked by unique restriction sites,

2) Arabidopsis transgenic lines expressing each construct, and

3) A website and a searchable database containing information about the lines and constructs, including the gene sequences highlighted with positions of primers and tagging sites, vector construct information, images and text descriptions of the protein expression pattern and intracellular localization, and protocols and standard operation procedures in experimentation, analysis, and interpretation. Also, a Reference Protein Subcellular Localization Map will be constructed using fluorescently-tagged proteins with known intracellular targeting.

These resources will be available to the public through two unrestricted venues: DNA constructs and transgenic seeds will be distributed through the Arabidopsis Biological Resource Center (ABRC) whereas gene sequences and expression and subcellular localization data, including fluorescence microscopy images, will be disseminated via the project website integrated into The Arabidopsis Information Resource (TAIR). Importantly, this sharing of the resources and results of this project through ABRC and TAIR, respectively, will take place on a continuous basis as the deliverables become available. Announcements on the availability of new resources will be made through such electronic media as the Bionet USENET newsgroups and parallel e-mail lists.

This project significantly advances the overall objectives of the 2010 Project by characterizing on a large scale the expression and subcellular localization of unknown Arabidopsis gene products. Our understanding of Arabidopsis biology will be significantly incomplete without such knowledge. In addition, this project has a broader impact on the society and science. Once this pilot project demonstrates the feasibility of the proposed approach, it will serve a basis for developing a laboratory curriculum for use in cell biology training of high school students and teachers as well as beginning investigators at the CSHL DNA Learning Center and the annual Arabidopsis Molecular Genetics Course, and at the biannual UCR Plant Cell Biology course. Finally, a teaching outreach program with community colleges will involve undergraduates in summer research. Thus, our program will bridge genomic approaches with cell biology in the laboratory and classroom, and generate important novel information and tools to characterize the Arabidopsis proteome.