Gramene Diversity Module: Sharing The Data Behind Germplasm, QTL, And Breeding Studies

Terry Casstevens3, Peter Bradbury3, Payan Canaran4, Dallas Kroon3, David E. Matthews1,2, Susan McCouch2,3, Junjian Ni2, Doreen Ware1,4, Immanuel Yap2, Wei Zhao4, Edward S. Buckler1,2,3

1 USDA-ARS
2 Dept. of Plant Breeding and Genetics, Cornell University
3 Institute of Genomic Diversity, Cornell University
4 Cold Spring Harbor Laboratory, NY, USA

Understanding genetic and phenotypic diversity is an exciting field. Thousands of studies have already been conducted to unravel the connections between observable traits and the genome. These studies produce large quantities of data (i.e. QTL, germplasm, molecular). Supported by the NSF and USDA-ARS, the Gramene Diversity project is building an infrastructure to share this basic molecular and phenotypic diversity data. This infrastructure includes a database schema (GDPDM; www.maizegenetics.net/gdpdm), a Java XML-SOAP middleware (GDPC; www.maizegenetics.net/gdpc), a sequence alignment-SNP viewer (www.panzea.org/db/snp_alignment/snp_cgi.dev.pl), and an association diversity analysis tool (TASSEL; www.maizegenetics.net/bioinformatics/tasselindex.htm). All these components are free open source tools. Species specific tools to diversity data sets are currently available for maize (www.panzea.org) and rice (rice-evolution.plbr.cornell.edu). We are currently integrating diversity data from Maize, Wheat, and Rice diversity projects into one database. Diversity data in that newly combined database and the USDA Germplasm Resources Information Network (GRIN; www.ars-grin.gov) will soon be publicly shared via the GDPC middleware. We are also working (1) to incorporate more data from diversity projects, (2) to develop community upload tools, and (3) to create enhanced query, display and analysis tools. We encourage community input and collaboration on this effort, so that the largest possible community can access and productively use diversity data.