Estimation of Syntenic Spans between Oryza sativa and other species of the Oryza Genus

Step 1 - Loading the FPContig Map

The OMAP HME FPC map for O.rufipogon was loaded into the Gramene 'mappings' relational database such that the data model captured the locations of FPContigs on chromosomes, and clones on FPContigs.

Step 2 - Sequence Loading

All GenBank GSS sequences for O.rufipogon were loaded into the 'mappings' database, and cross references were made between the sequence entries and their parent clones. Pairs of clone end reads could therefore be readily identified.

Step 3 - Sequence Alignment

The GSS sequences were aligned to the O.sativa pseudomolecules (TIGR v4 assembly) using blat. The standard blat parameters were used, and the alignments from the top 10 scoring hits loaded into the database.

Step 4 - Clone Mapping

Mappings between clones and the O. sativa genome were inferred where both ends of a given clone aligned and were within two standard deviations of the mean clone size. The mean clone size was determined by examining the band counts for the clones in the FPC map and extrapolating the clone size in base pairs using an estimated band size of 1195 bp. The BES for a clone could map in either the same or different orientations to account for possible inversions, but could not be mapped to different chromosomes or to regions on the same chromosome that fell outside of the expected clone size region (+/- 2 std dev. from the mean clone size). Clones could map to multiple regions of the genome if the score for the hits in the region were the same (multiple top hits).

Step 5 - FPContig Mapping

FPContigs mappings on the genome were inferred where three or more overlapping clones on a given contig were located. The start and end of the clones became the start and end points of each region. If a gap existed, another region would be formed by the clones in that region. Discrete sections of each FPContig were free to map to multiple locations on the genome.

Step 6 - O.rufipogon Genome Database

A representation of the O.rufipogon FPC map was transferred from the 'mappings' database to an 'ensembl-core' database. The FPContig-to-chromosome locations were represented as the 'assembly', and the clone locations as features annotated on the assembly. The annotated assembly can be viewed using Gramene's Genome Browser. Clones whose mapping to the O.sativa genome was supported by adjacent clones are outlined in red (synteny_group). Clones that were placed without support are outlined in green (synteny_singleton), and clones that could not be located are outlined in grey (no_synteny).

Step 7 - O.sativa Genome Database

The O.rufipogon GSS sequence to O.sativa pseudomolecule alignments were transferred to the existing Gramene 'ensembl-core' database as features annotated on the pseudomolecules. The clone to pseudomolecule locations were similarly transferred. The GSS and clone annotations can be viewed as tracks on Gramene's Oryza sativa Genome Browser.

Step 8 - Compara Database

The sections of O.rufipogon FPContigs that were located on the O.sativa pseudomolecules were used directly as syntenic anchors between the assemblies in the respective genome databases, and loaded accordingly into the 'ensembl-compara' database. These annotations can be viewed as tracks on each Genome Browser, or as whole-chromosome overviews from the Browser's SyntenyView application.