411 Conserved Domain Database models. CDD 1 {"type": "domain"} 150 Covariance models from Rfam (release 12.2), aligned to the genome with 'cmscan' from the Infernal suite of programs. Models are restricted to those observed in species that share a last common ancestor (LCA). Rfam Models (LCA) 1 {"type": "rna"} 22 Density of coding genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Coding genes (density) 1 \N 153 Dust is a program that identifies low-complexity sequences (regions of the genome with a biased distribution of nucleotides, such as a repeat). The Dust module is widely used with BLAST to prevent 'sticky' regions from determining false hits. Low complexity (Dust) 1 \N 2 Protein coding genes annotated in ENA Genes 1 {"label_key": "[biotype]", "caption": "Genes", "colour_key": "[biotype]", "default": {"cytoview": "gene_label", "MultiBottom": "collapsed_label", "alignsliceviewbottom": "as_collapsed_label", "MultiTop": "gene_label", "contigviewbottom": "transcript_label", "contigviewtop": "gene_label"}, "name": "Genes", "key": "ensembl"} 1 Assembly gap feature annotated in ENA Assembly gap (ENA) 1 \N 430 CATH/Gene3D families. Gene3D 1 {"type": "domain"} 3 Cross-references attached by GenomeLoader GenomeLoader cross-references 0 \N 436 \N GOA annotation 0 \N 187 GO term derived transitively from a UniProt record UniProt-derived GO term 0 \N 194 Gene encoding an enzyme annotated at the Plant Reactome. Plant Reactome 1 \N 409 HAMAP families. HAMAP 1 {"type": "domain"} 408 PANTHER families. PANTHER 1 {"type": "domain"} 432 InterPro2GO mapping, defined by InterPro. InterPro2GO mapping 0 \N 71 Density of long non-coding RNA genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Long non-coding genes (density) 1 \N 152 MicroRNA from miRBase. miRBase miRNA 1 {"type": "rna"} 160 RNA genes imported from miRBase . RNA genes 1 {"label_key": "[biotype]", "caption": "Genes", "colour_key": "[biotype]", "default": {"cytoview": "gene_label", "MultiBottom": "collapsed_label", "alignsliceviewbottom": "as_collapsed_label", "MultiTop": "gene_label", "contigviewbottom": "transcript_label", "contigviewtop": "gene_label"}, "name": "Genes", "key": "ensembl"} 420 Intrinsically disordered regions predicted by MobiDB lite. MobiDB lite 1 {"type": "feature"} 421 Coiled-coil regions predicted by Ncoils. Coiled-coils (Ncoils) 1 {"type": "feature"} 73 Percentage of repetitive elements for top level sequences (such as chromosomes, scaffolds, etc.) Repeats (percent) 1 \N 18 Percentage of G/C bases in the sequence. GC content 1 \N 410 Protein domains and motifs from the Pfam database. Pfam 1 {"type": "domain"} 431 Protein domains and motifs from the PROSITE profiles database. PROSITE profiles 1 {"type": "domain"} 415 Protein domains and motifs from the PIR (Protein Information Resource) Superfamily database. PIRSF 1 {"type": "domain"} 414 Protein fingerprints (groups of conserved motifs) from the PRINTS database. Prints 1 {"type": "domain"} 69 Density of pseudogenes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Pseudogenes (density) 1 \N 428 Repeats detected using Red (REPeatDetector) Repeats: Red 1 \N 155 Repeats detected using the MIPS Repeat Database (REdat) using RepeatMasker. Repeats: REdat 1 \N 156 Repeats identified by RepeatMasker, using the Repbase library of repeat profiles. Repeats: Repbase 1 \N 137 RNA genes produced by filtering alignments of Rfam (release 12.2) covariance models. RNA genes 1 {"label_key": "[biotype]", "caption": "Genes", "colour_key": "[biotype]", "default": {"cytoview": "gene_label", "MultiBottom": "collapsed_label", "alignsliceviewbottom": "as_collapsed_label", "MultiTop": "gene_label", "contigviewbottom": "transcript_label", "contigviewtop": "gene_label"}, "name": "Genes", "key": "ensembl"} 433 Protein domains and motifs from the PROSITE patterns database. PROSITE patterns 1 {"type": "domain"} 426 Low complexity peptide sequences identified by Seg. Low complexity (Seg) 1 {"type": "feature"} 417 Structure-Function Linkage Database families. SFLD 1 {"type": "domain"} 70 Density of short non-coding RNA genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Short non-coding genes (density) 1 \N 422 Signal peptide cleavage sites predicted by SignalP. Cleavage site (Signalp) 1 {"type": "feature"} 416 Protein domains and motifs from the SMART database. SMART 1 {"type": "domain"} 418 Protein domains and motifs from the SUPERFAMILY database. Superfamily 1 {"type": "domain"} 419 Protein domains and motifs from the TIGRFAM database. TIGRFAM 1 {"type": "domain"} 423 Transmembrane helices predicted by TMHMM. Transmembrane helices 1 {"type": "feature"} 154 Tandem Repeats Finder locates adjacent copies of a pattern of nucleotides. Tandem repeats (TRF) 1 \N 151 tRNA models predicted with tRNAscan-SE (release 1.3.1). tRNA Models 1 {"type": "rna"} 159 RNA genes produced by filtering predictions from tRNAscan-SE v1.23. RNA genes 1 {"label_key": "[biotype]", "caption": "Genes", "colour_key": "[biotype]", "default": {"cytoview": "gene_label", "MultiBottom": "collapsed_label", "alignsliceviewbottom": "as_collapsed_label", "MultiTop": "gene_label", "contigviewbottom": "transcript_label", "contigviewtop": "gene_label"}, "name": "Genes", "key": "ensembl"} 185 UniParc mapping based on sequence checksums UniParc cross-reference 0 \N 63 Sequences from various databases are matched to Ensembl transcripts using Exonerate. These are external references, or 'Xrefs'. DNA match 0 \N 186 UniProt cross-reference derived transitively from a UniParc identifier UniParc-derived cross-reference 0 \N 188 Cross-reference derived transitively from a UniProt record UniProt-derived cross-reference 0 \N 191 Cross references to RefSeq nucleotide sequences, determined by alignment against the transcriptome with blastx. RefSeq transcripts 0 \N 190 Cross references to RefSeq peptide sequences, determined by alignment against the proteome with blastp. RefSeq peptides 0 \N 189 Cross references to UniProt Swiss-Prot (reviewed) proteins, determined by alignment against the proteome with blastp. UniProt reviewed proteins 0 \N 192 Cross references to UniProt TrEMBL (unreviewed) proteins, determined by alignment against the proteome with blastp. UniProt unreviewed proteins 0 \N