130 Conserved Domain Database models. CDD 1 {"type": "domain"} 30 Density of coding genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Coding genes (density) 1 \N 34 Dust is a program that identifies low-complexity sequences (regions of the genome with a biased distribution of nucleotides, such as a repeat). The Dust module is widely used with BLAST to prevent 'sticky' regions from determining false hits. Low complexity (Dust) 1 \N 1 Protein coding genes annotated in ENA Genes 1 {"label_key": "[biotype]", "caption": "Genes", "colour_key": "[biotype]", "default": {"cytoview": "gene_label", "MultiBottom": "collapsed_label", "alignsliceviewbottom": "as_collapsed_label", "MultiTop": "gene_label", "contigviewbottom": "transcript_label", "contigviewtop": "gene_label"}, "name": "Genes", "key": "ensembl"} 7 3'UTR feature annotated in ENA 3'UTR (ENA) 1 {"caption": "Genomic features", "key": "ena_features", "label_key": "[text_label] [display_label]", "multi_name": "Genomic features", "name": "Genomic features"} 6 5'UTR feature annotated in ENA 5'UTR (ENA) 1 {"caption": "Genomic features", "key": "ena_features", "label_key": "[text_label] [display_label]", "multi_name": "Genomic features", "name": "Genomic features"} 21 Assembly gap feature annotated in ENA Assembly gap (ENA) 1 \N 20 gene feature annotated in ENA gene (ENA) 1 {"caption": "Genomic features", "key": "ena_features", "label_key": "[text_label] [display_label]", "multi_name": "Genomic features", "name": "Genomic features"} 4 ncRNA genes annotated in ENA ncRNA genes (ENA) 1 {"label_key": "[biotype]", "caption": "Genes", "colour_key": "[biotype]", "default": {"cytoview": "gene_label", "MultiBottom": "collapsed_label", "alignsliceviewbottom": "as_collapsed_label", "MultiTop": "gene_label", "contigviewbottom": "transcript_label", "contigviewtop": "gene_label"}, "name": "Genes", "key": "ensembl"} 153 CATH/Gene3D families. Gene3D 1 {"type": "domain"} 157 \N GOA annotation 0 \N 40 GO term derived transitively from a UniProt record UniProt-derived GO term 0 \N 128 HAMAP families. HAMAP 1 {"type": "domain"} 129 PANTHER families. PANTHER 1 {"type": "domain"} 151 InterPro2GO mapping, defined by InterPro. InterPro2GO mapping 0 \N 44 Density of long non-coding RNA genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Long non-coding genes (density) 1 \N 139 Intrinsically disordered regions predicted by MobiDB lite. MobiDB lite 1 {"type": "feature"} 140 Coiled-coil regions predicted by Ncoils. Coiled-coils (Ncoils) 1 {"type": "feature"} 45 Percentage of repetitive elements for top level sequences (such as chromosomes, scaffolds, etc.) Repeats (percent) 1 \N 22 Percentage of G/C bases in the sequence. GC content 1 \N 132 Protein domains and motifs from the Pfam database. Pfam 1 {"type": "domain"} 152 Protein domains and motifs from the PROSITE profiles database. PROSITE profiles 1 {"type": "domain"} 135 Protein domains and motifs from the PIR (Protein Information Resource) Superfamily database. PIRSF 1 {"type": "domain"} 136 Protein fingerprints (groups of conserved motifs) from the PRINTS database. Prints 1 {"type": "domain"} 26 Density of pseudogenes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Pseudogenes (density) 1 \N 149 Repeats detected using Red (REPeatDetector) Repeats: Red 1 \N 36 Repeats detected using the MIPS Repeat Database (REdat) using RepeatMasker. Repeats: REdat 1 \N 37 Repeats identified by RepeatMasker, using the Repbase library of repeat profiles. Repeats: Repbase 1 \N 154 Protein domains and motifs from the PROSITE patterns database. PROSITE patterns 1 {"type": "domain"} 147 Low complexity peptide sequences identified by Seg. Low complexity (Seg) 1 {"type": "feature"} 137 Structure-Function Linkage Database families. SFLD 1 {"type": "domain"} 28 Density of short non-coding RNA genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Short non-coding genes (density) 1 \N 144 Signal peptide cleavage sites predicted by SignalP. Cleavage site (Signalp) 1 {"type": "feature"} 138 Protein domains and motifs from the SMART database. SMART 1 {"type": "domain"} 142 Protein domains and motifs from the SUPERFAMILY database. Superfamily 1 {"type": "domain"} 141 Protein domains and motifs from the TIGRFAM database. TIGRFAM 1 {"type": "domain"} 143 Transmembrane helices predicted by TMHMM. Transmembrane helices 1 {"type": "feature"} 35 Tandem Repeats Finder locates adjacent copies of a pattern of nucleotides. Tandem repeats (TRF) 1 \N 38 UniParc mapping based on sequence checksums UniParc cross-reference 0 \N 39 UniProt cross-reference derived transitively from a UniParc identifier UniParc-derived cross-reference 0 \N 41 Cross-reference derived transitively from a UniProt record UniProt-derived cross-reference 0 \N 42 Cross references to UniProt Swiss-Prot (reviewed) proteins, determined by alignment against the proteome with blastp. UniProt reviewed proteins 0 \N 43 Cross references to UniProt TrEMBL (unreviewed) proteins, determined by alignment against the proteome with blastp. UniProt unreviewed proteins 0 \N