51 Protein features that represent the mapping between Ensembl proteins (ENSP) and AlphaFoldDB protein structures (including their corresponding chains). Imported via UniProt mappings AFDB-ENSP mappings 1 \N 23 Conserved Domain Database models. CDD 1 {"type": "domain"} 14 Density of coding genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Coding genes (density) 1 \N 7 Gene annotations from the CSHL gene annotation pipeline developed under NAM project. It is performed through an automated, evidence-based method combining 3rd party software including Mikado, BRAKER and PASA. Further filtered by conservation and AED score, classified into coding and non-coding sets. Genes 1 {"label_key": "[biotype]", "caption": "Genes", "default": {"contigviewtop": "gene_label", "MultiTop": "gene_label", "alignsliceviewbottom": "as_collapsed_label", "cytoview": "gene_label", "MultiBottom": "collapsed_label", "contigviewbottom": "transcript_label"}, "name": "Coding genes annotated by CSHL", "key": "ensembl", "colour_key": "[gene.logic_name]_[gene.biotype]"} 8 Gene annotations from the CSHL gene annotation pipeline developed under NAM project. It is performed through an automated, evidence-based method combining 3rd party software including Mikado, BRAKER and PASA. Further filtered by conservation and AED score, classified into coding and non-coding sets. Noncoding genes 1 {"label_key": "[biotype]", "caption": "Genes", "default": {"contigviewtop": "gene_label", "MultiTop": "gene_label", "alignsliceviewbottom": "as_collapsed_label", "cytoview": "gene_label", "MultiBottom": "collapsed_label", "contigviewbottom": "transcript_label"}, "name": "Non-coding genes annotated by CSHL", "key": "ensembl", "colour_key": "[gene.logic_name]_[gene.biotype]"} 3 Dust is a program that identifies low-complexity sequences (regions of the genome with a biased distribution of nucleotides, such as a repeat). The Dust module is widely used with BLAST to prevent 'sticky' regions from determining false hits. Low complexity (Dust) 1 \N 46 CATH/Gene3D families. Gene3D 1 {"type": "domain"} 53 \N GOA annotation 0 \N 19 GO term derived transitively from a UniProt record UniProt-derived GO term 0 \N 45 Gene encoding an enzyme annotated at the Plant Reactome. Plant Reactome 1 \N 24 HAMAP families. HAMAP 1 {"type": "domain"} 29 PANTHER families. PANTHER 1 {"type": "domain"} 47 InterPro2GO mapping, defined by InterPro. InterPro2GO mapping 0 \N 15 Density of long non-coding RNA genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Long non-coding genes (density) 1 \N 34 Intrinsically disordered regions predicted by MobiDB lite. MobiDB lite 1 {"type": "feature"} 35 Coiled-coil regions predicted by Ncoils. Coiled-coils (Ncoils) 1 {"type": "feature"} 16 Percentage of repetitive elements for top level sequences (such as chromosomes, scaffolds, etc.) Repeats (percent) 1 \N 12 Percentage of G/C bases in the sequence. GC content 1 \N 37 Protein domains and motifs from the Pfam database. Pfam 1 {"type": "domain"} 49 Protein domains and motifs from the PROSITE profiles database. PROSITE profiles 1 {"type": "domain"} 27 Protein domains and motifs from the PIR (Protein Information Resource) Superfamily database. PIRSF 1 {"type": "domain"} 26 Protein fingerprints (groups of conserved motifs) from the PRINTS database. Prints 1 {"type": "domain"} 11 Density of pseudogenes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Pseudogenes (density) 1 \N 5 RepeatMasker is used to find repeats and low-complexity sequences. This track usually shows repeats alone (not low-complexity sequences). Repeats 1 \N 2 Repeats identified by RepeatMasker, using a custom library of ab initio repeat profiles for this species. Repeats: Custom library 1 \N 48 Protein domains and motifs from the PROSITE patterns database. PROSITE patterns 1 {"type": "domain"} 42 Low complexity peptide sequences identified by Seg. Low complexity (Seg) 1 {"type": "feature"} 31 Structure-Function Linkage Database families. SFLD 1 {"type": "domain"} 13 Density of short non-coding RNA genes, calculated by dividing the chromosome into 150 "bins" and counting the genes in each. (For very short chromosomes, e.g. MT, some genes contribute to multiple bins.) Short non-coding genes (density) 1 \N 39 Signal peptide cleavage sites predicted by SignalP. Cleavage site (Signalp) 1 {"type": "feature"} 32 Protein domains and motifs from the SMART database. SMART 1 {"type": "domain"} 43 Density of single nucleotide polymorphisms (SNPs), calculated by dividing the chromosome into 150 "bins" and counting the SNPs in each. (For very short chromosomes, e.g. MT, some SNPs contribute to multiple bins.) SNP Density 1 \N 33 Protein domains and motifs from the SUPERFAMILY database. Superfamily 1 {"type": "domain"} 36 Protein domains and motifs from the TIGRFAM database. TIGRFAM 1 {"type": "domain"} 38 Transmembrane helices predicted by TMHMM. Transmembrane helices 1 {"type": "feature"} 4 Tandem Repeats Finder locates adjacent copies of a pattern of nucleotides. Tandem repeats (TRF) 1 \N 17 UniParc mapping based on sequence checksums UniParc cross-reference 0 \N 18 UniProt cross-reference derived transitively from a UniParc identifier UniParc-derived cross-reference 0 \N 20 Cross-reference derived transitively from a UniProt record UniProt-derived cross-reference 0 \N 21 Cross references to UniProt Swiss-Prot (reviewed) proteins, determined by alignment against the proteome with blastp. UniProt reviewed proteins 0 \N 22 Cross references to UniProt TrEMBL (unreviewed) proteins, determined by alignment against the proteome with blastp. UniProt unreviewed proteins 0 \N