This reports the protocol used to align the Rice_CDS features to Maize_BACs_20060126. Mon Feb 13 12:39:16 2006 Source of Rice_CDS : Downloaded from Genbank with query '(txid4530[ORGN] AND complete[TITL] AND cds[TITL]) NOT (Mitochondrion[ALL] OR Chloroplast[ALL] OR Mitochondrial[ALL]) )' Alignment procedure details --------------------------- 2423 Rice_CDS are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 857 # unique Features these alignments represent: 780 % of total features these alignments represent : 32.19 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 190 150 168 200 82 250 65 300 45 350 37 400 23 450 23 500 15 550 19 600 18 650 18 700 13 750 9 800 12 10000 120 Alignments with matches less than 150 bp are deleted # remaining Alignments : 500 # unique Features these remaining alignments represent: 466 % of total features these alignments represent : 19.23 % Frequency distribution of the remaining features # hits # features -------- -------- 1 433 2 32 3 1 4 0 5 0 6 0 8 0 9 0 10 0 20 0 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 500 # unique Features these remaining alignments represent: 466 % of total features these alignments represent : 19.23 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 0 40 1 50 3 60 10 70 19 80 99 90 302 95 64 100 2 Following is the distribution of Gaps Gaps # features -------- -------- 1000 335 2000 67 3000 35 4000 21 5000 10 6000 6 7000 3 8000 7 9000 2 10000 1 Following is the final summary # alignments : 500 # unique Features these alignments represent: 466 % of total features these alignments represent : 19.23 %