This reports the protocol used to align the Sugarcane_EST features to Maize_BACs_20060126. Mon Feb 13 15:26:16 2006 Source of Sugarcane_EST : Downloaded from genbank dbEST with query 'saccharum AND gbdiv_est[PROP]' Alignment procedure details --------------------------- 255984 Sugarcane_EST are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 47936 # unique Features these alignments represent: 41527 % of total features these alignments represent : 16.22 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 9044 150 5497 200 4668 250 3995 300 3666 350 3185 400 3395 450 3174 500 2926 550 2988 600 2744 650 1732 700 664 750 173 800 51 10000 34 Alignments with matches less than 150 bp are deleted # remaining Alignments : 33499 # unique Features these remaining alignments represent: 28983 % of total features these alignments represent : 11.32 % Frequency distribution of the remaining features # hits # features -------- -------- 1 25499 2 3043 3 206 4 61 5 64 6 62 8 48 9 0 10 0 20 0 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 32203 # unique Features these remaining alignments represent: 28748 % of total features these alignments represent : 11.23 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 2 30 0 40 1 50 20 60 63 70 386 80 1781 90 8952 95 17393 100 3605 Following is the distribution of Gaps Gaps # features -------- -------- 1000 25891 2000 2903 3000 1128 4000 452 5000 169 6000 85 7000 78 8000 55 9000 74 10000 80 Following is the final summary # alignments : 32203 # unique Features these alignments represent: 28748 % of total features these alignments represent : 11.23 %