This reports the protocol used to align the Rice_EST features to Maize_BACs_20060126. Mon Feb 13 13:21:14 2006 Source of Rice_EST : Downloaded from Genbank with query 'txid4530[orgn] AND gbdiv_est[PROP] Alignment procedure details --------------------------- 298857 Rice_EST are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 39150 # unique Features these alignments represent: 27694 % of total features these alignments represent : 9.27 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 7917 150 5406 200 5071 250 4525 300 3416 350 3211 400 2652 450 2695 500 1644 550 960 600 534 650 343 700 218 750 182 800 277 10000 99 Alignments with matches less than 150 bp are deleted # remaining Alignments : 25948 # unique Features these remaining alignments represent: 17089 % of total features these alignments represent : 5.72 % Frequency distribution of the remaining features # hits # features -------- -------- 1 13922 2 1360 3 166 4 457 5 597 6 216 8 371 9 0 10 0 20 0 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 17140 # unique Features these remaining alignments represent: 15448 % of total features these alignments represent : 5.17 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 2 40 10 50 13 60 58 70 200 80 1228 90 9773 95 3618 100 2238 Following is the distribution of Gaps Gaps # features -------- -------- 1000 14369 2000 897 3000 397 4000 105 5000 1108 6000 95 7000 25 8000 11 9000 26 10000 19 Following is the final summary # alignments : 17140 # unique Features these alignments represent: 15448 % of total features these alignments represent : 5.17 %