This reports the protocol used to align the Sorghum_ESTcluster3_Pratt features to tigrv4-genome. Fri Apr 14 11:57:18 2006 Source of Sorghum_ESTcluster3_Pratt : from Gramene markers database, originally from Pratt lab Alignment procedure details --------------------------- 27436 Sorghum_ESTcluster3_Pratt are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets. Initial summary # alignments : 17721 # unique Features these alignments represent: 16594 % of total features these alignments represent : 60.48 % The length of the matches are distributed as follows Hit_Length # alignments -------- -------- 100 1918 150 2045 200 2352 250 2332 300 2268 350 2078 400 1742 450 1240 500 766 550 446 600 286 650 132 700 46 750 25 800 22 10000 23 Alignments with matches less than 150 bp are deleted # remaining Alignments : 13814 # unique Features these remaining alignments represent: 12946 % of total features these alignments represent : 47.19 % Frequency distribution of the remaining features # hits # features -------- -------- 1 12543 2 238 3 41 4 44 5 53 6 5 8 8 9 6 10 4 20 4 30 0 40 0 50 0 100 0 Features that hit more than thrice are deleted. # remaining Alignments : 13142 # unique Features these remaining alignments represent: 12822 % of total features these alignments represent : 46.73 % % Identity distribution of the remaining features % Identity # features -------- -------- 10 0 20 0 30 0 40 3 50 5 60 24 70 151 80 1157 90 9068 95 2589 100 145 Following is the distribution of gaps Gaps # features -------- -------- 1000 11199 2000 1202 3000 302 4000 84 5000 37 6000 33 7000 17 8000 28 9000 16 10000 14 Following is the final summary # alignments : 13142 # unique Features these alignments represent: 12822 % of total features these alignments represent : 46.73 %