This reports the protocol used to align the Sorghum_EST features to Maize_BACs_20060126.
Mon Feb 13 15:01:33 2006


Source of Sorghum_EST : Downloaded from genbank with query ' txid4557[orgn]  AND  gbdiv_est[PROP]' 

Alignment procedure details 
--------------------------- 

231225 Sorghum_EST are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 45512
# unique Features these alignments represent: 35913
% of total features these alignments represent : 15.53 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 7404
150	 5485
200	 4517
250	 3823
300	 3657
350	 3508
400	 3711
450	 3307
500	 3324
550	 2902
600	 1910
650	 976
700	 598
750	 267
800	 91
10000	 32

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 32698
# unique Features these remaining alignments represent: 25205
% of total features these alignments represent : 10.90 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 21739
2	 2235
3	 220
4	 136
5	 319
6	 337
8	 219
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 26869
# unique Features these remaining alignments represent: 24194
% of total features these alignments represent : 10.46 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 0
40	 7
50	 9
60	 82
70	 373
80	 1320
90	 7225
95	 14026
100	 3827

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 23311
2000	 1995
3000	 687
4000	 313
5000	 189
6000	 68
7000	 47
8000	 22
9000	 33
10000	 27

Following is the final summary
# alignments : 26869
# unique Features these alignments represent: 24194
% of total features these alignments represent : 10.46 %