This reports the protocol used to align the Sugarcane_EST features to Maize_BACs_20060126.
Mon Feb 13 15:26:16 2006


Source of Sugarcane_EST : Downloaded from genbank dbEST with query 'saccharum AND gbdiv_est[PROP]' 

Alignment procedure details 
--------------------------- 

255984 Sugarcane_EST are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 47936
# unique Features these alignments represent: 41527
% of total features these alignments represent : 16.22 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 9044
150	 5497
200	 4668
250	 3995
300	 3666
350	 3185
400	 3395
450	 3174
500	 2926
550	 2988
600	 2744
650	 1732
700	 664
750	 173
800	 51
10000	 34

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 33499
# unique Features these remaining alignments represent: 28983
% of total features these alignments represent : 11.32 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 25499
2	 3043
3	 206
4	 61
5	 64
6	 62
8	 48
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 32203
# unique Features these remaining alignments represent: 28748
% of total features these alignments represent : 11.23 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 2
30	 0
40	 1
50	 20
60	 63
70	 386
80	 1781
90	 8952
95	 17393
100	 3605

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 25891
2000	 2903
3000	 1128
4000	 452
5000	 169
6000	 85
7000	 78
8000	 55
9000	 74
10000	 80

Following is the final summary
# alignments : 32203
# unique Features these alignments represent: 28748
% of total features these alignments represent : 11.23 %