This reports the protocol used to align the Wheat_EST features to Maize_BACs_20060126.
Mon Feb 13 15:52:12 2006


Source of Wheat_EST : Downloaded from genbank with query ' txid4564[orgn]  AND  gbdiv_est[PROP]' 

Alignment procedure details 
--------------------------- 

608301 Wheat_EST are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 100082
# unique Features these alignments represent: 78149
% of total features these alignments represent : 12.85 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 28100
150	 13506
200	 13071
250	 9322
300	 9511
350	 8632
400	 6045
450	 3902
500	 2845
550	 2238
600	 1680
650	 700
700	 336
750	 97
800	 18
10000	 79

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 58748
# unique Features these remaining alignments represent: 43136
% of total features these alignments represent : 7.09 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 35393
2	 5235
3	 451
4	 209
5	 869
6	 679
8	 298
9	 0
10	 0
20	 1
30	 1
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 47216
# unique Features these remaining alignments represent: 41079
% of total features these alignments represent : 6.75 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 1
30	 3
40	 15
50	 28
60	 189
70	 690
80	 3737
90	 26510
95	 11391
100	 4652

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 41765
2000	 3223
3000	 927
4000	 431
5000	 110
6000	 72
7000	 34
8000	 41
9000	 38
10000	 71

Following is the final summary
# alignments : 47216
# unique Features these alignments represent: 41079
% of total features these alignments represent : 6.75 %