This reports the protocol used to align the Wheat_ESTcluster_PlantGDB features to Maize_BACs_20060126.
Mon Feb 13 16:04:05 2006


Source of Wheat_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Triticum_aestivum/Triticum_aestivum.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

197761 Wheat_ESTcluster_PlantGDB are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 20108
# unique Features these alignments represent: 16917
% of total features these alignments represent : 8.55 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 5545
150	 3544
200	 2809
250	 2238
300	 1911
350	 1320
400	 884
450	 570
500	 415
550	 259
600	 169
650	 111
700	 88
750	 61
800	 31
10000	 153

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 11090
# unique Features these remaining alignments represent: 9245
% of total features these alignments represent : 4.67 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 8132
2	 843
3	 79
4	 30
5	 77
6	 67
8	 17
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 10055
# unique Features these remaining alignments represent: 9054
% of total features these alignments represent : 4.58 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 1
40	 6
50	 13
60	 46
70	 187
80	 1005
90	 5655
95	 2317
100	 825

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 8796
2000	 684
3000	 274
4000	 100
5000	 40
6000	 37
7000	 18
8000	 13
9000	 16
10000	 11

Following is the final summary
# alignments : 10055
# unique Features these alignments represent: 9054
% of total features these alignments represent : 4.58 %