This reports the protocol used to align the Rice_ESTcluster_PlantGDB features to Maize_BACs_20060126.
Mon Feb 13 13:08:13 2006


Source of Rice_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Oryza_sativa/Oryza_sativa.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

116515 Rice_ESTcluster_PlantGDB are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 14837
# unique Features these alignments represent: 10123
% of total features these alignments represent : 8.69 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 5658
150	 1831
200	 1670
250	 1356
300	 1047
350	 799
400	 513
450	 310
500	 265
550	 157
600	 159
650	 122
700	 110
750	 84
800	 72
10000	 684

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 7379
# unique Features these remaining alignments represent: 5627
% of total features these alignments represent : 4.83 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 4923
2	 360
3	 36
4	 66
5	 151
6	 50
8	 41
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 5751
# unique Features these remaining alignments represent: 5319
% of total features these alignments represent : 4.57 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 4
40	 4
50	 23
60	 58
70	 177
80	 779
90	 3448
95	 995
100	 263

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 4564
2000	 501
3000	 263
4000	 125
5000	 52
6000	 45
7000	 26
8000	 33
9000	 16
10000	 14

Following is the final summary
# alignments : 5751
# unique Features these alignments represent: 5319
% of total features these alignments represent : 4.57 %