This reports the protocol used to align the Sorghum_ESTcluster_PlantGDB features to Maize_BACs_20060126.
Mon Feb 13 14:23:34 2006


Source of Sorghum_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Sorghum_bicolor/Sorghum_bicolor.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

41845 Sorghum_ESTcluster_PlantGDB are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 7279
# unique Features these alignments represent: 5874
% of total features these alignments represent : 14.04 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 1663
150	 943
200	 793
250	 604
300	 501
350	 363
400	 330
450	 359
500	 315
550	 290
600	 226
650	 166
700	 130
750	 101
800	 59
10000	 436

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 4686
# unique Features these remaining alignments represent: 3892
% of total features these alignments represent : 9.30 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 3439
2	 336
3	 32
4	 19
5	 26
6	 22
8	 18
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 4207
# unique Features these remaining alignments represent: 3807
% of total features these alignments represent : 9.10 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 0
40	 3
50	 5
60	 20
70	 70
80	 230
90	 1085
95	 2344
100	 450

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 3350
2000	 398
3000	 181
4000	 80
5000	 47
6000	 32
7000	 11
8000	 16
9000	 14
10000	 15

Following is the final summary
# alignments : 4207
# unique Features these alignments represent: 3807
% of total features these alignments represent : 9.10 %