This reports the protocol used to align the Maize_ESTcluster_PlantGDB features to Maize_BACs_20060126.
Mon Feb 13 21:07:07 2006


Source of Maize_ESTcluster_PlantGDB : this is a set of  EST clusters and singletons down loaded from PlantGDB website.\nhttp://www.plantgdb.org/download/Download/Sequence/ESTcontig/Zea_mays/Zea_mays.PUT.fasta.bz2 

Alignment procedure details 
--------------------------- 

86727 Maize_ESTcluster_PlantGDB are aligned to Maize_BACs_20060126 using blat with blat parameters -minScore=120 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'Coding-SameSpecies' data sets.

Initial summary
# alignments : 16750
# unique Features these alignments represent: 12658
% of total features these alignments represent : 14.60 %

The following is the distribution of the feature coverage 
%coverage	no of alignments
--------	--------
9	 86
19	 677
29	 1286
39	 1536
49	 1411
59	 1093
69	 1045
79	 1297
89	 2140
90	 339
91	 367
92	 397
93	 398
94	 529
95	 501
96	 545
97	 635
98	 666
99	 738
100	 1064

 Alignments less than 95 % coverage are deleted
# remaining Alignments : 3653
# unique Features these remaining alignments represent: 2948
% of total features these alignments represent : 3.40 %

Gap distribution of the remaining features
Gaps	# alignments
--------	--------
1000	 2773
2000	 393
3000	 199
4000	 90
5000	 57
6000	 18
7000	 20
8000	 20
9000	 11
10000	 18
20000	 33

Alignments with gaps  > 4000 bp are deleted
# remaining Alignments : 3455
# unique Features these remaining alignments represent: 2781
% of total features these alignments represent : 3.21 %

% Identity distribution of the remaining features
% Identity	# alignments
--------	--------
90	 0
91	 0
92	 0
93	 0
94	 10
95	 86
96	 210
97	 302
98	 646
99	 1438
100	 763

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 2302
2	 371
3	 65
4	 22
5	 13
6	 1
8	 5
9	 0
10	 1
20	 1
30	 0
40	 0
50	 0
100	 0

 Features that hit more than four times are deleted.  
# remaining Alignments : 3327
# unique Features these remaining alignments represent: 2760
% of total features these alignments represent : 3.18 %