This reports the protocol used to align the Maize_MethylFilter_CSHL features to Maize_BACs_20060126.
Mon Feb 13 23:45:35 2006


Source of Maize_MethylFilter_CSHL : Methyl-filtered CSHL maize sequence, downloaded from genbank with query '(txid4577[ORGN] AND McCombie[AUTH] AND methyl[TITL] AND 2002[MDAT])' 

Alignment procedure details 
--------------------------- 

66390 Maize_MethylFilter_CSHL are aligned to Maize_BACs_20060126 using blat with blat parameters -minScore=160 followed by PslReps with -minAli=0.90 -nearTop=0.01 -singleHit. This was followed by a filtering procedure described below and applied in general to 'Same Species Genomic' data sets.

Initial summary
# alignments : 49122
# unique Features these alignments represent: 22707
% of total features these alignments represent : 34.20 %

Following is the Gap distribution 
Gaps	# alignments
--------	--------
0	 19689
1	 6166
2	 2689
3	 1566
4	 1038
5	 836
6	 707
7	 559
8	 539
9	 530
10	 394
20	 2263
30	 1060
40	 663
50	 474
60	 269
70	 247
80	 165
90	 148
100	 137
200	 940
300	 381
400	 338
500	 239
600	 139
700	 50
800	 90
900	 88
10000	 2265

Features with gaps  > 40 bp are deleted 
# remaining Alignments : 38699
# unique Features these remaining alignments represent: 18742
% of total features these alignments represent : 28.23 %

 Following is the distribution of by feature coverage 
%coverage	# alignments
--------	--------
9	 0
19	 0
29	 230
39	 576
49	 569
59	 604
69	 668
79	 708
89	 1820
90	 637
91	 968
92	 1312
93	 1568
94	 2403
95	 2928
96	 3749
97	 4495
98	 5251
99	 4662
100	 5551

 Features less than 90 % coverage are deleted. 
# remaining Alignments : 32914
# unique Features these remaining alignments represent: 14974
% of total features these alignments represent : 22.55 %

% Identity distribution of the remaining features
% Identity	# alignments
--------	--------
90	 344
91	 569
92	 1014
93	 1685
94	 2468
95	 3545
96	 4753
97	 5749
98	 5522
99	 4555
100	 2710

 Features less than 92 % identity are deleted. 
# remaining Alignments : 32001
# unique Features these remaining alignments represent: 14491
% of total features these alignments represent : 21.83 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 8309
2	 2724
3	 1234
4	 733
5	 453
6	 321
8	 344
9	 97
10	 59
20	 184
30	 24
40	 5
50	 1
100	 3

 Features that hit more than thrice are deleted.  
# remaining Alignments : 17459
# unique Features these remaining alignments represent: 12267
% of total features these alignments represent : 18.48 %