This reports the protocol used to align the Maize_WGS_JGI features to Maize_BACs_20060126.
Sat Feb 18 14:16:59 2006


Source of Maize_WGS_JGI : These are Maize WGS reads from Joint Genome Institute obtained by Bonnie 

Alignment procedure details 
--------------------------- 

1124441 Maize_WGS_JGI are aligned to Maize_BACs_20060126 using blat with blat parameters -minScore=160 followed by PslReps with -minAli=0.90 -nearTop=0.01 -singleHit. This was followed by a filtering procedure described below and applied in general to 'Same Species Genomic' data sets.

Initial summary
# aligments : 1868880
# unique Features these alignments represent: 833189
% of total features these alignments represent : 74.10 %

Following is the Gap distribution 
Gaps	# alignments
--------	--------
0		114537
1		130164
2		127578
3		107577
4		84617
5		68829
6		58463
7		51786
8		46786
9		42439
10		38325
20		240949
30		116422
40		62394
50		39150
60		28651
70		22046
80		15250
90		12306
100		9395
200		56766
300		22434
400		11616
500		7815
600		4359
700		3436
800		2762
900		2654
10000		87023

Features with gaps  > 40 bp are deleted 
# remaining Aligments : 1290866
# unique Features these represent alignments represent: 562029
% of total features these alignments represent : 49.98 %

 Following is the distribution of by feature coverage 
%coverage	# alignments
--------	--------
9		0
19		8220
29		46428
39		31528
49		31290
59		32536
69		35340
79		47181
89		197298
90		63999
91		79345
92		97686
93		111683
94		121593
95		121534
96		108400
97		87847
98		51519
99		15285
100		2154

Features less than 90 % coverage are deleted. 
# remaining Aligments : 797783
# unique Features these represent alignments represent: 292734
% of total features these alignments represent : 26.03 %

% Identity distribution of the remaining features
% Identity	# alignments
--------	--------
90		483
91		2949
92		11465
93		29329
94		64467
95		114114
96		159095
97		171835
98		155266
99		85780
100		3000

Features less than 92 % identity are deleted. 
# remaining Aligments : 794351
# unique Features these represent alignments represent: 290764
% of total features these alignments represent : 25.86 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1		129306
2		59998
3		32951
4		21096
5		15586
6		8813
8		10933
9		2968
10		2196
20		6212
30		614
40		80
50		10
100		1

Features that hit more than thrice are deleted.  
# remaining Aligments : 348155
# unique Features these represent alignments represent: 222255
% of total features these alignments represent : 19.77 %