This reports the protocol used to align the Barley_EST features to Maize_BACs_20060126.
Mon Feb 13 11:57:50 2006


Source of Barley_EST : Downloaded from genbank with query ' txid4512[orgn]  AND  gbdiv_est[PROP]' 

Alignment procedure details 
--------------------------- 

399369 Barley_EST are aligned to Maize_BACs_20060126 using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 62423
# unique Features these alignments represent: 46813
% of total features these alignments represent : 11.72 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 11241
150	 7204
200	 8864
250	 6034
300	 5779
350	 5400
400	 4545
450	 3621
500	 3363
550	 2534
600	 2137
650	 1344
700	 272
750	 46
800	 22
10000	 17

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 44098
# unique Features these remaining alignments represent: 30852
% of total features these alignments represent : 7.73 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 24620
2	 4049
3	 448
4	 342
5	 517
6	 381
8	 494
9	 0
10	 0
20	 1
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 34062
# unique Features these remaining alignments represent: 29117
% of total features these alignments represent : 7.29 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 3
40	 5
50	 14
60	 99
70	 522
80	 2516
90	 17166
95	 9856
100	 3881

Following is the distribution of Gaps
Gaps	# features
--------	--------
1000	 30669
2000	 2022
3000	 708
4000	 266
5000	 86
6000	 42
7000	 32
8000	 47
9000	 21
10000	 51

Following is the final summary
# alignments : 34062
# unique Features these alignments represent: 29117
% of total features these alignments represent : 7.29 %