This reports the protocol used to align the Sorghum_cluster_Pratt features to Maize_BACs.
 Kiran Ratnapu 
Tue Mar 29 18:52:16 2005


Source of Sorghum_cluster_Pratt : These are from Marie-Michele Cordonnier-Pratt and Lee Pratt  

Alignment procedure details 
--------------------------- 

25772 Sorghum_cluster_Pratt are aligned to Maize_BACs using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 3991
# unique Features these alignments represent: 3183
% of total features these alignments represent : 12.35 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 851
150	 426
200	 363
250	 296
300	 298
350	 300
400	 292
450	 263
500	 232
550	 198
600	 170
650	 123
700	 79
750	 44
800	 30
10000	 26

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 2719
# unique Features these remaining alignments represent: 2159
% of total features these alignments represent : 8.38 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 1878
2	 202
3	 18
4	 8
5	 8
6	 20
8	 25
9	 0
10	 0
20	 0
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 2336
# unique Features these remaining alignments represent: 2098
% of total features these alignments represent : 8.14 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 0
40	 1
50	 0
60	 12
70	 16
80	 97
90	 529
95	 1423
100	 258

Following is the distribution of Rice gaps
Rice_Gaps	# features
--------	--------
1000	 1994
2000	 180
3000	 74
4000	 21
5000	 15
6000	 11
7000	 10
8000	 5
9000	 4
10000	 4

Following is the final summary
# alignments : 2336
# unique Features these alignments represent: 2098
% of total features these alignments represent : 8.14 %