This reports the protocol used to align the markersdb_est_notmapped features to tigrv4-genome.
Thu Jul 20 01:54:13 2006


Source of markersdb_est_notmapped : The markers db ests that have not been mapped 

Alignment procedure details 
--------------------------- 

138423 markersdb_est_notmapped are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 104233
# unique Features these alignments represent: 92080
% of total features these alignments represent : 66.52 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 8594
150	 8731
200	 10013
250	 10937
300	 13812
350	 11927
400	 10018
450	 10092
500	 7923
550	 5202
600	 3056
650	 1905
700	 1075
750	 473
800	 258
10000	 217

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 87066
# unique Features these remaining alignments represent: 75895
% of total features these alignments represent : 54.83 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 71300
2	 2029
3	 529
4	 815
5	 880
6	 135
8	 177
9	 19
10	 7
20	 4
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 76945
# unique Features these remaining alignments represent: 73858
% of total features these alignments represent : 53.36 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 1
40	 17
50	 29
60	 81
70	 347
80	 4157
90	 51023
95	 20231
100	 1059

Following is the distribution of gaps
Gaps	# features
--------	--------
1000	 60932
2000	 10571
3000	 2621
4000	 748
5000	 343
6000	 173
7000	 152
8000	 96
9000	 63
10000	 43

Following is the final summary
# alignments : 76945
# unique Features these alignments represent: 73858
% of total features these alignments represent : 53.36 %