This reports the protocol used to align the NonOryza_mRNA features to tigrv4-genome.
Fri Jul 28 17:24:21 2006


Source of NonOryza_mRNA : The markers db non-Oryza mRNAs 

Alignment procedure details 
--------------------------- 

23948 NonOryza_mRNA are aligned to tigrv4-genome using blat with blat parameters -minIdentity=50 followed by PslReps with -singleHit. This was followed by a filtering procedure described below and applied in general to 'CrossSpecies-Coding' data sets.

Initial summary
# alignments : 18770
# unique Features these alignments represent: 18145
% of total features these alignments represent : 75.77 %

The length of the matches are distributed as follows 
Hit_Length	# alignments
--------	--------
100	 1376
150	 1420
200	 1694
250	 1570
300	 1465
350	 1357
400	 1267
450	 1024
500	 923
550	 809
600	 640
650	 536
700	 527
750	 485
800	 395
10000	 3282

Alignments with matches less than 150 bp are deleted
# remaining Alignments : 16018
# unique Features these remaining alignments represent: 15550
% of total features these alignments represent : 64.93 %

Frequency distribution of the remaining features
# hits	# features
--------	--------
1	 15216
2	 276
3	 25
4	 15
5	 6
6	 7
8	 3
9	 0
10	 0
20	 2
30	 0
40	 0
50	 0
100	 0

 Features that hit more than thrice are deleted.  
# remaining Alignments : 15843
# unique Features these remaining alignments represent: 15517
% of total features these alignments represent : 64.79 %

% Identity distribution of the remaining features
% Identity	# features
--------	--------
10	 0
20	 0
30	 0
40	 3
50	 5
60	 20
70	 104
80	 1283
90	 11748
95	 2584
100	 96

Following is the distribution of gaps
Gaps	# features
--------	--------
1000	 9255
2000	 3116
3000	 1696
4000	 674
5000	 322
6000	 186
7000	 99
8000	 78
9000	 55
10000	 28

Following is the final summary
# alignments : 15843
# unique Features these alignments represent: 15517
% of total features these alignments represent : 64.79 %