				    README


    To fetch rice genome data from GenBank, you need to run PERL script
GetRiceGenome.pl.

    GetRiceGenome.pl is the main perl script which calls other modules, 
programs, or data files to search GenBank, parse the results and put
them into the data files in relation to the tables for the rice genome 
database. Then, it calls SQL LOADER to load the coresponding data into the
Oracle database. 
    
    When GetRiceGenome.pl begins to run, it first checks file timeRecordFile.
If this file is empty, the program will search for all of the Rice BAC/PAC 
GenBank records. Otherwise, it will retrieve GenBank records between the last
searching time and this time.

    To run this program, a perl program named Boulder (see 
http://stein.cshl.org/software/boulder/) is required to install on your 
computer. The following files are also required to be in the same directory
as GetRiceGenome.pl is in:

    Recording files:  timeRecordFile and summary.previous.
                      (timeRecordFile stores time records of accessing program
                       GetRiceGenome.pl. summary.previous contains data
                       summary of the GenBank search results. Before you run 
                       this program first time, this file should be empty. 
                       You should not modified any data in this file after 
                       you running the program.)

    Control files:    analysis.ctl, chromosome.ctl, clone.ctl, contig.ctl,
                      dna.ctl, exon.ctl, exon_transcript.ctl, externalDB.ctl,
                      gene.ctl, gene_description.ctl, genetype.ctl, meta.ctl,
                      objectXref.ctl, repeat_feature.ctl, sequencing_site.ctl,
		      species.ctl, static_golden_path.ctl, transcript.ctl,
                      translation.ctl, Xref.ctl.

    SQLLoader files:  load_data0 and load_data (if SQLLOADER is failed to load
                                                DNA sequences, you have to run
                                                insertDNA.pl program to insert
                                                the DNA sequences first).

    
    The following datafiles will be generated by program GetRiceGenome.pl, you
can remove them after completing the program running.

    data files:    analysis.dat, chromosome.dat, clone.dat, contig.dat,
                   dna.dat, exon.dat, exon_transcript.dat, externalDB.dat,
                   gene.dat, gene_description.dat, genetype.dat, meta.dat,
                   objectXref.dat, repeat_feature.dat, sequencing_site.dat,
		   species.dat, static_golden_path.dat, transcript.dat,
                   translation.dat, Xref.dat, dnaseq.fasta.