|
SEQREP1 Documentation
Requirements
0. Installation and start
1. Encode sequences using SEQREP codes
2. Train a SOM
3. Map sequences into a trained SOM
4. Classify sequences after training a SOM
5. Reference
Requirements
To run SEQREP1 you need a Java interpreter. If you have none
installed, you can download Sun Java Runtime Environment (JRE) from http://java.sun.com/j2se/1.4/download.html.
0. Installation and start
Extract all the files from SEQREP1.zip into the same
directory. At the command line, go to the directory where you saved the
files and type 'java SEQREP1'.
1. Encode sequences
using SEQREP codes
- Menu 'File' -> 'Open File...'. Select the file with the sequences
in FASTA format. Classification of the sequences should be included in the
comment line of each sequence. Classes are labeled with letters A-I after
string '#&'. For example, to label one sequence with class D, you should
insert somewhere in the comment line, '#&D'.
- In the main panel, under 'Code Scheme', choose the virtual potentials
you want to use for coding.
- Menu 'Tools' -> 'Encode'
- If you want to train a Kohonen SOM, go to point 2.
- If you want to save the SEQREP codes, choose 'Save with classes'
or 'Save with labels' under menu 'File'. (Labels here are the first characters
of the comment lines).
2. Train a SOM
- Follow steps a-c of point 1 to encode the
sequences.
- Choose, in the main panel, the settings you want to use for training
(network size, initial learning span, number of epochs).
- Click on 'Train Kohonen NN', in the main panel.
- After the training, for inspection of the weights at a given level,
fill in the field below 'Show weights at level:' and click on the button.
3. Map sequences
into a trained SOM
- Encode (with the same virtual potentials that were used to train the SOM) the new sequences you want to map and save
the SEQREP codes, as described in point 1.
- Click on 'Map Objects', in the main panel. You will be asked to
specify the file with the SEQREP codes (the one you saved in a.).
- The SOM surface, with the new sequences mapped on it, will be displayed
in a second window. The sequences are labeled with the first characters of
the comment line (from the original FASTA file) or with the class specified
in the original FASTA file, depending on how the SEQREP codes were saved.
- If more than 3 sequences are mapped on the same neuron, you should
move the mouse over that neuron in order to display all the sequences at
the right side of the window.
4. Classify sequences
after training a SOM
- Encode the new sequences (with the same virtual potentials that were
used to train the SOM) and save the SEQREP codes, as described
in point 1.
- In the main panel, click on 'Predict'. You will be asked to specify
the file with the SEQREP codes (the one you saved in a.) and then
the file in which you want to save the results.
- The resulting file will contain a list of the sequences with the
respective classification. Each line corresponds to one sequence, with labels
in the first column, coordinates of the winning neuron in the second and third,
and predicted classifications in the last column. If the classification is
undecided it will be shown with letter 'J'.
5. Reference
- J. Aires-de-Sousa,
L. Aires-de-Sousa, “Representation of DNA sequences with virtual potentials
(SEQREP) and their processing by Kohonen self-organizing maps”, Bioinformatics,
2003, 19(1), 30-36.
© 2002
João
Aires de Sousa
Copyright and Disclaimer SEQREP1 software
is copyright © 2002 by Dr. Joao Aires de Sousa (jas@mail.fct.unl.pt). All
rights reserved. Dr. Joao Aires de Sousa provides the accompanying software
"as-is", without warranties of any kind; even including the implied warranty
of fitness or merchantability for any particular purpose. Dr. Joao Aires
de Sousa herein expressly disclaims all warranties on this software, either
express or implied. Dr. Joao Aires de Sousa may not be held liable for
any damages, incidental or consequential, occuring from the use of the accompanying
software, even if he has been advised of the possibility of such damage.
|