#1




  Options may be changed interactively by typing the 
  respective number. For each option there is a short
  online help. 

  More detailed information can be found in the user guide. 




#2




  If you want to align protein sequences, please enter "p" or "Return"

  If you want to align nucleic acid sequences, please enter "n".





#6  




  As described in the paper, the program DIALIGN composes 
  alignments from gapfree pairs segments of the sequences.
  Such segment pairs are refered to as `diagonals'.
  
  It is possible to use a threshold T for the quality of 
  diagonals. In this case, a diagonal is incorporated into the
  alignment only if its `weight' exceeds this threshold, and 
  regions of lower similarity are ignored.

   
  


#11 

  DIALIGN requires a single ASCII file containing the sequences to be aligned.
  Four different file formats are supported: IG, FASTA, EMBL and GCG-RSF 
  format. The following is an example of the FASTA sequence file format:

                >name1
                TACTTTACCCAGTAGTCATGTACAGAGT
                ACCGCCTCAATAAAAAGCCTAAGAGTCA
                >name2
                CCCATATGTGTAGAAGTTGCCTCGAGTG
                TTTACGCGGGGGCGGGCATTCTTTAAAC
                CACGCGGGGG
                >name3
                ACCTACTCTCCCCCCCCTTTTCCCAACT
                ATCTAATCTATTTYCAGGGCGTG

  The first line in every sequence starts with ">" and contains the name
  of the sequence. The user can select a subset or all sequences from 
  this input file with DIALIGN. (See user guide for more details.)

  If you want to terminate the program, please type "exit".

#15






  Please select the sequences to be aligned. Enter "a" if you want 
  to align all sequences from the input file. Enter sequence numbers 
  if you want to align a subset of the sequences. 

  Please end the selection process by an extra return.



  
         
#16

  The DIALIGN alignment format is as follows: 

ASE-Fly      ---RRNARERNRVKQVNNGFALLREKIPEEvseafeaqgagrgaSKKL-SKVETL
TFE3-Human   KKDNHNLIERRRRFNINDRIKELGTLIPKSSD------------PEmrwNKGTIL 
MYC-Chicke   KRRTHNVLERQRRNELKLSFFALRDQIPEVAN------------NEKA-PKVVIL 
 
             ********************************            **** ******  
             *****************************               **** ****** 
             *****************************                    ****** 
                  ********         *******                       
                  ********                                         

  At any position p in the alignment, the number of "*" characters 
  indicates the relative degree of local similarity among all sequences 
  in the alignment. This number is calculated based on the sum of
  `weights' of diagonals connecting residues at position p. You may
  specify the number of "*" characters for the region of maximal local 
  similarity within in the alignment.

#24   



  If the nucleic acid sequences under consideration are expected
  to contain protein coding regions, it is advisable to let DIALIGN
  translate the compared `nucleic acid segments' to `peptide
  segments' according to the genetic code. In this case, the
  similarity of two segments belonging to a `diagonal' will be
  assessed on the `peptide level' rather than on the `nucleic acid
  level'. Here, DIALIGN automatically considers all possible 
  combinations of reading frames.

  This method crucially increases the sensitivity of the program.
  However, `translation' of diagonals is only possible, if  
  sequences consist of  "A", "C", "T", "G" and "U" exclusively.



  
#111




  The program creates a single output file containing 

    1) the resulting alignment in DIALIGN format
    2) the same alignment in FASTA format
    3) a sequence tree   
      
  The default name of the output file is `name.ali' where `name' is 
  the name of the input file (if the input file has an extension, 
  this extension will be removed, e.g., if the name of the input
  file is "file.seq", the output file will be called "file.ali").

  However, it is possible to specify other names for output files.

 
        

 

#0



