------------------------------------------------------------------- PROANALYST: QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS IN PROTEINS, PROTEIN ENGINEERING, PATTERNS RECOGNITION IN COMBINATORIAL LIBRARIES, PHYSICO-CHEMICAL AND ALPHABETICAL ANALYSIS IN MULTIPLE SEQUENCE ALIGNMENTS AND 3D STRUCTURE (VER 1.02) ------------------------------------------------------------------- COPYRIGHT (C) 1996 Vladimir Ivanisenko, Alexey Eroshkin, all rights reserved. State Research Center of Virology and Biotechnology "Vector" Koltsovo, Novosibirsk Region, 633159 Russia E.mail: salex@vector.nsk.su, eroshkin@vector.nsk.su All Trademarks and Registered Names are acknowledged in this document. The files required to run PROANALYST are distributed in the form of a single compressed file. Create a directory "PANALYST" in your hard disk, for example, C. Copy PANALYS$ to the directory and type PANALYS$ to unpack the program. PROANALYST is an easy-to-use, state-of-the-art MSDOS application for studying the relationships between structure and activity in protein/peptide families. You are free to use, copy and distribute this version to people you think may have interest to it IF: NO FEE IS CHARGED FOR USE, COPYING OR DISTRIBUTION. In this demo version print and save are disabled; the number of analyzed proteins is limited by 15. This program is provided without any warranty, expressed or implied to you or any other person. The authors will not be liable for incidental, consequential or other damages arising through the use of this software. You can copy the programs and give copies to other people. However please note the following points: a. The copyright of the program remains with the authors. b. Please acknowledge the program in any publications of research for which they are used. c. If you transfer the program you must do so in unmodified form. PROANALYST: - has data converter from several protein sequence formats; - finds sites (in multiple alignment and 3D structure) influencing protein activity; - finds relationships between protein site structural characteristics and protein activities; - investigates structural differences between proteins divided by functional, evolutionary or other criteria (e.g. relates genotype to phenotype); - investigates physico-chemical factors related with activity changes in a set of mutant proteins (including multiple physico-chemical factors); - simulates protein-engineering experiments and predicts activity of mutant protein; - searches motifs and patterns in combinatorial libraries (peptide and phage-displayed libraries); - makes protein stereo pictures with sites highlighted; - provides linear correlation analysis, multiple linear regression analysis, discriminant analysis, ANOVA (analysis of variance), alphabetical analysis, profile; - maps all obtained results on 3D structure; - makes multiple alignment visualization and editing. PROANALYST IS USEFUL: - for chemists and biochemists making the investigations in the field of protein structure-function and structure-activity analysis; - for protein engineers trying to improve some protein properties; - for molecular biologists that need to get sense from multiple protein alignments; - for those who are interested in protein evolution and protein structure predictions; - for those who need good color protein 3D pictures with marking different sites; - for students in any field of PROTEIN SCIENCE that need to know what is protein primary, secondary and tertiary structures; hydrophilicity and flexibility of protein segment; multiple alignments and sequences variability; functional and activity-modulating sites; hydrophobic moments; profiles of averages and variations; regression and discriminant analysis and many other important things. - for all persons who are interested in comparative protein sequence analysis. PROANALYST has: - simple user interface to facilitate the data analysis and design; - simple graphics to facilitate the data understanding; - data examples with some protein family sequences and structures; - data files of amino acid physico-chemical and other properties commonly used in protein analysis; - the great number of service programs to help you in your research; - complete manual and help system. REQUIREMENTS PROANALYST runs on the IBM PC family of computers. PROANALYST requires DOS 3.30 or higher and at least 550K of RAM; it will run on any 80-column monitor but graphics require EGA/VGA monitor. A hard disk is recommended for enhancing performance of the program. THE PROGRAM'S MAIN FILES: README TXT - this file MANUAL TXT - manual to PROANALYST HELP HLP - help-file HELP1 EXE - program for on-line help EXAMPLES TXT - step-by-step instruction to PROANALYST PANALYST EXE - main program CONVERT EXE - the program for data conversion from CLUSTAL, SWISS-PROT, PIR, GCG files to PROANALYST format CONVERT DOC - documentation for CONVERT PROPERTY PPT - 19 useful phys.-chem. properties of amino acid residues KIDERA PPT - 10 normalized Kidera's phys.-chem. properties of aa PROGRAPH PPT - phys.-chem. properties of aa (from PROFILEGRAPH, K.Hofmann) M2_RS ACT - 3 files with data example of influenza A virus M(2) protein M2_RS ALI M2_RS SEQ ANTIMICR ACT - 3 files with data example of antimicrobial peptides ANTIMICR ALI ANTIMICR SEQ IFN ACT - 3 files with data example of interferon-alpha IFN SEQ IFN ALI IFN EXP - amino acid exposure and secondary structure (optional) IFN PDB - interferon 3D structure DE1 ACT - 3 files with data example of disintegrins DE1 SEQ DE1 ALI LIBRARY ACT - 3 files with data example of phage-display library LIBRARY SEQ LIBRARY ALI ALN (subdirectory) - 50 aligned protein families. HOW TO START To investigate the protein (or peptide) family of your interest you should prepare 3 files with protein names (*.seq), protein activities or grouping (*.act) and aligned sequences (*.ali) in the current directory. The data preparation can be made with the using of program CONVERT (see CONVERT.DOC) or any text editor. See the formats in the MANUAL or in the files with the examples. Simple editor and viewer are present also in PROANALYST and CONVERT. To use the program follow the steps: - start the program; - select earlier prepared protein family; - select sequences of the family you are going to investigate; - select a file with required physico-chemical properties of amino acids (necessary for regression and discriminant analysis); - select amino acid physico-chemical properties being interested; - select protein 3D structure (if available); - go to menu item PREPARE DATA; - define an investigated fragment (or up to 8 fragments); and so on. All other information you'll get from MANUAL.TXT or EXAMPLES.TXT files or from HELP system (press key F1) or simply with the using of trial-error method. If you have problems running PROANALYST please consult the manual or examples carefully to see if they can help. If you still need advice then please contact the authors by e-mail. Ask authors for complete updated version and: ADDITIONAL NEW SOFTWARE TOOLS: ProMSED: Protein Multiple Sequence Editor (for MS Windows 3.x/95) o automatic (Clustal V algorithm) and manual multiple protein sequences alignment; o sequences import and export in different formats; o no limits on number and length of sequences except those imposed by current computer memory; o interactive alignment of selected protein subset or selected regions; o flexible visualization and edition of alignment (Word for Windows style); o amino acid coloring to facilitate understanding protein physico-chemical, structural, evolutionary and other features. ProAnWin: Windows version of ProAnalyst with multiple protein sequence alignment (for MS Windows 3.x/95) o data input from main protein sequence formats (SWISS-PROT, PIR, FASTA, GCG, CLUSTAL); o databases of more then 50 amino acid physico-chemical properties; o reads 3D protein structure in PDB format; o flexible input of user-defined protein activities, properties or related phenotypes; o multiple protein sequences alignment (Clustal V algorithm); o threading multiple sequence alignment onto known 3-dimensional structure; o searching sites influencing protein activity and analyzing relationships between protein site structural characteristics and protein activities (properties or related phenotypes); o multiple linear regression analysis of structure-activity relationships; o genotype -- phenotype correlation analysis (e.g., in studying virus drug resistance); o investigation of physico-chemical factors related with activity or property changes in mutant proteins; o flexible visualization of protein 3D structures with sites highlighted; o activity, property or phenotype prediction; o output of all obtained results into text file. PROANAL3: Protein structure-activity analysis, analysis of physico-chemical properties variations and conservations (for MS DOS) o main features of ProAnalyst with nonlinear analysis of protein structure-activity relationships; o enhanced physico-chemical profiles for protein features prediction (functional and variable sites, secondary structures, amphipathic helices, antigenic sites, etc.).