
                      MEDELLER, iMembrane & PyFREAD

          copyright 2008-2012 Sebastian Kelm <kelm@stats.ox.ac.uk>

               This document was last updated on 28/09/2012
               


LICENCE

  This software is free for academic use only. For details, please see the
  licence documents accompanying this distribution, or contact the author.



INSTALLATION NOTES

  Before being able to run this application, several third-party applications
  must be installed. All of these are free for academic users.
  Below is a step-by-step guide, that should help you get through this process.

  This package may include several binary executables. These were compiled for
  a 64 bit Fedora linux system (3.4.11-1.fc16.x86_64). You may need to
  recompile these binaries for your own system architecture. Third party
  programs that may need compiling are mentioned below.


################################################################################



Required third party software
=============================

  Several binary executables need to be present in the shell environment's PATH.
  As an alternative to modifying your PATH, you can place most of the
  executables in the "bin" subdirectory within the installation directory
  (which we shall call <installdir>).
  
  
  (1) REQUIRED SCRIPTING LANGUAGES, MODULES AND LINUX COMMANDS
    
    Most of these (apart from numpy) should be pre-installed on common
    Linux distributions. If any of them are missing on your system, you can
    install them on most linux distros using your package manager ("apt-get" on
    Ubuntu, "yum" on Fedora and Red Hat).
    
    
    Python 2.7              http://www.python.org/
    Perl                    http://www.perl.org/
    BASH                    http://www.gnu.org/software/bash/
    
    Python module: numpy    http://sourceforge.net/projects/numpy/
    
    Standard linux commands: wget, rsync, diff, tee, sed, cut, grep, tr, xargs, ...
  
  
  (2) STAND-ALONE BLAST
    
    ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
    
    The executable "psiblast" from the NCBI BLAST+ package is required.
    Place it in <installdir>/bin, or somewhere in your PATH.
    
  
  (3) JOY
    
    http://tardis.nibio.go.jp/joy/
    
    The programme JOY is required. The executable needs to be called "joy", not
    "joy.Linux" or the likes. The executables "joy", "hbond", "sstruc" and "psa"
    all need to be in <installdir>/bin, or somewhere in your PATH.


  (4) MUSCLE
  
    http://www.drive5.com/muscle/
    
    The MUSCLE executable is required. Download the binary distribution suitable
    for your system from the above web site, rename it to "muscle" and place it
    in <installdir>/bin, or somewhere in your PATH.
    Make sure it is called "muscle" and not, e.g. "muscle3.8.31_i86linux32".


  (5) TM-align
  
    http://zhanglab.ccmb.med.umich.edu/TM-align/
    
    A version of TM-align is included in this distribution. You will need to
    compile TMalign from source. The source file is located in
    <installdir>/bin/TMalign.f. To compile it, use the command:
      cd <installdir>/bin
      gfortran -static -O3 -ffast-math -lm -o TMalign TMalign.f
    If this crashes, try removing the '-static' option:
      gfortran -O3 -ffast-math -lm -o TMalign TMalign.f
    If this fails as well, try simply:
      gfortran -o TMalign TMalign.f
    
    You will need the free gfortran compiler:
    http://gcc.gnu.org/wiki/GFortran
    
    Alternatively, get the binary distribution (Linux users only) from the
    above website. This may result in crashes if the authors of TMalign
    have changed the input and output format of the program since the
    release of the current MEDELLER distribution.


  (6) Modeller (Optional. Only required to build complete models.)
  
    http://salilab.org/modeller/
    
    If you have Modeller installed, MEDELLER can output a complete model. Simply
    follow the Modeller installation instructions.
    You need a licence for Modeller, which is free for academic users.
    
    Make sure you add Modeller's Python libraries to your PYTHONPATH environment
    variable.

    You can test your Modeller installation by typing:
    
    python
    import modeller.automodel
    exit()
    
    If no errors appear, you are good to go.


   (6) MP-T (Optional. Required to run the 'medellerpipeline', a.k.a. MEMOIR.)
  
    http://www.stats.ox.ac.uk/research/proteins/resources
    
    MP-T also has a dependency on USEARCH, which can be obtained here:
    http://drive5.com/usearch/
    
    For further details on how to install MP-T, please refer to the MP-T
    installation instructions.



Setting up the MEDELLER command-line applications
=================================================

  (1) Download and extract the main MEDELLER archive (medeller.tgz) into a
      directory of your choice. This will create a directory called "medeller",
      which we shall refer to as <installdir>.
  
  (2) You will want to add <installdir>/bin to your PATH environment variable.
      This will allow you to run MEDELLER and iMembrane easily from the command
      line.
  
  (3) If you wish to use MEDELLER, iMembrane, PyFREAD or any of their components
      from within your own Python programmes, you need to add
      <installdir>/lib-python to your PYTHONPATH environment variable.
  
  (4) OPTIONAL: Installing a local version of the Protein Data Bank.
      
      This will eliminate the need of a working internet connection to use the
      "automodel" script.
      
      If you do not install a local PDB database, you need to have the standard
      linux programme "wget" installed. The software will use this command to
      download PDB files directly from the PDB website whenever needed.
      
      To install a local PDB database, run the following commands:
      
      mkdir <installdir>/pdb
      rsync -a --port=33444 rsync.wwpdb.org::ftp_data/structures/divided/pdb/ <installdir>/pdb
      
      Rerun the "rsync" command whenever you wish to update your local PDB.
  
  (5) OPTIONAL: Compiling the CCD implementation used by PyFREAD.

      If you plan to run PyFREAD (or MultiFREAD) with non-standard parameters,
      i.e. an anchor RMSD cut-off above 1.0 Angstrom, you will need to compile
      the C++ program CCD located in <installdir>/bin/pyfread_cpp. You can do
      this by running the following commands:

      cd <installdir>/bin/pyfread_cpp/src
      make
  
  
  If you have installed all the required 3rd-party software, you should
  now be able to run the software using the following
  commands:

  medellerpipeline
    This is a complete pipeline that takes a target sequence and
    a template structure. It runs iMembrane, JOY, the alignment
    software MP-T, MEDELLER and imem_project. The output
    is a set of annotated models.
    
    MP-T is required for this to work. It is typically distributed together with
    this software. If not, you can obtain it here:
    http://www.stats.ox.ac.uk/research/proteins/resources
  
  automodel
    This is the command-line equivalent to the MEDELLER web application. The
    input is a target-template alignment and a template structure, the output
    is a set of annotated models.
    
    The automodel script calls several applications in a row:
      imembrane, joy, medeller, imem_project
  
  medeller
    Given an input alignment between your target and a template, which
    has been annotated with JOY and iMembrane, and the template's
    structure, create a 3D model of the target protein.
  
  imembrane
    This is a shell script which calls "imem_search" with the option
    "--dbpath <installdir>/imemdb".
    It is just a shortcut to avoid you retyping the database directory.

  imem_search
    Performs an iMembrane search on the iMembrane database. For each
    hit, it outputs a directory containing membrane insertion annotation.
    There is an option to automatically run JOY and add its annotation
    to the output.
  
  imem_project
    Given an iMembrane database entry (the "template"), which you
    already know is similar to your query, project the template's
    annotation onto the query. There is an option to automatically run
    JOY and add its annotation to the output



Testing the MEDELLER pipeline
=============================

  The best way of testing your installation is to run the medellerpipeline,
  which will use all the core programs in this distribution. It will also
  require a working MP-T installation, so be sure to finish that before
  attempting this test.
  
  After completing the installation, run the following command:

  cd <installdir>
  bin/medellerpipeline example/1U19A.fasta example/2VT4.pdb A ../blastdb/uniref90.fasta myoutput 1 | tee mystdout.txt
  
  [The number 1 refers to the number of processors to use for the BLAST search.
   If you have a multi-core machine, feel free to increase this number.]
  
  You can compare the output to the expected output by running these commands:
  
  diff mystdout.txt example/stdout.txt
  diff -r --brief myoutput example/output
  
  This will print all differences between your output and the example output. If
  the diff command produce no output, that is a good sign.
  
  The BLAST homologs produced during this procedure will only be identical to
  the example if you are using the database distributed with this software
  package (blastdb.tgz). If you are using your own database, expect the results
  to differ somewhat.



################################################################################



CONTACTING THE AUTHOR

  If you are having trouble installing this application even though you
  have followed the above instructions, or if you have discovered a
  bug, you may contact the author at the following address:
  
      kelm@stats.ox.ac.uk
      
  Please include the words "MEDELLER installation" in the subject line.
