INSTALLATION INSTRUCTIONS FOR MOBYLE
************************************

APOLOGIES

We did not get enough time so far to provide a proper setup script for
the Mobyle system, which is rather complex to install (see below). We
have schedule the release of a more automated setup procedure along
with the version 0.96.

REQUIREMENTS

- Any machine running a unix-like OS..
- An apache server with a loaded mod_cgi.
- Python, >=2.5
- The following Python librairies:
 	+ simpletal, >= 4.1
 	+ 4suite, >= 1.0.2
 	+ simplejson, >= 1.7.1
 	+ python imaging library, >= 1.1.5
 	+ PyCAPTCHA (http://releases.navi.cx/pycaptcha/pycaptcha-0.4.tar.bz)
    	+ a biological sequence/alignment format converter software,
    	squizz/jreadseq (squizz strongly recommended)
- Optional:
        + a batch system, such as SGE (http://gridengine.sunsource.net/) or
	  Torque (http://www.clusterresources.com/pages/products/torque-resource-manager.php).
        + dnspython >=1.5.0 is helpful to check user emails domain validity.
        + golden (ftp://ftp.pasteur.fr/pub/gensoft/projects/golden/)
	  is helpful to directly load biological sequences from
          databanks into the web portal.

2- Technical overview:

A "Mobyle" server does not run any specific daemon (apart from
apache). When a user launches a job, he is actually running a cgi that
runs a bioinformatics program in a subprocess. If the subprocess runs
for more than a certain time, it detaches itself, and continues
monitoring the execution until its end. The "apache" user is therefore
the one that runs every requestm and user permissions should be done
so that it can access and run every data, program, and parameter of
the Mobyle configuration.

3- Important elements in the Mobyle archive

Local => Configuration and local parameters or code for the Mobyle
         system.

Programs => Mobyle XML wrappers (distributed separately)

Src => Mobyle source code:
	* the Mobyle folder contains the "core" code for the Mobyle
          Server,
	* the Portal folder contains the code for the web portal that
          provides an access to the system.

Tools => A few utilities and scripts. The setsid binary, used only if
         jobs are executed using the "Sys" job manager.

Utils => Siginterrupt/setdid sources.

4- Installation steps

4.1 - Make sure every required dependence/software is present.

4.2 - Create the Mobyle tree structure

All the Mobyle code and configuration are located in a given
directory, whose path is stored in the MOBYLEHOME environment
variable.

=> Extract the contents of the archive in the chosen directory, which
has to be accessible by the apache user. The RunnerChild script
(located in ${MOBYLEHOME}/Src/Mobyle/RunnerChild.py has to be
executable by apache.

The data (jobs, sessions, etc.) as well as the web portal code have to
be accessible by apache, but should also be published on the web.

=> Copy the portal in the www directory: the MobylePortal folder
located in Portal/htdocs to "DocumentRoot", and the MobylePortal
folder located in Portal/cgi-bin to the "cgi-bin" folder

=> Create a folder in DocumentRoot to store Mobyle data and meta-data:
for instance, a "Mobyle" dans DocumentRoot. In this new folder, create
the following sub-folders:

    - jobs and jobs/ADMINDIR: contain the jobs data.

    - sessions: will contain the user sessions (e-mail adresses, jobs
      list, etc.).

=> Create a folder somewhere accessible by Apache that will store the
uploaded data before their format is checked.

The log folder will contain all the Mobyle logs.

=> Create a log folder (accessible by apache), for instance
/var/log/mobyle

	- error_log, tracks server errors.
	- access_log, tracks launched jobs.
	- account_log (optional), tracks the execution time of each job
	- session_log, tracks session mechanism.
	- debug, if level 3 job debug is set in configuration, stores
          the child process standard output.
	- build, if level 1 job debug is set in configuration, tracks
          the contruction of the job launch command line.

4.3 - Compile Siginterrupt (and setsid if required)

=> Compile Siginterrupt

	% cd ${MOBYLEHOME}/Utils
	% python setup.py build
	% python setup.py install

=> Compile setsid

setsid is mandatory only if Mobyle is set up to run jobs in "Sys"
mode, i.e., without any particular batch management system.

      % cd ${MOBYLEHOME}/Utils
      % cc setid.c -o setsid
      % cp setsid ../Tools

4.4 - Configure Mobyle

=> Copy the ${MOBYLEHOME}/Exemple/Local/Config/Config.template.py file
to ${MOBYLEHOME}/Local/Config/Config.py and edit it. Here are the main
configuration vars:

	- ROOT_URL = the hostname + port of the server (e.g.,
          http://mymobylemachine.myplace.com:80).
	- RESULTS_PATH = the path to the "jobs" folder.
	- USER_SESSIONS_PATH = the path to the "sessions" folder.
	- FORMAT_DETECTOR_CACHE_PATH = the path to the format detector
          cache path folder.
	- MOBYLEROOT_HTDOCS_URL = the path from the website root to the
          htdocs Mobyle folder.
	- MOBYLEROOT_CGI_URL = the path from the website root to the
          cgis Mobyle folder.
	- LOGDIR = the log directory path.

	- DATABANKS_CONFIG = Lists the various bio-banks which are
          available to load data from the web portal. This list should
          remain empty unless the golden program is compiled and set
          up.

	- MAINTAINER = the server administrator's e-mail adress, used
          to send critical error messages.
	- HELP = the e-mail adress of the person that receives user
          help requests. This is also the adress that will be the
          sender of job notification e-mails to users.
	- MAILHOST = mail server used to send the above-cited messages.

	- OPT_EMAIL = defines if it is mandatory to enter an e-mail
          before to run a job.
	- PARTICULAR_OPT_EMAIL = overloads the above directive on a
          program-specific base.

	- ANONYMOUS_SESSION = "captcha"|"no"|"yes" authorizes or the
          job submission with anonymous sessions (or asks to solve a
          captcha problem to stop bot submissions).
	- AUTHENTICATED_SESSION = "email"|"no"|"yes" authorizes the
          creation of authenticated sessions, where an e-mail adress
          confirmation system can be used.

	- BATCH = Batch submission system. set to Sys if you do not
          have such a system available.

4.5 - Configure Apache

=> Set up the MOBYLEHOME

The $MOBYLE_HOME var should be set in the cgis. Ex:

        ScriptAlias /cgi-bin/ /var/www/cgi-bin/
        <Directory "/var/www/cgi-bin">
                AllowOverride None
                Options FollowSymLinks
                Order allow,deny
                Allow from all
                SetEnv MOBYLEHOME /home/hmenager/cvs/Mobyle
        </Directory>

=> Specific MIME types

In order to be able to upload/vizualize some specific data types, such
as PDB files, you need to overload their mime type. The default is the
chemical/x-pdb in /etc/mime.types, but this mime type forces opening
such data with an external tool in most navigators. Therefore, if you
use such data, you can (for instance, add this directive in the apache
configuration file /etc/apache2/mods-available/mime.conf:
	"AddType text/plain .pdb".

=> Download button

The "save" button that is available in job results will automatically
open a "save as" prompt, given that you add this little trick to the
apache server configuration:

RewriteEngine on
RewriteCond    %{REQUEST_URI}	 ^/MobyleData/jobs(\.*) #replace the last part with your own  Mobyle RESULTS_PATH
RewriteCond %{QUERY_STRING}	^save$
RewriteRule   (.*)/([^/]+)$    $1/$2 [E=SAVEDFILENAME:$2]
Header set Content-Disposition "attachment; filename=\"%{SAVEDFILENAME}e\"" env=SAVEDFILENAME

4.6 - Setup and configure GOLDEN and SQUIZZ

4.6.1 - Golden (optional)

Golden is a software that retrieves sequence entries from
bio-banks. It is used within the Mobyle portal, to directly load data
from these banks before to analyze them

=> Installation

	% tar -xzf golden-1.1a.tar.gz
	% cd golden-1.1a/
	% ./configure
	% make
	% make install

=> Configuration

Edit in ${MOBYLEHOME}/Local/Config/Config.py the GOLDEN_PATH var,
which is the path to the golden binary.

        DATABANKS_CONFIG = available bio-banks description.
e.g.:

	DATABANKS_CONFIG = [
	   { 'id':'embl', 'dataType':'Sequence', 'bioTypes':['Nucleotide'],
	     'label': 'EMBL Nucleotide Sequence Database'},
	   {'id':'enzyme', 'dataType':'Sequence', 'bioTypes':['Protein'],
	     'label': 'Enzyme nomenclature database'},
	   {'id':'uniprot', 'dataType':'Sequence', 'bioTypes':['Protein'],
	     'label': 'Universal Protein Resource = SwissProt + TrEMBL + PIR'}
	]

`id' is an unique identifier of the bank, as listed by `golden -l'
command.

For detailed instructions about the golden program setup, please refer
to the golden distribution.

4.6.2 - Squizz (Strongly recommended)

It is mandatory to set up a sequence/alignment format
detector/converter program. Squizz is strongly recommended, but you
can also use the java version of readseq
(http://iubio.bio.indiana.edu/soft/molbio/readseq/), although it is
far too permissive in our opinion.

=> Setup

	% tar -xzf squizz-0.99.tar.gz
	% cd squizz-0.99/
	% ./configure
	% make
	% make install

=> Configuration

	SEQCONVERTER = path to the installed format detection softs
	(the only ones that are currently supported are SQUIZZ and
	READSEQ).

e.g.:

	SEQCONVERTER= {
	  'SQUIZZ': '/usr/local/bin/squizz',
	  'READSEQ': '/home/hmenager/cvs/Mobyle/Tools/jreadseq'
	}

4.7 - deploy the programs
The programs you wish to make available using Mobyle are published using XML program descriptions.
We provide a set of programs descriptions, which is available at
ftp://ftp.pasteur.fr/pub/gensoft/projects/mobyle/Programs-xxx.tgz

Download it and expand the archive in the Programs subfolder. Then,
configure Mobyle according to the bioinformatics software installed on
your platform.

- ORDER
  The order in which INCLUDE and EXCLUDE directive are evaluated.

- INCLUDE
  The list of programs descriptions to install.

- EXCLUDE
  The list of programs descriptions to not install.

For INCLUDE and EXCLUDE directives shell jokers could be used. By
example, 'dna*' refers to all programs descriptions beginning by 'dna'
...

Use the programInstaller.py script which is located in Tools subfolder to
install programs descriptions (for more details see associated
README). Make sure, once the programs descriptions are installed that
they are readable by the web server. The programInstaller does not
install the programs, just their xml descriptions. To be useful you
must install separately the bioinformatics software corresponding to
the descriptions.