iPiG - Integrating Peptide spectrum matches Into Genome browser visualizations Introduction ------------ iPiG targets the integration of peptide spectrum matches (PSMs) from mass spectrometry (MS) peptide identifications into the genomic visualizations provided by genome browser such as the UCSC genome browser (http://genome.ucsc.edu/). iPiG takes PSMs from the MS standard format mzIdentML (*.mzid) and provides results in genome track formats (BED and GFF3 files), which can be easily imported into genome browsers. For more details about iPiG and it's functioning, please see "iPiG: Integrating Peptide Spectrum Matches Into Genome Browser Visualizations" Mathias Kuhring and Bernhard Y. Renard (submitted manuscript) PLEASE NOTE, it is recommended to read the paper and this readme.txt file first, followed by the wiki pages provided at the project webpage: https://sourceforge.net/projects/ipig/ Instructions ------------ iPiG comes along with two additional tools: There is the mapping procedure itself (iPiG), an optional gene quality control (GeneControl) and download tool helping to get the necessary data (Downloader). An example of using iPiG, GeneControl and Downloader is provided in the wiki. GeneControl checks the integrity of the required gene annotations and the consistency between the annotations and the corresponding amino acids sequences. Thus, it is recommended to run it once for the gene annotations used. System Requirements ------------------- iPiG is developed in Java version 6, thus it is platform independent, but requires a Java Runtime Environment (JRE) in version 6. Version 7 is not fully tested. The JRE can be downloaded at the Oracle webpage: http://www.oracle.com/technetwork/java/javase/downloads/index.html User Interfaces --------------- iPiG provides a command line interface (CLI) as well as a graphical user interface (GUI). Execution scripts are provided for Windows and Linux. Type "ipig" without any parameters to get a short help about parameter using. On Linux platforms, it might be necessary to set execution rights for the scripts. This can be done with "chmod +x ipig", "chmod +x ipiggui", "chmod +x gcgui", resp. "chmod +x downloader". You might add the iPiG directory to your PATH variable to use it directly in your data folders for instance. Please consult your operating systems manual on how to set the PATH variable. In general, all interfaces (including GeneControl) could be addressed by executing the iPiG script file, which is recommended using instead of executing the iPiG.jar file with Java itself. The GUIs of iPiG, GeneControl and Downloader should be started with the additional scripts, like "ipiggui", "gcgui" or "downloader", though no further parameters are passed by the gui scripts yet. Data Requirements (iPiG) ------------------------ iPig requires information about gene locations and protein-gene connections. Thus several files have to be provided. See in the wiki for recommended sources and how to indicate them. 1.) A file with peptide spectrum matches (PSMs), best in mzIdentML format. Alternatively, a tab separated text file with particular columns (see ipig.pdf). 2.) The annotations of a reference genome in UCSC table format (*.txt). 3.) A file containing the corresponding amino acid translations in UCSC table format (*.txt). 4.) Optional but highly recommended: A Uniprot ID-mapping file (tab-delimited, *.tab). Source: ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/ 5.) Optional: A proteome in fasta format, e.g. containing the proteins used for peptide identification (*.fasta). For more details about the formats and for examples, please have look in the wiki and at the EXAMPLES folder. Data Requirements (GeneControl) ------------------------------- 1.) The annotations of a reference genome in UCSC table format (*.txt). 2.) A file, containing the corresponding amino acid translations in UCSC table format (*.txt). 3.) Reference sequences of the chromosomes in fasta format, indicated by a path. Chromosomes must be one per file each and files must be named like the chromosomes in the annotations (e.g. chr11.fa, chrY.fa, chrIV.fa etc.). ----------------------------------------------------------------------------------------- Copyright (c) 2012, Mathias Kuhring, KuhringM@rki.de, Robert Koch-Institute, Germany, All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * The name of the author may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL Mathias Kuhring BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.