jChIP project

jChIP manual

Requirements

Java Runtime Environment 64 bit (preferred version is 7)

Minimum 4GB of memory (large datasets may require greater amounts of RAM)

Running

Since jChIP is a pure Java application, it can be executed on every operating system. All platforms version may be launched using jChIP.bat script for Windows or jChIP.sh for Linux and MacOS. The maximum amount of memory available to the application can be set using the -Xmx argument inside the scripts.
It is also possible to run jChIP from command line by typing:
java -jar -Xmx4000M jChIP.jar
There are also binaries for each platform available on the Downloads page. They are configured to use 4GB of memory.

Creating a new working set

Before any analysis a new working set has to be created. To create it press Ctrl+n or select Add new working set in File menu. Then choose a directory in which all files will be saved. Finally you have to put a name which will be visible in the main application window.

Creating a protein-DNA binding profile

To create a new protein-DNA binding profile select one of the working sets and click on the settings icon. In the settings dialog a set of profile parameters is available. The next step is to choose appropriate parameters configuration for the current experiment. The following settings are available:

Locus length - size of the tolerance window in which tags are assigned to specified loci in genome.
Locus offset - window shift, 0 means that the loci are in the middle of the window. Window can be moved in both directions.
Assignment files - files containing raw sequencing data or peaks. The supported data formats are: Bowie, SAM, BAM, BED and WIG.
Filter multiple assignments - if selected, the tags that can be mapped to more than one locus are removed.
Assignment count normalization - normalizes number of tags in every single locus to one.

Database - name of the database (only Ensembl is currently supported).
Species - identifier of the selected species (only Homo sapiens and Mus musculus are currently supported).
Main attribute - type of loci to which the tags will be assigned.
Check strand - if selected, the directionality of DNA strands will be taken into account.
Secondary attribute - if Check strand is selected, this attribute will be considered for the '-' strand.
Gene types - if selected, only specified gene types will be take into account.
GO terms file - if selected, the specified file with GO terms will be read. GO terms should be placed in a single column (max. 500 terms).

Load loci list - if selected, loci from specified file will be used
Save loci list - if selected, loci list will be saved to the specified file after creating a profile
Filter short loci - if selected and strand checking is active, loci shorter than specified length will be removed. Loci length is calculated from the difference between the main and secondary attribute.
Filter near loci - if selected, the loci with lower distance than specified are filtered. There are three filtration methods:
remove all near loci - all close loci will be removed
leave one random locus - one random locus will be kept
leave one locus - single locus in the middle will be kept
Pre-Filter - if selected and near loci filtering is enabled, pre-filtration of loci with low tags count is performed according do following parameters
Pre-Filter length - window length in which tags are assigned
Pre-Filter offset - window shift
Pre-Filter tags - minimal number of tags locus must contain to pass the filter.
Filter low-tag loci - if selected, loci with less tags than specified are filtered.
Filter multi-tag loci - if selected, loci with more tags than specified are filtered.

Filter low-tag positions - if selected, single genome positions with less tags than specified are filtered.
Filter multi-tag positions - if selected, single genome positions with more tags than specified are filtered.

After adjusting all parameters the settings may be exported to a file. One can use the saved settings for further analysis by importing file.
After clicking the Run button loci data is downloaded from the database or imported from the given file and all calculations are performed.

Single profile results

After profiles are computed, some results are available on the working set tree. These results include:

profile - the protein binding profile
report - summary report containing statistics of filtration and tags assignment
loci table - table containing ID, coordinate, type, length and number of assigned tags for every processed locus. The rows can be sorted by clicking column headers or by selecting options from the context menu (right mouse click)
loci hist. - distribution of loci containing different number of tags
pos hist. - distribution of single positions containing different number of tags
chromosomes - assigned tags distribution over chromosomes. Every chromosome is plotted separately

Every created profile can be renamed or removed by right mouse click. There is also possibility to export most of the presented data by right-clicking on proper window.

Common profile results

jChIP offers also the possibility to compare different profiles. There are several methods of comparative analysis:

Common loci profile - the chosen profiles are plotted on single plot. If chi^2 test option is checked, the selected profile is treated as control sample and Chi^2 tests are computed to obtain profiles similarity measure. The Sum option plots the sum of selected profiles.
Common loci table - results are shown in one table, placing the same genes from different profiles in single row. The sum variant shows all genes from selected profiles filling missing values with zeros. The common variant shows only genes that appear in all profiles.
Common loci histogram - selected loci histograms are plotted in single figure. There is an option to scale every histogram to unit area. There is also possibility to set histograms range of interest.
Common positions histogram - selected single positions histograms are plotted in one plot. There is an option to scale every histogram to unit area. There is also possibility to set histograms range of interest.
XY loci plot - number of tags of the same genes in two different profiles is shown on single XY plot. If more loci in both profiles has the same number of tags, the corresponding point is colored. The color scale is from black to red.