DerTALEv2: Difference between revisions
(→Tools) |
|||
Line 38: | Line 38: | ||
The meaning of the individual tool parameters is described below. | The meaning of the individual tool parameters is described below. | ||
== | == Tool parameters == | ||
=== DerTALEv2 === | === DerTALEv2 === |
Revision as of 22:29, 24 May 2024
DerTALEv2 filters predictions of TALE target boxes by the presence of differentially expressed regions in a defined vicinity around a predicted target box.
As input, DerTALEv2 requires a list of target box predictions as generated by the PrediTALE tool, which is included in the DerTALEv2 JAR file.
For determining differentially expressed regions, DerTALEv2 also needs mapped RNA-seq data after Xanthomonas infection (treatment) and control in BAM format, which is the standard output format of most mappers, and may be generated from the SAM format using samtools. For each BAM file, DerTALEv2 also needs an index file with the same base name as the BAM file but additional extension .bai
(as generated by samtools).
Further parameters that can be specified include the number of predictions in the list that are considered (counting from top), the width of the region in which differential expression is considered, the width of the window that needs to be differentially expressed and a threshold on the log (base 2) differential abundance (e.g., 1
for a two-fold induction).
Command line tool
DerTALEv2 is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
DerTALEv2 and PrediTALE are packaged in one runnable JAR that may be run from the command line with
java -jar DerTALEv2
which lists the tools available and usage information
Available tools: dertalev2 - DerTALEv2 preditale - PrediTALE Syntax: java -jar DerTALEv2.jar <toolname> [<parameter=value> ...] Further info about the tools is given with java -jar DerTALEv2.jar <toolname> info For tests of individual tools: java -jar DerTALEv2.jar <toolname> test [<verbose>] Tool parameters are listed with java -jar DerTALEv2.jar <toolname>
You get a list of the tool parameters by calling DerTALEv2.jar with the corresponding tool name, e.g.,
java -jar DerTALEv2.jar dertalev2
The meaning of the individual tool parameters is described below.
Tool parameters
DerTALEv2
DerTALEv2 filters predictions of TALE target boxes by the presence of differentially expressed regions in a defined vicinity around a predicted target box.
If you experience problems using DerTALEv2, please contact us.
DerTALEv2 may be called with
java -jar DerTALEv2.jar dertalev2
and has the following parameters
name | comment | type | |||
p | Predictions (Predictions output file, type = tsv,tabular) | FILE | |||
The following parameter(s) can be used multiple times: | |||||
| |||||
The following parameter(s) can be used multiple times: | |||||
| |||||
n | Number of predictions (Number of (top) predictions considered, default = 100) | INT | |||
r | Region width (Number of bases around the predicted site, default = 500) | INT | |||
Threshold | Threshold (Threshold on the log differential abundance, default = 1.0) | DOUBLE | |||
s | Stranded (Defines whether the reads are stranded. In case of FR_FIRST_STRAND, the first read of a read pair or the only read in case of single-end data is assumed to be located on forward strand of the cDNA, i.e., reverse to the mRNA orientation. If you are using Illumina TruSeq you should use FR_FIRST_STRAND., range={FR_UNSTRANDED, FR_FIRST_STRAND, FR_SECOND_STRAND}, default = FR_UNSTRANDED) | STRING | |||
cc | Coverage cutoff (Minimum amount of reads as coverage cuttoff., default = 10) | INT | |||
rev | Region elongation value (Amount of bases a region is elongated if coverage is above half of coverage cuttoff at start/end of region., default = 100) | INT | |||
m | Minimum length of candidate region (Minimum length of candidate region., default = 100) | INT | |||
mcotcr | Minimum coverage of the Candidate Region (Minimum coverage of the Candidate Region., default = 50) | INT | |||
outdir | The output directory, defaults to the current working directory (.) | STRING |
Example:
java -jar DerTALEv2.jar dertalev2 p=<Predictions> t=<Treatment_BAM> c=<Control_BAM>
PrediTALE
PrediTALE predicts TALE target boxes using a novel model learned from quantitative data based on the RVD sequence of a TALE.
As input, PrediTALE requires a set of sequences that are scanned for putative TALE target boxes. These sequences could be promoters of genes but also complete genomic sequences (FastA format). For computing p-values, PrediTALE additional needs a background set of sequences, which is by default generated as a sub-sample of the original input data.
The prediction threshold may be defined either by means of a p-values or an approximate number of expected sites. The latter will also be converted to a p-value, internally, and the defined number of expected sites in not met exactly, in general.
TALEs are specified by a FastA file containing their RVD sequences, where individual RVDs are separated by dashes (-). This is the same format also output by the TALE Analysis tool of AnnoTALE.
Finally, it can be specified if both strands or only one of the strands are scanned where, in the former case, a penalty may be assigned to predictions on the reverse strand. While this penalty may be reasonable when scanning promoters, it should usually be set to 0
in case of genome-wide predictions.
If you experience problems using PrediTALE, please contact us.
PrediTALE may be called with
java -jar DerTALEv2.jar preditale
and has the following parameters
name | comment | type | ||||||||||||
s | Sequences (The sequences (e.g., a genome) to scan for binding sites, type = fa,fas,fasta) | FILE | ||||||||||||
b | Background sample (The sequences for determining the prediction threshold. Either a sub-sample of the input sequences or a dedicated background data set., range={sub-sample, background sequences}, default = sub-sample) | STRING | ||||||||||||
| ||||||||||||||
t | Threshold specification (The way of defining the prediction threshold. Either by explicitly defining a significance level or by specifying the number of expected sites, range={significance level, number of sites}, default = significance level) | STRING | ||||||||||||
| ||||||||||||||
TALEs | TALEs (The RVD sequences of the TALE, separated by dashes, in FastA format, type = fasta,fas,fa) | FILE | ||||||||||||
Strand | Strand (Prediction target sites on both strands, or the forward or reverse strand, range={both strands, forward strand, reverse strand}, default = both strands) | STRING | ||||||||||||
| ||||||||||||||
outdir | The output directory, defaults to the current working directory (.) | STRING |
Example:
java -jar DerTALEv2.jar preditale s=<Sequences> TALEs=<TALEs>