PrediTALE
PrediTALE predicts TALE target boxes using a novel model learned from quantitative data based on the RVD sequence of a TALE. A pre-print describing the method behind PrediTALE and comparing its performance to other tools for TALE target prediction is available from biorxiv (doi:). In addition to PrediTALE, we also provide DerTALE, a tool for filtering genome-wide target site predictions by mapped RNA-seq data after Xanthomonas infection. Both tools are described in more detail below.
PrediTALE and DerTALE are available as a command line application, but have also been integrated into AnnoTALE, which is available with a graphical user interface.
Download
PrediTALE is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.
PrediTALE and DerTALE are packaged in one runnable JAR that may be run from the command line with
java -jar PrediTALE.jar
which lists the tools available and usage information
Available tools: preditale - PrediTALE dertale - DerTALE Syntax: java -jar PrediTALE.jar <toolname> [<parameter=value> ...] Further info about the tools is given with java -jar PrediTALE.jar <toolname> info Tool parameters are listed with java -jar PrediTALE.jar <toolname>
Source code
PrediTALE
As input, PrediTALE requires a set of sequences that are scanned for putative TALE target boxes. These sequences could be promoters of genes but also complete genomic sequences (FastA format). For computing p-values, PrediTALE additional needs a background set of sequences, which is by default generated as a sub-sample of the original input data. The prediction threshold may be defined either by means of a p-values or an approximate number of expected sites. The latter will also be converted to a p-value, internally, and the defined number of expected sites in not met exactly, in general. TALEs are specified by a FastA file containing their RVD sequences, where individual RVDs are separated by dashes (-). This is the same format also output by the *TALE Analysis* tool of AnnoTALE. Finally, it can be specified if both strands or only one of the strands are scanned where, in the former case, a penalty may be assigned to predictions on the reverse strand. While this penalty may be reasonable when scanning promoters, it should usually be set to ``0`` in case of genome-wide predictions.