Projects
This site contains projects that use Jstacs.
MotifAdjuster
by Jens Keilwagen, Jan Baumbach, Thomas Kohl and Ivo Grosse.
Description
Valuable binding site annotation data are stored in databases. However, several types of errors can, and do, occur in the process of manually incorporating annotation data from scientific literature into these databases. Here, we introduce MotifAdjuster, a software that helps to detect these errors, and we demonstrate its efficacy on public data sets.
Paper
The paper MotifAdjuster: A tool for computational reassessment of transcription factor binding site annotations has been submitted to Genome Biology.
Download
MotifAdjuster download can be downloaded here.
Start instructions
If you have unzipped the archive, you can start the MotifAdjuster by invoking
java -cp ./:./jstacs-1.2.2.jar:./numericalMethods.jar MotifAdjuster <file> <ignoreChar> <length> <fgOrder> <fgEss> <bothStrands> <output> <sigma> <p(no motif)>
In Windows, you have to use ";" instead of ":" in the class path.
The arguments have the following meaning
name | comment | type |
file | the location of the data set | String |
ignoreChar | char for comment lines (e.g. for a FastA-file '>') | char |
length | the motif length | int |
fgOrder | the order of the inhomogeneous Markov model that is uses for the motif; 0 yields in a PWM | byte |
ess | the equivalent sample size that is used for the mixture model | double >= 0 |
bothStrands | use both strands | boolean |
output | output of the EM | boolean |
sigma | the sigma of the truncated discrete Gaussian distribution | double>0 |
p(no motif) | the probability for finding no motif | 0<=double<1 |
DiPoMM
by Jens Keilwagen, Jan Grau, Stefan Posch, Marc Strickert and Ivo Grosse.
Description
Transcription factors are one main component of gene regulation, as they activate or repress gene expression by binding to their binding sites. The de-novo discovery of transcription factor binding sites in the promoters of target genes is a challenging problem in bioinformatics, which has not yet been solved satisfactorily. We present DiPoMM, a discriminative de-novo motif discovery tool that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process.
Paper
The paper DiPoMM: Discriminative de-novo motif discovery utilizing positional preference has been submitted to ISMB 2009.
Download
DiPoMM download can be downloaded here.
Start instructions
Once you have unzipped the archive, you can start DiPoMM e.g. by invoking
java -cp .:jstacs-1.2.2.jar:lib/numericalMethods.jar:lib/bytecode.jar:lib/biojava-live.jar projects.DiPoMM home=path/to/data/directory/ fg=fgfile.txt bg=bgfile.txt init=best-random=100 p-val=1E-4
to search for motifs that are over-represented in path/to/data/directory/fgfile.txt
but not in path/to/data/directory/bgfile.txt
, initialize DiPoMM with the best from 100 randomly drawn starting values, and search for motif occurrences with a p-value less than 1E-4
.
Under Windows, you must use ";" instead of ":" in the class path.
The arguments have the following meaning
name | comment | type |
home | the path to the data directory, default = ./ | String |
ignore | the char that is used to mask comment lines in data files, e.g., '>' in a FASTA-file, default = > | Character |
fg | the file name of the foreground data file (the file containing sequences which are expected to contain binding sites of a common motif) | String |
bg | the file name of the background data file | String |
length | the motif length that is used at the beginning, valid range = [1, 50], default = 15 | Integer |
flankOrder | The Markov order of the model for the flanking sequence and the background sequence, valid range = [0, 5], default = 0 | Integer |
motifOrder | The Markov order of the motif model, valid range = [0, 3], default = 0 | Integer |
bothStrands | a switch whether to use both strands or not, default = true | Boolean |
init | the method that is used for initialization, one of 'best-random=<number>', 'enum=<length>', and 'specific=<sequence or file of sequence>' | String=[Integer | String] |
xml | the file name of the xml file the classifier is written to, default = ./classifier.xml | String |
adjust | a switch whether to adjust the motif length, i.e., either to shrink or expand, default = true | Boolean |
p-val | a p-value for predicting binding sites, valid range = [0.0, 1.0], OPTIONAL | Double |