About prediction models »
Below are AUC scores for our Calpacchopper (Bayesian model) and GPS*), evaluated on a set of 20-mer sequences comprised of 210 curated cleaved sequences in the literature that were not used for training either Calpacchopper or GPS, Their reversed sequences were used as negative samples (total of 420 sequences):
(*) GPS-CCD is an on-line predictor for calpain cleavage sites. http://ccd.biocuckoo.org/
Below are AUC scores for each model, evaluated using 10x10 cross-validation on a curated dataset of 90 substrate sequences (220 cleavage sites). Please note that these are non-comparable to the above AUC values:
Although MKL predictor tends to produce the best results, it requires considerably more time to run, due to the necessity to predict secondary structure of the input sequence as a preliminary step.