The CASMAP package (Combinatorial ASsociation MAPping) combines three different methods and publicly available tools into on software tool.
Here you can find links to all our publications on the topic of significant pattern mining.
The package was tailored to process genomic data, but it also allows for custom input files to conduct other more generic analises. A key feature of this package is its ability to correct for known confounders in the data, e.g. population stratification in GWAS.
Interval search: also “Region-based genome-wide association studies”, considers sets of features that are adjacent to each other. In the case of SNPs, it will consider genomic regions marked by a start and end SNP.
Combinatorial search: also “High-order Epistasis” (the FACS tool), considers any order combination of features.
The CASMAP package is available for R (ver. 3.4+) and Python (ver. 2.7+ or 3).
The package is available on CRAN
Source code and step-by-step installation instructions can be found in our GitHub repository
In case of problems with the installation steps, we have created a virtual machine (VM) with Ubuntu 14.04. The VM has the packages we will use in the hands-on sessions pre-installed. See more details here.
Today we will discuss three different examples on how to use CASMAP.
Example 1: Combinatorial motif binding
Example 2: Region-based GWAS in Arabidopsis thaliana
Example 3: Higher-order epistasis in Arabidopsis thaliana