1. Package overview

The CASMAP package (Combinatorial ASsociation MAPping) combines three different methods and publicly available tools into on software tool.


1.1 Repository of methods and publications

Here you can find links to all our publications on the topic of significant pattern mining.


1.2 Functionality

The package was tailored to process genomic data, but it also allows for custom input files to conduct other more generic analises. A key feature of this package is its ability to correct for known confounders in the data, e.g. population stratification in GWAS.

  • Interval search: also “Region-based genome-wide association studies”, considers sets of features that are adjacent to each other. In the case of SNPs, it will consider genomic regions marked by a start and end SNP.

  • Combinatorial search: also “High-order Epistasis” (the FACS tool), considers any order combination of features.


Region-based GWAS


Note: Figure extracted from Felipe’s presentation (Module I)


High-order epistasis


Note: Figure extracted from Felipe’s presentation (Module I)


2. Package installation

The CASMAP package is available for R (ver. 3.4+) and Python (ver. 2.7+ or 3).


2.1 For R


2.2 For Python


3. Virtual machine

In case of problems with the installation steps, we have created a virtual machine (VM) with Ubuntu 14.04. The VM has the packages we will use in the hands-on sessions pre-installed. See more details here.


3. Program for the rest of the session

Today we will discuss three different examples on how to use CASMAP.

  1. Example 1: Combinatorial motif binding

  2. Example 2: Region-based GWAS in Arabidopsis thaliana

  3. Example 3: Higher-order epistasis in Arabidopsis thaliana