Chapter 3 Universal enrichment analysis

clusterProfiler supports both hypergeometric test and gene set enrichment analyses of many ontology/pathway, but it’s still not enough for users may want to analyze their data with unsupported organisms, slim version of GO, novel functional annotation (e.g. GO via BlastGO or KEGG via KAAS), unsupported ontologies/pathways or customized annotations.

clusterProfiler provides enricher function for hypergeometric test and GSEA function for gene set enrichment analysis that are designed to accept user defined annotation. They accept two additional parameters TERM2GENE and TERM2NAME. As indicated in the parameter names, TERM2GENE is a data.frame with first column of term ID and second column of corresponding mapped gene and TERM2NAME is a data.frame with first column of term ID and second column of corresponding term name. TERM2NAME is optional.

3.1 Input data

For over representation analysis, all we need is a gene vector, that is a vector of gene IDs. These gene IDs can be obtained by differential expression analysis (e.g. with DESeq2 package).

For gene set enrichment analysis, we need a ranked list of genes. DOSE provides an example dataset geneList which was derived from R package breastCancerMAINZ that contained 200 samples, including 29 samples in grade I, 136 samples in grade II and 35 samples in grade III. We computed the ratios of geometric means of grade III samples versus geometric means of grade I samples. Logarithm of these ratios (base 2) were stored in geneList dataset.

The geneList contains three features:

  1. numeric vector: fold change or other type of numerical variable
  2. named vector: every number was named by the corresponding gene ID
  3. sorted vector: number should be sorted in decreasing order

Suppose you are importing your own data from a csv file and the file contains two columns, one for gene ID (no duplicated allowed) and another one for fold change, you can prepare your own geneList via the following command:

d <- read.csv(your_csv_file)
## assume that 1st column is ID
## 2nd column is fold change

## feature 1: numeric vector
geneList <- d[,2]

## feature 2: named vector
names(geneList) <- as.character(d[,1])

## feature 3: decreasing order
geneList <- sort(geneList, decreasing = TRUE)

We can load the sample data into R via:

data(geneList, package="DOSE")
head(geneList)
##     4312     8318    10874    55143    55388      991 
## 4.572613 4.514594 4.418218 4.144075 3.876258 3.677857

Suppose we define fold change greater than 2 as DEGs:

gene <- names(geneList)[abs(geneList) > 2]
head(gene)
## [1] "4312"  "8318"  "10874" "55143" "55388" "991"

3.2 WikiPathways analysis

WikiPathways is a continuously updated pathway database curated by a community of researchers and pathway enthusiasts. WikiPathways produces monthly releases of gmt files for supported organisms at data.wikipathways.org. Download the appropriate gmt file and then generate TERM2GENE and TERM2NAME to use enricher and GSEA functions.

library(magrittr)
library(clusterProfiler)

data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]

wpgmtfile <- system.file("extdata/wikipathways-20180810-gmt-Homo_sapiens.gmt", package="clusterProfiler")
wp2gene <- read.gmt(wpgmtfile)
wp2gene <- wp2gene %>% tidyr::separate(ont, c("name","version","wpid","org"), "%")
wpid2gene <- wp2gene %>% dplyr::select(wpid, gene) #TERM2GENE
wpid2name <- wp2gene %>% dplyr::select(wpid, name) #TERM2NAME

ewp <- enricher(gene, TERM2GENE = wpid2gene, TERM2NAME = wpid2name)
head(ewp)
##            ID
## WP2446 WP2446
## WP2361 WP2361
## WP179   WP179
## WP3942 WP3942
## WP4240 WP4240
## WP2328 WP2328
##                                                                           Description
## WP2446                                                  Retinoblastoma (RB) in Cancer
## WP2361                                                       Gastric Cancer Network 1
## WP179                                                                      Cell Cycle
## WP3942                                                         PPAR signaling pathway
## WP4240 Regulation of sister chromatid separation at the metaphase-anaphase transition
## WP2328                                                            Allograft Rejection
##        GeneRatio  BgRatio       pvalue     p.adjust
## WP2446     11/95  88/6249 6.801697e-08 1.054263e-05
## WP2361      6/95  29/6249 3.772735e-06 2.923870e-04
## WP179      10/95 122/6249 1.384549e-05 7.153503e-04
## WP3942      7/95  67/6249 6.210513e-05 2.406574e-03
## WP4240      4/95  16/6249 7.931988e-05 2.458916e-03
## WP2328      7/95  90/6249 4.016758e-04 1.037663e-02
##              qvalue
## WP2446 9.450779e-06
## WP2361 2.621058e-04
## WP179  6.412648e-04
## WP3942 2.157336e-03
## WP4240 2.204258e-03
## WP2328 9.301967e-03
##                                                       geneID
## WP2446 8318/9133/7153/6241/890/983/81620/7272/1111/891/24137
## WP2361                       4605/7153/11065/22974/6286/6790
## WP179          8318/991/9133/890/983/7272/1111/891/4174/9232
## WP3942                    4312/9415/9370/5105/2167/3158/5346
## WP4240                                    991/1062/4085/9232
## WP2328                   10563/6373/4283/3002/10578/3117/730
##        Count
## WP2446    11
## WP2361     6
## WP179     10
## WP3942     7
## WP4240     4
## WP2328     7
ewp2 <- GSEA(geneList, TERM2GENE = wpid2gene, TERM2NAME = wpid2name, verbose=FALSE)
head(ewp2)
##            ID
## WP3932 WP3932
## WP306   WP306
## WP236   WP236
## WP474   WP474
## WP2911 WP2911
## WP3664 WP3664
##                                                              Description
## WP3932                    Focal Adhesion-PI3K-Akt-mTOR-signaling pathway
## WP306                                                     Focal Adhesion
## WP236                                                       Adipogenesis
## WP474                                          Endochondral Ossification
## WP2911                       miRNA targets in ECM and membrane receptors
## WP3664 Regulation of Wnt/B-catenin Signaling by Small Molecule Compounds
##        setSize enrichmentScore       NES      pvalue
## WP3932     281      -0.3903410 -1.682629 0.001335113
## WP306      188      -0.4230017 -1.740632 0.001440922
## WP236      125      -0.4387536 -1.721863 0.001536098
## WP474       59      -0.5111776 -1.770901 0.001675042
## WP2911      21      -0.6953475 -1.967391 0.001748252
## WP3664      16      -0.6768705 -1.762540 0.001862197
##         p.adjust    qvalues rank
## WP3932 0.0291184 0.02362325 1994
## WP306  0.0291184 0.02362325 2221
## WP236  0.0291184 0.02362325 2301
## WP474  0.0291184 0.02362325 2142
## WP2911 0.0291184 0.02362325 2629
## WP3664 0.0291184 0.02362325 2739
##                          leading_edge
## WP3932 tags=26%, list=16%, signal=22%
## WP306  tags=28%, list=18%, signal=23%
## WP236  tags=34%, list=18%, signal=28%
## WP474  tags=47%, list=17%, signal=40%
## WP2911 tags=76%, list=21%, signal=60%
## WP3664 tags=62%, list=22%, signal=49%
##                                                                                                                                                                                                                                                                                                                                                                         core_enrichment
## WP3932 7059/51719/8660/5563/5295/6794/1288/7010/3910/3371/3082/3791/1301/1027/90993/3643/1129/1975/7450/3685/2034/1942/2149/1280/4804/3675/2261/7248/2246/4803/2259/3912/1902/2308/1278/1277/81617/2846/2057/2247/1281/50509/1290/55970/5618/7058/10161/56034/3693/4254/6720/3480/5159/3991/1289/1292/3908/2690/3909/8817/3551/2791/63923/3913/3667/3679/7060/3479/80310/1311/1101/3169
## WP306                                                                                                              5595/5228/7424/1499/4636/83660/7059/5295/1288/23396/3910/3371/3082/394/3791/7450/596/3685/1280/3675/595/2318/3912/1793/1278/5753/1277/10398/55742/50509/1290/2317/7058/25759/56034/3693/3480/5159/857/1292/3908/3909/63923/3913/3679/7060/3479/10451/80310/1311/1101
## WP236                                                                                                                                                                       8321/5925/8609/1499/2908/4088/8660/3399/6778/8648/6258/2662/5914/6776/2034/196/8204/4692/4208/2308/5468/6695/4023/1592/7350/81029/3952/1675/5618/6720/3991/2487/3667/3572/3479/6424/9370/5105/652/5346/2625
## WP474                                                                                                                                                                                                                                          4921/51719/860/5745/1028/1280/2261/4256/7042/4208/4322/8100/7078/2247/11096/85477/3480/2690/8817/5167/2737/5744/2487/5327/1300/1473/3479
## WP2911                                                                                                                                                                                                                                                                                                      3672/7057/3915/3910/1291/1278/1293/1281/50509/1290/7058/3693/1289/1292/3913
## WP3664                                                                                                                                                                                                                                                                                                                                     8658/8325/8321/1499/324/27122/6925/4035/6424

You may want to convert the gene IDs to gene symbols, which can be done by setReadable function.

library(org.Hs.eg.db)
ewp <- setReadable(ewp, org.Hs.eg.db, keyType = "ENTREZID")
ewp2 <- setReadable(ewp2, org.Hs.eg.db, keyType = "ENTREZID")
head(ewp)
##            ID
## WP2446 WP2446
## WP2361 WP2361
## WP179   WP179
## WP3942 WP3942
## WP4240 WP4240
## WP2328 WP2328
##                                                                           Description
## WP2446                                                  Retinoblastoma (RB) in Cancer
## WP2361                                                       Gastric Cancer Network 1
## WP179                                                                      Cell Cycle
## WP3942                                                         PPAR signaling pathway
## WP4240 Regulation of sister chromatid separation at the metaphase-anaphase transition
## WP2328                                                            Allograft Rejection
##        GeneRatio  BgRatio       pvalue     p.adjust
## WP2446     11/95  88/6249 6.801697e-08 1.054263e-05
## WP2361      6/95  29/6249 3.772735e-06 2.923870e-04
## WP179      10/95 122/6249 1.384549e-05 7.153503e-04
## WP3942      7/95  67/6249 6.210513e-05 2.406574e-03
## WP4240      4/95  16/6249 7.931988e-05 2.458916e-03
## WP2328      7/95  90/6249 4.016758e-04 1.037663e-02
##              qvalue
## WP2446 9.450779e-06
## WP2361 2.621058e-04
## WP179  6.412648e-04
## WP3942 2.157336e-03
## WP4240 2.204258e-03
## WP2328 9.301967e-03
##                                                              geneID
## WP2446 CDC45/CCNB2/TOP2A/RRM2/CCNA2/CDK1/CDT1/TTK/CHEK1/CCNB1/KIF4A
## WP2361                           MYBL2/TOP2A/UBE2C/TPX2/S100P/AURKA
## WP179       CDC45/CDC20/CCNB2/CCNA2/CDK1/TTK/CHEK1/CCNB1/MCM5/PTTG1
## WP3942                    MMP1/FADS2/ADIPOQ/PCK1/FABP4/HMGCS2/PLIN1
## WP4240                                     CDC20/CENPE/MAD2L1/PTTG1
## WP2328                    CXCL13/CXCL11/CXCL9/GZMB/GNLY/HLA-DQA1/C7
##        Count
## WP2446    11
## WP2361     6
## WP179     10
## WP3942     7
## WP4240     4
## WP2328     7
head(ewp2)
##            ID
## WP3932 WP3932
## WP306   WP306
## WP236   WP236
## WP474   WP474
## WP2911 WP2911
## WP3664 WP3664
##                                                              Description
## WP3932                    Focal Adhesion-PI3K-Akt-mTOR-signaling pathway
## WP306                                                     Focal Adhesion
## WP236                                                       Adipogenesis
## WP474                                          Endochondral Ossification
## WP2911                       miRNA targets in ECM and membrane receptors
## WP3664 Regulation of Wnt/B-catenin Signaling by Small Molecule Compounds
##        setSize enrichmentScore       NES      pvalue
## WP3932     281      -0.3903410 -1.682629 0.001335113
## WP306      188      -0.4230017 -1.740632 0.001440922
## WP236      125      -0.4387536 -1.721863 0.001536098
## WP474       59      -0.5111776 -1.770901 0.001675042
## WP2911      21      -0.6953475 -1.967391 0.001748252
## WP3664      16      -0.6768705 -1.762540 0.001862197
##         p.adjust    qvalues rank
## WP3932 0.0291184 0.02362325 1994
## WP306  0.0291184 0.02362325 2221
## WP236  0.0291184 0.02362325 2301
## WP474  0.0291184 0.02362325 2142
## WP2911 0.0291184 0.02362325 2629
## WP3664 0.0291184 0.02362325 2739
##                          leading_edge
## WP3932 tags=26%, list=16%, signal=22%
## WP306  tags=28%, list=18%, signal=23%
## WP236  tags=34%, list=18%, signal=28%
## WP474  tags=47%, list=17%, signal=40%
## WP2911 tags=76%, list=21%, signal=60%
## WP3664 tags=62%, list=22%, signal=49%
##                                                                                                                                                                                                                                                                                                                                                                                                                            core_enrichment
## WP3932 THBS3/CAB39/IRS2/PRKAA2/PIK3R1/STK11/COL4A6/TEK/LAMA4/TNC/HGF/KDR/COL11A1/CDKN1B/CREB3L1/INSR/CHRM2/EIF4B/VWF/ITGAV/EPAS1/EFNA1/F2R/COL2A1/NGFR/ITGA3/FGFR3/TSC1/FGF1/NGF/FGF14/LAMB1/LPAR1/FOXO1/COL1A2/COL1A1/CAB39L/LPAR4/EPOR/FGF2/COL3A1/COL5A3/COL5A2/GNG12/PRLR/THBS2/LPAR6/PDGFC/ITGB5/KITLG/SREBF1/IGF1R/PDGFRB/LIPE/COL5A1/COL6A2/LAMA2/GHR/LAMA3/FGF18/IKBKB/GNG11/TNN/LAMB2/IRS1/ITGA7/THBS4/IGF1/PDGFD/COMP/CHAD/FOXA1
## WP306                                                                                                                               MAPK3/PGF/VEGFC/CTNNB1/MYL5/TLN2/THBS3/PIK3R1/COL4A6/PIP5K1C/LAMA4/TNC/HGF/ARHGAP5/KDR/VWF/BCL2/ITGAV/COL2A1/ITGA3/CCND1/FLNC/LAMB1/DOCK1/COL1A2/PTK6/COL1A1/MYL9/PARVA/COL5A3/COL5A2/FLNB/THBS2/SHC2/PDGFC/ITGB5/IGF1R/PDGFRB/CAV1/COL6A2/LAMA2/LAMA3/TNN/LAMB2/ITGA7/THBS4/IGF1/VAV3/PDGFD/COMP/CHAD
## WP236                                                                                                                                                                                                    FZD1/RB1/KLF7/CTNNB1/NR3C1/SMAD3/IRS2/ID3/STAT6/NCOA1/RXRG/GDF10/RARA/STAT5A/EPAS1/AHR/NRIP1/NDN/MEF2C/FOXO1/PPARG/SPOCK1/LPL/CYP26A1/UCP1/WNT5B/LEP/CFD/PRLR/SREBF1/LIPE/FRZB/IRS1/IL6ST/IGF1/SFRP4/ADIPOQ/PCK1/BMP4/PLIN1/GATA3
## WP474                                                                                                                                                                                                                                                                          DDR2/CAB39/RUNX2/PTH1R/CDKN1C/COL2A1/FGFR3/MGP/TGFB2/MEF2C/MMP13/IFT88/TIMP3/FGF2/ADAMTS5/SCIN/IGF1R/GHR/FGF18/ENPP1/GLI3/PTHLH/FRZB/PLAT/COL10A1/CST5/IGF1
## WP2911                                                                                                                                                                                                                                                                                                                                   ITGA1/THBS1/LAMC1/LAMA4/COL6A1/COL1A2/COL6A3/COL3A1/COL5A3/COL5A2/THBS2/ITGB5/COL5A1/COL6A2/LAMB2
## WP3664                                                                                                                                                                                                                                                                                                                                                                                      TNKS/FZD8/FZD1/CTNNB1/APC/DKK3/TCF4/LRP1/SFRP4

As an alternative to manually downloading gmt files, install the rWikiPathways package to gain scripting access to the latest gmt files using the downloadPathwayArchive function.

3.3 Cell Marker

cell_markers <- vroom::vroom('http://bio-bigdata.hrbmu.edu.cn/CellMarker/download/Human_cell_markers.txt') %>%
   tidyr::unite("cellMarker", tissueType, cancerType, cellName, sep=", ") %>% 
   dplyr::select(cellMarker, geneID) %>%
   dplyr::mutate(geneID = strsplit(geneID, ', '))
cell_markers
## # A tibble: 2,868 x 2
##    cellMarker                                       geneID  
##    <chr>                                            <list>  
##  1 Kidney, Normal, Proximal tubular cell            <chr [1…
##  2 Liver, Normal, Ito cell (hepatic stellate cell)  <chr [1…
##  3 Endometrium, Normal, Trophoblast cell            <chr [1…
##  4 Germ, Normal, Primordial germ cell               <chr [1…
##  5 Corneal epithelium, Normal, Epithelial cell      <chr [1…
##  6 Placenta, Normal, Cytotrophoblast                <chr [1…
##  7 Periosteum, Normal, Periosteum-derived progenit… <chr [4…
##  8 Amniotic membrane, Normal, Amnion epithelial ce… <chr [2…
##  9 Primitive streak, Normal, Primitive streak cell  <chr [2…
## 10 Adipose tissue, Normal, Stromal vascular fracti… <chr [1…
## # … with 2,858 more rows
y <- enricher(gene, TERM2GENE=cell_markers, minGSSize=1)
DT::datatable(as.data.frame(y))

3.4 MSigDb analysis

Molecular Signatures Database contains 8 major collections:

  • H: hallmark gene sets
  • C1: positional gene sets
  • C2: curated gene sets
  • C3: motif gene sets
  • C4: computational gene sets
  • C5: GO gene sets
  • C6: oncogenic signatures
  • C7: immunologic signatures

Users can download GMT files from Broad Institute and use read.gmt to parse the file to be used in enricher() and GSEA().

There is an R package, msigdbr, that already packed the MSigDB gene sets in tidy data format that can be used directly with clusterProfiler.

It supports several specices:

library(msigdbr)
msigdbr_show_species()
##  [1] "Bos taurus"               "Caenorhabditis elegans"  
##  [3] "Canis lupus familiaris"   "Danio rerio"             
##  [5] "Drosophila melanogaster"  "Gallus gallus"           
##  [7] "Homo sapiens"             "Mus musculus"            
##  [9] "Rattus norvegicus"        "Saccharomyces cerevisiae"
## [11] "Sus scrofa"

We can retrieve all human gene sets:

m_df <- msigdbr(species = "Homo sapiens")
head(m_df, 2) %>% as.data.frame
##          gs_name  gs_id gs_cat gs_subcat human_gene_symbol
## 1 AAACCAC_MIR140 M12609     C3       MIR             ABCC4
## 2 AAACCAC_MIR140 M12609     C3       MIR             ACTN4
##   species_name entrez_gene gene_symbol sources
## 1 Homo sapiens       10257       ABCC4    <NA>
## 2 Homo sapiens          81       ACTN4    <NA>

Or specific collection. Here we use C6, oncogenic gene sets as an example:

m_t2g <- msigdbr(species = "Homo sapiens", category = "C6") %>% 
  dplyr::select(gs_name, entrez_gene)
head(m_t2g)
## # A tibble: 6 x 2
##   gs_name              entrez_gene
##   <chr>                      <int>
## 1 AKT_UP_MTOR_DN.V1_DN       25864
## 2 AKT_UP_MTOR_DN.V1_DN          95
## 3 AKT_UP_MTOR_DN.V1_DN      137872
## 4 AKT_UP_MTOR_DN.V1_DN         134
## 5 AKT_UP_MTOR_DN.V1_DN       55326
## 6 AKT_UP_MTOR_DN.V1_DN         271
em <- enricher(gene, TERM2GENE=m_t2g)
em2 <- GSEA(geneList, TERM2GENE = m_t2g)
head(em)
##                                            ID
## RPS14_DN.V1_DN                 RPS14_DN.V1_DN
## GCNP_SHH_UP_LATE.V1_UP GCNP_SHH_UP_LATE.V1_UP
## VEGF_A_UP.V1_DN               VEGF_A_UP.V1_DN
## PRC2_EZH2_UP.V1_DN         PRC2_EZH2_UP.V1_DN
## CSR_LATE_UP.V1_UP           CSR_LATE_UP.V1_UP
## E2F1_UP.V1_UP                   E2F1_UP.V1_UP
##                                   Description GeneRatio
## RPS14_DN.V1_DN                 RPS14_DN.V1_DN    22/186
## GCNP_SHH_UP_LATE.V1_UP GCNP_SHH_UP_LATE.V1_UP    16/186
## VEGF_A_UP.V1_DN               VEGF_A_UP.V1_DN    15/186
## PRC2_EZH2_UP.V1_DN         PRC2_EZH2_UP.V1_DN    14/186
## CSR_LATE_UP.V1_UP           CSR_LATE_UP.V1_UP    12/186
## E2F1_UP.V1_UP                   E2F1_UP.V1_UP    12/186
##                          BgRatio       pvalue     p.adjust
## RPS14_DN.V1_DN         187/11250 4.072878e-13 6.923892e-11
## GCNP_SHH_UP_LATE.V1_UP 183/11250 5.657195e-08 4.808615e-06
## VEGF_A_UP.V1_DN        193/11250 6.903522e-07 3.911996e-05
## PRC2_EZH2_UP.V1_DN     195/11250 4.197805e-06 1.784067e-04
## CSR_LATE_UP.V1_UP      172/11250 2.766080e-05 9.404672e-04
## E2F1_UP.V1_UP          189/11250 6.963494e-05 1.972990e-03
##                              qvalue
## RPS14_DN.V1_DN         5.616284e-11
## GCNP_SHH_UP_LATE.V1_UP 3.900487e-06
## VEGF_A_UP.V1_DN        3.173198e-05
## PRC2_EZH2_UP.V1_DN     1.447138e-04
## CSR_LATE_UP.V1_UP      7.628557e-04
## E2F1_UP.V1_UP          1.600382e-03
##                                                                                                                                        geneID
## RPS14_DN.V1_DN         10874/55388/991/9493/1062/4605/9133/23397/79733/9787/55872/83461/54821/51659/9319/9055/10112/4174/5105/2532/7021/79901
## GCNP_SHH_UP_LATE.V1_UP                                      55388/7153/79733/6241/9787/51203/983/9212/1111/9319/9055/3833/6790/4174/3169/1580
## VEGF_A_UP.V1_DN                                                   8318/9493/1062/9133/10403/6241/9787/4085/332/3832/7272/891/23362/2167/10234
## PRC2_EZH2_UP.V1_DN                                               8318/55388/4605/23397/9787/55355/10460/81620/2146/7272/9212/11182/3887/24137
## CSR_LATE_UP.V1_UP                                                               55143/2305/4605/6241/11065/55872/983/332/2146/51659/9319/1580
## E2F1_UP.V1_UP                                                                  55388/7153/23397/79733/9787/2146/2842/9212/8208/1111/9055/3833
##                        Count
## RPS14_DN.V1_DN            22
## GCNP_SHH_UP_LATE.V1_UP    16
## VEGF_A_UP.V1_DN           15
## PRC2_EZH2_UP.V1_DN        14
## CSR_LATE_UP.V1_UP         12
## E2F1_UP.V1_UP             12
head(em2)
##                              ID     Description setSize
## RAF_UP.V1_DN       RAF_UP.V1_DN    RAF_UP.V1_DN     189
## LEF1_UP.V1_DN     LEF1_UP.V1_DN   LEF1_UP.V1_DN     188
## PIGF_UP.V1_UP     PIGF_UP.V1_UP   PIGF_UP.V1_UP     187
## LTE2_UP.V1_UP     LTE2_UP.V1_UP   LTE2_UP.V1_UP     184
## IL2_UP.V1_DN       IL2_UP.V1_DN    IL2_UP.V1_DN     183
## ATF2_S_UP.V1_DN ATF2_S_UP.V1_DN ATF2_S_UP.V1_DN     182
##                 enrichmentScore       NES      pvalue
## RAF_UP.V1_DN         -0.5521243 -2.232199 0.001398601
## LEF1_UP.V1_DN        -0.4938672 -1.994263 0.001406470
## PIGF_UP.V1_UP        -0.3815946 -1.538346 0.001408451
## LTE2_UP.V1_UP        -0.3938000 -1.584654 0.001412429
## IL2_UP.V1_DN         -0.4237496 -1.704120 0.001416431
## ATF2_S_UP.V1_DN      -0.4456885 -1.791187 0.001420455
##                  p.adjust    qvalues rank
## RAF_UP.V1_DN    0.0140625 0.00783208 2035
## LEF1_UP.V1_DN   0.0140625 0.00783208 1705
## PIGF_UP.V1_UP   0.0140625 0.00783208 2975
## LTE2_UP.V1_UP   0.0140625 0.00783208 2110
## IL2_UP.V1_DN    0.0140625 0.00783208 2774
## ATF2_S_UP.V1_DN 0.0140625 0.00783208 2531
##                                   leading_edge
## RAF_UP.V1_DN    tags=43%, list=16%, signal=37%
## LEF1_UP.V1_DN   tags=36%, list=14%, signal=32%
## PIGF_UP.V1_UP   tags=33%, list=24%, signal=25%
## LTE2_UP.V1_UP   tags=33%, list=17%, signal=28%
## IL2_UP.V1_DN    tags=39%, list=22%, signal=31%
## ATF2_S_UP.V1_DN tags=39%, list=20%, signal=32%
##                                                                                                                                                                                                                                                                                                                                                                                                                                            core_enrichment
## RAF_UP.V1_DN    221037/1960/51340/55105/10265/22996/9687/5357/1952/12/4602/9231/596/51454/57419/595/4256/23022/8777/7837/7042/3397/57613/323/1831/6451/54843/7358/2353/8773/6938/8991/64699/57007/23389/10560/7227/3485/79068/10769/4254/26018/3480/2674/23327/857/116039/6542/2690/1955/1363/26353/7033/5376/89927/51363/8821/2239/6947/4886/214/3487/3667/5157/54847/54898/7031/6505/57535/10451/18/771/80129/5174/5507/56521/8839/8614/5241/10551/57758
## LEF1_UP.V1_DN                                                            22998/10325/23221/79170/9187/4300/51479/25956/7552/8644/11162/7168/1490/54463/65084/25837/80303/1809/3397/4208/11223/79762/6451/481/54795/51626/8991/5950/5627/64699/6414/10103/221078/9961/5874/56898/90865/57235/83989/5002/6653/56034/55314/85458/956/54502/4832/5783/1363/53832/54985/80221/84786/214/50853/51149/57088/5157/2947/79932/7031/6505/9338/25924/80736/3169/10551
## PIGF_UP.V1_UP                                                                                                         1362/8803/9321/27067/9652/5756/23517/79882/9655/10124/4750/131544/178/81550/51351/221154/220988/9497/6416/9236/9988/22873/23469/23272/4925/8743/9976/3964/1429/26036/10142/114882/2701/140890/167227/10795/221981/862/1295/22915/273/1102/9823/4319/3295/29994/3075/1462/9674/10443/4675/1359/3572/7049/7503/2743/2167/10234/10351/9
## LTE2_UP.V1_UP                                                                                                       64759/1601/6558/7138/11138/10497/81031/665/2878/3554/4925/55005/9252/2034/5283/54869/5025/6019/65124/10979/51435/26999/4212/444/23588/55556/2954/93129/79762/54453/79789/316/79679/151011/51280/9630/57037/5627/9867/55198/6653/10455/5101/65055/8309/9429/23303/9590/253190/4982/4128/4680/1846/51760/80129/4857/5507/3158/5304/10974
## IL2_UP.V1_DN                                                  10858/5333/2977/79874/10014/7773/64798/26959/6588/579/777/5158/23361/9915/5087/64759/2535/6123/7010/51285/126353/80201/79841/1028/2322/4222/947/9187/225/9976/5592/54996/3202/55779/23148/5334/8605/57326/26115/54453/55766/64927/6591/55204/9940/57608/653483/9886/10279/55805/10628/901/9891/7102/27244/79921/4674/8292/5125/79843/6857/79940/23541/3357/50853/9863/79932/90362/55663/4250
## ATF2_S_UP.V1_DN                                                              53616/55667/51279/4642/7130/7139/2/710/80352/4608/11145/881/2252/3399/22846/10687/12/1397/1301/3554/2619/26011/78987/9284/1306/6586/9627/4054/54813/115207/55890/26249/4803/11075/3400/79789/8527/2353/10580/79987/80760/5493/90865/3485/3751/2202/2199/10610/1047/1294/23492/3908/5364/658/5350/2121/2331/4330/9737/4982/57088/8490/9358/6424/1307/3708/125/1311/56521/10351

We can test with other collections, for example, using C3 to test whether the genes are up/down-regulated by sharing specific motif.

m_t2g <- msigdbr(species = "Homo sapiens", category = "C3") %>% 
  dplyr::select(gs_name, entrez_gene)
head(m_t2g)
## # A tibble: 6 x 2
##   gs_name        entrez_gene
##   <chr>                <int>
## 1 AAACCAC_MIR140       10257
## 2 AAACCAC_MIR140          81
## 3 AAACCAC_MIR140          90
## 4 AAACCAC_MIR140        8754
## 5 AAACCAC_MIR140       11096
## 6 AAACCAC_MIR140         177
em3 <- GSEA(geneList, TERM2GENE = m_t2g)
head(em3)
##                                                            ID
## TGACATY_UNKNOWN                               TGACATY_UNKNOWN
## WTGAAAT_UNKNOWN                               WTGAAAT_UNKNOWN
## GGATTA_PITX2_Q2                               GGATTA_PITX2_Q2
## ACCAAAG_MIR9                                     ACCAAAG_MIR9
## TTTGCAC_MIR19A_MIR19B                   TTTGCAC_MIR19A_MIR19B
## CAGTATT_MIR200B_MIR200C_MIR429 CAGTATT_MIR200B_MIR200C_MIR429
##                                                   Description
## TGACATY_UNKNOWN                               TGACATY_UNKNOWN
## WTGAAAT_UNKNOWN                               WTGAAAT_UNKNOWN
## GGATTA_PITX2_Q2                               GGATTA_PITX2_Q2
## ACCAAAG_MIR9                                     ACCAAAG_MIR9
## TTTGCAC_MIR19A_MIR19B                   TTTGCAC_MIR19A_MIR19B
## CAGTATT_MIR200B_MIR200C_MIR429 CAGTATT_MIR200B_MIR200C_MIR429
##                                setSize enrichmentScore
## TGACATY_UNKNOWN                    497      -0.3193965
## WTGAAAT_UNKNOWN                    458      -0.3214586
## GGATTA_PITX2_Q2                    449      -0.3345811
## ACCAAAG_MIR9                       393      -0.3338362
## TTTGCAC_MIR19A_MIR19B              397      -0.3659786
## CAGTATT_MIR200B_MIR200C_MIR429     383      -0.3564722
##                                      NES      pvalue
## TGACATY_UNKNOWN                -1.423590 0.001226994
## WTGAAAT_UNKNOWN                -1.424123 0.001226994
## GGATTA_PITX2_Q2                -1.477988 0.001237624
## ACCAAAG_MIR9                   -1.462621 0.001251564
## TTTGCAC_MIR19A_MIR19B          -1.602838 0.001251564
## CAGTATT_MIR200B_MIR200C_MIR429 -1.554166 0.001270648
##                                  p.adjust    qvalues rank
## TGACATY_UNKNOWN                0.02490912 0.01753554 2961
## WTGAAAT_UNKNOWN                0.02490912 0.01753554 1523
## GGATTA_PITX2_Q2                0.02490912 0.01753554 2464
## ACCAAAG_MIR9                   0.02490912 0.01753554 2771
## TTTGCAC_MIR19A_MIR19B          0.02490912 0.01753554 4078
## CAGTATT_MIR200B_MIR200C_MIR429 0.02490912 0.01753554 3186
##                                                  leading_edge
## TGACATY_UNKNOWN                tags=30%, list=24%, signal=24%
## WTGAAAT_UNKNOWN                tags=18%, list=12%, signal=17%
## GGATTA_PITX2_Q2                tags=26%, list=20%, signal=22%
## ACCAAAG_MIR9                   tags=27%, list=22%, signal=22%
## TTTGCAC_MIR19A_MIR19B          tags=44%, list=33%, signal=30%
## CAGTATT_MIR200B_MIR200C_MIR429 tags=34%, list=25%, signal=26%
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 core_enrichment
## TGACATY_UNKNOWN                                                                                                                                                                             800/5168/55691/5789/11080/6400/55512/1112/9604/29760/4638/3218/27314/2065/4286/9459/4026/7026/4335/2263/53353/54329/7082/10446/5930/8567/51334/10174/8626/687/3321/83478/6588/5441/8929/7881/9846/23328/8924/220988/146057/55909/4661/29116/23405/94134/56171/63898/4131/4776/80000/627/9748/2908/11138/23767/10290/5295/1288/9445/10231/29/3373/2669/5191/4211/5813/25945/7716/596/4300/2078/3931/8633/9627/2872/3213/80021/2261/11194/3223/4212/8848/5837/6764/23037/57326/7106/590/7042/4208/55084/50650/80052/6450/323/481/8436/2353/10370/862/9120/727/79899/23452/1293/10580/60481/54838/55204/5166/2845/81603/6876/8522/1592/5577/3489/6310/2202/29951/27239/55821/9499/1501/57161/6571/4035/8821/5744/90627/4922/55800/4223/10472/23303/1264/54587/9590/9576/185/4675/576/1287/9358/3479/5348/776/771/79689
## WTGAAAT_UNKNOWN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      2823/80762/50486/4628/9736/324/80070/51585/9627/8082/11211/4885/64112/7482/5334/79645/4734/80059/4208/11075/55084/2012/6709/6003/55629/8829/1581/54843/2353/862/727/10516/10580/1848/55204/5166/89795/2845/79776/9886/761/116496/4488/11096/10468/23677/9201/3075/6925/4121/443/55184/57332/5205/10129/2819/57161/51306/55812/3249/54361/54970/5744/26960/10186/10472/388677/4330/53829/26137/9079/8404/36/3572/3479/1602/23090/81563/2167/8614/57758/4969
## GGATTA_PITX2_Q2                                                                                                                                                                                                                                                                                                                                         311/3199/4604/57556/8019/3321/57835/8929/6671/1760/10401/4430/9497/5530/6532/54894/23528/5228/5087/2295/6453/56171/3232/4921/3005/221037/56912/10265/10521/10497/23767/8322/37/23429/6561/4303/57082/845/94/51447/9922/6258/23243/2571/10253/4602/29119/55068/126393/7402/56675/5624/7716/7779/596/4300/57804/79923/1280/100506658/9627/3675/26037/2261/2804/65084/11030/3216/57326/140890/23255/5396/9854/1907/54843/57037/862/23452/55152/1848/8835/89795/57007/64221/5136/7227/10324/55273/6310/2202/56034/956/54502/51474/4254/6925/6097/744/443/23446/4306/4915/4487/25803/4081/4223/10472/388677/55107/347902/3708/10742/51313/4857/730/4036/3169
## ACCAAAG_MIR9                                                                                                                                                                                                                                                                                                                                                                                              80235/26119/4026/79665/4884/4641/9706/10242/766/9874/54329/54916/10014/9655/10174/4750/22859/79646/8289/4130/83478/54778/9098/9846/9781/9341/10420/1997/6575/4131/57147/957/1960/55914/22891/10521/11138/10424/9149/26504/23767/860/8648/23506/9445/56995/6548/2571/23621/604/5813/549/558/5099/23111/27303/1306/6843/115207/26249/22863/4734/80059/6785/5396/3400/79589/23253/80267/7586/481/2494/22795/91775/5166/54861/900/10181/10560/9823/11167/9886/3778/51232/11096/56034/6444/85458/4254/23116/5159/25827/2045/55893/54816/10129/5376/51363/54361/90627/10186/214/55107/9828/25799/2200/10451
## TTTGCAC_MIR19A_MIR19B          1605/659/10318/4299/6249/81609/57194/9686/10425/9839/23013/22936/989/51742/4154/55625/51460/51592/11127/3976/54980/9751/56937/6045/4204/27252/4862/27236/2115/5869/152006/57698/57018/23032/56145/10150/55054/4090/25852/10618/2060/7107/10771/7071/9044/10904/9752/6431/8473/83452/392/10395/6777/51230/22889/23236/55652/158471/6595/905/56146/4026/79665/5932/115/8325/546/10154/9655/23499/54521/26959/23612/3321/56929/54778/26060/56848/9098/107/10420/23001/29922/9759/29116/23405/94134/2295/6453/83660/51719/8654/11138/2186/23341/8038/860/2004/9256/6548/23621/23503/29/394/220594/831/92689/55082/54014/1901/23111/7552/27303/1490/51454/57419/66008/2318/6252/7248/4929/23041/9900/55727/91694/22863/1389/54796/3400/9649/23253/54453/7048/22862/22905/23365/79899/5412/60481/54838/89795/5793/23389/7227/6310/3625/26018/23492/2099/23116/744/775/90355/388/23414/11069/6857/26960/55638/10472/2152/2697/3131/7786/55107/9863/23261/3479/3708/10742/2018/4036/2066
## CAGTATT_MIR200B_MIR200C_MIR429                                                                                                                                                                                                                                                  31/23291/5179/57092/9969/9865/8473/9644/10395/6383/5569/1392/4008/9647/11041/8613/6405/4884/8491/9706/2263/6567/53353/64766/5310/546/5128/55145/54469/200576/8434/7003/687/3915/1456/3321/10499/23189/26060/55578/1998/51421/22864/10484/10140/23001/4661/253782/9807/83660/95681/1960/2908/51719/2186/22846/108/9710/80205/2104/9522/7320/23243/23258/4602/129642/2669/51439/7716/54014/23047/5099/2823/3340/1942/7552/1843/100506658/8204/10979/79618/23015/8848/4692/5334/57515/26249/6563/23122/6196/4077/2494/10370/3953/51339/79365/54838/89795/8076/5577/3751/6310/1009/29994/10769/1501/57509/3899/2273/23116/4908/27244/775/5783/65055/4035/23414/2737/5744/6857/55800/10186/3131/9828/6935/4982/219654/1602/3708/4857