Commit 16eec28e authored by liiskolb's avatar liiskolb

v 0.2.1 sent to CRAN

parent 14030369
......@@ -4,7 +4,7 @@ Title: Interface to the 'g:Profiler' Toolset
Version: 0.2.1
Author: Liis Kolberg <liis.kolberg@ut.ee>, Uku Raudvere <uku.raudvere@ut.ee>
Maintainer: Liis Kolberg <liis.kolberg@ut.ee>
Description: A toolset for functional enrichment analysis and visualization, gene/protein/SNP identifier conversion and mapping orthologous genes across species via 'g:Profiler' (<https://biit.cs.ut.ee/gprofiler>).
Description: A toolset for functional enrichment analysis and visualization, gene/protein/SNP identifier conversion and mapping orthologous genes across species via 'g:Profiler' (<https://biit.cs.ut.ee/gprofiler/>).
The main tools are:
(1) 'g:GOSt' - functional enrichment analysis and visualization of gene lists;
(2) 'g:Convert' - gene/protein/transcript identifier conversion across various namespaces;
......@@ -14,7 +14,6 @@ Description: A toolset for functional enrichment analysis and visualization, gen
BugReports: https://biit.cs.ut.ee/gprofiler/page/contact
License: GPL (>= 2)
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.1
Imports: jsonlite, RCurl, ggplot2, plotly, tidyr (>= 1.0.0), crosstalk, grDevices, gridExtra, grid, viridisLite, dplyr
Depends: R (>= 3.5)
......
......@@ -161,9 +161,20 @@ gost <- function(query,
url = file.path("http://biit.cs.ut.ee", "gplink", "l")
# get data_version
version_info = try(gprofiler2::get_version_info(), silent = TRUE)
if("try-error" %in% class(version_info)){
base_url = gprofiler2::get_base_url()
# get data version from archive
data_version = basename(base_url)
}else{
data_version = version_info$gprofiler_version
}
body <- jsonlite::toJSON((
list(
url = jsonlite::unbox(file.path(gprofiler2::get_base_url(), "gost")),
data_version = jsonlite::unbox(data_version),
payload = {
list(
organism = jsonlite::unbox(organism),
......@@ -215,7 +226,7 @@ gost <- function(query,
list("Accept" = "application/json",
"Content-Type" = "application/json",
"charset" = "UTF-8",
"Expect" = "")
"Expect" = "")
oldw <- getOption("warn")
options(warn = -1)
h1 = RCurl::basicTextGatherer(.mapUnicode = FALSE)
......@@ -302,13 +313,13 @@ gost <- function(query,
genes <- meta$genes_metadata$query[[query]]$ensgs[which(lengths(evcodes) > 0)]
genes2 <- lapply(genes, function(x) ifelse(x %in% genemap, names(which(genemap == x)) , x))
return(paste0(genes2, collapse = ","))
},
},
df$intersections,
df$query,
SIMPLIFY = TRUE
)
df$evidence_codes <- sapply(df$intersections, function(x)
paste0(sapply(x[which(lengths(x) > 0)], paste0, collapse = " "), collapse = ","), USE.NAMES = FALSE)
paste0(sapply(x[which(lengths(x) > 0)], paste0, collapse = " "), collapse = ","), USE.NAMES = FALSE)
}
# Order by query, source and p_value
......
......@@ -41,12 +41,12 @@ knitr::opts_chunk$set(
## Overview
[gprofiler2](https://CRAN.R-project.org/package=gprofiler2) provides an R interface to the widely used web toolset g:Profiler ([https://biit.cs.ut.ee/gprofiler](https://biit.cs.ut.ee/gprofiler)) @gp.
[gprofiler2](https://CRAN.R-project.org/package=gprofiler2) provides an R interface to the widely used web toolset g:Profiler ([https://biit.cs.ut.ee/gprofiler](https://biit.cs.ut.ee/gprofiler/)) @gp.
The toolset performs functional enrichment analysis and visualization of gene lists, converts gene/protein/SNP identifiers to numerous namespaces, and maps orthologous genes across species.
[g:Profiler](https://biit.cs.ut.ee/gprofiler) relies on [Ensembl databases](https://www.ensembl.org/index.html) as the primary data source and follows their release cycle for updates.
[g:Profiler](https://biit.cs.ut.ee/gprofiler/) relies on [Ensembl databases](https://www.ensembl.org/index.html) as the primary data source and follows their release cycle for updates.
The main tools in [g:Profiler](https://biit.cs.ut.ee/gprofiler) are:
The main tools in [g:Profiler](https://biit.cs.ut.ee/gprofiler/) are:
* [g:GOSt](https://biit.cs.ut.ee/gprofiler/gost) - functional enrichment analysis of gene lists
* [g:Convert](https://biit.cs.ut.ee/gprofiler/convert) - gene/protein/transcript identifier conversion across various namespaces
......@@ -65,7 +65,7 @@ Corresponding functions in the [gprofiler2](https://CRAN.R-project.org/package=g
[gprofiler2](https://CRAN.R-project.org/package=gprofiler2) uses the [publicly available APIs](https://biit.cs.ut.ee/gprofiler/page/apis) of the g:Profiler web tool which ensures that the results from all of the interfaces are consistent.
The package corresponds to the 2019 update of [g:Profiler](https://biit.cs.ut.ee/gprofiler) and provides access for versions *e94_eg41_p11* and higher. The older versions are available from the previous R package [gProfileR](https://CRAN.R-project.org/package=gProfileR).
The package corresponds to the 2019 update of [g:Profiler](https://biit.cs.ut.ee/gprofiler/) and provides access for versions *e94_eg41_p11* and higher. The older versions are available from the previous R package [gProfileR](https://CRAN.R-project.org/package=gProfileR).
----
......@@ -304,13 +304,18 @@ Available data sources and their abbreviations are:
* [Gene Ontology](http://geneontology.org/) (GO or by branch GO\:MF, GO\:BP, GO\:CC)
* [KEGG](https://www.genome.jp/kegg/) (KEGG)
* [Reactome](https://reactome.org/) (REAC)
* [WikiPathways](https://www.wikipathways.org) (WP)
* [WikiPathways](https://www.wikipathways.org/index.php/WikiPathways) (WP)
* [TRANSFAC](https://genexplain.com/transfac/) (TF)
* [miRTarBase](http://mirtarbase.mbc.nctu.edu.tw/php/index.php) (MIRNA)
* [miRTarBase](https://mirtarbase.cuhk.edu.cn/~miRTarBase/miRTarBase_2019/php/index.php) (MIRNA)
* [Human Protein Atlas](https://www.proteinatlas.org/) (HPA)
* [CORUM](https://mips.helmholtz-muenchen.de/corum/) (CORUM)
* [Human phenotype ontology](https://hpo.jax.org/) (HP)
* [Human phenotype ontology](https://hpo.jax.org/app/) (HP)
The function `get_version_info` enables to obtain the full metadata about the versions of different data sources for a given `organism`.
```{r}
get_version_info(organism = "hsapiens")
```
### Custom data sources with `upload_GMT_file`
......@@ -329,7 +334,7 @@ download.file(url = "http://software.broadinstitute.org/gsea/resources/msigdb/7.
upload_GMT_file(gmtfile = "extdata/biocarta.gmt")
```
The result is a string that denotes the unique ID of the uploaded data source in the [g:Profiler](https://biit.cs.ut.ee/gprofiler) database. In this examaple, the ID is **gp\_ \_TEXF\_hZLM\_d18**.
The result is a string that denotes the unique ID of the uploaded data source in the [g:Profiler](https://biit.cs.ut.ee/gprofiler/) database. In this examaple, the ID is **gp\_ \_TEXF\_hZLM\_d18**.
After the upload, this ID can be used as a value for the parameter `organism` in the `gost` function. The input `query` should consist of identifiers that are available in the GMT file. Note that all the genes in the GMT file define the domain size and therefore it is not sufficient to include only the selection of interesting terms to the file.
......@@ -340,14 +345,14 @@ custom_gostres <- gost(query = c("MAPK3", "PIK3C2G", "HRAS", "PIK3R1", "MAP2K1",
head(custom_gostres$result, 3)
```
There is no need to repeatedly upload the same GMT file(s) every time before the enrichment analysis. This can only be uploaded once and then the ID can be used in any further enrichment analyses that are based on that custom source. The same ID can also be used in the [web tool](https://biit.cs.ut.ee/gprofiler) as a token under the Custom GMT options.
There is no need to repeatedly upload the same GMT file(s) every time before the enrichment analysis. This can only be uploaded once and then the ID can be used in any further enrichment analyses that are based on that custom source. The same ID can also be used in the [web tool](https://biit.cs.ut.ee/gprofiler/) as a token under the Custom GMT options.
For example, the same query in the web tool is available from [https://biit.cs.ut.ee/gplink/l/jh3HdbUWQZ](https://biit.cs.ut.ee/gplink/l/jh3HdbUWQZ).
----
## Creating a Generic Enrichment Map (GEM) file for EnrichmentMap
Generic Enrichment Map (GEM) is a file format that can be used as an input for [Cytoscape EnrichmentMap application](http://apps.cytoscape.org/apps/enrichmentmap). In EnrichmentMap you can set the Analysis Type parameter as **Generic/gProfiler** and upload the required files: GEM file with enrichment results (input field **Enrichments**) and GMT file that defines the annotations (input field **GMT**).
Generic Enrichment Map (GEM) is a file format that can be used as an input for [Cytoscape EnrichmentMap application](https://apps.cytoscape.org/apps/enrichmentmap). In EnrichmentMap you can set the Analysis Type parameter as **Generic/gProfiler** and upload the required files: GEM file with enrichment results (input field **Enrichments**) and GMT file that defines the annotations (input field **GMT**).
For a single query, the GEM file can be generated and saved using the following commands:
......@@ -376,7 +381,7 @@ Here the parameter `file` should be the character string naming the file togethe
In addition to the GEM file, EnrichmentMap requires also the data source description GMT file as an input. For example, if you are using g:Profiler default data sources and your input query consists of human ENSG identifiers, then the required GMT file is available from [https://biit.cs.ut.ee/gprofiler/static/gprofiler_full_hsapiens.ENSG.gmt](https://biit.cs.ut.ee/gprofiler/static/gprofiler_full_hsapiens.ENSG.gmt). Note that this file does not include annotations from KEGG and Transfac as we are restricted by data source licenses that do not allow us to share these two data sources with our users. This means that the enrichment results in the GEM file cannot include results from these resources, otherwise you will get an error from the Cytoscape application. This can be assured by setting appropriate values to the `sources` parameter in the `gost()` function.
For other organisms, the GMT files are downloadable from the [g:Profiler web page](https://biit.cs.ut.ee/gprofiler) under the *Data sources* section, after setting a suitable value for the organism. If you are using a custom GMT file for you analysis, then this should be uploaded to EnrichmentMap.
For other organisms, the GMT files are downloadable from the [g:Profiler web page](https://biit.cs.ut.ee/gprofiler/) under the *Data sources* section, after setting a suitable value for the organism. If you are using a custom GMT file for you analysis, then this should be uploaded to EnrichmentMap.
In case you want to compare **multiple queries** in EnrichmentMap you could generate individual GEM files for each of the queries and upload these as separate Data sets. This EnrichmentMap option enables you to browse, edit and compare multiple networks simultaneously by color-coding different uploaded Data sets.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment