---
title: "Parallel processing"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{parallelization}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(flipr)
load("../R/sysdata.rda")
time_without_parallelization <- df_parallelization$time_without_par
time_with_parallelization <- df_parallelization$time_par
```

The [**flipr**](https://permaverse.github.io/flipr/) package uses functions
contained in the [**furrr**](https://future.futureverse.org/index.html) package
for parallel processing. The setting of parallelization has to be done on the
user side. We illustrate here how to achieve asynchronous evaluation. We use the
[**future**](https://future.futureverse.org/index.html) package to set the plan,
the **parallel** package to define a default cluster, and the
[**progressr**](https://progressr.futureverse.org/index.html) package to report
progress updates.

By setting the desired number of cores, we define the number of background R
sessions that will be used to evaluate expressions in parallel. This number is
used to set the multisession plan with the function `future::plan()` and to
define a default cluster with `parallel::setDefaultCluster()`. Then, to enable
the visualization of evaluation progress, we can put the code in the
`progressr::with_progress()` function, or more simply set it for all the
following code with the `progressr::handlers()` function. After these settings,
[**flipr**](https://permaverse.github.io/flipr/) functions can be used, as shown
in this example.

To show the benefit of parallel processing, we compare here the processing times
necessary to evaluate a grid with a plausibility function. First, here is the
computation without parallelization.

```{r, eval=FALSE}
set.seed(1234)
x <- rnorm(10, 1, 1)
y <- rnorm(10, 4, 1)

null_spec <- function(y, parameters) {
  purrr::map(y, ~ .x - parameters[1])
}
stat_functions <- list(stat_t)
stat_assignments <- list(delta = 1)

pf <- PlausibilityFunction$new(
  null_spec = null_spec,
  stat_functions = stat_functions,
  stat_assignments = stat_assignments,
  x, y
)

pf$set_point_estimate(mean(y) - mean(x), overwrite = TRUE)
pf$set_parameter_bounds(
  point_estimate = pf$point_estimate, 
  conf_level = pf$max_conf_level
)
pf$set_grid(
  parameters = pf$parameters, 
  npoints = 50L
)

tictoc::tic()
pf$evaluate_grid(grid = pf$grid)
time_without_parallelization <- tictoc::toc()
```

```{r}
time_without_parallelization 
```

## Computation with parallel processing

By setting the desired number of cores, we define the number of background R
sessions that will be used to evaluate expressions in parallel. This number is
used to set the multisession plan with the function `future::plan()` and to
define a default cluster with `parallel::setDefaultCluster()`. Then, to enable
the visualization of evaluation progress, we can put the code in the
`progressr::with_progress()` function, or more simply set it for all the
following code with the `progressr::handlers()` function. After these settings,
[**flipr**](https://permaverse.github.io/flipr/) functions can be used, as shown
in this example.

```{r, eval=FALSE}
ncores <- 4
future::plan(multisession, workers = ncores)
cl <- parallel::makeCluster(ncores)
parallel::setDefaultCluster(cl)
progressr::handlers(global = TRUE)

set.seed(1234)
x <- rnorm(10, 1, 1)
y <- rnorm(10, 4, 1)

null_spec <- function(y, parameters) {
  purrr::map(y, ~ .x - parameters[1])
}
stat_functions <- list(stat_t)
stat_assignments <- list(delta = 1)

pf <- PlausibilityFunction$new(
  null_spec = null_spec,
  stat_functions = stat_functions,
  stat_assignments = stat_assignments,
  x, y
)

pf$set_point_estimate(mean(y) - mean(x), overwrite = TRUE)
pf$set_parameter_bounds(
  point_estimate = pf$point_estimate, 
  conf_level = pf$max_conf_level
)
pf$set_grid(
  parameters = pf$parameters, 
  npoints = 50L
)

tictoc::tic()
pf$evaluate_grid(grid = pf$grid)
time_with_parallelization <- tictoc::toc()

parallel::stopCluster(cl)
```

It is good practice to shut down the workers with the `parallel::stopCluster()`
function at the end of the code.

```{r}
time_with_parallelization
```

This experiment proves that we can save a lot of computation time when using
parallel processing, as we gained approximately 33 seconds in this example to
evaluate the plausibility function.

Finally, to return to a sequential plan with no progress updates, the following
code can be used.

```{r, eval=FALSE}
future::plan(sequential)
parallel::setDefaultCluster(NULL)
progressr::handlers(global = FALSE)
```