Using groundhog with foreach loops

The package foreach allows running loops in parallel, leveraging the multiple cores of a computer for faster processing. The way foreach implements this is through multiple instances/environments of R running in the background. This is relevant for reproducibility because those background environments load packages from the default R library, not groundhog’s. This can be fixed with 3 lines of code inside the loop. 

Specifically, one includes a groundhog.library() call from within the loop. To avoid repeating the loading of the package in every loop, an if statement runs it only if groundhog has not yet been loaded; it runs once per parallel environment created.

Below is an example for a loop that runs a function from the package pwr. You would just replace pwr for the package(s) you need for the functions inside the loop.

#Use groundhog to load the packages used in the parallel loop
  pkgs <-c('foreach','doParallel','parallel')
  groundhog.library(pkgs, '2022-03-01')

#Create the loop with ‘foreach’

   sample.sizes <- foreach(simk=1:10 , .combine='c') %dopar% {

#KEY LINES: if ‘groundhog’ has not yet been loaded in the new environment, load it and ‘pwr’
  if (!'groundhog' %in% .packages()) {   #if groundhog is not loaded in background environemnt 
  library('groundhog')                   #load it
  groundhog.library('pwr', '2022-03-01') #and use it to load 'pwr', replace pwr with pkg(s) you need
#Now comes the code for the loop itself, with the function that needed ‘pwr’
powerk <- runif(1,min=.50 ,max=.90)
pwr.t.test(d=.5, power=powerk)$power