Brian Albert Monroe
This allows a "master" argument for the PSOCK cluster, for instance
|1 year ago|
|R||1 year ago|
|man||1 year ago|
|.Rbuildignore||5 years ago|
|.gitignore||3 years ago|
|DESCRIPTION||1 year ago|
|NAMESPACE||1 year ago|
|README.md||1 year ago|
This helper package allows code to be portable between use on a MPI cluster computer and allows the use of the same code for parallel functions on Linux, Mac OS, and Windows.
Linux and Mac OS are POSIX compliant and therefore are able to make use of 'fork()' to enable parallel tasks with significant speed benefits and ease of coding, this feature however does not exist in Windows. Instead, Windows machines can make use of a PSOCK cluster to do parallel computing. This helper package unifies the syntax and allows the code to be portable and suitably parallel across all platforms.
####Requires: The entire functionality is provided by the parallel package, which is shipped with R by default. Take a look through that documentation, you may decide that ptools is unnecessary for your use-case.
Packages are hosted at bamonroe.com/drat.
With the package installed, it should be loaded as any other package would:
A configuration command needs to be run before any functions are run in parallel:
p_config(type = <type>, host_list = list(), cores = <cores>)
p_config() function accepts
type argument, either
"PSOCK", a list named by hostname with elements equal to each core for the
hostname, and/or a numeric value for
If the number of cores specified is less than 1, 1 core will be used. If the number of cores specified is greater than the number available, the maximum number available will be used.
When utilizing FORK style parallelization, i.e. running code on Unix-like systems, the number of cores can be changed at any point without any issues.
The script comes with 7 wrapper functions as of right now:
p_library(("libname1", "libname2", ...)
These functions are only useful when using cluster style parallelizatin However, these commands will do absolutely nothing while using FORK and allow the script to be completely portable across platforms. So rather code as if you need to export objects to workers.
p_export("obj1", "obj2", ...)
Export the named objects to all cluseter nodes by default.
var <- 1:10 c.export("var")
c.eval( vec <- 1:10 )
creates a vector named
vec from 1 to 10 on every worker
clusterCall. Not a wrapper for serial
call, which does not do the same thing.
p_apply(X, MARGIN, FUN)
parApplyfor PSOCK. An additional wrapper function was written around
mclapplyfor forking that results in a parallel equivalent of the apply function not included in the
parApplyLBfor PSOCK. An additional wrapper function was written around
mclapply(preschedule = FALSE)for forking that results in a parallel equivalent of the apply function not included in the
mclapply(preschedule = FALSE)with FORK,
parSapplywith PSOCK. An additional wrapper was written around
mclapplythat results in a parallel version of sapply not included in the built in
parSapplyLBwith PSOCK. An additional wrapper was written around
mclapply(preschedule = FALSE)that results in a parallel version of sapply not included in the built in