Random Forest via ranger. Predicts response variables or brushed set of rows from predictor variables, using Random Forest classification or regression.

randomForest(
  dataset = cs.in.dataset(),
  preds = cs.in.predictors(),
  resps = cs.in.responses(),
  brush = cs.in.brushed(),
  scriptvars = cs.in.scriptvars(),
  return.results = FALSE,
  ...
)

Arguments

dataset

[data.frame]
Dataset with named columns. The names correspond to predictors and responses.

preds

[character]
Character vector of predictor variables.

resps

[character]
Character vector of response variables.

brush

[logical]
Logical vector of length nrow(dataset). Flags brushed rows in Cornerstone.

scriptvars

[list]
Named list of script variables set via the Cornerstone "Script Variables" menu. For details see below.

return.results

[logical(1)]
If FALSE the function returns TRUE invisibly. If TRUE, it returns a list of results. Default is FALSE.

...

[ANY]
Additional arguments to be passed to ranger . Please consider possible script variables (scriptvars) to prevent duplicates.

Value

Logical [TRUE] invisibly and outputs to Cornerstone or, if return.results = TRUE, list of resulting data.frame objects:

statistics

General statistics about the random forest.

importances

Variable importance of prediction variables in descending order of importance (most important first)

predictions

Dataset to brush with predicted values for dataset. The original input and other columns can be added to this dataset through the menu Columns -> Add from Parent ....

confusion

For categorical response variables or brush state only. A table with counts of each distinct combination of predicted and actual values.

rgobjects

List of ranger.forest objects with fitted random forests.

Details

The following script variables are summarized in scriptvars list:

brush.pred

[logical(1)]
Use brush vector as additional predictor.
Default is FALSE.

use.rows

[character(1)]
Rows to use in model fit. Possible values are all, non-brushed, or brushed.
Default is all.

num.trees

[integer(1)]
Number of trees to fit in ranger.
Default is 500.

importance.mode

[character(1)]
Variable importance mode. For details see ranger.
Default is permutation.

respect.unordered.factors

[character(1)]
Handling of unordered factor covariates. For details see ranger.
Default is NULL.

See also

Examples

# Fit random forest to iris data: res = randomForest(iris, c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"), "Species" , scriptvars = list(brush.pred = FALSE, use.rows = "all", num.trees = 500 , importance.mode = "permutation" , respect.unordered.factors = "ignore" ) , brush = rep(FALSE, nrow(iris)), return.results = TRUE ) # Show general statistics: res$statistics
#> Statistic Value #> 1: Type Classification #> 2: Number of Trees 500 #> 3: Sample Size 150 #> 4: Number of Independent Variables 4 #> 5: Mtry 2 #> 6: Minimal Node Size 1 #> 7: Variable Importance Mode permutation #> 8: Splitrule gini #> 9: OOB Prediction Error [%] 4 #> 10: Runtime R Script [s] 1.022
# Prediction randomForestPredict(iris[, 1:4], c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width") , robject = res$rgobjects , return.results = TRUE )
#> $predictions #> Species #> 1: setosa #> 2: setosa #> 3: setosa #> 4: setosa #> 5: setosa #> --- #> 146: virginica #> 147: virginica #> 148: virginica #> 149: virginica #> 150: virginica #>