| Title: | Interactive 'shiny' GUI for the 'earth' Package |
| Version: | 0.8.0 |
| Description: | Provides a 'shiny'-based graphical user interface for the 'earth' package, enabling interactive building and exploration of Multivariate Adaptive Regression Splines (MARS) models. Features include data import from CSV and 'Excel' files, automatic detection of categorical variables, interactive control of interaction terms via an allowed matrix, comprehensive model diagnostics with variable importance and partial dependence plots, and publication-quality report generation via 'Quarto'. |
| License: | AGPL (≥ 3) |
| URL: | https://github.com/wcraytor/earthUI |
| BugReports: | https://github.com/wcraytor/earthUI/issues |
| Depends: | R (≥ 4.1.0) |
| Imports: | DBI, earth (≥ 5.3.0), ggplot2, jsonlite, openxlsx, plotly, readxl, RSQLite, shiny, stats, tools, utils |
| Suggests: | bslib, callr, DT, knitr, quarto, rmarkdown, shinyFiles, showtext, sysfonts, testthat (≥ 3.0.0), tinytex, withr, writexl |
| Config/testthat/edition: | 3 |
| Encoding: | UTF-8 |
| VignetteBuilder: | knitr |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-06-02 07:10:09 UTC; claude-code |
| Author: | William Craytor [aut, cre] |
| Maintainer: | William Craytor <bcraytor@proton.me> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-02 07:40:02 UTC |
earthUI: Interactive 'shiny' GUI for the 'earth' Package
Description
Provides a 'shiny'-based graphical user interface for the 'earth' package, enabling interactive building and exploration of Multivariate Adaptive Regression Splines (MARS) models. Features include data import from CSV and 'Excel' files, automatic detection of categorical variables, interactive control of interaction terms via an allowed matrix, comprehensive model diagnostics with variable importance and partial dependence plots, and publication-quality report generation via 'Quarto'.
Author(s)
Maintainer: William Craytor bcraytor@proton.me
See Also
Useful links:
Export an earthUI result as RDS for consumption by mgcvUI
Description
Saves the result via saveRDS() to
<file_name>_earthUI_result_<timestamp>.rds and verifies the file is
readable and can produce a prediction. If verification fails, the file
is deleted. Skipped silently for models with degree > 2 (mgcvUI only
supports pairwise interactions).
Usage
auto_export_for_mgcv(result, output_folder, file_name)
Arguments
result |
A fit result list as returned by |
output_folder |
Character scalar. May be |
file_name |
Character scalar. Used to derive the output filename. |
Value
Invisibly, NULL.
Build an allowed function for earth()
Description
Converts an allowed interaction matrix into a function compatible with
the allowed parameter of earth::earth(). The function checks that
ALL pairwise combinations among the predictors in a proposed interaction
term are TRUE in the matrix.
Usage
build_allowed_function(allowed_matrix, block_degree1 = NULL)
Arguments
allowed_matrix |
A symmetric logical matrix as returned by
|
block_degree1 |
Optional character vector of predictor names to
block from entering the model as degree-1 (main effect) terms. These
variables can still participate in interactions (degree >= 2). This is
useful when a variable like |
Details
The returned function implements the standard earth() allowed function
contract. When earth proposes a new hinge function involving predictor
pred with existing parent predictors indicated by the parents logical
vector, the function checks that every pair of involved predictors is
allowed in the matrix.
For a 3-way interaction between X, Y, Z, the function verifies that (X,Y), (Y,Z), and (X,Z) are all TRUE in the matrix.
When block_degree1 is specified, any predictor in that list is blocked
from entering as a degree-1 term but is allowed in higher-degree
interactions (subject to the allowed matrix).
Value
A function with signature
function(degree, pred, parents, namesx, first) suitable for the
allowed parameter of earth::earth().
Examples
mat <- build_allowed_matrix(c("sqft", "bedrooms", "pool"))
mat["sqft", "pool"] <- FALSE
mat["pool", "sqft"] <- FALSE
func <- build_allowed_function(mat)
# Block sale_age from degree 1 (interaction only)
mat2 <- build_allowed_matrix(c("sale_age", "living_area", "lot_size"))
func2 <- build_allowed_function(mat2, block_degree1 = "sale_age")
Build an allowed interaction matrix
Description
Creates a symmetric logical matrix indicating which pairs of predictors are allowed to interact. By default, all interactions are allowed.
Usage
build_allowed_matrix(variable_names, default = TRUE)
Arguments
variable_names |
Character vector of predictor variable names. |
default |
Logical. Default value for all entries. Default is |
Value
A symmetric logical matrix with variable_names as both row and
column names.
Examples
mat <- build_allowed_matrix(c("sqft", "bedrooms", "pool"))
mat["sqft", "pool"] <- FALSE
mat["pool", "sqft"] <- FALSE
mat
Build a Sales Comparison Grid Excel workbook
Description
Generates a multi-sheet xlsx workbook formatted as a Sales Comparison Grid from an RCA-adjusted data frame. Each sheet shows the subject and up to three comparable sales side by side, with factual values, value contributions, and adjustments per regression variable, plus rows for grouped variables (location, site, age), residual feature inputs, and an Adjusted Sale Price formula.
Usage
build_sales_grid(
rca_df,
comp_rows,
output_file,
specials = list(),
title_prefix = "Intermediate Sales Comparable Grid",
progress_fn = NULL
)
Arguments
rca_df |
A data frame produced by the RCA workflow. Row 1 is the
subject; rows 2+ are comps. Must contain columns produced by
|
comp_rows |
Integer vector of row numbers (>= 2) to include in the grid. Maximum 30 (10 sheets, 3 comps per sheet). |
output_file |
Character scalar. Destination xlsx path. |
specials |
Named list mapping a special type
(e.g. |
title_prefix |
Character scalar. Sheet-title prefix. Defaults to
|
progress_fn |
Optional function called after each sheet is written
with arguments |
Details
This is the non-Shiny computation kernel used by the earthUI Shiny app's
Sales Grid download button, and is also suitable for use from batch
scripts that already have rca_df in memory.
Value
Invisibly, the output_file path.
Generate a city path code from a full name
Description
Applies the rule: lowercase + strip all non-alphanumerics + first 6
characters. If the resulting code already exists in existing_codes,
append _1, _2, ... until unique.
Usage
city_abbreviation(full_name, existing_codes = character(0))
Arguments
full_name |
Character scalar (e.g. |
existing_codes |
Character vector of codes already in the parent
folder. Defaults to |
Value
Character scalar.
Compute intermediate output data frame
Description
Returns an enhanced copy of data with per-target model columns appended:
est_<target>, residual, cqa, optional residual_sf / cqa_sf (when
a living-area column is supplied), per-g-function <var>_contribution
columns, basis, and calc_residual. For appraisal or market purposes,
rows are sorted by residual_sf (or residual) descending, with the
subject row (row 1) pinned on top when applicable. Ranking columns
(residual_sf, cqa_sf, residual, cqa) are moved to the leftmost
positions.
Usage
compute_intermediate_output(
data,
result = NULL,
purpose = c("general", "appraisal", "market"),
skip_subject_row = FALSE,
living_area_col = NULL
)
Arguments
data |
A data frame (the raw imported data). Must contain the target
column(s) named in |
result |
A fit result list as returned by |
purpose |
Character scalar: |
skip_subject_row |
Logical. In |
living_area_col |
Character scalar or |
Details
This is the non-Shiny computation kernel used by the earthUI Shiny app's Download Intermediate Output button, and is also suitable for use from batch scripts.
Value
A data frame.
Compute RCA (Residual Comparable Adjustment) output for an appraisal
Description
Starting from the raw data (subject in row 1, comps in rows 2+) and a
fitted earth model, produces the adjusted comparables data frame used by
the Sales Comparison Grid. The user-supplied subject CQA score is
converted into a subject residual via linear interpolation of the comp
CQA/residual curve; per-g-function contributions and adjustments are
then added for each comp, along with net/gross adjustments, percentages,
and the final adjusted_sale_price.
Usage
compute_rca_adjustments(
data,
result,
user_cqa,
cqa_type = c("cqa", "cqa_sf"),
living_area_col = NULL,
weight_col = NULL
)
Arguments
data |
A data frame (subject + comps) matching |
result |
A fit result list as returned by |
user_cqa |
Numeric scalar in |
cqa_type |
Character: |
living_area_col |
Character scalar or |
weight_col |
Character scalar or |
Details
For multi-target models, the primary target drives the subject residual calculation. Additional targets receive their own residual interpolation and adjustment columns, imputed only for zero-weight rows.
This is the non-Shiny computation kernel used by the earthUI Shiny app's RCA Raw Output button, and is also suitable for use from batch scripts.
Value
An enhanced data frame with model columns, contributions,
adjustments, subject_value, subject_cqa, and
adjusted_sale_price.
Compute sale age in integer days
Description
Given a vector of contract (sale) dates and a single effective date, returns
the difference in integer days (effective_date - contract_date). Contract
dates may be supplied as POSIXct, Date, character strings parseable by
as.POSIXct(), or numeric Excel serial date numbers (origin 1899-12-30).
Usage
compute_sale_age(contract_vals, effective_date)
Arguments
contract_vals |
A vector of contract/sale dates. Accepted types:
|
effective_date |
The effective (appraisal) date. Accepted types:
|
Details
This function is the non-Shiny computation kernel used by the
earthUI Shiny app when computing a sale_age column from a designated
contract_date column. It is also suitable for use from batch scripts.
Value
An integer vector the same length as contract_vals, giving the
number of whole days between effective_date and each contract date.
NA contract values propagate to NA results.
Examples
compute_sale_age(
contract_vals = as.Date(c("2024-01-15", "2024-06-01")),
effective_date = as.Date("2025-01-01")
)
Convert a Quarto source file to one or more output formats
Description
Renders any .qmd file (not just earthUI-generated ones) to the
requested formats via quarto::quarto_render(). Useful for
converting hand-edited or manually-combined Quarto reports — e.g.,
a master document that uses {{< include >}} to pull in multiple
project reports.
Usage
convert_quarto_file(
qmd_path,
formats = c("html"),
output_dir = NULL,
paper_size = "letter"
)
Arguments
qmd_path |
Path to a Quarto source ( |
formats |
Character vector of output formats. Any subset of
|
output_dir |
Directory to write the rendered output(s). Defaults
to the same directory as |
paper_size |
Character: |
Value
Invisibly, a character vector of output file paths.
List all countries with shipped admin schemas
Description
Returns a named character vector of country display names indexed by
ISO 3166-1 alpha-2 code. Suitable for a selectInput choices list.
Usage
country_choices()
Value
Named character vector. Names are display names, values are lowercase 2-letter codes.
Get the admin-level schema for a country
Description
Returns the ordered character vector of admin level labels for the
given ISO 3166-1 alpha-2 country code. Used by the UI cascade and by
regproj_path() to validate path depth.
Usage
country_schema(cc)
Arguments
cc |
Character scalar. Lowercase ISO 3166-1 alpha-2 country code
(e.g. |
Details
Unknown country codes return a generic 2-level fallback
(c("region", "city")) so users in countries not yet covered by the
shipped table still get a sensible cascade.
Value
Character vector of admin level labels, top to bottom.
Default regProj root for the current OS
Description
Resolution order:
-
REGPROJ_ROOTenvironment variable. -
regproj_rootfield in user prefs (earthui_prefs_path()). Per-OS default:
C:/regProjon Windows;~/regProjelsewhere.
Usage
default_regproj_root()
Value
Character scalar. Absolute path.
Detect likely categorical variables in a data frame
Description
Returns a logical named vector indicating which columns are likely
categorical. Character and factor columns are always flagged. Numeric
columns with fewer than max_unique unique values are also flagged.
Usage
detect_categoricals(df, max_unique = 10L)
Arguments
df |
A data frame. |
max_unique |
Integer. Numeric columns with this many or fewer unique values are flagged as likely categorical. Default is 10. |
Value
A named logical vector with one element per column. TRUE indicates
the column is likely categorical.
Examples
df <- data.frame(
price = c(100, 200, 300, 400),
pool = c("Y", "N", "Y", "N"),
bedrooms = c(2, 3, 2, 4),
sqft = c(1200, 1500, 1300, 1800)
)
detect_categoricals(df)
Detect column types in a data frame
Description
Inspects each column and returns a best-guess R type string. Character columns are tested for common date patterns. Numeric columns containing only 0/1 values (with both present) are flagged as logical.
Usage
detect_types(df)
Arguments
df |
A data frame. |
Value
A named character vector with one element per column.
Possible values: "numeric", "integer", "character",
"logical", "factor", "Date", "POSIXct",
"unknown".
Examples
df <- data.frame(
price = c(100.5, 200.3, 300.1),
rooms = c(2L, 3L, 4L),
pool = c("Y", "N", "Y"),
sold = c(TRUE, FALSE, TRUE)
)
detect_types(df)
Path to the per-user earthUI preferences file
Description
Returns the path to <R_user_dir("earthUI","config")>/prefs.json. The
file holds user-level configuration that lives outside the regProj
tree itself — most importantly, the location of the regProj root.
Usage
earthui_prefs_path()
Value
Character scalar.
Read user preferences (returns empty list if file missing)
Description
Read user preferences (returns empty list if file missing)
Usage
earthui_prefs_read()
Value
Named list.
Write user preferences (atomic; creates the config dir if needed)
Description
Write user preferences (atomic; creates the config dir if needed)
Usage
earthui_prefs_write(prefs)
Arguments
prefs |
Named list to save. |
Value
Invisibly, the prefs path.
Export saved earthUI settings for one file+purpose to a JSON file
Description
The Shiny app persists per-file, per-purpose settings in a SQLite DB
keyed by "<filename>||<purpose>". export_settings() reads that row
and writes a single JSON file containing the full settings bundle
(target, earth parameters, variable selections, type/special overrides,
and interactions), plus an rca block for batch RCA inputs — the
subject CQA score and CQA score type.
Usage
export_settings(filename, purpose, output_json)
Arguments
filename |
Character scalar. The filename as stored in the DB
(e.g. |
purpose |
Character scalar: |
output_json |
Character scalar. Destination file path ( |
Details
If output_json already exists, the rca block of the existing file
is preserved — re-exporting from the UI does not clobber hand-edited
CQA inputs.
The emitted rca block has two fields:
- cqa_score
nullor a number in[0.00, 10.00].- cqa_score_type
"CQA/sf"(based on residual / living-area, default) or"CQA"(based on residual).
The emitted reports field is an array of formats to render in batch
mode: any subset of "html", "pdf", "docx". An empty array []
(default) means no reports are generated.
Value
Invisibly, the output_json path.
Examples
## Not run:
export_settings("Appraisal_1.csv", "appraisal",
"~/configs/Appraisal_1.json")
## End(Not run)
Fit an earth model
Description
Wrapper around earth::earth() with parameter validation and automatic
cross-validation when interaction terms are enabled.
Usage
fit_earth(
df,
target,
predictors,
categoricals = NULL,
linpreds = NULL,
type_map = NULL,
degree = 1L,
allowed_func = NULL,
allowed_matrix = NULL,
nfold = NULL,
nprune = NULL,
thresh = NULL,
penalty = NULL,
minspan = NULL,
endspan = NULL,
fast.k = NULL,
pmethod = NULL,
glm = NULL,
trace = NULL,
nk = NULL,
newvar.penalty = NULL,
fast.beta = NULL,
ncross = NULL,
stratify = NULL,
varmod.method = NULL,
varmod.exponent = NULL,
varmod.conv = NULL,
varmod.clamp = NULL,
varmod.minspan = NULL,
keepxy = NULL,
Scale.y = NULL,
Adjust.endspan = NULL,
Auto.linpreds = NULL,
Force.weights = NULL,
Use.beta.cache = NULL,
Force.xtx.prune = NULL,
Get.leverages = NULL,
Exhaustive.tol = NULL,
wp = NULL,
weights = NULL,
...,
.capture_trace = TRUE
)
Arguments
df |
A data frame containing the modeling data. |
target |
Character string. Name of the response variable. |
predictors |
Character vector. Names of predictor variables. |
categoricals |
Character vector. Names of predictors to treat as
categorical (converted to factors before fitting). Default is |
linpreds |
Character vector. Names of predictors constrained to enter
the model linearly (no hinge functions). Default is |
type_map |
Named list or character vector. Maps column names to
declared types (e.g., |
degree |
Integer. Maximum degree of interaction. Default is 1 (no interactions). When >= 2, cross-validation is automatically enabled. |
allowed_func |
Function or |
allowed_matrix |
Logical matrix or |
nfold |
Integer. Number of cross-validation folds. Automatically set
to 10 when |
nprune |
Integer or |
thresh |
Numeric. Forward stepping threshold. Default is earth's default (0.001). |
penalty |
Numeric. Generalized cross-validation penalty per knot.
Default is earth's default (if |
minspan |
Integer or |
endspan |
Integer or |
fast.k |
Integer. Maximum number of parent terms considered at each step of the forward pass. Default is earth's default (20). |
pmethod |
Character. Pruning method. One of |
glm |
List or |
trace |
Numeric. Trace earth's execution. 0 (default) = none, 0.3 = variance model, 0.5 = cross validation, 1-5 = increasing detail. |
nk |
Integer or |
newvar.penalty |
Numeric or |
fast.beta |
Numeric or |
ncross |
Integer or |
stratify |
Logical or |
varmod.method |
Character or |
varmod.exponent |
Numeric or |
varmod.conv |
Numeric or |
varmod.clamp |
Numeric or |
varmod.minspan |
Integer or |
keepxy |
Logical or |
Scale.y |
Logical or |
Adjust.endspan |
Numeric or |
Auto.linpreds |
Logical or |
Force.weights |
Logical or |
Use.beta.cache |
Logical or |
Force.xtx.prune |
Logical or |
Get.leverages |
Logical or |
Exhaustive.tol |
Numeric or |
wp |
Numeric vector or |
weights |
Numeric vector or |
... |
Additional arguments passed to |
.capture_trace |
Logical. If |
Value
A list with class "earthUI_result" containing:
- model
The fitted earth model object.
- target
Name of the response variable.
- predictors
Names of predictor variables used.
- categoricals
Names of categorical predictors.
- degree
Degree of interaction used.
- cv_enabled
Logical; whether cross-validation was used.
- data
The data frame used for fitting.
Examples
# Using the included demo appraisal dataset
demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI")
df <- import_data(demo_file)
result <- fit_earth(df, target = "sale_price",
predictors = c("living_sqft", "lot_size", "age"))
format_summary(result)
Format ANOVA decomposition
Description
Extracts the ANOVA table from a fitted earth model.
Usage
format_anova(earth_result)
Arguments
earth_result |
An object of class |
Value
A data frame with the ANOVA decomposition showing which predictors contribute to each basis function and their importance.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
format_anova(result)
Format earth model as LaTeX equation
Description
Converts a fitted earth model into a LaTeX-formatted mathematical representation using g-function notation. Basis functions are grouped by degree (constant, first-degree, second-degree, third-degree) and labeled with indices that encode the group, position, and factor variable count.
Usage
format_model_equation(earth_result, digits = 7L, response_idx = NULL)
Arguments
earth_result |
An object of class |
digits |
Integer. Number of significant digits for coefficients and cut points. Default is 7. |
response_idx |
Integer or |
Value
A list containing:
- latex
Character string. LaTeX array environment for HTML/MathJax rendering.
- latex_inline
Character string. Wrapped in display math delimiters for MathJax/HTML rendering.
- latex_pdf
Character string. LaTeX for native PDF output with escaped special characters in text blocks.
- latex_word
Character string. LaTeX for Word/docx output.
- groups
List of group structures for programmatic access.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
eq <- format_model_equation(result)
cat(eq$latex)
Format earth model summary
Description
Extracts key statistics from a fitted earth model including coefficients, basis functions, R-squared, GCV, GRSq, and RSS.
Usage
format_summary(earth_result)
Arguments
earth_result |
An object of class |
Value
A list containing:
- coefficients
Data frame of model coefficients and basis functions.
- r_squared
Training R-squared.
- gcv
Generalized cross-validation value.
- grsq
Generalized R-squared (1 - GCV/variance).
- rss
Residual sum of squares.
- n_terms
Number of terms in the pruned model.
- n_predictors
Number of predictors used in the final model.
- n_obs
Number of observations.
- cv_rsq
Cross-validated R-squared (if CV was used, else NA).
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
summary_info <- format_summary(result)
summary_info$r_squared
Format variable importance
Description
Extracts variable importance scores from a fitted earth model using
earth::evimp().
Usage
format_variable_importance(earth_result)
Arguments
earth_result |
An object of class |
Value
A data frame with columns variable, nsubsets, gcv, and rss,
sorted by overall importance (nsubsets).
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
format_variable_importance(result)
Generate a Quarto report source bundle (without rendering)
Description
Writes a self-contained Quarto project under <dest_dir>/<base>_qmd/
containing the populated <base>.qmd source, all pre-generated plots
(PNG + PDF), the report data RDS, and reference.docx for Word
rendering. The resulting bundle can be edited or combined with other
Quarto sources, then rendered to HTML / Word / PDF via
convert_quarto_file().
Usage
generate_quarto_report(earth_result, dest_dir, base = "earth_report")
Arguments
earth_result |
An object of class |
dest_dir |
Directory to write the bundle into. The bundle itself
lives at |
base |
Bundle base name (no extension). The .qmd file inside is
named |
Details
Use this when you want the Quarto source as a first-class artifact — e.g., to combine reports from multiple projects into a master document before publishing.
Value
Invisibly, the absolute path to the generated .qmd file.
Get model settings for a project
Description
Reads model settings for a project from <REGPROJ_ROOT>/projects.sqlite.
Used by the Shiny UI to restore settings on file open, and by external
tools (ValEngr, batch scripts) to inspect project state. Settings are
scoped by project + purpose and are shared across all data files in the
project (a small test extract and the full dataset share one config).
Usage
get_project_settings(
project_path,
method = "earth",
purpose = "general",
root = default_regproj_root()
)
Arguments
project_path |
Absolute path to the project root. |
method |
One of |
purpose |
Settings scope: one of |
root |
regProj root. Defaults to |
Value
Named list with settings, variables, and (for earth)
interactions (each a JSON string), or NULL if no row exists.
Import data from CSV or Excel files
Description
Reads a CSV (.csv) or 'Excel' (.xlsx, .xls) file and returns a data frame. Column names are converted to snake_case and duplicates are made unique.
Usage
import_data(filepath, sheet = 1, sep = ",", dec = ".", ...)
Arguments
filepath |
Character string. Path to the data file. Supported formats:
|
sheet |
Character or integer. For Excel files, the sheet to read. Defaults to the first sheet. Ignored for CSV files. |
sep |
Character. Field separator for CSV files. Default |
dec |
Character. Decimal separator for CSV files. Default |
... |
Additional arguments passed to |
Value
A data frame with column names converted to snake_case. Duplicate column names are made unique by appending numeric suffixes.
Examples
# Load the included demo appraisal dataset
demo_file <- system.file("extdata", "Appraisal_1.csv", package = "earthUI")
df <- import_data(demo_file)
head(df)
Test whether a path is a regProj project leaf folder
Description
Returns TRUE if path exists, sits two directories deep under root
(<purpose>/<flat_segment>/), and the flat segment parses cleanly.
Usage
is_project_dir(path, root = default_regproj_root())
Arguments
path |
Absolute path to check. |
root |
regProj root. Defaults to |
Value
Logical scalar.
Launch the earthUI Shiny application
Description
Opens an interactive 'shiny' GUI for building and exploring 'earth' (MARS-style) models. The application provides data import, variable configuration, model fitting, result visualization, and report export.
Usage
launch(port = 7878L, ...)
Arguments
port |
Integer. Port number for the Shiny app. Defaults to 7878. A fixed port keeps browser-side UI preferences (theme, last-used purpose) consistent across sessions. (Model configuration is saved server-side in the project database, not in the browser.) |
... |
Additional arguments passed to |
Value
This function does not return a value; it launches the Shiny app.
Examples
if (interactive()) {
launch()
}
List g-function groups from a fitted earth model
Description
Returns a data frame describing each non-intercept g-function group from the
model equation, including degree, factor count, graph dimensionality, and
the number of terms. The g-function notation is
{}^{f}g^{j}_{k} where f = number of factor variables
(top-left), j = degree of interaction (top-right), k = position within the
degree group (bottom-right).
Usage
list_g_functions(earth_result)
Arguments
earth_result |
An object of class |
Value
A data frame with columns:
- index
Integer. Sequential index (1-based).
- label
Character. Variable names in the group.
- g_j
Integer. Degree of the g-function (top-right superscript).
- g_k
Integer. Position within the degree (bottom-right subscript).
- g_f
Integer. Number of factor variables (top-left superscript).
- d
Integer. Graph dimensionality (degree minus factor count).
- n_terms
Integer. Number of terms in the group.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
list_g_functions(result)
Detect the current operating system as a regProj segment
Description
Returns "mac" on Darwin, "ubuntu" on Linux, "win11" on Windows.
This is the value used as the <os> segment in regProj paths so that
multi-OS output can be merged cleanly on a developer's machine.
Usage
os_detect()
Value
Character scalar: "mac", "ubuntu", or "win11".
Plot actual vs predicted values
Description
Creates a scatter plot of actual vs predicted values with a 1:1 reference line.
Usage
plot_actual_vs_predicted(earth_result, response_idx = NULL)
Arguments
earth_result |
An object of class |
response_idx |
Integer or |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_actual_vs_predicted(result)
Plot variable contribution
Description
Creates a scatter plot showing each variable's actual contribution to the prediction. For each observation, the contribution is the sum of coefficient * basis function value across all terms involving that variable.
Usage
plot_contribution(earth_result, variable, response_idx = NULL)
Arguments
earth_result |
An object of class |
variable |
Character string. Name of the predictor variable to plot. |
response_idx |
Integer or |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_contribution(result, "wt")
Plot correlation matrix
Description
Creates a heatmap of pairwise correlations among the target variable and numeric predictors, with cells colored by degree of correlation and values printed in each cell.
Usage
plot_correlation_matrix(earth_result)
Arguments
earth_result |
An object of class |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_correlation_matrix(result)
Plot g-function as a static contour (for reports)
Description
Creates a ggplot2 visualization for any g-function group. For d <= 1,
produces a 2D scatter plot (same as plot_g_function()). For d >= 2,
produces a filled contour plot suitable for static formats like PDF and Word.
Usage
plot_g_contour(earth_result, group_index, response_idx = NULL)
Arguments
earth_result |
An object of class |
group_index |
Integer. Index of the g-function group (1-based, from
|
response_idx |
Integer or |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_g_contour(result, 1)
Plot g-function contribution
Description
Creates a contribution plot for a specific g-function group. For degree-1 groups (single variable), produces a 2D scatter + piecewise-linear plot with slope labels and knot markers. For degree-2 groups (two variables), produces a 3D surface plot using plotly if available, or a filled contour plot.
Usage
plot_g_function(earth_result, group_index, response_idx = NULL)
Arguments
earth_result |
An object of class |
group_index |
Integer. Index of the g-function group (1-based, from
|
response_idx |
Integer or |
Value
A ggplot2::ggplot object for d <= 1, or a plotly widget for d >= 2 (when plotly is installed). Falls back to ggplot2 contour if plotly is not available.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_g_function(result, 1)
Plot g-function as a static 3D perspective (for reports)
Description
Creates a base R persp() 3D surface plot for g-function groups with d >= 2.
For d <= 1, produces a 2D scatter plot (same as plot_g_function()).
The surface is colored by contribution value using a blue-white-red scale.
Suitable for PDF and Word output where interactive plotly is not available.
Usage
plot_g_persp(
earth_result,
group_index,
theta = 30,
phi = 25,
response_idx = NULL
)
Arguments
earth_result |
An object of class |
group_index |
Integer. Index of the g-function group (1-based, from
|
theta |
Numeric. Azimuthal rotation angle in degrees. Default 30. |
phi |
Numeric. Elevation angle in degrees. Default 25. |
response_idx |
Integer or |
Value
Invisible NULL (base graphics). For d <= 1, returns a ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"), degree = 2L)
plot_g_persp(result, 1)
Plot partial dependence
Description
Creates a partial dependence plot for a selected variable from a fitted earth model.
Usage
plot_partial_dependence(
earth_result,
variable,
n_grid = 50L,
response_idx = NULL
)
Arguments
earth_result |
An object of class |
variable |
Character string. Name of the predictor variable to plot. |
n_grid |
Integer. Number of grid points for the partial dependence calculation. Default is 50. |
response_idx |
Integer or |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_partial_dependence(result, "wt")
Plot Q-Q plot of residuals
Description
Creates a normal Q-Q plot of the model residuals.
Usage
plot_qq(earth_result, response_idx = NULL)
Arguments
earth_result |
An object of class |
response_idx |
Integer or |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_qq(result)
Plot residual diagnostics
Description
Creates a two-panel diagnostic plot: residuals vs fitted values and a Q-Q plot of residuals.
Usage
plot_residuals(earth_result, response_idx = NULL)
Arguments
earth_result |
An object of class |
response_idx |
Integer or |
Value
A ggplot2::ggplot object showing residuals vs fitted values.
Use plot_qq() for the Q-Q plot separately.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_residuals(result)
Plot variable importance
Description
Creates a horizontal bar chart of variable importance from a fitted earth model.
Usage
plot_variable_importance(earth_result, type = "nsubsets")
Arguments
earth_result |
An object of class |
type |
Character. Importance metric to plot: |
Value
A ggplot2::ggplot object.
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
plot_variable_importance(result)
Prepare report assets
Description
Pre-generates all plots and data for the earth model report. Returns the
path to a directory containing all assets. This directory can be passed to
render_report() to avoid re-computing anything during rendering.
Usage
prepare_report_assets(earth_result, assets_dir = NULL)
Arguments
earth_result |
An object of class |
assets_dir |
Character. Path to write assets. If |
Value
The path to the assets directory (invisibly).
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
assets <- prepare_report_assets(result)
Encode the flat regProj segment from components
Description
Joins country, admin levels, and project name with _ to produce the
single hierarchy-encoding folder name used at the project level under
<root>/<purpose>/. Validates each component: admin codes must match
^[a-z0-9-]+$ (no internal underscores), project name allows
^[A-Za-z0-9_-]+$.
Usage
regproj_flat_segment(country, levels, project_name)
Arguments
country |
Lowercase ISO 3166-1 alpha-2 country code (e.g. |
levels |
Character vector of admin codes, ordered top to bottom.
Length must match |
project_name |
Project leaf name. Must satisfy |
Value
Character scalar (e.g. "us_ca_081_burlin_20251231_j").
Open (and initialize if needed) the regProj geo database
Description
Returns a DBI connection. Creates the schema on first call. The
caller is responsible for DBI::dbDisconnect().
Usage
regproj_geo_db_connect(root = default_regproj_root())
Arguments
root |
regProj root. Defaults to |
Details
On first creation, the tables are seeded from:
the shipped reference data (
regproj_reference()) — 24 countries, 51 US states, 3,076 US counties.the shipped places data (
pkg/inst/extdata/regproj_geo.rds) — US incorporated places + GeoNames-derived city/admin data for GB, DE, IT, FR, SE, SG.any pre-existing
<REGPROJ_ROOT>/.regproj-index.json(legacy migration).
Value
A DBIConnection to the SQLite database.
Path to the regProj geo SQLite database
Description
<REGPROJ_ROOT>/geo.sqlite — holds country codes and a flexible,
variable-depth admin_entries table for state / county / city /
deeper-level admin codes per country. Travels with the regProj tree.
Usage
regproj_geo_db_path(root = default_regproj_root())
Arguments
root |
regProj root. Defaults to |
Value
Character scalar.
List input files in an active project's in/ folder
Description
Returns the basenames of regular files in <project_path>/<os>_in/.
Hidden dotfiles and subdirectories are excluded. Sorted alphabetically.
Usage
regproj_in_files(project_path, os = os_detect())
Arguments
project_path |
Absolute path to the project root (the flat segment folder). |
os |
OS segment to use. Defaults to |
Value
Character vector of file basenames.
Get a code for a full name within a scope
Description
Looks up full_name under scope (e.g. "us/ca") in the geo
SQLite DB (which is seeded with shipped reference data on first
creation). Returns NULL if not found.
Usage
regproj_index_get(scope, full_name, root = default_regproj_root())
Arguments
scope |
Character scalar. Slash-separated path of parent codes
(e.g. |
full_name |
Character scalar. The display name to look up. |
root |
regProj root. Defaults to |
Value
Character scalar code, or NULL.
Path to the central regProj name-to-code index (legacy)
Description
The geo data is now stored in <REGPROJ_ROOT>/geo.sqlite (see
regproj_geo_db_path()). This path returns the location of the
legacy .regproj-index.json file, which is migrated into the
SQLite DB on first connect and is no longer written to.
Usage
regproj_index_path(root = default_regproj_root())
Arguments
root |
regProj root. Defaults to |
Value
Character scalar.
Set a name-to-code mapping in the geo SQLite DB
Description
Inserts (or replaces) the mapping under the given scope.
Usage
regproj_index_put(scope, full_name, code, root = default_regproj_root())
Arguments
scope |
Character scalar. Slash-separated path of parent codes
(e.g. |
full_name |
Character scalar. The display name to look up. |
code |
Character scalar. The path code to assign. |
root |
regProj root. Defaults to |
Value
Invisibly, the code.
Read the central regProj index from the geo SQLite DB
Description
Returns the entire geo DB as a nested list keyed by scope (slash-
separated parent-code path), for backward compatibility with callers
that iterate. New code should prefer regproj_index_get() or direct
DB queries via regproj_geo_db_connect().
Usage
regproj_index_read(root = default_regproj_root())
Arguments
root |
regProj root. Defaults to |
Value
Named list. Outer key is scope ("" for countries, then
slash-separated codes per admin level). Inner key is full name;
value is the code.
Deprecated. The geo data is now in SQLite; this is a no-op kept for backward compatibility.
Description
Deprecated. The geo data is now in SQLite; this is a no-op kept for backward compatibility.
Usage
regproj_index_write(idx, root = default_regproj_root())
Arguments
idx |
Ignored. |
root |
regProj root. Defaults to |
Value
Invisibly, the geo DB path.
Last-used input file marker (per project)
Description
Stores the basename of the most-recently-selected input file in
<project>/<os>/.regproj-last, so it can be auto-selected the next
time the project is opened.
Usage
regproj_last_file_path(project_path, os = os_detect())
regproj_last_file_get(project_path, os = os_detect())
regproj_last_file_set(project_path, basename, os = os_detect())
Arguments
project_path |
Absolute project path. |
os |
OS segment. Defaults to |
basename |
File basename to remember. |
Value
Path / basename / invisible path.
List all projects under a regProj root
Description
Walks <root>/<purpose>/<country>/<state>/<county>/<city>/<project_name>/
and returns a data frame describing each project found, with its mtime
(most recent of the in/ and out/ trees, falling back to the project
folder itself).
Usage
regproj_list_projects(
root = default_regproj_root(),
sort_by = c("recent", "alpha")
)
Arguments
root |
regProj root. Defaults to |
sort_by |
|
Details
Recognized purposes: gen, appr, mktarea. Other top-level
subdirectories under <root> are ignored, including any leading-dot
files like .regproj-index.json.
Value
Data frame with columns: project_path (absolute), purpose,
country, state, county, city, project_name, mtime
(POSIXct). Empty data frame (correct columns, zero rows) if no
projects exist or root is missing.
Decode a flat regProj segment back into components
Description
Inverse of regproj_flat_segment(). Splits on _, takes the first
token as country, then the next length(country_schema(country))
tokens as admin levels; the remaining tokens (joined by _) are the
project name. Returns NULL on parse failure.
Usage
regproj_parse_flat(segment)
Arguments
segment |
Character scalar. |
Value
Named list with country, levels (character vector),
project_name — or NULL if parsing failed.
Build a regProj-canonical path
Description
Composes the path
<root>/<purpose>/<flat_segment>/<os>_<in|out>[_<method>] from its
components. The hierarchy (country / admin levels / project name) is
concatenated into a single folder under <purpose>/ to keep the tree
shallow. Pure path computation — does not create any directories
unless create = TRUE.
Usage
regproj_path(
purpose,
country,
levels,
project_name,
os = os_detect(),
in_or_out = c("out", "in"),
method = "earth",
root = default_regproj_root(),
create = FALSE
)
Arguments
purpose |
|
country |
Lowercase ISO 3166-1 alpha-2 country code (e.g. |
levels |
Character vector of admin codes, ordered top to bottom.
Length must match |
project_name |
Project leaf name. Must satisfy |
os |
One of |
in_or_out |
One of |
method |
Optional method subdir (e.g. |
root |
Optional explicit regProj root. Defaults to
|
create |
Logical. If |
Value
Character scalar. Absolute normalized path.
Open (and initialize if needed) the regProj projects database
Description
Returns a DBI connection. Creates the schema on first call. The
caller is responsible for DBI::dbDisconnect().
Usage
regproj_projects_db_connect(root = default_regproj_root())
Arguments
root |
regProj root. Defaults to |
Value
A DBIConnection to the SQLite database.
Path to the regProj projects SQLite database
Description
<REGPROJ_ROOT>/projects.sqlite — one row per project, keyed by the
flat segment (e.g. "us_ca_081_burlin_20251231_j"). Holds project
metadata and per-method settings as JSON blobs.
Usage
regproj_projects_db_path(root = default_regproj_root())
Arguments
root |
regProj root. Defaults to |
Value
Character scalar.
Load the shipped regProj reference data
Description
Reads pkg/inst/extdata/regproj_reference.json once per session and
caches the result. Contains country names, US states, and US counties
(with FIPS codes). Used by the UI cascades to populate dropdowns.
Usage
regproj_reference()
Value
A nested list with components version, countries, states,
counties.
Render an earth model report
Description
Renders a parameterized 'Quarto' report from the fitted 'earth' model results. Requires the 'quarto' R package and a 'Quarto' installation.
Usage
render_report(
earth_result,
output_format = "html",
output_file = NULL,
paper_size = "letter",
assets_dir = NULL
)
Arguments
earth_result |
An object of class |
output_format |
Character. Output format: |
output_file |
Character. Path for the output file. If |
paper_size |
Character. Paper size for PDF output: |
assets_dir |
Character or |
Value
The path to the rendered output file (invisibly).
Examples
result <- fit_earth(mtcars, "mpg", c("cyl", "disp", "hp", "wt"))
render_report(result, output_format = "html",
output_file = tempfile(fileext = ".html"))
Select comparable sales for the Sales Comparison Grid
Description
Given an RCA-adjusted data frame (subject in row 1, comps in rows 2+), builds comp-summary tables, filters by weight, sorts by gross adjustment percentage, splits into "recommended" (gross adjustment < threshold) and "others", and caps the recommended list.
Usage
select_sales_grid_comps(
rca_df,
sale_age_col = "sale_age",
min_weight = 0,
max_gross_adj_pct = 0.25,
max_recommended = 30L
)
Arguments
rca_df |
A data frame produced by the RCA workflow. Row 1 is the
subject; rows 2+ are comps. Expected columns (any may be missing; NA is
substituted): |
sale_age_col |
Character scalar giving the column name that holds
sale age in days. Defaults to |
min_weight |
Numeric. Comps with |
max_gross_adj_pct |
Numeric. Comps whose |
max_recommended |
Integer. Upper bound on the number of recommended
comps returned. Default |
Details
This is the non-Shiny computation kernel used by the earthUI Shiny app's Sales Grid modal, and is also suitable for use from batch scripts.
Value
A named list with:
- recommended
Data frame of recommended comps, sorted by
sale_ageascending, capped atmax_recommendedrows.- others
Data frame of eligible comps not in the recommended set, sorted by
gross_adj_pctascending.
Each data frame has columns: row, id, address, sale_price,
sale_age, weight, gross_adj, gross_adj_pct.
Set model settings for a project
Description
Writes model settings for a project to <REGPROJ_ROOT>/projects.sqlite.
Used by the Shiny UI to persist settings, and by external tools to seed
projects programmatically. Settings are scoped by project + purpose and
shared across all data files in the project.
Usage
set_project_settings(
project_path,
settings = NULL,
variables = NULL,
interactions = NULL,
method = "earth",
purpose = "general",
root = default_regproj_root()
)
Arguments
project_path |
Absolute path to the project root. |
settings, variables, interactions |
JSON strings (or |
method |
One of |
purpose |
Settings scope: one of |
root |
regProj root. Defaults to |
Value
Invisibly, NULL.
Validate earthUI_result object
Description
Validate earthUI_result object
Usage
validate_earthUI_result(x)
Arguments
x |
Object to validate. |
Value
Invisible NULL. Raises error if invalid.
Validate declared column types against actual data
Description
Checks each selected predictor's actual data against the user-declared type. Returns a list of errors (blocking), warnings (non-blocking), and any Date/POSIXct columns that will be auto-converted to numeric.
Usage
validate_types(df, type_map, predictors)
Arguments
df |
A data frame. |
type_map |
Named list or character vector. Names are column names,
values are declared types (e.g., |
predictors |
Character vector of selected predictor column names. |
Value
A list with components:
- ok
Logical.
TRUEif no blocking errors found.- warnings
Character vector of non-blocking warnings.
- errors
Character vector of blocking errors.
- date_columns
Character vector of Date/POSIXct predictor columns that will be auto-converted to numeric.
Examples
df <- data.frame(price = c(100, 200, 300), city = c("A", "B", "C"))
types <- list(price = "numeric", city = "character")
validate_types(df, types, predictors = c("price", "city"))
Write an earth model summary to a text file
Description
Writes the model print-out, summary.earth(), and optional variance
model / trace log to
<file_name>_earth_output_<timestamp>.txt in output_folder.
Usage
write_earth_output(result, output_folder, file_name)
Arguments
result |
A fit result list as returned by |
output_folder |
Character scalar. May be |
file_name |
Character scalar. Used to derive the output filename. |
Value
Invisibly, NULL.
Write an earthUI fitting log to a text file
Description
Writes the timestamped contents of an earth fitting trace to
<file_name>_earth_log_<timestamp>.txt in output_folder.
If output_folder is NULL or empty, ~/Downloads is used. The folder
is created if it doesn't exist. Errors are caught and reported via
message() so batch pipelines don't fail on logging issues.
Usage
write_fit_log(output_folder, lines, file_name)
Arguments
output_folder |
Character scalar. Directory in which to write the
log. May be |
lines |
Character vector of log lines. |
file_name |
Character scalar. Used to derive the log filename (the extension is stripped). |
Value
Invisibly, NULL.