---
title: "vismeteor"
output:
    rmarkdown::html_vignette:
        toc: true
        fig_width: 6
        fig_height: 4
vignette: >
  %\VignetteIndexEntry{vismeteor}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
```

```{r setup, include = FALSE}
library(vismeteor)
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
```

## Introduction

`vismeteor` was developed to provide a comprehensive tool for analyzing visually observed
meteors. It takes into account human perception of different meteor magnitudes through
perception probabilities. Therefore, this package provides methods that incorporate
these perception probabilities in the evaluation of meteor magnitudes.

Perception probabilities are crucial for analyzing observed magnitude distributions,
as they give rise to specific distributions unique to visual meteor observations.
One such model is the Geometric Model of Visual Meteor Magnitudes.
It is based on the standard geometric distribution, adjusted by multiplying its probabilities with
the perception probabilities. The resulting distribution has a single parameter:
the population index r.
This package provides functions to work with this adjusted distribution.

Additionally, this package offers tools to facilitate the evaluation, recognizing that
visual meteor observations typically report only total counts over a specified period,
rather than individual meteor events.

For visual observations of meteor magnitudes, the observed counts can be specified in
fractional values (half meteors). This package includes a function for unbiased rounding
of these fractional values to use standard tools in R that only allow integer values.

Beyond magnitude distributions, the package also helps in analyzing how meteor activity
varies over time or solar longitude. It provides a routine for fitting smooth, parsimonious
profiles to noisy meteor count series, automatically choosing a small set of spline knot
positions from a user-supplied candidate grid under an information criterion or
cross-validation score.

## Knot Selection for Spline Models

The function `select_knots()` performs greedy stepwise selection of interior spline knots from
a user-supplied candidate grid. In each step it adds (forward) or removes (backward) the single
knot that most improves a user-defined score, such as BIC, AIC, or a cross-validation criterion.
The routine is model-agnostic: the scoring function is responsible for fitting the model — a
Poisson GLM, a natural cubic spline via `glm()`, or any other form — so `select_knots()` can
be reused across very different analyses.

A typical application is fitting the activity profile of a meteor shower over solar longitude
through noisy counts. A fine grid of candidate positions captures where the curve *could* bend;
`select_knots()` reduces it to a small subset where the data actually demand a bend, yielding
a smooth, parsimonious fit.

```{r, echo=TRUE}
set.seed(1)
x <- seq(0, 10, length.out = 200)
y <- rpois(length(x), lambda = 50 + 30 * sin(x))
dat <- data.frame(x = x, y = y)

fit <- \(d, knots) {
    f <- if (length(knots) == 0L) {
        y ~ splines::ns(x)
    } else {
        y ~ splines::ns(x, knots = knots)
    }
    glm(f, family = poisson(), data = d)
}
score_bic <- \(d, knots) fit(d, knots) |> BIC()

result <- select_knots(
    dat,
    knot_candidates = seq(1, 9, by = 0.5),
    score_fun = score_bic
)
final_knots <- sort(c(result$knots, result$fixed_knots))
final_fit <- fit(dat, final_knots)

op <- par(mar = c(7, 4, 4, 2) + 0.1)
plot(
    x, y,
    pch = 20, col = "gray",
    main = "Poisson spline fit with selected knots",
    xlab = "x", ylab = "count"
)
lines(x, 50 + 30 * sin(x), col = "darkgreen", lwd = 2, lty = 2)
lines(x, final_fit |> predict(type = "response"), col = "blue", lwd = 2)
abline(v = final_knots, col = "red", lty = 3)
legend(
    "bottom",
    inset = c(0, -0.35),
    legend = c("true mean", "spline fit", "selected knots"),
    col = c("darkgreen", "blue", "red"),
    lty = c(2, 1, 3),
    lwd = c(2, 2, 1),
    horiz = TRUE,
    bty = "n",
    xpd = TRUE
)
par(op)
```

See `?select_knots` and `vignette("select_knots")` for more information.

## Perception probabilities

Meteors appear randomly distributed across the sky. The perception of an observer can
vary significantly for each meteor, influenced largely by the region of the sky where
the meteor appears. Depending on its position, a meteor might be perceived or missed entirely,
leading to the concept of perception probabilities.

Perception probabilities quantify the likelihood of an observer detecting a meteor.
They provide a statistical measure reflecting the chance of seeing or not seeing a meteor
under various conditions. In statistical modeling of visual observations, these probabilities
are commonly expressed in relation to the difference between the meteor's magnitude and the
limiting magnitude of the observation.

This package includes the function `vmperception()` to return perception probabilities for
visual meteor magnitudes.

Custom perception probability functions can be passed to most functions of this package.
This allows users to tailor the behavior of the functions to their specific needs and
conduct their own studies on the appearance of observable meteor magnitudes.
This flexibility is particularly valuable for researchers wishing to incorporate
their own empirical data or models into the analysis.

```{r, echo=TRUE}
plot(
    vmperception,
    -0.5, 8,
    main = "Perception probability of visual meteor magnitudes",
    col = "blue",
    xlab = "limiting magnitude - meteor magnitude",
    ylab = "p"
)
```

See `?vmperception` for more information.

## Geometric Model of Visual Meteor Magnitudes

In visual meteor observations, magnitudes are typically recorded as integer values.  
Hence, the geometric model is discrete and its probability mass function is
$$
P[M = m] \sim
\begin{cases}
  f(m_{\mathrm{lim}} - m)\, r^m, & \text{if } m_{\mathrm{lim}} - m > -0.5,\\[5pt]
  0 & \text{otherwise,}
\end{cases}
$$

where $m_{\mathrm{lim}}$ denotes the limiting (non-integer) magnitude of the observation,
and $m$ the integer meteor magnitude.
The function $f(\cdot)$ denotes the perception probability function.

This distribution is the product of the perception probabilities and the
underlying geometric distribution of meteor magnitudes.
Therefore, the parameter $p$ of the geometric distribution is $p = 1 - 1/r$.

The main advantage of this model is its simplicity.
When the number of observed meteors is small, it can often be shown that the
distribution of meteor magnitudes is well described by this model.

```{r, echo=TRUE, results='hide'}
m <- seq(6, -4, -1)
limmag <- 6.5
r <- 2.0
p <- vismeteor::dvmgeom(m, limmag, r)
barplot(
    p,
    names.arg = m,
    main = paste0("Density (r = ", r, ", limmag = ", limmag, ")"),
    col = "blue",
    xlab = "m",
    ylab = "p",
    border = "blue",
    space = 0.5
)
axis(side = 2, at = pretty(p))
```

See `?vmgeom` for more information.

## Ideal Distribution of Visual Meteor Magnitudes

The ideal model of visual meteor magnitudes is an alternative to the geometric model.
It more accurately accounts for the finiteness of meteor magnitudes across the entire
magnitude spectrum, using the ideal gas from theoretical physics as a conceptual analogy.

This model is particularly useful when, compared to the geometric model,
fewer faint meteor magnitudes are observed.

The density of the ideal distribution is

$$
{\displaystyle \frac{\mathrm{d}p}{\mathrm{d}m} = \frac{3}{2} \, \log(r) \sqrt{\frac{r^{3 \, \psi + 2 \, m}}{(r^\psi + r^m)^5}}}
$$
where $m$ is the continuous (real-valued) meteor magnitude,
$r = 10^{0.4} \approx 2.51189 \dots$ is a constant and
$\psi$ is the only parameter of this magnitude distribution.

```{r, echo=TRUE, results='hide'}
m <- seq(6, -4, -1)
psi <- 5.0
limmag <- 6.5
p <- vismeteor::dvmideal(m, limmag, psi)
barplot(
    p,
    names.arg = m,
    main = paste0("Density (psi = ", psi, ", limmag = ", limmag, ")"),
    col = "blue",
    xlab = "m",
    ylab = "p",
    border = "blue",
    space = 0.5
)
axis(side = 2, at = pretty(p))
```

See `?mideal` and `?vmideal` for more information.

## Fractional Counting

In the statistical analysis of visual meteor observations, a method known as "count data"
in categorical time series is used. Observers record the number of meteors in each category
over specific time intervals.

In visual meteor observation, observers record numerical meteor magnitudes and sum them,
uniquely allowing for "half" counts when indecisive between two values. This fractional
counting reflects measurement uncertainty, splitting counts between adjacent magnitude categories.

While fractional counting accommodates measurement uncertainty with "half" counts, some tools
cannot process these fractional values and require integer rounding.
The function `vmtable()` addresses this by rounding the magnitudes in a contingency table to whole
numbers. It ensures that the rounding process only alters the marginal totals when necessary,
preserving the overall count integrity. This means both the grand total and the intermediate
sums of meteors observed remain consistent, ensuring accurate and usable data for
subsequent analysis.

Example:
```{r, echo=TRUE}
mt <- as.table(matrix(
    c(
        0.0, 0.0, 2.5, 0.5, 0.0, 1.0,
        0.0, 1.5, 2.0, 0.5, 0.0, 0.0,
        1.0, 0.0, 0.0, 3.0, 2.5, 0.5
    ),
    nrow = 3, ncol = 6, byrow = TRUE
))
colnames(mt) <- seq(6)
rownames(mt) <- c("A", "B", "C")
margin.table(mt, 1)
margin.table(mt, 2)

# contingency table with integer values
(mt_int <- vmtable(mt))
margin.table(mt_int, 1)
margin.table(mt_int, 2)
```

See `?vmtable` for more information.

## Quantile Analysis with Minimum Meteor Count

The function `freq_quantile()` is tailored for analyzing time series data in visual
meteor observation, where the count of meteors is recorded over specific time intervals.

Unlike traditional methods that sort quantiles based on time and percent, which often result
in some quantiles having fewer meteors than desired, `freq_quantile()` constructs quantiles
with a focus on ensuring a minimum number of meteors in each quantile.

This method addresses the challenge of varying meteor counts in each interval, including those
with zero meteors. By utilizing `freq_quantile()`, users can effectively divide the time series
into quantiles that are both time-based and density-based, enhancing the understanding of meteor
occurrence and distribution over the observed period while ensuring each quantile meets the
minimum count criterion.

This approach provides a more nuanced and reliable analysis, especially vital when dealing with
the inherent variability in meteor observations.

The following example represents a time-ordered list of observed meteor counts.
The objective is to group these counts into quantiles while maintaining their chronological order,
ensuring that each quantile contains at least 10 meteors.

```{r, echo=TRUE}
freq <- c(1, 8, 3, 3, 4, 9, 5, 0, 0, 2, 7, 8, 2, 6, 4)
f <- freq_quantile(freq, 10)
print(f)
print(tapply(freq, f, sum))
```

See `?freq_quantile` for more information.

## Interfacing VMDB Data with VMDB Loaders

The functions `load_vmdb_rates()` and `load_vmdb_magnitudes()` query an
[imo-vmdb](https://pypi.org/project/imo-vmdb/) web server through its REST API to retrieve
visual meteor observations from the Visual Meteor Database (VMDB). The imo-vmdb server ingests
and validates the original VMDB exports and serves them in a structured, relational form, so
callers receive enriched and consistent observation records rather than raw exports.

Both functions take a `base_url` pointing at the API endpoint and accept filters such as
`shower`, `period`, `sl` (solar longitude) and `lim_magn` (limiting magnitude). For rate
observations, `sun_alt_max` and `moon_alt_max` further constrain sky conditions. Optional
companion data — observing session metadata and per-observation magnitude distributions —
can be requested via `with_sessions` and `with_magnitudes`.

Within this package, `PER_2015_rates` and `PER_2015_magn` are provided as example data sets.
They are included for testing and training purposes, allowing users to
understand and utilize the full functionality of the entire package.

See `?load_vmdb`, `?PER_2015_rates` and `?PER_2015_magn` for more information.
