Skip to contents

Universal network estimation function that supports both transition networks (relative, frequency, co-occurrence) and association networks (correlation, partial correlation, graphical lasso). Uses the global estimator registry, so custom estimators can also be used.

Usage

build_network(
  data,
  method,
  actor = NULL,
  action = NULL,
  time = NULL,
  session = NULL,
  order = NULL,
  codes = NULL,
  group = NULL,
  format = "auto",
  window_size = 3L,
  mode = c("non-overlapping", "overlapping"),
  scaling = NULL,
  threshold = 0,
  level = NULL,
  time_threshold = 900,
  params = list(),
  ...
)

Arguments

data

Data frame (sequences or per-observation frequencies) or a square symmetric matrix (correlation or covariance).

method

Character. Required. Name of a registered estimator. Built-in methods: "relative", "frequency", "co_occurrence", "cor", "pcor", "glasso". Aliases: "tna" and "transition" map to "relative"; "ftna" and "counts" map to "frequency"; "cna" maps to "co_occurrence"; "corr" and "correlation" map to "cor"; "partial" maps to "pcor"; "ebicglasso" and "regularized" map to "glasso".

actor

Character. Name of the actor/person ID column for sequence grouping. Default: NULL.

action

Character. Name of the action/state column (long format). Default: NULL.

time

Character. Name of the time column (long format). Default: NULL.

session

Character. Name of the session column. Default: NULL.

order

Character. Name of the ordering column. Default: NULL.

codes

Character vector. Column names of one-hot encoded states (for onehot format). Default: NULL.

group

Character. Name of a grouping column for per-group networks. Returns a netobject_group (named list of netobjects). Default: NULL.

format

Character. Input format: "auto", "wide", "long", or "onehot". Default: "auto".

window_size

Integer. Window size for one-hot windowing. Default: 3L.

mode

Character. Windowing mode: "non-overlapping" or "overlapping". Default: "non-overlapping".

scaling

Character vector or NULL. Post-estimation scaling to apply (in order). Options: "minmax", "max", "rank", "normalize". Can combine: c("rank", "minmax"). Default: NULL (no scaling).

threshold

Numeric. Absolute values below this are set to zero in the result matrix. Default: 0 (no thresholding).

level

Character or NULL. Multilevel decomposition for association methods. One of NULL, "between", "within", "both". Requires id_col. Default: NULL.

time_threshold

Numeric. Maximum time gap (seconds) for long format session splitting. Default: 900.

params

Named list. Method-specific parameters passed to the estimator function (e.g. list(gamma = 0.5) for glasso, or list(format = "wide") for transition methods). This is the key composability feature: downstream functions like bootstrap or grid search can store and replay the full params list without knowing method internals.

...

Additional arguments passed to the estimator function.

Value

An object of class c("netobject", "cograph_network") containing:

data

The input data used for estimation, as a data frame.

weights

The estimated network weight matrix.

nodes

Data frame with columns id, label, name, x, y. Node labels are in $nodes$label.

edges

Data frame of non-zero edges with integer from/to (node IDs) and numeric weight.

directed

Logical. Whether the network is directed.

method

The resolved method name.

params

The params list used (for reproducibility).

scaling

The scaling applied (or NULL).

threshold

The threshold applied.

n_nodes

Number of nodes.

n_edges

Number of non-zero edges.

level

Decomposition level used (or NULL).

meta

List with source, layout, and tna metadata (cograph-compatible).

node_groups

Node groupings data frame, or NULL.

Method-specific extras (e.g. precision_matrix, cor_matrix, frequency_matrix, lambda_selected, etc.) are preserved from the estimator output.

When level = "both", returns an object of class "netobject_ml" with $between and $within sub-networks and a $method field.

Details

The function works as follows:

  1. Resolves method aliases to canonical names.

  2. Retrieves the estimator function from the global registry.

  3. For association methods with level specified, decomposes the data (between-person means or within-person centering).

  4. Calls the estimator: do.call(fn, c(list(data = data), params)).

  5. Applies scaling and thresholding to the result matrix.

  6. Extracts edges and constructs the netobject.

Examples

# \donttest{
# Transition network (relative probabilities)
seqs <- data.frame(
  V1 = sample(LETTERS[1:4], 30, TRUE), V2 = sample(LETTERS[1:4], 30, TRUE),
  V3 = sample(LETTERS[1:4], 30, TRUE), V4 = sample(LETTERS[1:4], 30, TRUE)
)
net <- build_network(seqs, method = "relative")
print(net)
#> Transition Network (relative probabilities) [directed]
#>   Weights: [0.111, 0.556]  |  mean: 0.238
#> 
#>   Weight matrix:
#>         A     B     C     D
#>   A 0.423 0.192 0.192 0.192
#>   B 0.304 0.304 0.217 0.174
#>   C 0.217 0.304 0.304 0.174
#>   D 0.222 0.111 0.556 0.111 
#> 
#>   Initial probabilities:
#>   A             0.333  ████████████████████████████████████████
#>   B             0.300  ████████████████████████████████████
#>   D             0.200  ████████████████████████
#>   C             0.167  ████████████████████

# Association network (glasso)
freq_data <- convert_sequence_format(seqs, format = "frequency")
net_glasso <- build_network(freq_data, method = "glasso",
                             params = list(gamma = 0.5, nlambda = 50))
#> Dropping non-syntactic columns: 0, 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 3, 30, 4, 5, 6, 7, 8, 9

# With scaling
net_scaled <- build_network(seqs, method = "relative",
                             scaling = c("rank", "minmax"))
# }