Skip to content

Scaling

Per-layer feature scaling for multi-layer probes.


lmprobe.scaling.PerLayerScaler

Standardize features on a per-layer basis.

When using multiple layers (concatenated), each layer may have different activation magnitude distributions. This scaler normalizes each layer's features to zero mean and unit variance.

Two strategies are available: - "per_neuron": Each neuron gets its own mean/std (more parameters, higher variance) - "per_layer": All neurons in a layer share one mean/std (fewer parameters, lower variance)

The "per_layer" strategy may be preferable when: - Sample size is small relative to hidden dimension - Neurons within a layer have similar activation distributions (symmetry assumption)

Parameters:

Name Type Description Default
n_layers int

Number of layers in the concatenated features.

required
hidden_dim int

Hidden dimension per layer (features per layer).

required
strategy str

Scaling strategy: - "per_neuron": Each neuron has its own mean/std - "per_layer": All neurons in a layer share one mean/std

"per_neuron"

Attributes:

Name Type Description
means_ ndarray | None

Feature means. Shape depends on strategy: - "per_neuron": (n_layers, hidden_dim) - "per_layer": (n_layers,)

stds_ ndarray | None

Feature standard deviations. Shape matches means_.

Examples:

>>> scaler = PerLayerScaler(n_layers=3, hidden_dim=128, strategy="per_neuron")
>>> X_train_scaled = scaler.fit_transform(X_train)
>>> X_test_scaled = scaler.transform(X_test)
>>> # Use per_layer for small sample sizes
>>> scaler = PerLayerScaler(n_layers=3, hidden_dim=128, strategy="per_layer")

fit

fit(X: ndarray) -> PerLayerScaler

Compute per-layer means and standard deviations.

Parameters:

Name Type Description Default
X ndarray

Feature matrix, shape (n_samples, n_layers * hidden_dim).

required

Returns:

Type Description
PerLayerScaler

Self, for method chaining.

Raises:

Type Description
ValueError

If X has wrong number of features.

transform

transform(X: ndarray) -> np.ndarray

Apply per-layer standardization.

Parameters:

Name Type Description Default
X ndarray

Feature matrix, shape (n_samples, n_layers * hidden_dim).

required

Returns:

Type Description
ndarray

Standardized features, same shape as input.

Raises:

Type Description
RuntimeError

If scaler has not been fitted.

ValueError

If X has wrong number of features.

fit_transform

fit_transform(X: ndarray) -> np.ndarray

Fit scaler and transform data in one step.

Parameters:

Name Type Description Default
X ndarray

Feature matrix, shape (n_samples, n_layers * hidden_dim).

required

Returns:

Type Description
ndarray

Standardized features, same shape as input.

inverse_transform

inverse_transform(X: ndarray) -> np.ndarray

Reverse the standardization.

Parameters:

Name Type Description Default
X ndarray

Standardized feature matrix, shape (n_samples, n_layers * hidden_dim).

required

Returns:

Type Description
ndarray

Original-scale features, same shape as input.

Raises:

Type Description
RuntimeError

If scaler has not been fitted.

get_layer_stats

get_layer_stats() -> dict[str, np.ndarray]

Get per-layer statistics for analysis.

Returns:

Type Description
dict

Dictionary with layer-level statistics. Contents depend on strategy: - "per_neuron": includes 'mean_norms', 'std_norms', 'mean_per_layer', 'std_per_layer' - "per_layer": includes 'means', 'stds' (the raw per-layer values)

Raises:

Type Description
RuntimeError

If scaler has not been fitted.