Skip to content

Layer Selection & Sweep

Different layers of a language model encode different information. Finding the right layer (or combination of layers)can be an impactful tuning decision.


Layer specification syntax

Spec Description
16 Single layer (negative indexing: -1 = last)
[14, 15, 16] Multiple layers (concatenated)
"middle" Middle third of layers
"last" Last layer only
"all" All layers concatenated
"auto" Automatic via Group Lasso (pip install lmprobe[auto])
"fast_auto" Fast selection via coefficient importance
"sweep" Train independent probe per layer
"sweep:10" Sweep every 10th layer
"sweep:55-65" Sweep layers 55 through 65

Layer sweep

Train an independent probe at every layer to identify which are most informative:

result = Probe.sweep_layers(
    model="meta-llama/Llama-3.1-8B-Instruct",
    positive_prompts=positive_prompts,
    negative_prompts=negative_prompts,
    layers="all",
    classifier="ridge",
)

# Score all layers
scores = result.score(test_prompts, test_labels)
# {0: 0.52, 1: 0.55, ..., 31: 0.78}

# Best layer
best = result.best_layer(test_prompts, test_labels)
print(f"Best layer: {best}")

# Predict with a specific layer's probe
preds = result.probes[best].predict(test_prompts)

You can also use sweep as a layer spec string, useful when you want to sweep as part of a normal Probe workflow:

probe = Probe(model=model, layers="sweep")        # sweep all
probe = Probe(model=model, layers="sweep:10")      # every 10th
probe = Probe(model=model, layers="sweep:55-65")   # a range

Layer importance analysis

When using multiple layers (e.g., layers="all"), compute per-layer importance from the fitted classifier's coefficients:

probe = Probe(
    model="meta-llama/Llama-3.1-8B-Instruct",
    layers="all",
    classifier="ridge",
)

probe.fit(positive_prompts, negative_prompts)

importances = probe.compute_layer_importance(metric="l2")
# array([0.03, 0.05, ..., 0.42]) — shape (n_layers,), sums to 1.0
# Use probe.candidate_layers_ to map index → layer number
best_idx = importances.argmax()
print(f"Most important layer: {probe.candidate_layers_[best_idx]}")

Fast auto layer selection

Automatically select the most important layers using importance analysis, then refit on just those layers:

probe = Probe(
    model="meta-llama/Llama-3.1-8B-Instruct",
    layers="fast_auto",
    fast_auto_top_k=3,       # keep top 3 layers
    normalize_layers=True,
)

probe.fit(positive_prompts, negative_prompts)
print(f"Selected layers: {probe.selected_layers_}")

This is a two-stage process: first fit on all layers, then refit on the top-k layers by importance.


Automatic layer selection via Group Lasso

Use structured sparsity (Group Lasso) to let the optimizer choose which layers to keep. More principled than fast_auto but slower:

# Requires: pip install lmprobe[auto]
probe = Probe(
    model="meta-llama/Llama-3.1-8B-Instruct",
    layers="auto",
    auto_candidates=[0.25, 0.5, 0.75],  # fractional positions or explicit indices
    auto_alpha=0.01,                     # regularization strength
)

probe.fit(positive_prompts, negative_prompts)
print(f"Selected layers: {probe.selected_layers_}")

Practical guidance

Where to start:

  • Middle layers (12–20 in a 32-layer model) are often best for semantic properties
  • Last layer is usually best for surface/output-level properties
  • First few layers are mostly syntactic/positional

When to sweep:

  • When you have no prior knowledge about the task
  • When you want to verify your layer choice is principled

When to use "all" with concatenation:

  • When signal is distributed across many layers
  • Combined with PCA or Group Lasso to manage dimensionality

When to use fast_auto:

  • When you want a data-driven choice without the overhead of Group Lasso
  • Good default when you don't want to hand-tune layers