Skip to content

Remote Execution

lmprobe supports remote execution via nnsight and NDIF (National Deep Inference Fabric), allowing you to probe large models without local GPU resources.

US-only access

NDIF currently restricts access to US-based users. Remote functionality has not been fully integration-tested. See known considerations.


Setup

Install the nnsight extra:

pip install lmprobe[nnsight]

Set your API key:

export NDIF_API_KEY="your-api-key-here"

Basic usage

from lmprobe import Probe

probe = Probe(
    model="meta-llama/Llama-3.1-70B-Instruct",
    layers="middle",
    backend="nnsight",
    remote=True,
)

probe.fit(positive_prompts, negative_prompts)
predictions = probe.predict(test_prompts)

Per-call remote override

You can set remote at the probe level but override it per call. A common pattern: extract activations remotely during training (large model), then run inference locally on a smaller model or cached activations:

probe = Probe(
    model="meta-llama/Llama-3.1-70B-Instruct",
    backend="nnsight",
    remote=True,
    layers=20,
)

probe.fit(positive_prompts, negative_prompts)  # extracts remotely

# Predict locally (uses cached activations if already computed)
predictions = probe.predict(new_prompts, remote=False)

Caching with remote execution

Activations extracted remotely are cached locally using the same cache system as local execution. This means:

  • Re-running fit() or predict() with the same prompts hits the local cache
  • Remote calls happen only for new/uncached prompts
  • You can warm up the cache before a deadline:
probe.warmup(test_prompts, batch_size=4)  # cache everything remotely upfront
predictions = probe.predict(test_prompts)  # now runs fully from cache

Known considerations

  • Remote execution returns nnsight proxy tensors rather than direct tensors. lmprobe handles this via hasattr(act, "value") checks in extraction.py.
  • Network latency affects batch processing. Use smaller batch_size for remote calls.
  • Error handling for missing/invalid API keys raises an informative exception before any network calls are made.

Backend parameter

The backend parameter controls the extraction engine:

Value Description
"local" HuggingFace Transformers (default)
"nnsight" nnsight / NDIF remote execution

remote=True requires backend="nnsight". Setting remote=True with backend="local" will raise an error.