Remote Execution¶
lmprobe supports remote execution via nnsight and NDIF (National Deep Inference Fabric), allowing you to probe large models without local GPU resources.
US-only access
NDIF currently restricts access to US-based users. Remote functionality has not been fully integration-tested. See known considerations.
Setup¶
Install the nnsight extra:
Set your API key:
Basic usage¶
from lmprobe import Probe
probe = Probe(
model="meta-llama/Llama-3.1-70B-Instruct",
layers="middle",
backend="nnsight",
remote=True,
)
probe.fit(positive_prompts, negative_prompts)
predictions = probe.predict(test_prompts)
Per-call remote override¶
You can set remote at the probe level but override it per call. A common pattern: extract activations remotely during training (large model), then run inference locally on a smaller model or cached activations:
probe = Probe(
model="meta-llama/Llama-3.1-70B-Instruct",
backend="nnsight",
remote=True,
layers=20,
)
probe.fit(positive_prompts, negative_prompts) # extracts remotely
# Predict locally (uses cached activations if already computed)
predictions = probe.predict(new_prompts, remote=False)
Caching with remote execution¶
Activations extracted remotely are cached locally using the same cache system as local execution. This means:
- Re-running
fit()orpredict()with the same prompts hits the local cache - Remote calls happen only for new/uncached prompts
- You can warm up the cache before a deadline:
probe.warmup(test_prompts, batch_size=4) # cache everything remotely upfront
predictions = probe.predict(test_prompts) # now runs fully from cache
Known considerations¶
- Remote execution returns nnsight proxy tensors rather than direct tensors.
lmprobehandles this viahasattr(act, "value")checks inextraction.py. - Network latency affects batch processing. Use smaller
batch_sizefor remote calls. - Error handling for missing/invalid API keys raises an informative exception before any network calls are made.
Backend parameter¶
The backend parameter controls the extraction engine:
| Value | Description |
|---|---|
"local" |
HuggingFace Transformers (default) |
"nnsight" |
nnsight / NDIF remote execution |
remote=True requires backend="nnsight". Setting remote=True with backend="local" will raise an error.