Activation Analysis¶

Capture and inspect hidden states at any layer. Understand what each layer contributes through activation statistics, entropy measurement, and sparsity analysis.

Hook-Based Capture¶

The low-level API uses hooks to intercept the forward pass:

from model_garage.core.hooks import HookManager
from model_garage.core.tensor import TensorUtils

with HookManager(model) as hooks:
    hooks.register_capture_hook("model.layers.15", hook_name="layer_15")
    model(input_ids)

    data = hooks.get_captured("layer_15")
    stats = TensorUtils.stats(data["output"])
    print(f"Mean: {stats['mean']:.4f}")
    print(f"Std:  {stats['std']:.4f}")
    print(f"Sparsity: {stats['sparsity']:.2%}")

Snapshot Capture¶

The high-level API captures multiple layers at once with computed statistics:

from model_garage.snapshot.capture import SnapshotCapture

capture = SnapshotCapture(model)
snapshots = capture.run(
    input_ids,
    layers=["transformer.h.0", "transformer.h.6", "transformer.h.11"]
)

for name, snap in snapshots.items():
    print(f"{name}:")
    print(f"  Mean activation: {snap.mean_activation:.4f}")
    print(f"  Sparsity: {snap.sparsity:.2%}")

Tensor Utilities¶

TensorUtils provides common analysis operations:

from model_garage.core.tensor import TensorUtils

# Full statistics
stats = TensorUtils.stats(tensor)
# Returns: mean, std, min, max, sparsity, norm, shape

# Cosine similarity between two hidden states
similarity = TensorUtils.cosine_sim(hidden_a, hidden_b)

Layer-by-Layer Analysis¶

Capture every layer to understand the transformation pipeline:

all_layers = [f"transformer.h.{i}" for i in range(model.config.n_layer)]

capture = SnapshotCapture(model)
snapshots = capture.run(input_ids, layers=all_layers)

# Plot sparsity progression
for name, snap in snapshots.items():
    layer_num = int(name.split('.')[-1])
    bar = '#' * int(snap.sparsity * 50)
    print(f"  Layer {layer_num:2d}: {bar} {snap.sparsity:.1%}")

CLI¶

# Analyze activations for a prompt
garage analyze gpt2 --prompt "The meaning of life is"

# Analyze specific layers
garage analyze gpt2 --prompt "Hello world" --layers 0,6,11

Use Cases¶

Interpretability — Understand what each layer learns
Debugging — Find layers producing anomalous activations
Pruning — Identify high-sparsity layers that can be compressed
Research — Validate hypotheses about model behavior (see Sparse Pathways)

Next Steps¶

Extract components from interesting layers
Inject modifications and measure their effect
Read the Sparse Pathways paper for domain-specific neuron analysis