claudeindex
Plugin

agent-evaluation

Comprehensive evaluation framework for LLM agent systems. Multi-dimensional rubrics, LLM-as-judge with bias mitigation, pairwise comparison, direct scoring, confidence calibration, and continuous monitoring.

Installation

1

Add the marketplace

/plugin marketplace add viktorbezdek/skillstack
2

Install plugins

/plugin

Run these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.