claudeindex
Skill

bigcode-evaluation-harness

LLM benchmarking and evaluation including lm-evaluation-harness, BigCode Evaluation Harness, and NeMo Evaluator. Use when benchmarking models or measuring performance.

Installation

1

Add the marketplace

/plugin marketplace add zechenzhangAGI/AI-research-SKILLs
2

Install plugins

/plugin

Run these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.