AI evaluation and benchmarking tools
skill-plugins
Adds a /dataset-generator skill for generating high-quality evaluation datasets with adjustable difficulty levels from PDF documents for RAG system testing and benchmarking