Skill

simpo

RLHF and preference alignment including TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, and torchforge. Use when aligning models with human preferences, training reward models, or large-scale RL training.

Installation

Add the marketplace

/plugin marketplace add zechenzhangAGI/AI-research-SKILLs

Install plugins

/plugin

Run these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.

Details & Metadata

From Plugin

post-training

View Plugin

From Marketplace

ai-research-skills

Primary

View Marketplace

Author

@zechenzhangAGI

View GitHub Profile