RLHF and preference alignment including TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, and torchforge. Use when aligning models with human preferences, training reward models, or large-scale RL training.
Add the marketplace
/plugin marketplace add zechenzhangAGI/AI-research-SKILLs
Install plugins
/plugin
Run these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.
From Plugin
post-training
View Plugin
From Marketplace
ai-research-skills
View Marketplace
Author
@zechenzhangAGI
View GitHub Profile