Marketplace

ai-research-skills

Comprehensive library of 89 AI research engineering skills enabling autonomous AI research from hypothesis to experimental verification

Stars

5,779

Forks

447

Plugins

Installation

Add the marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

Install plugins

/plugin

Run these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.

Repository & Links

GitHub Repository

Orchestra-Research/AI-Research-SKILLs

Details & Metadata

Plugins

Skills

Agents

Owner

@Orchestra-Research

View GitHub Profile

Last Crawled

March 29, 2026

Plugins

Plugin

model-architecture

LLM architectures and implementations including LitGPT, Mamba, NanoGPT, RWKV, and TorchTitan. Use when implementing, training, or understanding transformer and alternative architectures.

View Details

Plugin

tokenization

Text tokenization for LLMs including HuggingFace Tokenizers and SentencePiece. Use when training custom tokenizers or handling multilingual text.

View Details

Plugin

fine-tuning

LLM fine-tuning frameworks including Axolotl, LLaMA-Factory, PEFT, and Unsloth. Use when fine-tuning models with LoRA, QLoRA, or full fine-tuning.

View Details

Plugin

mechanistic-interpretability

Neural network interpretability tools including TransformerLens, SAELens, NNSight, and pyvene. Use when analyzing model internals, finding circuits, or understanding how models compute.

View Details

Plugin

data-processing

Data curation and processing at scale including NeMo Curator and Ray Data. Use when preparing training datasets or processing large-scale data.

View Details

Plugin

RLHF and preference alignment including TRL, GRPO, OpenRLHF, SimPO, verl, slime, miles, and torchforge. Use when aligning models with human preferences, training reward models, or large-scale RL training.

View Details

Plugin

safety-alignment

AI safety and content moderation including Constitutional AI, LlamaGuard, NeMo Guardrails, and Prompt Guard. Use when implementing safety filters, content moderation, or prompt injection detection.

View Details

Plugin

distributed-training

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

View Details

Plugin

infrastructure

GPU cloud and compute orchestration including Modal, Lambda Labs, and SkyPilot. Use when deploying training jobs or managing GPU resources.

View Details

Plugin

optimization

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

View Details

Plugin

evaluation

LLM benchmarking and evaluation including lm-evaluation-harness, BigCode Evaluation Harness, and NeMo Evaluator. Use when benchmarking models or measuring performance.

View Details

Plugin

inference-serving

Production LLM inference including vLLM, TensorRT-LLM, llama.cpp, and SGLang. Use when deploying models for production inference.

View Details

Plugin

mlops

ML experiment tracking and lifecycle including Weights & Biases, MLflow, and TensorBoard. Use when tracking experiments or managing models.

View Details

Plugin

agents

LLM agent frameworks including LangChain, LlamaIndex, CrewAI, and AutoGPT. Use when building chatbots, autonomous agents, or tool-using systems.

View Details

Plugin

rag

Retrieval-Augmented Generation including Chroma, FAISS, Pinecone, Qdrant, and Sentence Transformers. Use when building semantic search or document retrieval systems.

View Details

Plugin

prompt-engineering

Structured LLM outputs including DSPy, Instructor, Guidance, and Outlines. Use when extracting structured data or constraining LLM outputs.

View Details

Plugin

observability

LLM application monitoring including LangSmith and Phoenix. Use when debugging LLM apps or monitoring production systems.

View Details

Plugin

multimodal

Vision, audio, and multimodal models including CLIP, Whisper, LLaVA, BLIP-2, Segment Anything, Stable Diffusion, AudioCraft, Cosmos Policy, OpenPI, and OpenVLA-OFT. Use when working with images, audio, multimodal tasks, or vision-language-action robot policies.

View Details

Plugin

emerging-techniques

Advanced ML techniques including MoE Training, Model Merging, Long Context, Speculative Decoding, Knowledge Distillation, and Model Pruning. Use when implementing cutting-edge optimization or architecture techniques.

View Details

Plugin

autoresearch

Autonomous research orchestration using a two-loop architecture. Manages the full research lifecycle from literature survey to paper writing, routing to domain-specific skills for execution. Use when starting a research project, running autonomous experiments, or managing multi-hypothesis research.

View Details

Plugin

ml-paper-writing

Write publication-ready ML/AI/Systems papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM, OSDI, NSDI, ASPLOS, SOSP. Includes LaTeX templates, citation verification, reviewer guidelines, publication-quality figure generation, and writing best practices.

View Details

Plugin

ideation

Research ideation frameworks including structured brainstorming and creative thinking. Use when exploring new research directions, generating novel ideas, or seeking fresh angles on existing work.

View Details

Skills

Skill

litgpt

LLM architectures and implementations including LitGPT, Mamba, NanoGPT, RWKV, and TorchTitan. Use when implementing, training, or understanding transformer and alternative architectures.

From ai-research-skills/

View Details

Skill

mamba

LLM architectures and implementations including LitGPT, Mamba, NanoGPT, RWKV, and TorchTitan. Use when implementing, training, or understanding transformer and alternative architectures.

From ai-research-skills/

View Details

Skill

nanogpt

LLM architectures and implementations including LitGPT, Mamba, NanoGPT, RWKV, and TorchTitan. Use when implementing, training, or understanding transformer and alternative architectures.

From ai-research-skills/

View Details

Skill

rwkv

LLM architectures and implementations including LitGPT, Mamba, NanoGPT, RWKV, and TorchTitan. Use when implementing, training, or understanding transformer and alternative architectures.

From ai-research-skills/

View Details

Skill

torchtitan

LLM architectures and implementations including LitGPT, Mamba, NanoGPT, RWKV, and TorchTitan. Use when implementing, training, or understanding transformer and alternative architectures.

From ai-research-skills/

View Details

Skill

huggingface-tokenizers

Text tokenization for LLMs including HuggingFace Tokenizers and SentencePiece. Use when training custom tokenizers or handling multilingual text.

From ai-research-skills/

View Details

Skill

sentencepiece

Text tokenization for LLMs including HuggingFace Tokenizers and SentencePiece. Use when training custom tokenizers or handling multilingual text.

From ai-research-skills/

View Details

Skill

axolotl

LLM fine-tuning frameworks including Axolotl, LLaMA-Factory, PEFT, and Unsloth. Use when fine-tuning models with LoRA, QLoRA, or full fine-tuning.

From ai-research-skills/

View Details

Skill

llama-factory

LLM fine-tuning frameworks including Axolotl, LLaMA-Factory, PEFT, and Unsloth. Use when fine-tuning models with LoRA, QLoRA, or full fine-tuning.

From ai-research-skills/

View Details

Skill

peft

LLM fine-tuning frameworks including Axolotl, LLaMA-Factory, PEFT, and Unsloth. Use when fine-tuning models with LoRA, QLoRA, or full fine-tuning.

From ai-research-skills/

View Details

Skill

unsloth

LLM fine-tuning frameworks including Axolotl, LLaMA-Factory, PEFT, and Unsloth. Use when fine-tuning models with LoRA, QLoRA, or full fine-tuning.

From ai-research-skills/

View Details

Skill

nnsight

Neural network interpretability tools including TransformerLens, SAELens, NNSight, and pyvene. Use when analyzing model internals, finding circuits, or understanding how models compute.

From ai-research-skills/

View Details

Skill

pyvene

Neural network interpretability tools including TransformerLens, SAELens, NNSight, and pyvene. Use when analyzing model internals, finding circuits, or understanding how models compute.

From ai-research-skills/

View Details

Skill

saelens

Neural network interpretability tools including TransformerLens, SAELens, NNSight, and pyvene. Use when analyzing model internals, finding circuits, or understanding how models compute.

From ai-research-skills/

View Details

Skill

transformer-lens

Neural network interpretability tools including TransformerLens, SAELens, NNSight, and pyvene. Use when analyzing model internals, finding circuits, or understanding how models compute.

From ai-research-skills/

View Details

Skill

nemo-curator

Data curation and processing at scale including NeMo Curator and Ray Data. Use when preparing training datasets or processing large-scale data.

From ai-research-skills/

View Details

Skill

ray-data

Data curation and processing at scale including NeMo Curator and Ray Data. Use when preparing training datasets or processing large-scale data.

From ai-research-skills/

View Details

Skill

grpo-rl-training

From ai-research-skills/

View Details

Skill

miles

From ai-research-skills/

View Details

Skill

openrlhf

From ai-research-skills/

View Details

Skill

simpo

From ai-research-skills/

View Details

Skill

slime

From ai-research-skills/

View Details

Skill

torchforge

From ai-research-skills/

View Details

Skill

trl-fine-tuning

From ai-research-skills/

View Details

Skill

verl

From ai-research-skills/

View Details

Skill

constitutional-ai

AI safety and content moderation including Constitutional AI, LlamaGuard, NeMo Guardrails, and Prompt Guard. Use when implementing safety filters, content moderation, or prompt injection detection.

From ai-research-skills/

View Details

Skill

llamaguard

AI safety and content moderation including Constitutional AI, LlamaGuard, NeMo Guardrails, and Prompt Guard. Use when implementing safety filters, content moderation, or prompt injection detection.

From ai-research-skills/

View Details

Skill

nemo-guardrails

AI safety and content moderation including Constitutional AI, LlamaGuard, NeMo Guardrails, and Prompt Guard. Use when implementing safety filters, content moderation, or prompt injection detection.

From ai-research-skills/

View Details

Skill

prompt-guard

AI safety and content moderation including Constitutional AI, LlamaGuard, NeMo Guardrails, and Prompt Guard. Use when implementing safety filters, content moderation, or prompt injection detection.

From ai-research-skills/

View Details

Skill

accelerate

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

From ai-research-skills/

View Details

Skill

deepspeed

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

From ai-research-skills/

View Details

Skill

megatron-core

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

From ai-research-skills/

View Details

Skill

pytorch-fsdp2

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

From ai-research-skills/

View Details

Skill

pytorch-lightning

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

From ai-research-skills/

View Details

Skill

ray-train

Multi-GPU and multi-node training including DeepSpeed, PyTorch FSDP, Accelerate, Megatron-Core, PyTorch Lightning, and Ray Train. Use when training large models across GPUs.

From ai-research-skills/

View Details

Skill

lambda-labs

GPU cloud and compute orchestration including Modal, Lambda Labs, and SkyPilot. Use when deploying training jobs or managing GPU resources.

From ai-research-skills/

View Details

Skill

modal

GPU cloud and compute orchestration including Modal, Lambda Labs, and SkyPilot. Use when deploying training jobs or managing GPU resources.

From ai-research-skills/

View Details

Skill

skypilot

GPU cloud and compute orchestration including Modal, Lambda Labs, and SkyPilot. Use when deploying training jobs or managing GPU resources.

From ai-research-skills/

View Details

Skill

awq

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

From ai-research-skills/

View Details

Skill

bitsandbytes

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

From ai-research-skills/

View Details

Skill

flash-attention

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

From ai-research-skills/

View Details

Skill

gguf

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

From ai-research-skills/

View Details

Skill

gptq

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

From ai-research-skills/

View Details

Skill

hqq

Model optimization and quantization including Flash Attention, bitsandbytes, GPTQ, AWQ, GGUF, and HQQ. Use when reducing memory, accelerating inference, or quantizing models.

From ai-research-skills/

View Details

Skill

bigcode-evaluation-harness

LLM benchmarking and evaluation including lm-evaluation-harness, BigCode Evaluation Harness, and NeMo Evaluator. Use when benchmarking models or measuring performance.

From ai-research-skills/

View Details

Skill

lm-evaluation-harness

LLM benchmarking and evaluation including lm-evaluation-harness, BigCode Evaluation Harness, and NeMo Evaluator. Use when benchmarking models or measuring performance.

From ai-research-skills/

View Details

Skill

nemo-evaluator

LLM benchmarking and evaluation including lm-evaluation-harness, BigCode Evaluation Harness, and NeMo Evaluator. Use when benchmarking models or measuring performance.

From ai-research-skills/

View Details

Skill

llama-cpp

Production LLM inference including vLLM, TensorRT-LLM, llama.cpp, and SGLang. Use when deploying models for production inference.

From ai-research-skills/

View Details

Skill

sglang

Production LLM inference including vLLM, TensorRT-LLM, llama.cpp, and SGLang. Use when deploying models for production inference.

From ai-research-skills/

View Details

Skill

tensorrt-llm

Production LLM inference including vLLM, TensorRT-LLM, llama.cpp, and SGLang. Use when deploying models for production inference.

From ai-research-skills/

View Details

Skill

vllm

Production LLM inference including vLLM, TensorRT-LLM, llama.cpp, and SGLang. Use when deploying models for production inference.

From ai-research-skills/

View Details

Skill

mlflow

ML experiment tracking and lifecycle including Weights & Biases, MLflow, and TensorBoard. Use when tracking experiments or managing models.

From ai-research-skills/

View Details

Skill

tensorboard

ML experiment tracking and lifecycle including Weights & Biases, MLflow, and TensorBoard. Use when tracking experiments or managing models.

From ai-research-skills/

View Details

Skill

weights-and-biases

ML experiment tracking and lifecycle including Weights & Biases, MLflow, and TensorBoard. Use when tracking experiments or managing models.

From ai-research-skills/

View Details

Skill

autogpt

LLM agent frameworks including LangChain, LlamaIndex, CrewAI, and AutoGPT. Use when building chatbots, autonomous agents, or tool-using systems.

From ai-research-skills/

View Details

Skill

crewai

LLM agent frameworks including LangChain, LlamaIndex, CrewAI, and AutoGPT. Use when building chatbots, autonomous agents, or tool-using systems.

From ai-research-skills/

View Details

Skill

langchain

LLM agent frameworks including LangChain, LlamaIndex, CrewAI, and AutoGPT. Use when building chatbots, autonomous agents, or tool-using systems.

From ai-research-skills/

View Details

Skill

llamaindex

LLM agent frameworks including LangChain, LlamaIndex, CrewAI, and AutoGPT. Use when building chatbots, autonomous agents, or tool-using systems.

From ai-research-skills/

View Details

Skill

chroma

Retrieval-Augmented Generation including Chroma, FAISS, Pinecone, Qdrant, and Sentence Transformers. Use when building semantic search or document retrieval systems.

From ai-research-skills/

View Details

Skill

faiss

Retrieval-Augmented Generation including Chroma, FAISS, Pinecone, Qdrant, and Sentence Transformers. Use when building semantic search or document retrieval systems.

From ai-research-skills/

View Details

Skill

pinecone

Retrieval-Augmented Generation including Chroma, FAISS, Pinecone, Qdrant, and Sentence Transformers. Use when building semantic search or document retrieval systems.

From ai-research-skills/

View Details

Skill

qdrant

Retrieval-Augmented Generation including Chroma, FAISS, Pinecone, Qdrant, and Sentence Transformers. Use when building semantic search or document retrieval systems.

From ai-research-skills/

View Details

Skill

sentence-transformers

Retrieval-Augmented Generation including Chroma, FAISS, Pinecone, Qdrant, and Sentence Transformers. Use when building semantic search or document retrieval systems.

From ai-research-skills/

View Details

Skill

dspy

Structured LLM outputs including DSPy, Instructor, Guidance, and Outlines. Use when extracting structured data or constraining LLM outputs.

From ai-research-skills/

View Details

Skill

guidance

Structured LLM outputs including DSPy, Instructor, Guidance, and Outlines. Use when extracting structured data or constraining LLM outputs.

From ai-research-skills/

View Details

Skill

instructor

Structured LLM outputs including DSPy, Instructor, Guidance, and Outlines. Use when extracting structured data or constraining LLM outputs.

From ai-research-skills/

View Details

Skill

outlines

Structured LLM outputs including DSPy, Instructor, Guidance, and Outlines. Use when extracting structured data or constraining LLM outputs.

From ai-research-skills/

View Details

Skill

langsmith

LLM application monitoring including LangSmith and Phoenix. Use when debugging LLM apps or monitoring production systems.

From ai-research-skills/

View Details

Skill

phoenix

LLM application monitoring including LangSmith and Phoenix. Use when debugging LLM apps or monitoring production systems.

From ai-research-skills/

View Details

Skill

audiocraft

From ai-research-skills/

View Details

Skill

blip-2

From ai-research-skills/

View Details

Skill

clip

From ai-research-skills/

View Details

Skill

cosmos-policy

From ai-research-skills/

View Details

Skill

llava

From ai-research-skills/

View Details

Skill

openpi

From ai-research-skills/

View Details

Skill

openvla-oft

From ai-research-skills/

View Details

Skill

segment-anything

From ai-research-skills/

View Details

Skill

stable-diffusion

From ai-research-skills/

View Details

Skill

whisper

From ai-research-skills/

View Details

Skill

knowledge-distillation

From ai-research-skills/

View Details

Skill

long-context

From ai-research-skills/

View Details

Skill

model-merging

From ai-research-skills/

View Details

Skill

model-pruning

From ai-research-skills/

View Details

Skill

moe-training

From ai-research-skills/

View Details

Skill

speculative-decoding

From ai-research-skills/

View Details

Skill

0-autoresearch-skill

From ai-research-skills/

View Details

Skill

ml-paper-writing

From ai-research-skills/

View Details

Skill

academic-plotting

From ai-research-skills/

View Details

Skill

brainstorming-research-ideas

Research ideation frameworks including structured brainstorming and creative thinking. Use when exploring new research directions, generating novel ideas, or seeking fresh angles on existing work.

From ai-research-skills/

View Details

Skill

creative-thinking-for-research

Research ideation frameworks including structured brainstorming and creative thinking. Use when exploring new research directions, generating novel ideas, or seeking fresh angles on existing work.

From ai-research-skills/

View Details

ai-research-skills

Installation

Repository & Links

Details & Metadata

Plugins

model-architecture

tokenization

fine-tuning

mechanistic-interpretability

data-processing

post-training

safety-alignment

distributed-training

infrastructure

optimization

evaluation

inference-serving

mlops

agents

rag

prompt-engineering

observability

multimodal

emerging-techniques

autoresearch

ml-paper-writing

ideation

Skills

litgpt

mamba

nanogpt

rwkv

torchtitan

huggingface-tokenizers

sentencepiece

axolotl

llama-factory

peft

unsloth

nnsight

pyvene

saelens

transformer-lens

nemo-curator

ray-data

grpo-rl-training

miles

openrlhf

simpo

slime

torchforge

trl-fine-tuning

verl

constitutional-ai

llamaguard

nemo-guardrails

prompt-guard

accelerate

deepspeed

megatron-core

pytorch-fsdp2

pytorch-lightning

ray-train

lambda-labs

modal

skypilot

awq

bitsandbytes

flash-attention

gguf

gptq

hqq

bigcode-evaluation-harness

lm-evaluation-harness

nemo-evaluator

llama-cpp

sglang

tensorrt-llm

vllm

mlflow