Add the marketplace
/plugin marketplace add ascend-ai-coding/awesome-ascend-skillsInstall plugins
/pluginRun these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.
Installs op-plugin (torch_npu operator plugin) environment and guides custom NPU operator integration with PyTorch via two generic patterns (no workspace vs workspace+tiling), from kernel implementation through host registration, build, and test.
Diffusers 昇腾 NPU 技能集,包括环境配置、权重准备、Pipeline 推理等。当用户需要在昇腾 NPU 上运行 Diffusers 推理时使用。
华为昇腾 Ascend Extension for PyTorch (torch_npu) 的环境检查、部署与能力指引。在用户使用 @torch_npu、昇腾 NPU、CANN、或需要将 PyTorch 迁移到 NPU 时自动应用;当用户使用 @torch_npu_doc 时,基于 skill 的 reference 文档提供项目内中文文档能力说明。
HuggingFace Diffusers 环境配置指南,用于华为昇腾 NPU。覆盖 CANN 版本检测、PyTorch + torch_npu 安装、Diffusers 库安装及环境验证。当用户需要在昇腾 NPU 上配置 Diffusers 环境时使用。
Diffusers 模型权重准备工具,用于华为昇腾 NPU。支持从 HuggingFace 和 ModelScope 下载模型权重,以及基于 config.json 生成假权重用于业务验证。当用户需要下载 Diffusers 模型权重或生成测试权重时使用。
Huawei Ascend NPU npu-smi command reference. Use for device queries (health, temperature, power, memory, processes, ECC), configuration (thresholds, modes, fan), firmware upgrades (MCU, bootloader, VRD), virtualization (vNPU), and certificate management.
HCCL (Huawei Collective Communication Library) performance testing for Ascend NPU clusters. Use for testing distributed communication bandwidth, verifying HCCL functionality, and benchmarking collective operations like AllReduce, AllGather, AlltoAll.
Complete toolkit for Huawei Ascend NPU model conversion and inference. Convert ONNX models to .om format using ATC tool, run Python inference on OM models using ais_bench, compare precision between CPU ONNX and NPU OM outputs, and end-to-end YOLO inference with Ultralytics.
Create Docker containers for Huawei Ascend NPU development with proper device mappings (davinci_manager, devmm_svm, hisi_hdc) and volume mounts (driver, sbin, home). Use when setting up Ascend development environments in Docker, running CANN applications in containers, or creating isolated NPU development workspaces.
Huawei Ascend NPU model compression tool for LLM, MoE, and multimodal models. Supports W4A8, W8A8, W8A8S, W8A16, W8A8C8 quantization and sparse quantization. Compatible with 20+ model families (Qwen, DeepSeek, LLaMA, GLM, Kimi, Baichuan, Yi, InternLM, Mistral, etc.). Includes precision auto-tuning, custom model integration guide, and vLLM-Ascend/MindIE deployment.
AI model evaluation tool for Ascend NPU. Supports accuracy evaluation (text, multimodal datasets), performance evaluation (latency, throughput, stress testing), vLLM/Triton inference, 15+ benchmarks (MMLU, GSM8K, MMMU, BFCL), multi-turn dialogue, Function Call, and custom datasets.
AscendC development helper, provides AscendC development environment, compiler, and debugger. Supports C/C++ and Python programming on Ascend NPU. Includes sample code, tutorials, and best practices for AscendC development.
vLLM inference engine for Huawei Ascend NPU. Deploy LLMs with OpenAI-compatible API, offline batch inference, quantized model serving (W4A8, W8A8), tensor/pipeline parallelism for distributed inference, and performance optimization. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.
Diffusers 昇腾 NPU 技能集,包括环境配置、权重准备、Pipeline 推理等。当用户需要在昇腾 NPU 上运行 Diffusers 推理时使用。
Diffusers 昇腾 NPU 技能集,包括环境配置、权重准备、Pipeline 推理等。当用户需要在昇腾 NPU 上运行 Diffusers 推理时使用。
Diffusers 昇腾 NPU 技能集,包括环境配置、权重准备、Pipeline 推理等。当用户需要在昇腾 NPU 上运行 Diffusers 推理时使用。