Plugin

vllm-ascend

vLLM inference engine for Huawei Ascend NPU. Deploy LLMs with OpenAI-compatible API, offline batch inference, quantized model serving (W4A8, W8A8), tensor/pipeline parallelism for distributed inference, and performance optimization. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.

Installation

Add the marketplace

/plugin marketplace add ascend-ai-coding/awesome-ascend-skills

Install plugins

/plugin

Run these commands in Claude Code to add this plugin to your environment. The marketplace must be added before you can install its plugins.

Repository & Links

Plugin Source

View Plugin Code

GitHub Repository

ascend-ai-coding/awesome-ascend-skills

Details & Metadata

From Marketplace

awesome-ascend-skills

Primary

View Marketplace

Author

@ascend-ai-coding

View GitHub Profile