OA0
OA0 是一个探索 AI 的社区
现在注册
已注册用户请  登录
OA0  ›  代码  ›  MLC-LLM — 让大模型跨设备高效部署运行

MLC-LLM — 让大模型跨设备高效部署运行

 
  schema ·  2026-02-12 19:03:56 · 6 次点击  · 0 条评论  
# MLC LLM [![安装](https://img.shields.io/badge/文档-最新-green)](https://llm.mlc.ai/docs/) [![许可证](https://img.shields.io/badge/许可证-apache_2-blue)](https://github.com/mlc-ai/mlc-llm/blob/main/LICENSE) [![加入 Discord](https://img.shields.io/badge/加入-Discord-7289DA?logo=discord&logoColor=white)](https://discord.gg/9Xpy2HGBuD) [![相关仓库: WebLLM](https://img.shields.io/badge/相关仓库-WebLLM-fafbfc?logo=github)](https://github.com/mlc-ai/web-llm/) **基于机器学习编译的通用大语言模型部署引擎** [快速开始](https://llm.mlc.ai/docs/get_started/quick_start) | [文档](https://llm.mlc.ai/docs) | [博客](https://blog.mlc.ai/)

项目简介

MLC LLM 是一个面向大语言模型的机器学习编译器和高性能部署引擎。本项目的使命是让每个人都能在其自有平台上原生地开发、优化和部署 AI 模型。

AMD GPU NVIDIA GPU Apple GPU Intel GPU
Linux / Win ✅ Vulkan, ROCm ✅ Vulkan, CUDA N/A ✅ Vulkan
macOS ✅ Metal (dGPU) N/A ✅ Metal ✅ Metal (iGPU)
Web 浏览器 ✅ WebGPU 和 WASM
iOS / iPadOS ✅ Apple A 系列 GPU 上的 Metal
Android ✅ Adreno GPU 上的 OpenCL ✅ Mali GPU 上的 OpenCL

MLC LLM 在 MLCEngine 上编译和运行代码。MLCEngine 是一个跨上述所有平台的统一高性能 LLM 推理引擎。它提供了 OpenAI 兼容的 API,可通过 REST 服务器、Python、JavaScript、iOS、Android 等多种方式调用,所有这些都由我们与社区共同持续改进的同一引擎和编译器提供支持。

快速开始

请访问我们的 文档 开始使用 MLC LLM。
- 安装指南
- 快速开始
- 项目介绍

引用

如果您觉得本项目有用,请考虑引用:

@software{mlc-llm,
    author = {{MLC team}},
    title = {{MLC-LLM}},
    url = {https://github.com/mlc-ai/mlc-llm},
    year = {2023-2025}
}

MLC LLM 所采用的核心技术包括:

参考文献(点击展开) ```bibtex @inproceedings{tensorir, author = {Feng, Siyuan and Hou, Bohan and Jin, Hongyi and Lin, Wuwei and Shao, Junru and Lai, Ruihang and Ye, Zihao and Zheng, Lianmin and Yu, Cody Hao and Yu, Yong and Chen, Tianqi}, title = {TensorIR: An Abstraction for Automatic Tensorized Program Optimization}, year = {2023}, isbn = {9781450399166}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3575693.3576933}, doi = {10.1145/3575693.3576933}, booktitle = {Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2}, pages = {804–817}, numpages = {14}, keywords = {Tensor Computation, Machine Learning Compiler, Deep Neural Network}, location = {Vancouver, BC, Canada}, series = {ASPLOS 2023} } @inproceedings{metaschedule, author = {Shao, Junru and Zhou, Xiyou and Feng, Siyuan and Hou, Bohan and Lai, Ruihang and Jin, Hongyi and Lin, Wuwei and Masuda, Masahiro and Yu, Cody Hao and Chen, Tianqi}, booktitle = {Advances in Neural Information Processing Systems}, editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh}, pages = {35783--35796}, publisher = {Curran Associates, Inc.}, title = {Tensor Program Optimization with Probabilistic Programs}, url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/e894eafae43e68b4c8dfdacf742bcbf3-Paper-Conference.pdf}, volume = {35}, year = {2022} } @inproceedings{tvm, author = {Tianqi Chen and Thierry Moreau and Ziheng Jiang and Lianmin Zheng and Eddie Yan and Haichen Shen and Meghan Cowan and Leyuan Wang and Yuwei Hu and Luis Ceze and Carlos Guestrin and Arvind Krishnamurthy}, title = {{TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning}, booktitle = {13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)}, year = {2018}, isbn = {978-1-939133-08-3}, address = {Carlsbad, CA}, pages = {578--594}, url = {https://www.usenix.org/conference/osdi18/presentation/chen}, publisher = {USENIX Association}, month = oct, } ```
6 次点击  ∙  0 人收藏  
登录后收藏  
0 条回复
关于 ·  帮助 ·  PING ·  隐私政策 ·  服务条款   
OA0 - Omni AI 0 一个探索 AI 的社区
沪ICP备2024103595号-2
耗时 14 ms
Developed with Cursor