OA0

OA0 是一个探索 AI 的社区

现在注册

已注册用户请登录

OA0 › 代码 › LightGBM 高效分布式梯度提升树算法与大规模训练框架

LightGBM 高效分布式梯度提升树算法与大规模训练框架

air · 2026-03-05 14:06:36 · 60 次点击 · 0 条评论

[!NOTE]
本项目已于 2026 年 3 月从 Microsoft/LightGBM 迁移至 lightgbm-org/LightGBM。
此仓库仍是官方的 LightGBM 源代码，由相同的维护者（包括 LightGBM 的创建者）管理。
详情请参阅：https://github.com/lightgbm-org/LightGBM/issues/7187

LightGBM

LightGBM 是一个基于树学习算法的梯度提升框架。它被设计为分布式且高效的，具有以下优点：

更快的训练速度和更高的效率。
更低的内存使用。
更好的准确性。
支持并行、分布式和 GPU 学习。
能够处理大规模数据。

更多详细信息，请参阅特性。

得益于这些优势，LightGBM 在许多机器学习竞赛的获胜方案中被广泛使用。

在公共数据集上的对比实验表明，LightGBM 在效率和准确性上都能超越现有的提升框架，并且内存消耗显著降低。此外，分布式学习实验表明，在特定设置下，LightGBM 可以通过使用多台机器进行训练来实现线性加速。

快速开始与文档

我们的主要文档位于 https://lightgbm.readthedocs.io/，并由本仓库生成。如果你是 LightGBM 的新手，请按照该网站上的安装指南进行操作。

接下来你可能想阅读：

示例展示了常见任务的命令行用法。
LightGBM 支持的特性和算法。
参数是你可以进行的自定义设置的详尽列表。
分布式学习 和 GPU 学习 可以加速计算。
FLAML 为 LightGBM 提供自动调优（代码示例）。
Optuna 超参数调优器 为 LightGBM 超参数提供自动调优（代码示例）。
理解 LightGBM 参数（以及如何使用 Neptune 调优它们）。

贡献者文档：

我们如何更新 readthedocs.io。
查看 开发指南。

新闻

请参阅 GitHub 发布页面上的更新日志。

外部（非官方）仓库

此处列出的项目提供了使用 LightGBM 的替代方式。它们不由 LightGBM 开发团队维护或官方支持。

JPMML (Java PMML 转换器): https://github.com/jpmml/jpmml-lightgbm

Nyoka (Python PMML 转换器): https://github.com/SoftwareAG/nyoka

Treelite (用于高效部署的模型编译器): https://github.com/dmlc/treelite

lleaves (基于 LLVM 的模型编译器，用于高效推理): https://github.com/siboehm/lleaves

Hummingbird (将模型编译为张量计算): https://github.com/microsoft/hummingbird

GBNet (将 LightGBM 用作 PyTorch 模块): https://github.com/mthorrell/gbnet

cuML 森林推理库 (GPU 加速推理): https://github.com/rapidsai/cuml

daal4py (Intel CPU 加速推理): https://github.com/intel/scikit-learn-intelex/tree/master/daal4py

m2cgen (多种语言的模型应用器): https://github.com/BayesWitnesses/m2cgen

leaves (Go 模型应用器): https://github.com/dmitryikh/leaves

ONNXMLTools (ONNX 转换器): https://github.com/onnx/onnxmltools

SHAP (模型输出解释器): https://github.com/slundberg/shap

Shapash (模型可视化和解释): https://github.com/MAIF/shapash

dtreeviz (决策树可视化和模型解释): https://github.com/parrt/dtreeviz

supertree (决策树的交互式可视化): https://github.com/mljar/supertree

SynapseML (Spark 上的 LightGBM): https://github.com/microsoft/SynapseML

Kubeflow Fairing (Kubernetes 上的 LightGBM): https://github.com/kubeflow/fairing

Kubeflow Operator (Kubernetes 上的 LightGBM): https://github.com/kubeflow/xgboost-operator

lightgbm_ray (Ray 上的 LightGBM): https://github.com/ray-project/lightgbm_ray

Ray (分布式计算框架): https://github.com/ray-project/ray

Mars (Mars 上的 LightGBM): https://github.com/mars-project/mars

ML.NET (.NET/C# 包): https://github.com/dotnet/machinelearning

LightGBM.NET (.NET/C# 包): https://github.com/rca22/LightGBM.Net

LightGBM Ruby (Ruby gem): https://github.com/ankane/lightgbm-ruby

LightGBM4j (Java 高级绑定): https://github.com/metarank/lightgbm4j

LightGBM4J (用 Scala 编写的 LightGBM JVM 接口): https://github.com/seek-oss/lightgbm4j

Julia 包: https://github.com/IQVIA-ML/LightGBM.jl

lightgbm3 (Rust 绑定): https://github.com/Mottl/lightgbm3-rs

MLServer (LightGBM 推理服务器): https://github.com/SeldonIO/MLServer

MLflow (实验跟踪、模型监控框架): https://github.com/mlflow/mlflow

FLAML (用于超参数优化的 AutoML 库): https://github.com/microsoft/FLAML

MLJAR AutoML (表格数据 AutoML): https://github.com/mljar/mljar-supervised

Optuna (超参数优化框架): https://github.com/optuna/optuna

LightGBMLSS (使用 LightGBM 进行概率建模): https://github.com/StatMixedML/LightGBMLSS

mlforecast (使用 LightGBM 进行时间序列预测): https://github.com/Nixtla/mlforecast

skforecast (使用 LightGBM 进行时间序列预测): https://github.com/JoaquinAmatRodrigo/skforecast

{bonsai} (R {parsnip} 兼容接口): https://github.com/tidymodels/bonsai

{mlr3extralearners} (R {mlr3} 兼容接口): https://github.com/mlr-org/mlr3extralearners

lightgbm-transform (特征转换绑定): https://github.com/lightgbm-org/LightGBM-transform

postgresml (通过 Postgres 扩展在 SQL 中进行 LightGBM 训练和预测): https://github.com/postgresml/postgresml

pyodide (在 Web 浏览器中运行 lightgbm Python 包): https://github.com/pyodide/pyodide

vaex-ml (具有自己 LightGBM 接口的 Python DataFrame 库): https://github.com/vaexio/vaex

支持

在 Stack Overflow 上使用 lightgbm 标签提问，我们会关注新问题。
在 GitHub issues 上提交 错误报告 和 功能请求。

如何贡献

请查看贡献指南页面。

Microsoft 开源行为准则

本项目采用了 Microsoft 开源行为准则。更多信息请参阅行为准则 FAQ，如有任何其他问题或意见，请联系 opencode@microsoft.com。

参考文献

Yu Shi, Guolin Ke, Zhuoming Chen, Shuxin Zheng, Tie-Yan Liu. "Quantized Training of Gradient Boosting Decision Trees" (链接). Advances in Neural Information Processing Systems 35 (NeurIPS 2022), pp. 18822-18833.

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. "LightGBM: A Highly Efficient Gradient Boosting Decision Tree". Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 3149-3157.

Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tie-Yan Liu. "A Communication-Efficient Parallel Algorithm for Decision Tree". Advances in Neural Information Processing Systems 29 (NIPS 2016), pp. 1279-1287.

Huan Zhang, Si Si and Cho-Jui Hsieh. "GPU Acceleration for Large-scale Tree Boosting". SysML Conference, 2018.

许可证

本项目根据 MIT 许可证的条款授权。有关详细信息，请参阅 LICENSE。

项目地址：https://github.com/lightgbm-org/LightGBM

60 次点击 ∙ 0 人收藏

登录后收藏

0 条回复