llmware




🧰🛠️ 用于构建基于知识的本地、私有、安全的 LLM 应用程序的统一框架
llmware 针对 AI PC 和本地笔记本电脑、边缘及自托管部署进行了优化,支持广泛的 Windows、Mac 和 Linux 平台,并支持 GGUF、OpenVINO、ONNXRuntime、ONNXRuntime-QNN (Qualcomm)、WindowsLocalFoundry 和 Pytorch,提供了一个高级接口,便于利用为目标平台优化的正确推理技术。
llmware 包含两个主要组件:
-
包含 300+ 模型的模型目录 - 模型以量化、优化的格式预打包,以利用设备上的 GPU 和 NPU 能力,支持主要的开源模型系列以及 50+ 个针对企业流程自动化关键任务专门优化的 llmware 微调 SLIM、Bling、Dragon 和 Industry-Bert 模型。同时支持来自 OpenAI、Anthropic 和 Google 的主要云模型。
-
RAG 管道 - 将知识源连接到生成式 AI 模型全生命周期的集成组件,具有广泛的文档解析和摄取能力,并能创建可扩展的知识库。
通过整合这两个组件,llmware 提供了一套全面的工具,用于快速构建基于知识的企业 LLM 应用程序。
我们的愿景是 AI 应该可持续、准确且具有成本效益,使用尽可能小的计算资源完成任务。
实际上,我们所有的示例和模型都可以在设备上运行 - 立即在您的笔记本电脑上开始使用。
加入我们的 Discord | 观看 Youtube 教程 | 在 Huggingface 上探索我们的模型系列
🎯 主要特性
使用 llmware 编写代码基于几个主要概念:
模型目录:以相同的方式访问所有模型,易于查找,无论底层实现如何。
# 目录中包含 300+ 个模型,其中 50+ 个针对 RAG 优化的 BLING、DRAGON 和 Industry BERT 模型
# 全面支持 GGUF、OpenVINO、Onnxruntime、HuggingFace、Sentence Transformers 和主要的基于 API 的模型
# 易于扩展以添加自定义模型 - 参见示例
from llmware.models import ModelCatalog
from llmware.prompts import Prompt
# 所有模型都通过 ModelCatalog 访问
models = ModelCatalog().list_all_models()
# 要使用 ModelCatalog 中的任何模型 - 使用 "load_model" 方法并传递 model_name 参数
my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf")
# 调用模型进行推理
output = my_model.inference("what is the future of AI?", add_context="Here is the article to read")
# 调用模型进行流式输出
for token in my_model.stream("What is the future of AI?"):
print(token, end="")
# 将模型集成到 Prompt 中
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information")
知识库:大规模摄取、组织和索引知识集合 - 解析、文本分块和嵌入。
from llmware.library import Library
# 解析和文本分块一组文档(pdf、pptx、docx、xlsx、txt、csv、md、json/jsonl、wav、png、jpg、html)
# 步骤 1 - 创建一个知识库,这是“知识库容器”结构
# - 知识库同时具有文本集合(数据库)资源和文件资源(例如,llmware_data/accounts/{library_name})
# - 嵌入和查询是针对知识库运行的
lib = Library().create_new_library("my_library")
# 步骤 2 - add_files 是通用摄取函数 - 指向包含混合文件类型的本地文件文件夹
# - 文件将根据文件扩展名路由到正确的解析器,进行解析、文本分块并索引到文本集合数据库中
lib.add_files("/folder/path/to/my/files")
# 在知识库上安装嵌入 - 选择一个嵌入模型和向量数据库
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)
# 在同一知识库上添加第二个嵌入(混合匹配模型 + 向量数据库)
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)
# 易于为不同的项目和组创建多个知识库
finance_lib = Library().create_new_library("finance_q4_2023")
finance_lib.add_files("/finance_folder/")
hr_lib = Library().create_new_library("hr_policies")
hr_lib.add_files("/hr_folder/")
# 获取包含关键元数据的知识库卡片 - 文档、文本块、图像、表格、嵌入记录
lib_card = Library().get_library_card("my_library")
# 查看所有知识库
all_my_libs = Library().get_all_library_cards()
查询:使用文本、语义、混合、元数据和自定义过滤器的组合查询知识库。
from llmware.retrieval import Query
from llmware.library import Library
# 步骤 1 - 加载先前创建的知识库
lib = Library().load_library("my_library")
# 步骤 2 - 创建一个查询对象并传递知识库
q = Query(lib)
# 步骤 3 - 运行多种不同的查询(示例中还有许多其他选项)
# 基本文本查询
results1 = q.text_query("text query", result_count=20, exact_mode=False)
# 语义查询
results2 = q.semantic_query("semantic query", result_count=10)
# 将文本查询限制在知识库中的某些文档,并与查询进行“精确”匹配
results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True)
# 要应用特定的嵌入(如果知识库上有多个),请在创建查询对象时传递名称
q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus")
results4 = q2.semantic_query("new semantic query")
带来源的提示:将知识检索与 LLM 推理结合的最简单方法。
from llmware.prompts import Prompt
from llmware.retrieval import Query
from llmware.library import Library
# 构建一个提示
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
# 添加一个文件 -> 文件被解析、文本分块、按查询过滤,然后打包为模型就绪的上下文,
# 如果需要,会分批处理以适应模型上下文窗口
source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query")
# 将查询结果(来自 Query)附加到提示中
my_lib = Library().load_library("my_library")
results = Query(my_lib).query("my query")
source2 = prompter.add_source_query_results(results)
# 对知识库运行新查询并直接加载到提示中
source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15)
# 使用“带来源的提示”运行推理
responses = prompter.prompt_with_source("my query")
# 运行事实核查 - 推理后
fact_check = prompter.evidence_check_sources(responses)
# 查看来源材料(分批的“模型就绪”并附加到提示)
source_materials = prompter.review_sources_summary()
# 查看完整的提示历史
prompt_history = prompter.get_current_history()
RAG 优化模型 - 专为 RAG 工作流集成和本地运行而设计的 1-7B 参数模型。
```
""" 这个“Hello World”示例演示了如何使用提供的上下文开始使用本地 BLING 模型,包括 Pytorch 和 GGUF 版本。"""
import time
from llmware.prompts import Prompt
def hello_world_questions():
test_list = [
{"query": "What is the total amount of the invoice?",
"answer": "$22,500.00",
"context": "Services Vendor Inc. \n100 Elm Street Pleasantville, NY \nTO Alpha Inc. 5900 1st Street "
"Los Angeles, CA \nDescription Front End Engineering Service $5000.00 \n Back End Engineering"
" Service $7500.00 \n Quality Assurance Manager $10,000.00 \n Total Amount $22,500.00 \n"
"Make all checks payable to Services Vendor Inc. Payment is due within 30 days."
"If you have any questions concerning this invoice, contact Bia Hermes. "
"THANK YOU FOR YOUR BUSINESS! INVOICE INVOICE # 0001 DATE 01/01/2022 FOR Alpha Project P.O. # 1000"},
{"query": "What was the amount of the trade surplus?",
"answer": "62.4 billion yen ($416.6 million)",
"context": "Japan’s September trade balance swings into surplus, surprising expectations"
"Japan recorded a trade surplus of 62.4 billion yen ($416.6 million) for September, "
"beating expectations from economists polled by Reuters for a trade deficit of 42.5 "
"billion yen. Data from Japan’s customs agency revealed that exports in September "
"increased 4.3% year on year, while imports slid 16.3% compared to the same period "
"last year. According to FactSet, exports to Asia fell for the ninth straight month, "
"which reflected ongoing China weakness. Exports were supported by shipments to "
"Western markets, FactSet added. — Lim Hui Jie"},
{"query": "When did the LISP machine market collapse?",
"answer": "1987.",
"context": "The attendees became the leaders of AI research in the 1960s."
" They and their students produced programs that the press described as 'astonishing': "
"computers were learning checkers strategies, solving word problems in algebra, "
"proving logical theorems and speaking English. By the middle of the 1960s, research in "
"the U.S. was heavily funded by the Department of Defense and laboratories had been "
"established around the world. Herbert Simon predicted, 'machines will be capable, "
"within twenty years, of doing any work a man can do'. Marvin Minsky agreed, writing, "
"'within a generation ... the problem of creating 'artificial intelligence' will "
"substantially be solved'. They had, however, underestimated the difficulty of the problem. "
"Both the U.S. and British governments cut off exploratory research in response "
"to the criticism of Sir James Lighthill and ongoing pressure from the US Congress "
"to fund more productive projects. Minsky's and Papert's book Perceptrons was understood "
"as proving that artificial neural networks approach would never be useful for solving "
"real-world tasks, thus discrediting the approach altogether. The 'AI winter', a period "
"when obtaining funding for AI projects was difficult, followed. In the early 1980s, "
"AI research was revived by the commercial success of expert systems, a form of AI "
"program that simulated the knowledge and analytical skills of human experts. By 1985, "
"the market for AI had reached over a billion dollars. At the same time, Japan's fifth "
"generation computer project inspired the U.S. and British governments to restore funding "
"for academic research. However, beginning with the collapse of the Lisp Machine market "
"in 1987, AI once again fell into disrepute, and a second, longer-lasting winter began."},
{"query": "What is the current rate on 10-year treasuries?",
"answer": "4.58%",
"context": "Stocks rallied Friday even after the release of stronger-than-expected U.S. jobs data "
"and a major increase in Treasury yields. The Dow Jones Industrial Average gained 195.12 points, "
"or 0.76%, to close at 31,419.58. The S&P 500 added 1.59% at 4,008.50. The tech-heavy "
"Nasdaq Composite rose 1.35%, closing at 12,299.68. The U.S. economy added 438,000 jobs in "
"August, the Labor Department said. Economists polled by Dow Jones expected 273,000 "
"jobs. However, wages rose less than expected last month. Stocks posted a stunning "
"turnaround on Friday, after initially falling on the stronger-than-expected jobs report. "
"At its session low, the Dow had fallen as much as 198 points; it surged by more than "
"500 points at the height of the rally. The Nasdaq and the S&P 500 slid by 0.8% during "
"their lowest points in the day. Traders were unclear of the reason for the intraday "
"reversal. Some noted it could be the softer wage number in the jobs report that made "
"investors rethink their earlier bearish stance. Others noted the pullback in yields from "
"the day’s highs. Part of the rally may just be to do a market that had gotten extremely "
"oversold with the S&P 500 at one point this week down more than 9% from its high earlier "
"this year. Yields initially surged after the report, with the 10-year Treasury rate trading "
"near its highest level in 14 years. The benchmark rate later eased from those levels, but "
"was still up around 6 basis points at 4.58%. 'We’re seeing a little bit of a give back "
"in yields from where we were around 4.8%. [With] them pulling back a bit, I think that’s "
"helping the stock market,' said Margaret Jones, chief investment officer at Vibrant Industries "
"Capital Advisors. 'We’ve had a lot of weakness in the market in recent weeks, and potentially "
"some oversold conditions.'"},
{"query": "Is the expected gross margin greater than 70%?",
"answer": "Yes, between 71.5% and 72.%",
"context": "Outlook NVIDIA’s outlook for the third quarter of fiscal 2024 is as follows:"
"Revenue is expected to be $16.00 billion, plus or minus 2%. GAAP and non-GAAP "
"gross margins are expected to be 71.5% and 72.5%, respectively, plus or minus "
"50 basis points. GAAP and non-GAAP operating expenses are expected to be "
"approximately $2.95 billion and $2.00 billion, respectively. GAAP and non-GAAP "
"other income and expense are expected to be an income of approximately $100 "
"million, excluding gains and losses from non-affiliated investments. GAAP and "
"non-GAAP tax rates are expected to be 14.5%, plus or minus 1%, excluding any discrete items."
"Highlights NVIDIA achieved progress since its previous earnings announcement "
"in these areas: Data Center Second-quarter revenue was a record $10.32 billion, "
"up 141% from the previous quarter and up 171% from a year ago. Announced that the "
"NVIDIA® GH200 Grace™ Hopper™ Superchip for complex AI and HPC workloads is shipping "
"this quarter, with a second-generation version with HBM3e memory expected to ship "
"in Q2 of calendar 2024. "},
{"query": "What is Bank of America's rating on Target?",
"answer": "Buy",
"context": "Here are some of the tickers on my radar for Thursday, Oct. 12, taken directly from "
"my reporter’s notebook: It’s the one-year anniversary of the S&P 500′s bear market bottom "
"of 3,577. Since then, as of Wednesday’s close of 4,376, the broad market index "
"soared more than 22%. Hotter than expected September consumer price index, consumer "
"inflation. The Social Security Administration issues announced a 3.2% cost-of-living "
"adjustment for 2024. Chipotle Mexican Grill (CMG) plans price increases. Pricing power. "
"Cites consumer price index showing sticky retail inflation for the fourth time "
"in two years. Bank of America upgrades Target (TGT) to buy from neutral. Cites "
"risk/reward from depressed levels. Traffic could improve. Gross margin upside. "
"Merchandising better. Freight and transportation better. Target to report quarter "
"next month. In retail, the CNBC Investing Club portfolio owns TJX Companies (TJX), "
"the off-price jug