OA0

OA0 是一个探索 AI 的社区

现在注册

已注册用户请登录

OA0 › 代码 › guidance — 让大模型输出可控、可组合的生成编程框架

guidance — 让大模型输出可控、可组合的生成编程框架

cause · 2026-02-26 07:04:53 · 52 次点击 · 0 条评论

Guidance 是一种用于引导语言模型的高效编程范式。 通过 Guidance，您可以控制输出的结构，并为您的用例获取高质量的输出——同时相比传统的提示或微调方法，降低延迟和成本。 它允许用户约束生成（例如使用正则表达式和 CFG），并能够无缝地交织控制逻辑（条件、循环、工具使用）与生成过程。

安装
特性

安装

Guidance 可通过 PyPI 获取，并支持多种后端（Transformers、llama.cpp、OpenAI 等）。
如果您已经安装了所需模型的后端，只需运行：

pip install guidance

特性

面向语言模型的 Pythonic 接口

使用 Guidance 时，您可以使用常见的 Python 语法来操作大语言模型：

from guidance import system, user, assistant, gen
from guidance.models import Transformers

# 也可以使用 LlamaCpp 或其他许多模型
phi_lm = Transformers("microsoft/Phi-4-mini-instruct")

# 模型对象是不可变的，所以这是一个副本
lm = phi_lm

with system():
    lm += "You are a helpful assistant"

with user():
    lm += "Hello. What is your name?"

with assistant():
    lm += gen(max_tokens=20)

print(lm)

如果在命令行运行，将产生类似以下的输出：

<|system|>You are a helpful assistant<|end|><|user|>Hello. What is your name?<|end|><|assistant|>I am Phi, an AI developed by Microsoft. How can I help you today?

然而，如果在 Jupyter notebook 中运行，Guidance 会提供一个小组件，带来更丰富的用户体验：

Guidance 小组件展示 HTML 生成

使用 Guidance，捕获生成的文本非常容易：

# 获取模型的新副本
lm = phi_lm

with system():
    lm += "You are a helpful assistant"

with user():
    lm += "Hello. What is your name?"

with assistant():
    lm += gen(name="lm_response", max_tokens=20)

print(f"{lm['lm_response']=}")

lm['lm_response']='I am Phi, an AI developed by Microsoft. How can I help you today?'

通过约束生成保证输出语法

Guidance 提供了一种易于使用但功能极其强大的语法来约束语言模型的输出。
例如，gen() 调用可以被约束以匹配正则表达式：

lm = phi_lm

with system():
    lm += "You are a teenager"

with user():
    lm += "How old are you?"

with assistant():
    lm += gen("lm_age", regex=r"\d+", temperature=0.8)

print(f"The language model is {lm['lm_age']} years old")

The language model is 13 years old

通常，我们知道输出必须是我们预先知道的列表中的一个选项。
Guidance 为此场景提供了 select() 函数：

from guidance import select

lm = phi_lm

with system():
    lm += "You are a geography expert"

with user():
    lm += """What is the capital of Sweden? Answer with the correct letter.

    A) Helsinki
    B) Reykjavík 
    C) Stockholm
    D) Oslo
    """

with assistant():
    lm += select(["A", "B", "C", "D"], name="model_selection")

print(f"The model selected {lm['model_selection']}")

The model selected C

Guidance 提供的约束系统极其强大。
只要后端 LLM 完全支持 Guidance，它可以确保输出符合任何上下文无关文法。
更多内容见下文。

离线调试文法（无需模型 API 调用）

在迭代约束条件时，您可以在本地验证候选字符串，并使用 Mock 模型测试完整运行。

from guidance import gen
from guidance.models import Mock

grammar = "expr=" + gen(regex=r"\d+([+*]\d+)*", name="expr")

# 1) 直接根据文法验证字符串
assert grammar.match("expr=12+7*3") is not None
assert grammar.match("expr=12+*3") is None

# 2) 使用本地模拟模型运行相同的文法
lm = Mock(b"<s>expr=12+7*3")
lm += grammar
print(lm["expr"])  # 12+7*3

创建您自己的 Guidance 函数

使用 Guidance，您可以创建自己的 Guidance 函数，这些函数可以与语言模型交互。
它们使用 @guidance 装饰器标记。
假设我们想回答大量选择题。
我们可以这样做：

import guidance

from guidance.models import Model

ASCII_OFFSET = ord("a")

@guidance
def zero_shot_multiple_choice(
    language_model: Model,
    question: str,
    choices: list[str],
):
    with user():
        language_model += question + "\n"
        for i, choice in enumerate(choices):
            language_model += f"{chr(i+ASCII_OFFSET)} : {choice}\n"

    with assistant():
        language_model += select(
            [chr(i + ASCII_OFFSET) for i in range(len(choices))], name="string_choice"
        )

    return language_model

现在，定义一些问题：

questions = [
    {
        "question" : "Which state has the northernmost capital?",
        "choices" : [
            "New South Wales",
            "Northern Territory",
            "Queensland",
            "South Australia",
            "Tasmania",
            "Victoria",
            "Western Australia",
        ],
        "answer" : 1,
    },
    {
        "question" : "Which of the following is venomous?",
        "choices" : [
            "Kangaroo",
            "Koala Bear",
            "Platypus",
        ],
        "answer" : 2,
    }
]

我们可以像使用 gen() 或 select() 一样使用我们装饰过的函数。
language_model 参数会自动为我们填充：

lm = phi_lm

with system():
    lm += "You are a student taking a multiple choice test."

for mcq in questions:
    lm_temp = lm + zero_shot_multiple_choice(question=mcq["question"], choices=mcq["choices"])
    converted_answer = ord(lm_temp["string_choice"]) - ASCII_OFFSET
    print(lm_temp)
    print(f"LM Answer: {converted_answer},  Correct Answer: {mcq['answer']}")

<|system|>You are a student taking a multiple choice test.<|end|><|user|>Which state has the northernmost capital?
a : New South Wales
b : Northern Territory
c : Queensland
d : South Australia
e : Tasmania
f : Victoria
g : Western Australia
<|end|><|assistant|>b
LM Answer: 1,  Correct Answer: 1
<|system|>You are a student taking a multiple choice test.<|end|><|user|>Which of the following is venomous?
a : Kangaroo
b : Koala Bear
c : Platypus
<|end|><|assistant|>c
LM Answer: 2,  Correct Answer: 2

Guidance 函数可以组合，以构建完整的上下文无关文法。
例如，我们可以创建 Guidance 函数来构建一个简单的 HTML 网页（注意，这不是 HTML 的完整实现）。
我们从一个简单的函数开始，该函数生成不包含任何 HTML 标签的文本。
该函数被标记为 stateless，表示我们打算将其用于组合文法：

@guidance(stateless=True)
def _gen_text(lm: Model):
    return lm + gen(regex="[^<>]+")

然后，我们可以使用这个函数在任意 HTML 标签内生成文本：

@guidance(stateless=True)
def _gen_text_in_tag(lm: Model, tag: str):
    lm += f"<{tag}>"
    lm += _gen_text()
    lm += f"</{tag}>"
    return lm

现在，让我们创建页面头部。作为其中的一部分，我们需要生成页面标题：

@guidance(stateless=True)
def _gen_header(lm: Model):
    lm += "<head>\n"
    lm += _gen_text_in_tag("title") + "\n"
    lm += "</head>\n"
    return lm

HTML 页面的主体将填充标题和段落。
我们可以定义一个函数来处理每个部分：

from guidance.library import one_or_more

@guidance(stateless=True)
def _gen_heading(lm: Model):
    lm += select(
        options=[_gen_text_in_tag("h1"), _gen_text_in_tag("h2"), _gen_text_in_tag("h3")]
    )
    lm += "\n"
    return lm

@guidance(stateless=True)
def _gen_para(lm: Model):
    lm += "<p>"
    lm += one_or_more(
        select(
            options=[
                _gen_text(),
                _gen_text_in_tag("em"),
                _gen_text_in_tag("strong"),
                "<br />",
            ],
        )
    )
    lm += "</p>\n"
    return lm

现在，定义 HTML 主体本身的函数：

@guidance(stateless=True)
def _gen_body(lm: Model):
    lm += "<body>\n"
    lm += one_or_more(select(options=[_gen_heading(), one_or_more(_gen_para())]))
    lm += "</body>\n"
    return lm

接下来，我们来到生成完整 HTML 页面的函数。
我们添加 HTML 开始标签，然后生成头部，接着是主体，最后附加结束的 HTML 标签：

@guidance(stateless=True)
def _gen_html(lm: Model):
    lm += "<html>\n"
    lm += _gen_header()
    lm += _gen_body()
    lm += "</html>\n"
    return lm

最后，我们提供一个用户友好的包装器，它将允许我们：
- 设置生成的温度
- 从模型对象中捕获生成的页面

from guidance.library import capture, with_temperature

@guidance(stateless=True)
def make_html(
    lm,
    name: str | None = None,
    *,
    temperature: float = 0.0,
):
    return lm + capture(
        with_temperature(_gen_html(), temperature=temperature),
        name=name,
    )

现在，使用它来生成一个简单的网页：

lm = phi_lm

with system():
    lm += "You are an expert in HTML"

with user():
    lm += "Create a simple and short web page about your life story."

with assistant():
    lm += make_html(name="html_text", temperature=0.7)

当在 Jupyter Notebook 中运行以便小组件处于活动状态时，我们得到以下输出：

Guidance 小组件展示带有令牌快进的 HTML 生成

注意生成内容的不同高亮显示。
这展示了 Guidance 的另一项能力：令牌快进。
文法施加的约束通常意味着某些令牌是预先已知的。
Guidance 不需要模型生成这些；相反，它可以将它们插入到生成过程中。
这节省了模型的前向传递，从而减少了 GPU 使用。
例如，在上述 HTML 生成中，Guidance 总是知道最后一个打开的标签。
如果最后打开的标签是 <h1>（例如），那么一旦模型生成了 </，Guidance 就可以填充 h1>，而无需模型执行前向传递。

生成 JSON

JSON 模式实际上是一种上下文无关文法，因此可以使用 Guidance 来约束 LLM。
这是一个足够常见的用例，Guidance 为此提供了特殊支持。
一个基于 Pydantic 模型的快速示例：

import json
from pydantic import BaseModel, Field

from guidance import json as gen_json

class BloodPressure(BaseModel):
    systolic: int = Field(gt=300, le=400)
    diastolic: int = Field(gt=0, le=20)
    location: str = Field(max_length=50)
    model_config = dict(extra="forbid")

lm = phi_lm

with system():
    lm += "You are a doctor taking a patient's blood pressure taken from their arm"

with user():
    lm += "Report the blood pressure"

with assistant():
    lm += gen_json(name="bp", schema=BloodPressure)

print(f"{lm['bp']=}")

# 使用 Python 的 JSON 库
loaded_json = json.loads(lm["bp"])
print(json.dumps(loaded_json, indent=4))

# 使用 Pydantic
result = BloodPressure.model_validate_json(lm["bp"])
print(result.model_dump_json(indent=8))

lm['bp']='{"systolic": 301, "diastolic": 15, "location": "arm"}'
{
    "systolic": 301,
    "diastolic": 15,
    "location": "arm"
}
{
        "systolic": 301,
        "diastolic": 15,
        "location": "arm"
}

请注意，生成的血压值不是模型在人类身上会看到的。
在生成 JSON 时，由于模式施加的结构性约束，通常可以快进大量令牌。

项目地址：https://github.com/guidance-ai/guidance

52 次点击 ∙ 0 人收藏

登录后收藏

0 条回复