OA0

OA0 是一个探索 AI 的社区

现在注册

已注册用户请登录

OA0 › 代码 › Instructor-XL? 不如看看 Guidance — 精准控制生成结构与流程

Instructor-XL? 不如看看 Guidance — 精准控制生成结构与流程

ascend · 2026-05-04 11:00:25 · 53 次点击 · 0 条评论

Guidance 是一种高效的大语言模型编程范式。通过 Guidance，您可以控制输出的结构，为您的用例获得高质量的输出——同时比传统提示或微调降低延迟和成本。它允许用户约束生成（例如使用正则表达式和上下文无关文法），并以无缝方式交错控制（条件、循环、工具使用）和生成。

安装
特性

安装

Guidance 可通过 PyPI 获取，并支持多种后端（Transformers、llama.cpp、OpenAI 等）。如果您已有模型所需的后端，可以直接运行：

pip install guidance

特性

大语言模型的 Pythonic 接口

使用 Guidance 时，您可以通过常见的 Python 习惯用法与大语言模型交互：

from guidance import system, user, assistant, gen
from guidance.models import Transformers

# 也可以使用 LlamaCpp 或其他多种模型
phi_lm = Transformers("microsoft/Phi-4-mini-instruct")

# 模型对象是不可变的，因此这里是其副本
lm = phi_lm

with system():
    lm += "You are a helpful assistant"

with user():
    lm += "Hello. What is your name?"

with assistant():
    lm += gen(max_tokens=20)

print(lm)

在命令行运行将产生如下输出：

<|system|>You are a helpful assistant<|end|><|user|>Hello. What is your name?<|end|><|assistant|>I am Phi, an AI developed by Microsoft. How can I help you today?

如果运行在 Jupyter Notebook 中，Guidance 会提供一个更丰富的用户界面组件：

显示 HTML 生成的 Guidance 组件

使用 Guidance 捕获生成的文本非常简单：

# 获取模型的副本
lm = phi_lm

with system():
    lm += "You are a helpful assistant"

with user():
    lm += "Hello. What is your name?"

with assistant():
    lm += gen(name="lm_response", max_tokens=20)

print(f"{lm['lm_response']=}")

lm['lm_response']='I am Phi, an AI developed by Microsoft. How can I help you today?'

使用约束生成保证输出语法

Guidance 提供了一种易于使用且极其强大的语法，用于约束语言模型的输出。例如，gen() 调用可以约束为匹配正则表达式：

lm = phi_lm

with system():
    lm += "You are a teenager"

with user():
    lm += "How old are you?"

with assistant():
    lm += gen("lm_age", regex=r"\d+", temperature=0.8)

print(f"The language model is {lm['lm_age']} years old")

The language model is 13 years old

通常，我们知道输出必须是预先已知列表中的某个项。Guidance 为此场景提供了 select() 函数：

from guidance import select

lm = phi_lm

with system():
    lm += "You are a geography expert"

with user():
    lm += """What is the capital of Sweden? Answer with the correct letter.

    A) Helsinki
    B) Reykjavík 
    C) Stockholm
    D) Oslo
    """

with assistant():
    lm += select(["A", "B", "C", "D"], name="model_selection")

print(f"The model selected {lm['model_selection']}")

The model selected C

Guidance 提供的约束系统非常强大。它可以确保输出符合任何上下文无关文法（只要后端 LLM 完全支持 Guidance）。下面将对此进行更多说明。

离线调试文法（无需模型 API 调用）

在迭代约束时，您可以在本地验证候选字符串，并使用 Mock 模型测试完整运行。

from guidance import gen
from guidance.models import Mock

grammar = "expr=" + gen(regex=r"\d+([+*]\d+)*", name="expr")

# 1) 直接针对文法验证字符串
assert grammar.match("expr=12+7*3") is not None
assert grammar.match("expr=12+*3") is None

# 2) 使用本地 Mock 模型运行相同文法
lm = Mock(b"<s>expr=12+7*3")
lm += grammar
print(lm["expr"])  # 12+7*3

创建您自己的 Guidance 函数

使用 Guidance，您可以创建自己的、能够与语言模型交互的 Guidance 函数。这些函数使用 @guidance 装饰器标记。假设我们想回答大量多项选择题，可以这样做：

import guidance

from guidance.models import Model

ASCII_OFFSET = ord("a")

@guidance
def zero_shot_multiple_choice(
    language_model: Model,
    question: str,
    choices: list[str],
):
    with user():
        language_model += question + "\n"
        for i, choice in enumerate(choices):
            language_model += f"{chr(i+ASCII_OFFSET)} : {choice}\n"

    with assistant():
        language_model += select(
            [chr(i + ASCII_OFFSET) for i in range(len(choices))], name="string_choice"
        )

    return language_model

现在，定义一些问题：

questions = [
    {
        "question" : "Which state has the northernmost capital?",
        "choices" : [
            "New South Wales",
            "Northern Territory",
            "Queensland",
            "South Australia",
            "Tasmania",
            "Victoria",
            "Western Australia",
        ],
        "answer" : 1,
    },
    {
        "question" : "Which of the following is venomous?",
        "choices" : [
            "Kangaroo",
            "Koala Bear",
            "Platypus",
        ],
        "answer" : 2,
    }
]

我们可以像使用 gen() 或 select() 一样使用装饰后的函数。language_model 参数将自动传入：

lm = phi_lm

with system():
    lm += "You are a student taking a multiple choice test."

for mcq in questions:
    lm_temp = lm + zero_shot_multiple_choice(question=mcq["question"], choices=mcq["choices"])
    converted_answer = ord(lm_temp["string_choice"]) - ASCII_OFFSET
    print(lm_temp)
    print(f"LM Answer: {converted_answer},  Correct Answer: {mcq['answer']}")

<|system|>You are a student taking a multiple choice test.<|end|><|user|>Which state has the northernmost capital?
a : New South Wales
b : Northern Territory
c : Queensland
d : South Australia
e : Tasmania
f : Victoria
g : Western Australia
<|end|><|assistant|>b
LM Answer: 1,  Correct Answer: 1
<|system|>You are a student taking a multiple choice test.<|end|><|user|>Which of the following is venomous?
a : Kangaroo
b : Koala Bear
c : Platypus
<|end|><|assistant|>c
LM Answer: 2,  Correct Answer: 2

Guidance 函数可以组合，以构建完整的上下文无关文法。例如，我们可以创建 Guidance 函数来构建一个简单的 HTML 网页（注意：这不是 HTML 的完整实现）。我们从一个简单的函数开始，该函数将生成不包含任何 HTML 标签的文本。该函数被标记为 stateless 以表明我们打算将其用于组合文法：

@guidance(stateless=True)
def _gen_text(lm: Model):
    return lm + gen(regex="[^<>]+")

然后，我们可以使用此函数在任意 HTML 标签内生成文本：

@guidance(stateless=True)
def _gen_text_in_tag(lm: Model, tag: str):
    lm += f"<{tag}>"
    lm += _gen_text()
    lm += f"</{tag}>"
    return lm

现在，创建页面头部。作为其中的一部分，我们需要生成一个页面标题：

@guidance(stateless=True)
def _gen_header(lm: Model):
    lm += "<head>\n"
    lm += _gen_text_in_tag("title") + "\n"
    lm += "</head>\n"
    return lm

HTML 页面的主体将填充标题和段落。我们可以为每个定义一个函数：

from guidance.library import one_or_more

@guidance(stateless=True)
def _gen_heading(lm: Model):
    lm += select(
        options=[_gen_text_in_tag("h1"), _gen_text_in_tag("h2"), _gen_text_in_tag("h3")]
    )
    lm += "\n"
    return lm

@guidance(stateless=True)
def _gen_para(lm: Model):
    lm += "<p>"
    lm += one_or_more(
        select(
            options=[
                _gen_text(),
                _gen_text_in_tag("em"),
                _gen_text_in_tag("strong"),
                "<br />",
            ],
        )
    )
    lm += "</p>\n"
    return lm

现在，定义 HTML 主体本身的函数：

@guidance(stateless=True)
def _gen_body(lm: Model):
    lm += "<body>\n"
    lm += one_or_more(select(options=[_gen_heading(), one_or_more(_gen_para())]))
    lm += "</body>\n"
    return lm

接下来，是生成完整 HTML 页面的函数。我们添加 HTML 起始标签，然后生成头部，然后是主体，最后追加结束 HTML 标签：

@guidance(stateless=True)
def _gen_html(lm: Model):
    lm += "<html>\n"
    lm += _gen_header()
    lm += _gen_body()
    lm += "</html>\n"
    return lm

最后，我们提供一个用户友好的包装器，它可以让我们：

设置生成的温度
从 Model 对象中捕获生成的页面

from guidance.library import capture, with_temperature

@guidance(stateless=True)
def make_html(
    lm,
    name: str | None = None,
    *,
    temperature: float = 0.0,
):
    return lm + capture(
        with_temperature(_gen_html(), temperature=temperature),
        name=name,
    )

现在，使用它来生成一个简单的网页：

lm = phi_lm

with system():
    lm += "You are an expert in HTML"

with user():
    lm += "Create a simple and short web page about your life story."

with assistant():
    lm += make_html(name="html_text", temperature=0.7)

当在 Jupyter Notebook 中运行并且组件处于活动状态时，我们会得到以下输出：

显示带令牌快进的 HTML 生成的 Guidance 组件

注意生成文本中的不同高亮显示。这展示了 Guidance 的另一个能力：令牌快进。文法施加的约束通常意味着某些令牌是预先已知的。Guidance 不需要模型来生成这些令牌；相反，它可以将它们插入到生成中。这节省了模型的向前传递，从而减少了 GPU 的使用。例如，在上面的 HTML 生成中，Guidance 始终知道最后一个打开的标签。如果最后打开的标签是 <h1>（例如），那么一旦模型生成了 </，Guidance 就可以填入 h1>，而无需模型执行一次向前传递。

生成 JSON

JSON 模式实际上是一种上下文无关文法，因此它可以使用 Guidance 来约束 LLM。这是一个足够常见的场景，Guidance 为其提供了特殊支持。一个基于 Pydantic 模型的快速示例：

import json
from pydantic import BaseModel, Field

from guidance import json as gen_json

class BloodPressure(BaseModel):
    systolic: int = Field(gt=300, le=400)
    diastolic: int = Field(gt=0, le=20)
    location: str = Field(max_length=50)
    model_config = dict(extra="forbid")

lm = phi_lm

with system():
    lm += "You are a doctor taking a patient's blood pressure taken from their arm"

with user():
    lm += "Report the blood pressure"

with assistant():
    lm += gen_json(name="bp", schema=BloodPressure)

print(f"{lm['bp']=}")

# 使用 Python 的 JSON 库
loaded_json = json.loads(lm["bp"])
print(json.dumps(loaded_json, indent=4))

# 使用 Pydantic
result = BloodPressure.model_validate_json(lm["bp"])
print(result.model_dump_json(indent=8))

lm['bp']='{"systolic": 301, "diastolic": 15, "location": "arm"}'
{
    "systolic": 301,
    "diastolic": 15,
    "location": "arm"
}
{
        "systolic": 301,
        "diastolic": 15,
        "location": "arm"
}

请注意，生成的血压值并不是模型在人类身上会看到的数值。生成 JSON 时，由于模式施加的结构约束，通常可以快进大量的令牌。

项目地址：https://github.com/microsoft/guidance

53 次点击 ∙ 0 人收藏

登录后收藏

0 条回复