提示工程

目的与范围

本文档详细介绍了Screenshot-to-Code应用程序中使用的提示工程系统。它涵盖了提示的构造、组织和定制，以适应不同的技术栈和输入模式。提示工程系统负责创建有效的提示，以指导AI模型（如Claude Sonnet、GPT-4o等）根据截图、导入的代码或视频生成准确的代码。

有关AI模型如何处理这些提示的信息，请参阅LLM集成。

提示系统架构

提示工程系统接收输入（截图、现有代码或视频），并创建专门的提示，指导AI模型如何生成代码。该系统支持多种技术栈，并维护上下文以进行迭代优化。

来源：backend/prompts/__init__.py21-74

提示类型和结构

系统根据输入源和技术栈使用不同类型的提示。每种提示类型都有特定的结构，旨在指导AI模型生成代码。

基本提示组件

来源：backend/prompts/__init__.py97-134

截图提示

截图提示用于将截图转换为代码。它们包括：

一个系统消息，其中包含关于如何为特定技术栈生成代码的详细说明
一个用户消息，包含：
- 截图图像
- 文本指令（“生成一个看起来与此完全相同的网页的代码”)
- 可选地，一个结果图像，用于更新场景

系统提示内容的示例片段

You are an expert Tailwind developer
You take screenshots of a reference web page from the user, and then build single page apps 
using Tailwind, HTML and JS.
...

来源：backend/prompts/screenshot_system_prompts.py4-199

导入代码提示

导入代码提示用于在从现有代码开始时。它们包含一个系统消息，该消息提供：

一个角色描述（例如，“你是一位精通Tailwind的开发者”)
代码生成的准则
支持的库
要修改或扩展的现有代码

示例片段

You are an expert Tailwind developer.

- Do not add comments in the code...
- Repeat elements as needed...
- For images, use placeholder images...

In terms of libraries,
...

Here is the code of the app: [IMPORTED CODE]

来源：backend/prompts/imported_code_prompts.py4-143 backend/prompts/__init__.py77-93

视频提示

视频提示专门用于从用户与Web应用程序的交互视频记录中创建代码。它们指示AI：

分析视频中的截图
理解用户流程
创建一个功能性的应用程序，以重现相同的行为

这些提示强调使用JavaScript使应用程序具有功能性，以匹配视频中看到的交互。

来源：backend/prompts/claude_prompts.py7-51

技术栈支持

提示工程系统支持多种技术栈，每种技术栈都有专门的提示，提供特定于技术栈的说明和库引用。

来源：backend/prompts/screenshot_system_prompts.py202-210 backend/prompts/imported_code_prompts.py145-153 backend/prompts/types.py14-22 frontend/src/lib/stacks.ts3-11

每个技术栈特定的提示包括：

专家角色定义（例如，“你是一位精通React/Tailwind的开发者”)
样式和内容准确性说明
所需的库CDN链接
返回代码的格式要求

系统维护着截图生成和导入代码场景的并行提示集。

提示组装流程

提示组装流程根据输入模式、技术栈以及是首次生成还是更新而有所不同。

来源：backend/prompts/__init__.py21-74

提示组装的关键步骤

检查输入源:
- 如果输入是导入的代码，则使用assemble_imported_code_prompt()
- 如果输入是截图，则使用assemble_prompt()
- 如果输入是视频，则使用assemble_claude_prompt_video()
处理更新场景:
- 对于更新，将历史记录转换为消息序列
- 包含结果图像以显示已生成的内容
创建图像缓存:
- 对于更新，创建从先前生成的内容映射的图像缓存
返回最终提示:
- 返回组装好的提示消息和图像缓存

系统提示集

系统维护了几个提示模板集合，这些模板用于不同的场景。

提示集	目的	源文件
`SYSTEM_PROMPTS`	用于截图到代码生成的模板	`screenshot_system_prompts.py`
`IMPORTED_CODE_SYSTEM_PROMPTS`	用于处理导入代码的模板	`imported_code_prompts.py`
`VIDEO_PROMPT`	用于从视频录制生成代码的模板	`claude_prompts.py`
`HTML_TAILWIND_CLAUDE_SYSTEM_PROMPT`	Claude特定的HTML/Tailwind模板	`claude_prompts.py`
`REACT_TAILWIND_CLAUDE_SYSTEM_PROMPT`	Claude特定的React/Tailwind模板	`claude_prompts.py`

来源：backend/prompts/screenshot_system_prompts.py202-210 backend/prompts/imported_code_prompts.py145-153 backend/prompts/claude_prompts.py7-114

通用提示元素

尽管技术栈和输入类型各不相同，但所有提示都共享几个通用元素：

专家角色定义:
```
You are an expert [Stack] developer
```

外观说明:

- Make sure the app looks exactly like the screenshot.
- Pay close attention to background color, text color, font size, font family, 
padding, margin, border, etc. Match the colors and sizes exactly.

内容指南:

- Use the exact text from the screenshot.
- Do not add comments in the code such as "<!-- Add other navigation links as needed -->" and "<!-- ... other news items ... -->" in place of writing the full code. WRITE THE FULL CODE.
- Repeat elements as needed to match the screenshot.

图像处理:

- For images, use placeholder images from https://placehold.co and include a detailed description of the image in the alt text so that an image generation AI can generate the image later.

库引用:

In terms of libraries,
- Use this script to include [Library]: <script src="..."></script>

输出格式说明:

Return only the full code in <html></html> tags.
Do not include markdown "```" or "```html" at the start or end.

来源：backend/prompts/screenshot_system_prompts.py4-199 backend/prompts/imported_code_prompts.py4-143

测试框架

提示工程系统包含测试，以验证为不同场景和技术栈正确组装了提示。

来源：backend/prompts/test_prompts.py351-471

测试验证了：

为每种技术栈使用了正确的系统提示
用户提示包含预期的文本
导入的代码提示具有正确的结构和内容

与其他系统集成

提示工程系统与Screenshot-to-Code应用程序的多个其他组件集成。

来源：backend/prompts/__init__.py1-135

提示工程系统：

接收输入，来自前端（截图、代码、视频）
创建适当的提示，基于输入类型和技术栈
维护对话历史，以进行迭代优化
将提示传递给LLM集成系统
支持图像生成，通过在提示中包含图像说明

结论

提示工程系统是Screenshot-to-Code应用程序的关键组成部分，负责创建有效的提示来指导AI模型生成准确的代码。它支持多种技术栈、输入类型和迭代优化场景，为每种上下文提供专门的说明。

该系统设计为可扩展的，在不同类型的提示之间具有清晰的划分，并支持各种技术栈。它与应用程序其他组件的集成确保了从输入到代码生成的顺畅工作流程。