Structured Extraction

Extract type-safe structured data using JSON Schema validation.

Overview

RelayPlane supports structured output extraction using JSON Schema. This ensures AI model outputs conform to your expected data structure, with automatic validation and type safety.

Basic Usage

Define a JSON Schema and pass it to the step configuration:

1import { relay } from "@relayplane/sdk";
2
3const InvoiceSchema = {
4 type: "object",
5 properties: {
6 vendor: { type: "string" },
7 amount: { type: "number" },
8 date: { type: "string" },
9 lineItems: {
10 type: "array",
11 items: {
12 type: "object",
13 properties: {
14 description: { type: "string" },
15 quantity: { type: "number" },
16 unitPrice: { type: "number" }
17 },
18 required: ["description", "quantity", "unitPrice"]
19 }
20 }
21 },
22 required: ["vendor", "amount", "date"]
23};
24
25const result = await relay
26 .workflow("invoice-extractor")
27 .step("extract", { schema: InvoiceSchema })
28 .with("openai:gpt-4o")
29 .prompt("Extract invoice data from: {{input.text}}")
30 .run({ text: invoiceText });
31
32// result.output is typed according to the schema
33console.log(result.output.vendor); // string
34console.log(result.output.amount); // number

Provider Behavior

Different providers handle structured extraction differently:

  • OpenAI - Uses native response_format with strict mode for guaranteed schema compliance
  • Anthropic - Uses prompt engineering with schema description, then validates output
  • Google - Uses prompt engineering with JSON output instruction
  • xAI / Local - Uses prompt engineering with schema description
For maximum reliability, use OpenAI models with strict mode when schema compliance is critical.

Using Zod Schemas

You can use Zod for runtime validation with automatic JSON Schema generation:

1import { relay } from "@relayplane/sdk";
2import { z } from "zod";
3import { zodToJsonSchema } from "zod-to-json-schema";
4
5const InvoiceSchema = z.object({
6 vendor: z.string(),
7 amount: z.number(),
8 date: z.string(),
9 lineItems: z.array(z.object({
10 description: z.string(),
11 quantity: z.number(),
12 unitPrice: z.number()
13 }))
14});
15
16type Invoice = z.infer;
17
18const result = await relay
19 .workflow("invoice-extractor")
20 .step("extract", {
21 schema: zodToJsonSchema(InvoiceSchema)
22 })
23 .with("openai:gpt-4o")
24 .prompt("Extract invoice data from: {{input.text}}")
25 .run({ text: invoiceText });
26
27// Parse and validate with Zod
28const invoice: Invoice = InvoiceSchema.parse(result.output);

Complex Nested Schemas

1const ContractSchema = {
2 type: "object",
3 properties: {
4 parties: {
5 type: "array",
6 items: {
7 type: "object",
8 properties: {
9 name: { type: "string" },
10 role: { enum: ["buyer", "seller", "guarantor"] },
11 address: {
12 type: "object",
13 properties: {
14 street: { type: "string" },
15 city: { type: "string" },
16 country: { type: "string" }
17 }
18 }
19 },
20 required: ["name", "role"]
21 }
22 },
23 terms: {
24 type: "object",
25 properties: {
26 startDate: { type: "string" },
27 endDate: { type: "string" },
28 value: { type: "number" },
29 currency: { type: "string" }
30 },
31 required: ["startDate", "value", "currency"]
32 }
33 },
34 required: ["parties", "terms"]
35};

Handling Validation Errors

When the model output doesn't match the schema, a validation error is thrown:

1try {
2 const result = await relay
3 .workflow("extractor")
4 .step("extract", { schema: MySchema })
5 .with("openai:gpt-4o")
6 .run(input);
7} catch (error) {
8 if (error.type === 'ValidationError') {
9 console.error('Schema validation failed:', error.details);
10 // Consider retrying with a different prompt or model
11 }
12}