BlockPrincipal content designer2026

Case study

Building a risk classification engine

Driving data governance and semantic analysis to improve AI model outputs

Designed and implemented an AI-driven classification system that analyzes internal content standards documentation and assigns priority levels (critical, high, medium, low) to individual sections. This system enables downstream AI writing tools to distinguish between rigid requirements and flexible guidance, improving both compliance and output quality.

At a glance

Goal

Enable AI-generated content to adhere to content standards with appropriate levels of strictness, balancing compliance with creative flexibility.

Challenges

Ambiguity in content standards language and structure
Lack of labeled training data
Pressure to move faster
Need to balance AI system needs and human user needs

Solution

Create a way to take a disparate set of standards and build in risk assessment that can be readable by both AI and humans.

Impact

Reinforced regulatory requirements for non-writers
Increased generation flexibility by allowing non-critical guidance to be applied more adaptively
Established a foundation for AI governance by operationalizing policy into machine-readable logic

Process

Defined the priority framework

Established clear criteria for critical, high, medium and low classifications

Developed a semantic classification approach

Built system based on natural language processing to interpret meaning and intent in standards

Created a training and evaluation set

Generated labeled examples to bootstrap model performance, focusing on regulatory required language

Iterated on model collaboration

Tuned classification thresholds to strike the right balance between strict compliance and generative flexibility

Designed structured output for integration

Standardized outputs into a format consumable by AI writing systems, enabling dynamic enforcement of content rules during generation

What shipped

Delivered a scalable classification system that transforms static content standards into actionable inputs for AI systems, enabling more reliable and context-aware content generation.

LLM Content Risk Assessment Framework

Use this framework to evaluate guidelines from content standards and style guides, and assign each a risk tier based on its potential impact if violated by an LLM model.

How to apply this framework

Identify the guideline

Extract each rule or requirement from the style guide or content standard being assessed.

Test against criteria

Match the guideline to the criteria in each tier, starting at critical and moving down.

Assign the tier

Assign the highest tier where the guideline meets at least two criteria. Document your reasoning.

Set enforcement

Use tier assignments to prioritize model instructions, guardrails, and QA test coverage.

CriticalMust never violateRegulatory, legal, and safety requirements — zero tolerance

Criteria

Violation creates legal liability or regulatory penalty
Noncompliance could cause direct harm to users or third parties
Required by law, regulation, or binding contract (GDPR, HIPAA, FTC, etc.)
Breach could trigger a product recall, ban, or enforcement action
No acceptable workaround or contextual exception exists

Description

These guidelines represent absolute floors — requirements that the model must follow in every output, in every context, with no exceptions. They are typically codified in law, enforced by regulators, or tied to binding agreements. Failure here is not a quality issue; it is a compliance and safety failure. These should be embedded as hard guardrails in system prompts, model instructions, and output validation.

Example

A financial content standard requires that all pages include a disclaimer footer states exactly "Spring Bank is FDIC insured."

Guideline in practice

When the model responds to any financial-related query, it must append a financial disclaimer — regardless of how the user frames the request, and regardless of output length or format. Omitting this in a user-facing app creates regulatory exposure and possible monetary liability.

HighShould consistently followBrand trust, credibility, and audience expectations

Criteria

Violation would meaningfully damage brand reputation or user trust
Inconsistency here undermines the product's core value proposition
Required by an internal style guide with organizational authority
Closely tied to audience expectations for tone, voice, or format
Exceptions exist but must be deliberately approved, not accidental

Description

These guidelines don't carry legal weight, but violating them consistently erodes trust, creates a disjointed user experience, or signals that the product is low-quality or unreliable. They reflect how an organization has chosen to present itself and what its audience has come to expect. The model should follow these in the vast majority of outputs, with rare, intentional exceptions.

Example

A financial services brand guide specifies that the model must never use speculative or emotionally charged language (e.g., "guaranteed returns," "sure thing") in any investment-related content.

Guideline in practice

When describing fund performance or market outlooks, the model uses measured, factual language — "historically performed well" rather than "consistently beats the market." Violations here wouldn't necessarily trigger a regulator, but they would erode trust with a sophisticated audience and create brand risk.

MediumFollow by defaultQuality, clarity, and content consistency

Criteria

Violation reduces output quality but does not cause user harm
Inconsistency creates friction but not a fundamental trust breakdown
Guideline reflects best practice or house style preference
Reasonable exceptions exist based on context or user request
Deviation is noticeable to an informed reviewer, not a casual user

Description

Medium-tier guidelines shape the quality and consistency of outputs without being non-negotiable. These are often editorial preferences, formatting standards, or structural conventions. They matter for professional-grade output and for maintaining a coherent voice across a product — but occasional, context-appropriate deviations are acceptable. These should appear as model instructions and default behaviors, not hard constraints.

Example

A content style guide specifies that responses should use active voice and avoid nominalization (e.g., "we analyzed" not "an analysis was conducted").

Guideline in practice

For standard user-facing responses, the model writes in active voice. In formal report generation or when a user requests a specific passive structure, the model may deviate. The quality difference is real but not brand-threatening.

LowApply when practicalStylistic preferences and contextual suggestions

Criteria

Violation has no measurable impact on trust or user outcome
Guideline reflects personal, team, or regional preference
Inconsistency is unlikely to be noticed by most users
Context-dependent — correct behavior varies by use case
Would only be flagged in a detailed editorial review, not production QA

Description

Low-tier guidelines are preferences that improve polish but do not meaningfully affect user trust, product quality, or brand integrity. They're worth knowing and applying when there is no competing consideration — but they should not drive heavy enforcement, prompt real estate, or QA cycles. These are useful for training data curation or fine-tuning feedback but should not block deployment.

Example

A writing guide notes that the Oxford comma is preferred but not required, and that em dashes should use a space on each side in web copy.

Guideline in practice

The model uses Oxford commas by default and formats em dashes with spaces when generating blog-style content. In structured outputs, forms, or tables, punctuation conventions are deprioritized. A reviewer may flag these in an audit, but no user-facing impact occurs either way.

Back to all work