---
title: "Attributes for Conceptual Data Objects"
url: https://ott.earth/cdo/minimal-attributes/
lead: "Every CDO should have a clear, business-focused set of attributes — ensuring each data object is well understood, easy to manage, and fit for purpose across the enterprise."
author: "Pavel Ott"
source: https://ott.earth/
license: CC BY 4.0
---

# Attributes for Conceptual Data Objects

> Every CDO should have a clear, business-focused set of attributes — ensuring each data object is well understood, easy to manage, and fit for purpose across the enterprise.


> **Tip:** Not every CDO needs every attribute — focus on what's essential for business clarity, governance, and compliance. Keep definitions short, practical, and easy to update.

Each attribute below carries a second purpose: it shapes how AI agents, MCP servers, and retrieval pipelines can use the CDO directly. Read the *italic* commentary alongside each row.

## Core (Functional)

<div class="table-wrap">

<table class="attrs">
  <thead>
    <tr><th style="width:18%">Attribute</th><th style="width:57%">Guideline</th><th style="width:25%">Example</th></tr>
  </thead>
  <tbody>
    <tr><td><strong>Name</strong></td><td>Short, business-friendly label for the CDO. <em>Used as the canonical token in MCP tool names, RAG indexes, and agent prompts — eliminates synonym drift ("Client" vs "Customer" vs "Account") that derails AI retrieval.</em></td><td><code>Customer</code></td></tr>
    <tr><td><strong>Business Meaning &amp; Definition</strong></td><td>One-sentence summary in plain English. <em>Becomes the <code>description</code> field exposed to LLMs via MCP servers and tool schemas — models match user intent to the right concept by meaning, not by guessing from column names.</em></td><td>"A person or organisation that has an active relationship with the enterprise and may transact with it."</td></tr>
    <tr><td><strong>Type</strong></td><td>Category (External Standard, Canonical, Bespoke). <em>Signals to AI agents how much trust to place in the concept — autonomous actions can be permitted on External/Canonical CDOs and gated for human review on Bespoke ones.</em></td><td>Canonical</td></tr>
    <tr><td><strong>Relationships</strong></td><td>List inbound and outbound links to other CDOs. <em>Lets AI agents traverse the enterprise data graph deterministically (Customer → Order → Invoice) instead of hallucinating joins or inventing foreign keys.</em></td><td>Customer → places → Order; Customer → owns → Account</td></tr>
    <tr><td><strong>Mandatory Attributes</strong></td><td>Key business fields that must be present. <em>Defines the required argument shape for MCP tool calls — the LLM knows what it must collect from the user before invoking an action.</em></td><td>CustomerID, LegalName, ContactEmail, Country</td></tr>
    <tr><td><strong>Validation Rules</strong></td><td>Constraints (uniqueness, mandatory fields, valid ranges). <em>Machine-readable guardrails that automatically reject malformed AI-generated payloads before they hit downstream systems — turns "hope" into enforcement.</em></td><td>CustomerID unique; ContactEmail per RFC 5322; Country in ISO 3166-1</td></tr>
    <tr><td><strong>Interoperability</strong></td><td>Ability to map to external standards or internal canonical models. <em>Enables AI agents to translate between vendor schemas and the enterprise concept on the fly, so the same prompt works whether the data lives in Workday, SAP, or a CSV.</em></td><td>Maps to ISO 20022 "Party"; SAP Business Partner; Salesforce Account</td></tr>
    <tr><td><strong>Usage Scenarios</strong></td><td>Where and how the CDO is used (reporting, analytics, integration, compliance). <em>Used by AI tool-routers to decide which agent or skill should handle a given query — improves tool selection accuracy in multi-agent workflows.</em></td><td>CRM segmentation, billing, KYC reporting, churn analytics</td></tr>
    <tr><td><strong>Effective Date &amp; Version</strong></td><td>When the current definition applies; track changes. <em>AI agents and embeddings must pin to a version — without this, a model can quote a definition that was retired last quarter and confidently mislead the user.</em></td><td>v2.3, effective 2025-07-01</td></tr>
    <tr><td><strong>Authoritative source</strong></td><td>Who owns or governs the CDO (internal or external). <em>Gives AI agents a citation target and a human escalation path — every AI-generated answer about this concept can attribute back to a known steward.</em></td><td>Customer Data Office — Steward: J. Doe</td></tr>
    <tr><td><strong>Usage Notes</strong></td><td>Consuming domains, reports, APIs, integrations. <em>Tells AI orchestrators which downstream systems will be touched by a tool call — essential for impact assessment, blast-radius control, and "are you sure?" confirmations.</em></td><td>Consumed by Billing, Analytics Warehouse, Marketing Automation</td></tr>
  </tbody>
</table>

</div>

## Supporting (Non-Functional)

<div class="table-wrap">

<table class="attrs">
  <thead>
    <tr><th style="width:18%">Attribute</th><th style="width:57%">Guideline</th><th style="width:25%">Example</th></tr>
  </thead>
  <tbody>
    <tr><td><strong>Performance</strong></td><td>Response time targets for queries and API calls. <em>Determines whether the CDO can be called inline during a chat turn or must be cached/pre-fetched — directly shapes AI agent latency budgets.</em></td><td>Lookup ≤ 200 ms p95; search ≤ 500 ms p95</td></tr>
    <tr><td><strong>Scalability</strong></td><td>Ability to handle growth in data volume. <em>AI workloads are bursty (one prompt fans out into many tool calls) — capacity must flex without throttling agents mid-task.</em></td><td>10M records today; 30% YoY growth; burst 500 req/s</td></tr>
    <tr><td><strong>Availability</strong></td><td>Uptime targets (e.g., 99.9% for critical CDOs). <em>When AI sits in the critical path, a CDO outage cascades into a visible AI failure — availability is now a user-experience metric, not just an infrastructure one.</em></td><td>99.95% uptime, 24×7</td></tr>
    <tr><td><strong>Resiliency</strong></td><td>Disaster recovery and failover capabilities (RPO/RTO). <em>AI agents that fail silently are dangerous; failover behaviour must be explicit so the agent either retries cleanly or refuses, never fabricates.</em></td><td>RPO 15 min; RTO 1 hour; multi-region active-passive</td></tr>
    <tr><td><strong>Quality rules</strong></td><td>Checks for accuracy, completeness, uniqueness, and reference integrity. <em>Poor data quality is amplified by AI — bad inputs become confident wrong outputs at scale. Quality rules are the upstream defence against hallucination.</em></td><td>&lt;0.1% duplicate rate; 100% mandatory fields populated; FK integrity to Country</td></tr>
    <tr><td><strong>Lifecycle</strong></td><td>Creation, update, archival, and review cadence. <em>RAG corpora and fine-tuning datasets must respect the CDO lifecycle — otherwise AI keeps surfacing deprecated definitions long after the business has moved on.</em></td><td>Created on first transaction; reviewed annually; archived 7 years after last activity</td></tr>
    <tr><td><strong>Security</strong></td><td>Encryption, access controls (RBAC, MFA) for sensitive data. <em>AI agents inherit the caller's identity; RBAC enforced at the CDO prevents oversharing through prompt injection, summarisation, or careless context-window leakage.</em></td><td>RBAC; PII fields AES-256 at rest; TLS 1.3 in transit; MFA for write</td></tr>
    <tr><td><strong>Compliance</strong></td><td>Privacy, retention, and regulatory constraints. <em>AI inference can re-expose redacted data through paraphrase — compliance must be enforced at the CDO so sensitive content never enters the model's context window in the first place.</em></td><td>GDPR Art. 17 (right to erasure); 7-year tax retention; APP (Australia)</td></tr>
    <tr><td><strong>Auditability</strong></td><td>Full change history and version control. <em>Every AI-driven decision needs a reproducible trail: which definition, which version, which rules were in force at the moment of inference.</em></td><td>Every field change logged with user, timestamp, before/after; immutable audit store</td></tr>
    <tr><td><strong>Maintainability</strong></td><td>Ease of updating definitions and propagating changes. <em>AI tooling subscribed to the CDO (vector stores, MCP schemas, prompt templates) needs predictable change events to refresh embeddings and tool definitions automatically.</em></td><td>Schema changes published via change-event topic; semver versioning</td></tr>
    <tr><td><strong>Portability</strong></td><td>Technology-independent representation (e.g., JSON, XML, UML). <em>Directly consumable by MCP servers, OpenAPI tool schemas, and vector embeddings — a portable CDO is an AI-ready CDO with no glue code.</em></td><td>JSON Schema + OpenAPI 3.1 definition; UML class diagram</td></tr>
  </tbody>
</table>

</div>

## AI-Native Attributes

<p class="lead-sm">Recommended additions for AI / agentic deployments. Treating CDOs as first-class AI resources requires a small set of additional attributes. These make a CDO directly usable by MCP servers, agent frameworks, and retrieval pipelines — without bolt-on glue code or one-off integration projects.</p>

<div class="table-wrap">

<table class="attrs attrs-ai">
  <thead>
    <tr><th style="width:18%">AI Attribute</th><th style="width:57%">Guideline</th><th style="width:25%">Example</th></tr>
  </thead>
  <tbody>
    <tr><td><strong>MCP / Tool Exposure</strong></td><td>Whether and how the CDO is exposed through MCP servers, tool schemas, or agent APIs. Defines the AI-callable contract: read, write, search, subscribe — and the auth model that applies.</td><td><code>mcp://crm/customer</code> — read, search; write gated by OAuth scope <code>customer.write</code></td></tr>
    <tr><td><strong>Semantic Representation</strong></td><td>Recommended embedding model, chunking strategy, and synonyms/aliases for retrieval. Ensures consistent semantic search behaviour across every AI application referencing the same concept.</td><td><code>text-embedding-3-large</code>; 512-token chunks; aliases: "Client", "Account", "Party"</td></tr>
    <tr><td><strong>AI Usage Policy</strong></td><td>Permitted AI uses: retrieval, summarisation, inference, training, fine-tuning, third-party LLM exposure. Prevents sensitive or contractually-restricted CDOs from leaking into model training or external services.</td><td>Retrieval &amp; summarisation: allowed; training &amp; 3rd-party LLM: disallowed</td></tr>
    <tr><td><strong>Grounding Status</strong></td><td>Whether this CDO is approved as a source-of-truth anchor for RAG and agentic workflows. Distinguishes ground-truth concepts from informational ones — a critical control on hallucination.</td><td>Approved RAG anchor for sales-assist and support agents</td></tr>
    <tr><td><strong>Confidence &amp; Provenance Signals</strong></td><td>Required signals returned with the data (source system, freshness, confidence score, last-validated timestamp). Lets AI agents weight evidence and tell the user when an answer is provisional.</td><td><code>source=CRM</code>; freshness ≤ 5 min; confidence 0.95; validated 2025-08-15T10:00Z</td></tr>
    <tr><td><strong>Human-in-the-Loop Triggers</strong></td><td>Conditions under which AI actions on this CDO require human approval (e.g., create, delete, value above threshold). Defines blast-radius limits for autonomous agents.</td><td>Approval required for create, delete, merge, or any change to LegalName / Country</td></tr>
  </tbody>
</table>

</div>

