A Business Analyst's Guide to Designing OAC AI Agent components That Make It Work
By Shriram Gupta | Oracle Solution Architect | BizInsight Consulting
Oracle has made it genuinely easy to create an AI Agent in Oracle Analytics Cloud. Navigate to Create, select AI Agent, choose a dataset, paste some instructions, upload a document. You can have an agent running in under twenty minutes. I know because I built one.
But here is what I learned quickly: the ease of setup is a trap. The agent I built in twenty minutes could not answer half the questions I asked it in plain English. It answered the same question differently depending on how I phrased it. It gave me numbers I could not trust. And it had no idea what a late payment meant in the context of my organisation's AP policy — even though that definition was fundamental to everything the business needed.
The configuration of an OAC AI Agent is not the hard work. The design of the three components that feed it — the Dataset, the Knowledge Document, and the Supplemental Instructions — is where most implementations succeed or fail. And that design work is fundamentally a Business Analyst's responsibility, not an IT task.
This blog is for Business Analysts, functional consultants, and anyone who will sit across the table from a business stakeholder and design an OAC AI Agent for their domain. The IT implementation that follows depends entirely on the quality of what you define here first.
1. Why the Agent Is the Easy Part
When Oracle launched the AI Agent feature in OAC, the product team did something smart — they made the configuration interface deceptively simple. Three fields: a dataset, a text box for instructions, and a file upload for documents. That simplicity is intentional. It removes the technical barrier to entry.
The problem is that simplicity creates a false impression. It makes the agent look like the product. It is not. The agent is the container. The three components you fill it with are the product. And those components require deep business knowledge, careful design thinking, and structured conversations with your subject matter experts that no Oracle wizard or configuration screen can replace.
Think of it this way. A blank spreadsheet takes three seconds to create. A financial model that a CFO can trust takes weeks of requirements gathering, formula design, and validation. The spreadsheet is not the work. The content is the work. OAC AI Agent is the same principle.
The critical insight: An OAC AI Agent built on a poorly designed dataset, a vague knowledge document, and incomplete supplemental instructions will confidently give wrong answers. And a confidently wrong AI answer is more dangerous than no answer at all — because users trust it.
2. Discovery First — Ask Why Before You Ask What
Before any BA touches a dataset schema or drafts a single supplemental instruction, there is a discovery phase that most implementations skip — and pay for later.
The natural instinct is to start with what: what data do we have, what tables exist, what reports are already built. This is an IT-flavoured starting point and it leads to an agent designed around available data rather than around business need. The result is an agent that can answer questions nobody is actually asking.
A BA starts with why and what problem. The discovery conversation with a business stakeholder sounds like this:
The Five Discovery Questions Every BA Must Ask
- What question do you ask every Monday morning that takes you hours to answer today?
- When was the last time you made a wrong decision because you had the wrong number — and what happened as a result?
- If you could ask your data anything right now and get an instant, trustworthy answer, what would you ask?
- Do you and your colleagues define key terms the same way — for example, does everyone agree on what counts as a late payment or an open order?
- What does a wrong answer from this agent cost the business?
These questions do three things. They surface the real use cases — the Monday morning problems that drive actual business value. They expose conflicting metric definitions early, before you have built anything expensive. And they establish the stakes — helping the BA prioritise which design decisions matter most.
The fifth question is particularly important. An agent answering questions about office supply spend can afford to be occasionally imprecise. An agent answering questions about invoice approval compliance or fraud indicators cannot. Understanding the cost of a wrong answer shapes every subsequent design decision.
Discovery output: Before leaving the business stakeholder meeting, a BA should have a written list of at least 15 specific questions the agent must answer correctly. Not categories — actual questions, written in the exact language the business user would type. This list becomes the design specification for all three components and the test plan for validating the agent before launch.
3. Dataset Design — Start With Questions, Work Backwards to Fields
The dataset is the foundation of everything. Every answer the agent gives comes from the data it can query. A well-designed dataset makes the agent look intelligent. A poorly designed dataset makes the agent look broken — even when the agent itself is configured perfectly.
There is one architectural constraint that makes dataset design critical: an OAC AI Agent is scoped to exactly one dataset. The agent cannot join across datasets, cannot pull from multiple sources mid-query, and cannot compensate for data that simply is not there. This is not a temporary limitation — it is a design constraint that forces discipline upfront.
3.1 Design Forwards, Not Backwards
Most people design datasets by looking at available data and asking what questions it can answer. A BA should do the opposite — start with the questions the business needs answered and work backwards to the fields required to answer them.
Take the discovery question list from Section 2. For each question, ask: what data fields are needed to answer this? Write them down. When you have worked through every question, the union of those fields is your dataset schema. You will find that most of the fields are obvious — but a handful will require conversations with IT about derived calculations, pre-joined tables, or fields that do not yet exist in any source system.
3.2 Make the Data Self-Describing
This is the single most impactful design decision a BA can make for dataset quality — and it is almost universally overlooked on first-generation agent builds.
OAC AI Agent interprets column values as well as column names when formulating answers. If a column called Payment_Status contains values of Y and N, the agent must infer what those mean. It will sometimes infer correctly and sometimes not — especially when a user asks in natural language without using the field name.
If the same column contains values like Paid On Time and Paid Late, the agent does not need to infer anything. The value is the answer. This principle — self-describing values — improves agent answer quality without changing a single line of supplemental instructions.
Apply this selectively — use descriptive values where they carry real meaning:
Keep Yes / No for simple binary flags that also feed reports and dashboards — replacing Yes with a 40-character string breaks conditional formatting, filter dropdowns, and calculated measures. The supplemental instructions carry the translation burden for those fields instead.
Real-world lesson: During testing of our AP Intelligence Agent demo, the agent failed to answer several plain English questions correctly. When we rephrased those same questions using the exact field name and value, it answered correctly every time. The root cause was not the agent — it was that the data used codes the agent had to interpret. After applying self-describing values to the right fields, the plain English success rate improved significantly without touching the supplemental instructions.
3.3 Define Derived Fields — Do Not Leave Calculations to the Agent
Business KPIs rarely exist as raw fields in a source system. Days Payable Outstanding, late payment rate, days overdue, three-way match rate — these are calculated metrics. A BA must identify every derived field the agent will need and define the business formula explicitly.
Do not leave these calculations to the agent. The agent can perform basic arithmetic, but it does not know your fiscal year definition, your payment terms policy, or whether your organisation counts overdue from invoice date or due date. Undefined calculations produce inconsistent answers to the same underlying question.
The BA's job is to write the business definition of every derived field in plain English. IT's job is to implement that definition in SQL. This handoff document — written business formula from BA to IT — is the most important artefact the BA produces in the dataset design phase.
3.4 The One-Dataset Constraint Requires Pre-Join Design
Because the agent works with a single dataset, any multi-table question must be answered from a single flat view. In an Oracle EBS environment this means the joining logic — headers to lines, lines to holds, transactions to approvals — must happen in ADW before the dataset is created. The BA does not implement this join, but the BA must specify it.
For every cross-table question in the discovery list, the BA should document which source tables are needed, which fields from each table, and how they relate. A BA who hands IT a list of questions and says "figure out the tables" will get a dataset that may or may not answer those questions correctly. A BA who hands IT a field-level specification will get exactly what the agent needs.
4. Knowledge Document Design — Write for Retrieval, Not for Reading
The knowledge document is the component most people get wrong — because they think of it as a document. It is not. It is a retrieval index.
When a user asks the agent a question that involves a policy rule or business definition, the agent does not read the entire knowledge document. It retrieves the most relevant chunks — short passages that match the semantic context of the question — and uses those chunks to ground its answer. This process is called RAG: Retrieval-Augmented Generation. How well it works depends almost entirely on how the document is structured and written.
A document written for human reading — with narrative paragraphs, context-setting introductions, and legal hedging language — retrieves poorly. A document written for RAG retrieves accurately and produces precise, policy-cited answers.
4.1 The Three Types of Knowledge Document Content
- Policy rules with thresholds. Specific, numbered rules that define required behaviour at specific conditions. Example: "Invoices above $500,000 require CFO approval before payment release. Finance Director approval alone is not sufficient at this threshold." State each rule completely — never reference another section, because a chunk retrieval may return only one section in isolation.
- Business definitions. Exact definitions of key metrics and terms as your organisation uses them — not textbook definitions. Example: "A late payment is any invoice where the Payment Date exceeds the Due Date by one or more calendar days." This prevents the agent from applying generic interpretations to your organisation-specific terminology.
- Escalation and exception rules. What happens when the standard process cannot be followed — who gets notified, what hold is placed, what timeline applies. This content lets the agent not just identify a problem but tell the user what to do about it.
4.2 Five Structural Rules for Writing for RAG
- Use clear, numbered section headers. "Section 1.1: Invoice Approval Authority Matrix" retrieves reliably. "Overview" does not.
- State each rule completely and independently. Never write "as noted above" or "see Section 3 for exceptions." Every rule must be fully stated where it appears.
- Lead with the specific threshold or condition. "Invoices above $500,000 require CFO approval" retrieves better than "The CFO approval process applies to a subset of invoices based on amount thresholds."
- Keep paragraphs short and focused. One rule per paragraph. Two or three sentences maximum. Long paragraphs dilute the retrieval signal.
- Avoid legal and bureaucratic language. "Notwithstanding any other provision of this policy" is not retrievable in any meaningful way. Use plain English.
4.3 Surface the Undocumented Knowledge
The most valuable knowledge is often the knowledge nobody wrote down. Every organisation has it — the AP Manager who manually checks every invoice from a vendor because there are duplicate vendor IDs in the system. The Controller who knows that Q3 always runs high because of the annual maintenance cycle.
A BA's job in the knowledge document phase is not just to transcribe existing policies. It is to conduct stakeholder interviews specifically designed to surface this tribal knowledge. Ask: "What do you know about this data that a new employee starting tomorrow would not know?" The answers are often the most valuable content in the entire knowledge document.
Governance note: Knowledge documents are not one-time uploads. They are living assets. When a policy changes, a threshold is updated, or a new exception type is introduced — the document must be updated and re-uploaded to all affected agents. Establish a named business stakeholder as the document owner for each domain, with a defined review cadence.
5. Supplemental Instructions — Build the Vocabulary Map First
Supplemental instructions are typed directly into the OAC AI Agent configuration screen — up to 6,000 characters of plain English rules that define how the agent thinks, what it knows, and how it behaves. They are the bridge between the dataset's field names and the business language your users will actually type.
The 6,000 character limit is not just a technical constraint — it is a design constraint that forces a BA to be ruthlessly clear about what the agent absolutely must know. Every word counts.
5.1 The Vocabulary Map — Your Most Important Pre-Work
Before writing a single supplemental instruction, build a vocabulary map. Left column is what business users say. Right column is what the data contains. Built from the discovery questions, the dataset field list, and direct conversation with stakeholders about the language they actually use.
Every row in this vocabulary map is a candidate supplemental instruction. Select the rows that represent the highest-frequency, highest-risk business questions and encode those mappings explicitly. Lower-priority mappings can be left to the agent's inference capability.
5.2 Four Categories of Supplemental Instruction Content
- KPI definitions. The exact formula for every calculated metric. Not the concept — the formula. "DPO = Average of Days_To_Pay across all invoices in the selected period. Target is 35 days overall."
- Threshold and compliance rules. Specific dollar amounts, percentages, and conditions that trigger compliance checks — stated in data terms the agent can apply directly.
- Behavioural rules. How the agent should present answers — always include vendor tier, always compare Q3 to Q1 baseline, always cite the policy section when flagging a compliance issue.
- Scope boundaries. What the agent will not answer. Scope boundaries prevent the agent from speculating outside its domain, which is a common source of hallucination.
5.3 The Priority Hierarchy Rule
One of the most powerful things to encode in supplemental instructions is a priority hierarchy for risk questions. When a user asks "what should I focus on this week?", the agent needs to know which issues are most urgent. Define this explicitly:
- Priority 1 — Fraud indicators (bank account change, new vendor without contract)
- Priority 2 — Missing required approvals (CFO approval absent on high-value invoices)
- Priority 3 — Contract compliance issues (expired contracts, missing contracts)
- Priority 4 — Duplicate invoices
- Priority 5 — Budget overruns
Without this hierarchy, the agent presents issues in an arbitrary order. With it, the prioritisation answer becomes immediately actionable.
6. Validation — If Users Cannot Ask in Plain English, the Design Is Not Done
The BA's design is not complete when the three components are built. It is complete when the agent can answer the discovery question list correctly in plain English — without the user needing to know field names, filter syntax, or data structure.
Take the 15 questions from the discovery phase and type them into the agent exactly as a business user would. Not with field names. Not with structured filters. In the same casual, natural language a business user would actually type on a Monday morning.
Every question that fails is a design gap, not an agent failure. Diagnose each failure: was the answer wrong because the data was missing? Because the knowledge document did not contain the relevant policy? Because the supplemental instructions did not map the vocabulary? The answer determines where the fix belongs.
The Rephrase Test — A Diagnostic Tool
When an agent fails a plain English question, rephrase it using the exact field name and value from the dataset. If it answers correctly when rephrased but not in plain English, the problem is a vocabulary gap in the supplemental instructions. Add the missing vocabulary mapping and retest.
If the agent fails even with the exact field name and value, the problem is in the dataset itself — a missing field, a differently encoded value, or an indexing issue. This requires IT involvement to resolve.
Target before launch: Aim for at least 80% of plain English discovery questions passing before declaring the agent ready for pilot. Launching with less than 60% will erode user trust before the agent has a chance to demonstrate value.
7. The Handoff to IT — What a BA Delivers
A BA who completes the design work described in this blog produces a set of artefacts that make the IT implementation straightforward. The BA does not implement the ADW views, the ODI pipelines, or the OAC dataset configuration — but the BA produces the specification that makes those implementations possible.
The complete BA handoff package contains:
- Discovery question list. The 15+ specific questions the agent must answer correctly, written in plain English. This becomes the acceptance test for IT.
- Field specification. Every field the dataset must contain — field name, business definition, data type, possible values, and source table or calculation logic.
- Derived field definitions. The business formula for every calculated metric, written in plain English. IT translates these into SQL.
- Knowledge document draft. Domain policy, business glossary, and escalation rules — written in RAG-friendly structure and approved by the relevant business stakeholder.
- Supplemental instructions draft. The 6,000 character agent configuration text — ready to paste into the OAC AI Agent configuration screen.
- Vocabulary map. The two-column business language to data language mapping — IT uses this to validate that dataset field values match the vocabulary mappings in the instructions.
With this package, IT's role shifts from design to implementation — and the implementation can proceed with confidence that the agent being built is the one the business actually needs.
8. Closing Thought — The Agent Is Not the Intelligence
Oracle's OAC AI Agent is genuinely impressive technology. The ability to ask a plain English question about enterprise data and receive an answer grounded in both real data and real business policy — in seconds, without SQL, without a dashboard — is a meaningful step forward for business intelligence.
But the technology is not the intelligence. The intelligence is what a skilled BA encodes into the three components before the agent is ever launched. The dataset design that ensures the data is clean, complete, and appropriately structured. The knowledge document that captures policy rules in a form the agent can retrieve accurately. The supplemental instructions that bridge the gap between how business users speak and how the data is structured.Oracle has made the agent easy to configure. The BA's job is to make it worth configuring. That work — the discovery, the design, the validation — is not glamorous. It does not show up in a product demo. But it is the difference between an AI Agent that becomes a trusted part of how your organisation makes decisions, and one that gets quietly abandoned after three weeks because nobody trusts what it says.
The agent is the container. The three components are the product. And a Business Analyst who understands that distinction will build AI Agents that actually change how their organisation works.
About the Author
Shriram Gupta is an Oracle Solution Architect at BizInsight Consulting, specialising in Oracle Analytics Cloud, Autonomous Data Warehouse, and Oracle E-Business Suite implementations. This blog is part of a series on practical OAC AI Agent design drawn from hands-on implementation experience.
Tags: Oracle Analytics Cloud | OAC AI Agent | Business Intelligence | Oracle EBS | OCI GenAI | Business Analyst | Dataset Design | Knowledge Management | RAG | Supplemental Instructions