Search 800 + Posts

Jan 27, 2026

Execution Pipeline: Sql Query Generation (Natural Language to Secure SQL) with Select AI

This blog is about the inner workings of the Oracle Select AI execution pipeline—the sophisticated machinery that translates human intent into high-performance, secure SQL code.

Oracle Select AI turns natural language into runnable SQL queries through a process called Prompt Augmentation, which bridges the gap between general-purpose Large Language Models (LLMs) and your specific database schema. It serves as a developer assistant that enables both SQL beginners and experts to interact with data using human language

 The SQL query generation pipeline in Select AI is a multi-step orchestration that transforms a human-language question into a precise, runnable database query. It serves as a developer assistant that bridges the gap between general-purpose Large Language Models (LLMs) and your private enterprise data.


 The transformation from a human question to a database query follows a specific sequence to ensure accuracy and relevance:

The Step-by-Step Pipeline

The process follows a specific workflow to ensure the generated SQL is accurate and secure:

1. Natural Language Input: The process begins when a user submits a prompt, such as "Who are the top three baby boomer Big Spenders?".

2. Prompt Augmentation: This is the "key enabler" of the pipeline. Select AI automatically combines the user's prompt with database schema metadata. This metadata includes:

    ◦ Table and column names.
    ◦ Data types.
    ◦ Optional comments that explain the purpose of generically named columns.

When a user asks a question, Select AI doesn't just send that sentence to an LLM. It performs Metadata Harvesting. It pulls table names, column definitions, data types, and—crucially—Comments and Annotations from the Oracle Data Dictionary. This "augmented context" ensures the LLM understands that "Q3 Revenue" refers to the REVENUE column filtered by the FISCAL_QUARTER column

3. LLM Processing: This "augmented prompt" is sent to an external AI provider (like OCI Generative AI, OpenAI, or Azure). The LLM uses its general knowledge (e.g., knowing the age range for a "Baby Boomer") alongside the provided metadata to understand the database structure.

4. SQL Generation: The LLM generates a syntactically correct SQL query specifically tailored to your schema.

5. Execution and Security: Depending on the chosen action, the query is executed in the database. It is important to note that Oracle security policies and data access rules are enforced just as they would be for any other SQL query.

Key Components Enabling the Pipeline

To manage this pipeline effectively, Select AI utilizes several specific configurations:

AI Profiles: These are predefined configurations created with the dbms_cloud_ai package. They specify which AI provider to use, the specific model (e.g., GPT-4o), and the Object List (a subset of tables the AI is allowed to consider).

Metadata Tuning: The quality of the pipeline's output depends on the quality of the metadata. Adding comments to your columns helps the LLM avoid confusion when dealing with generic table names like "Table 1" or "Table 2".

Conversation History: If enabled, the pipeline remembers up to the last 10 prompts. This allows the user to ask follow-up questions to refine results without respecifying previous details.

Comparison of Generation Actions

 Once the pipeline generates the SQL, you can choose how to interact with it:

 

Action

Result of the Pipeline

Data Privacy

show_sql

Returns only the generated SQL code for review.

No actual data sent to LLM.

run_sql

Executes the query and returns a rowset (default action).

No actual data sent to LLM.

explain_sql

Returns the SQL plus a step-by-step natural language breakdown.

No actual data sent to LLM.

narrate

Runs the query and then summarizes the results in a conversational tone.

Sends actual table data to the LLM.

 


Preventing Hallucinations

A standard LLM (like ChatGPT) might "hallucinate" and invent table names that don't exist in your database when asked a question. By using Prompt Augmentation to feed the LLM your actual schema metadata, Select AI ensures the generated query is runnable on your specific instance.

Would you like to know more about how to use the "explain_sql" action to help your team learn SQL, or should we discuss how the pipeline handles large schemas with hundreds of tables?