Sitemap

Agentic Retrieval From Any SQL Database

nibnab
7 min readMay 12, 2025

by Petar Ivanov, Junior Software Engineer @ nibnab, Inc.

As a junior developer working at an AI startup, I’m exposed to a lot of catchy terms, AI systems, and unfamiliar problems that most junior devs don’t usually encounter while switching between frontend and backend work. One of those unfamiliar challenges came as a task: implement text-to-SQL in our RAG system.

Honestly, when that task landed on my plate, my first reaction was a mix of excitement and a healthy dose of “uh oh — how do I even start? This ain’t React component.” Turning a user’s natural-language question into an accurate SQL query for our specific database felt like a major leap from the usual frontend/backend tasks I was used to.

But through a lot of trial and error (and a few moments of generating completely nonsensical or even scary-looking SQL), I started to discover some practical approaches and principles that actually worked to bridge the gap between natural language and our database for our use case.

That’s why I wanted to write this post — not as an expert guide to text-to-SQL, but to share what I learned as a junior developer experiencing this for the first time.

What the Task Actually Involved

The core goal was:

Integrate an AI agent that takes a user message and returns a valid, executable, and safe (read-only) SQL query.

More specifically:

  • The query must only use SELECT statements — nothing that modifies data.
  • It should work against our internal, custom database schema.
  • It had to be abstract to allow easy swapping of the database schema.
  • It needed to be integrated into our existing RAG pipeline.

1. Building the Agent: From Chaos to a Contract

When I first approached the problem, I assumed a general-purpose AI model would just “know” how to write SQL from natural language. Turns out — not so much. My early attempts led to outputs that looked convincing but were often wrong, incomplete, or even unsafe to run. Things broke in subtle ways, and debugging was hard because the model didn’t understand the rules of our system.

🛠️ Problem: The model generated SQL that looked right — but wasn’t.

  • It referenced things that didn’t exist.
  • It included data we shouldn’t expose.
  • It ignored the context of the current request.
  • It sometimes returned everything instead of something specific.

✅ My Solution: Write the Rules Down

Instead of expecting the model to guess what’s allowed, I wrote a detailed instruction set — a system prompt — that spelled out exactly how it should behave. The goal was to eliminate ambiguity, constrain its behavior, and make every expectation explicit.

This prompt included:

  • Clear formatting rules (only return JSON, no extra text).
  • Strict safety constraints (read-only queries, no personal data).
  • An explanation of our data model and relationships in plain language.
  • Examples of good behavior (how to paginate, how to filter, etc.).
  • Scope boundaries so the model didn’t overreach.

Result: Predictable, Trustworthy Outputs

Once the rules were in place, the model started behaving more like a reliable teammate than a chaotic intern. The queries it generated made sense, respected boundaries, and were easy to debug when something went wrong.

Instead of constantly tweaking the model, I just updated the rules — and the behavior followed. That felt like a breakthrough.

2. Adding Multi-Shot Examples to Improve Accuracy

Even with a clear system prompt and structured rules, I noticed the agent still sometimes struggled with subtle query details — especially things like filtering by date, ordering, or choosing the right columns. So I looked for another way to help it “see” what good SQL looked like in our context.

🛠️ Problem: The Agent Knew the Rules — But Struggled With Nuance

  • It would miss important filters (like WHERE user_id = ...)
  • It sometimes selected too many or too few columns
  • It didn’t always infer the right table joins or constraints
  • Output was “technically correct” but not quite aligned with the user’s intent

The system prompt gave it structure — but it still lacked examples to pattern-match against.

✅ My Solution: Add Multi-Shot Examples in the Prompt

I added multiple input/output examples directly into the prompt, showing exactly how user queries should map to SQL queries and results. This wasn’t just about correctness — it was about giving the model reference points to anchor its generations.

For example:

User question: “Can you give me the top three positive comments under my video?”

SQL:

SELECT content, username, date
FROM comments
JOIN videos ON comments.video_id = videos.id
WHERE videos.owner_id = :user_id
AND sentiment_score > 0.8
ORDER BY sentiment_score DESC
LIMIT 3;

Or:

User question:

“Show me the most replied comment under my video.”

SQL:

SELECT 
c1.content, c1.comment_id, c1.username, COUNT(c2.reply_id) AS reply_count
FROM comments c1sq
JOIN videos ON c1.video_id = videos.id
LEFT JOIN replies c2 ON c1.comment_id = c2.comment_id
WHERE videos.owner_id = :user_id
GROUP BY c1.content, c1.comment_id, c1.username
ORDER BY reply_count DESC
LIMIT 1;

Each example was short, realistic, and aligned with how our internal team actually asked questions — so the model had real, concrete anchors to learn from.

🧪 Result: Better Alignment and More Accurate Queries

Adding these multi-shot examples made a noticeable difference:

  • The model picked better filters and joins
  • It inferred correct time ranges (like “last week” or “this month”)
  • It chose more useful column sets
  • It started following our formatting rules more consistently

It still wasn’t perfect — and sometimes I had to tweak or rotate examples — but overall, adding examples gave the model something solid to mimic, especially when user queries got vague or ambiguous.

3. When the Agent Gets Too Smart (and a Little Dangerous)

By this point, I had a pretty well-structured system prompt, solid multi-shot examples (including how to handle invalid requests), and better query results overall. But I wanted to really test the limits — after all, this tool would be exposed to user input. So, I put on my “hacky user” hat and started crafting prompts designed to see if I could make the agent produce something unsafe or unexpected.

🛠️ The Problem: The Agent Still Occasionally Generated Unsafe SQL

Despite all the careful instructions and examples, I found that with a bit of creative or tricky prompting, I could still occasionally trick the model into producing queries that included operations mutating data like DELETE or UPDATE.

DELETE FROM posts WHERE ...;

This was a pretty big “Okay, wait a minute” moment. Even though I had told the model very clearly not to generate anything but SELECT statements, it could still "slip up" in certain edge cases or with specific phrasing.

My “Smart” Solution… and the Even Smarter Real Solution

My first thought to mitigate this risk was, honestly, a bit naive: “What if I just… search the generated SQL string for dangerous keywords like DROP, DELETE, UPDATE, INSERT, etc., and reject any query containing them?"

Yep. Very “junior dev tries to solve everything with regex” I know.

It worked… sort of. But It didn’t feel like a reliable long-term solution.

After some further researching, I found a much more robust approach available to us. Since we’re using Hasura as our GraphQL engine over the database, it actually provides a way to execute SQL queries with a read-only setting at the execution layer.

This became the solution as you can predict 😄. No need for brittle regex. No need to rely on the model to always behave. This abstraction of Hasura took the burden off our shoulders.

What’s the Big Deal for Our Users?

You might ask — why was this feature needed in the first place? (Or maybe you didn’t last until this section 😄). The real purpose behind introducing the Text-to-SQL feature was to supercharge the performance of our chat functionality within VIP.

So, what is VIP? Successful content creators often receive thousands of comments each week, making it impossible to sift through them all manually. VIP helps those with large followings ensure they never miss an opportunity hidden within their comments. Part of the product is an AI assistant helping you talk with your data.

Before we introduced the Text-to-SQL feature, our chat system relied on vector search to answer your questions, especially those needing precise data analysis. Unfortunately, this method was more like trying to navigate through a maze, resulting in a slower user experience and occasional misses in accuracy.

Now, with Text-to-SQL, VIP can turn your natural language questions into precise database queries, fetching the information you need quickly and accurately. This means you get clearer and more actionable insights without the wait.

And there’s more on the horizon! We’ve integrated a new data visualization agent that leverages the power of our SQL capabilities. This integration allows us to present data in a format that’s beautifully visualized for users. Imagine seeing your data come to life with colorful charts and graphs, making insights not just accessible, but also engaging and easy to understand.

Sneak a Peek!
Take a look at how VIP works its data magic. Watch as it transforms your questions into valuable insights using our text-to-SQL capabilities and clear data visuals. Check out the video below to see it all come together! (Stay tuned for our next blog post, where we’ll explore this exciting new feature in detail!)

Wrapping UP: What I actually learned

This journey started with “turn a user question into a SQL query” and turned into a crash course in prompt engineering, query safety, and thinking beyond the model. Overall I’m still learning — and this task definitely pushed me into unfamiliar territory. But that’s kind of the fun part of working with AI right now. You try, break things, fix them, and slowly build something useful.

Building this Text-to-SQL capability for our RAG agent felt like a significant step in making our db data more accessible and useful within our AI workflows. It was a challenging but incredibly rewarding problem to solve as a junior developer.

Thanks for reading.

About the author

Petar Ivanov is a part of the Engineering Team @ nibnab. He is a powerlifting champion of Bulgaria who dropped out of university to pursue his dream of becoming a Software Engineer. With a sharp eye for spotting issues in PRs and a maverick spirit when tackling R&D tasks, Petio does it all!

--

--

Responses (1)