I recently worked on an AI project that was quite unusual. It was about analyzing the past, present, and future of a person’s life. Yes, you guessed it right. Astrology.
More specifically, this project focused on Vedic Astrology using the Kerala system. As someone who builds AI systems for a living, and being someone who loves solving challenging problems, I was excited. Modern AI tools make it incredibly easy to spin up apps quickly. But here is the reality check. The whole concept of “vibe coding” is still evolving, especially when it comes to complex data analysis, probabilistic workflows, and systems where accuracy truly matters.
I explored several LLMs and open-source codebases. Most of them were fine-tuned for region-specific astrology systems. A few generated outright wrong results. Many relied on private or inaccessible datasets, often embedded into RAG pipelines that were not publicly available. That immediately raised a red flag for me.
On top of that, I was new to this “domain” and I had trust issues. I did not want to invest weeks learning something that might not even be technically sound (imho, please excuse my ignorance). So I tried a shortcut – asking AI to build their own requirements document.
I asked a public chat assistant to generate the analysis for me. It gave me something. But I was skeptical.
So I repeated the same query another ten times in fresh sessions, without any chat history. The result was interesting. Roughly 40% of the output was consistent. However, the rest was mostly assumptions, probabilistic filler, or what I can only describe as polite hallucinations designed to keep me satisfied.
Then I tried something fun.
I took the response from one AI assistant and pasted it into another. I asked the second one to validate it. Then I pasted that response back into the first. That is when things got interesting. They started debating.
They corrected each other. They agreed. They disagreed. They blamed. They apologized. It felt like a mini academic conference happening inside my browser.
Who said AI cannot show signs of ego or empathy? I saw both in action.
From these debates, I noticed a pattern. Both assistants were weak at doing accurate astronomical calculations, especially for planetary positions and the specific rules of the Kerala system. Even when I explicitly mentioned the system and provided clear constraints, the models would either forget, skip steps, or silently substitute missing data with fabricated values.
That is when I decided to formalize this process.
I wrote two simple AI agents using two different LLMs and made them peer-review each other. You can call this a Multi-Agent Collaboration pattern. Some people also refer to it as Marker-Checker loops, Debate-Based Consensus, or Peer-Review workflows.
This saved me a lot of time. More importantly, it helped me identify what really mattered for this application. Input validation, prompt constraints, guardrails, preprocessing, structured reasoning, deterministic computation, and final response validation.
One major lesson I learned was this: You should never let an LLM ‘guess’ when a deterministic solution is possible.
So I redesigned the system. I invested time learning how astrologers actually do these calculations. Based on that, I decided to move all planetary position calculations outside the LLM. I used NASA’s planetary transit datasets such as de442.bsp and hip_main.dat, along with Python libraries like Pandas and NumPy, to generate deterministic results.
At a high level, the initial input is now generated using a rules engine. This not only improves accuracy but also saves a lot of tokens (cost saving). That structured input is then passed to the LLM for interpretation and explanation. A second LLM instance validates the response before it reaches the user.
Sorry that I cannot disclose more technical, architecture and strategy details at this point.
But then another question came up.
How do I know my results are correct?
I bought a few horoscope reports from some well-known astrology websites that does not use LLMs. I treated that as my Golden Dataset. I manually validated my outputs against it. Then I also used an AI assistant as a secondary validator using an LLM-as-a-Judge pattern to validate these documents and responses. Only after passing both checks did I trust the pipeline.
Final thoughts…
Of course, astrology itself is a pseudoscience. Some future predictions are inherently speculative. No matter how deterministic the calculations are, the interpretations will always remain probabilistic.
So yes, this app will always live in a probabilistic world. (which technically is not my worry for now but this might be my customer’s bread and butter after few days)
But anyway, now it does so intentionally, transparently, and with strong architectural guardrails. And that, for me, is the real win.

