Anatomy of an AI Research Agent That Doesn't Hallucinate
We built an autonomous AI agent that generates security research daily. It produced 36 papers in 42 days — all smoke. We audited every one, dissected five systemic failure modes, and rebuilt the pipeline with seven validation gates. This post documents the architecture, the failures, and the engineering that turned a hallucination factory into a functional research tool.
Read articleCapturing OWASP FinBot CTF: A Source-Aware Methodology for Agentic AI Red Teaming
Two-day engagement against OWASP's FinBot CTF using Ai-EGIS v3.0: 19/19 challenges captured, 7,315 points and 37/31 badges. The case study introduces Tool Output Mimicry — a novel primitive for defeating multi-agent guardrails — and a reusable source-aware methodology for testing agentic systems.
Read article