EviBound Framework Eliminates False Claims in Autonomous AI Research

Researchers at Cornell University have developed EviBound, an evidence-bound execution framework designed to eliminate false claims in LLM-based autonomous research agents. The framework uses dual governance gates to ensure that all claims made by AI agents are supported by verifiable evidence, enhancing the integrity of AI research.

Dual Governance Gates for Research Integrity

EviBound’s architecture includes two governance gates: the Approval Gate and the Verification Gate. The Approval Gate validates acceptance criteria schemas before code execution, while the Verification Gate confirms the validity of artifacts post-execution through MLflow API queries. This dual-gate system ensures that tasks are only marked as complete when backed by queryable run_ids, required artifacts, and a FINISHED status.

Confidence-Gated Retries and Performance

The framework incorporates confidence-gated retries to handle transient failures without creating unbounded loops. Evaluations on benchmark tasks showed that EviBound achieved a 0% hallucination rate, significantly outperforming prompt-level only (100% hallucination) and verification-only (25% hallucination) baselines. EviBound successfully verified 7 out of 8 tasks, with the approval gate correctly blocking one task due to malformed criteria.

Key Features of EviBound

The EviBound package includes execution trajectories, MLflow run_ids, and a verification protocol. The framework introduces an 8.3% execution overhead, a minimal cost for the significant improvement in research integrity. The developers argue that research integrity should be an architectural property achieved through governance gates rather than relying solely on model scale.

Future Directions

Future work aims to scale EviBound to complex multi-step research, implement domain-specific evidence schemas, and develop adaptive verification thresholds. The team plans to integrate cross-cycle learning from verification patterns and enhance planning and reflection system integration. The code and artifacts are planned for release under CC BY 4.0 upon publication.