Primary regulatory and governance sources
The core regulations, standards, and governance references that shape enterprise AI deployment across major markets.
EU AI Act Implementation Timeline
The definitive phased application schedule from the Future of Life Institute. Prohibition chapters took effect February 2025; GPAI and governance rules August 2025; full high-risk AI obligations August 2026. Providers of models already on market must comply by August 2027. The timeline that sets the procurement clock for every enterprise deploying AI in or with the EU.
NIST AI Risk Management Framework (AI RMF)
Voluntary framework released January 2023. Addresses design, development, use, and evaluation of AI to incorporate trustworthiness and manage risks across individuals, organizations, and society. The de-facto US federal reference for enterprise AI governance programs.
Singapore's Digital & AI Governance: A Pro-Innovation Model
Practitioner analysis of Singapore's voluntary, principles-based AI governance stack — IMDA's Model AI Governance Frameworks for general AI, generative AI, and the January 2026 Agentic AI framework. Covers PDPC data obligations, MAS FEAT principles, and the AI Verify testing toolkit.
Unpacking the December 11, 2025 US Executive Order on AI
Attorney analysis of the federal executive order actively pushing preemption of conflicting state AI bills and conditioning broadband and discretionary grant funding on state alignment. Critical reading for US-market AI deployments operating across multiple states.
AI Watch: Global Regulatory Tracker — China
Live regulatory tracker for China's layered AI legal framework, including the Generative AI Regulations (August 2023), algorithm recommendation rules, and deep-synthesis provisions. Essential for teams operating any AI service that touches Chinese users or infrastructure.
Deploying AI in the Philippines
Philippine legal guidance covering data privacy obligations under the Data Privacy Act, NPC advisories, sector-specific rules, and the governance expectations for organizations deploying AI systems affecting Filipino residents. Directly relevant to Bayani.ai's home market.
ISO/IEC 42001 — AI Management System Standard
The international standard for AI management systems, analogous to ISO 27001 for information security. Establishes requirements for responsible development, provision, and use of AI-based products and services. Increasingly cited in enterprise procurement and vendor qualification.
Security frameworks, threat models, and identity guidance
Security architecture, adversarial threat models, and identity patterns for protecting AI systems in production.
Zero Trust for AI (ZT4AI): New Tools and Reference Architecture
Microsoft's March 2026 announcement extends Zero Trust principles across the full AI lifecycle — from data ingestion and model training through deployment and agent behavior. Introduces a new AI pillar in the Zero Trust Workshop, updated reference architecture, and concrete patterns for prompt-injection defense, agentic system security, and AI observability. The most current practitioner-level Zero Trust guidance for AI-native stacks.
OWASP Top 10 for LLM Applications 2025
The community standard for LLM security risk — Prompt Injection, Insecure Output Handling, Training Data Poisoning, Model DoS, Supply Chain Vulnerabilities, Sensitive Information Disclosure, Insecure Plugin Design, Excessive Agency, Overreliance, and Model Theft. Required reading for any team building production LLM systems.
NIST SP 800-207: Zero Trust Architecture
The foundational US federal publication defining Zero Trust Architecture principles, components, and deployment models. Establishes the terminology and principles — verify explicitly, use least privilege, assume breach — that underpin all enterprise Zero Trust programs, now extended to AI runtimes.
NIST AI 100-2: Adversarial Machine Learning
Taxonomy and terminology of AI adversarial attacks and mitigations. Covers evasion, poisoning, privacy, and abuse attacks across both predictive and generative AI. The risk taxonomy that procurement teams use to evaluate security posture in AI vendor assessments.
Agentic AI Identity Management Approach
Cloud Security Alliance guidance on identity challenges unique to autonomous AI agents — non-human identities, least-privilege tool grants, session scoping, and delegation chains. Directly applicable to MCP-connected agents operating across enterprise systems.
Disrupting AI Espionage: Anthropic Threat Intelligence
Anthropic's real-world analysis of how nation-state and commercial actors attempt to misuse large language models for espionage, influence operations, and capability development — and the technical and policy countermeasures that have proven effective.
State of AI Security Report — H1 2025
Trend Micro's threat landscape data covering observed AI-specific attacks in the first half of 2025. Quantifies attack volumes, vectors, and industry verticals most targeted, providing statistical grounding for AI security investment decisions.
Retrieval architecture, evaluation, and production quality
The research, benchmarks, and evaluation frameworks that matter when retrieval systems move from prototype to enterprise workload.
Contextual Retrieval: Reducing RAG Retrieval Failures by 67%
Anthropic's technique for prepending chunk-specific explanatory context before embedding significantly outperforms standard RAG. Contextual Embeddings alone cut top-20 retrieval failures by 35% (5.7% → 3.7%). Combined with Contextual BM25 the failure rate drops 49% to 2.9%. Adding a reranking step brings the failure rate to 1.9% — a 67% reduction. The implementation cost is $1.02 per million document tokens. One of the most cited RAG improvements of 2024.
Lost in the Middle: Long-Context RAG Position Bias
Foundational research demonstrating that LLMs perform significantly better when relevant information appears at the beginning or end of the retrieved context, not in the middle. Documents the "lost in the middle" phenomenon that affects retrieval ranking and reranking strategy decisions.
Comprehensive Chunking Evaluation for Production RAG
2025 systematic evaluation of chunking strategies across domains, document types, and retrieval configurations. Identifies which chunking approaches preserve semantic coherence, minimize context fragmentation, and produce measurably higher retrieval precision in enterprise knowledge bases.
RAGAS — RAG Assessment Framework
The most widely adopted open-source framework for evaluating RAG pipeline quality. Measures faithfulness, answer relevancy, context precision, context recall, and context entity recall without requiring human-labeled ground truth. Used by teams at Databricks, Cohere, and across enterprise AI programs.
Reranker Benchmark for RAG Pipelines
NVIDIA's systematic comparison of cross-encoder rerankers, bi-encoder models, and LLM-based reranking approaches for RAG pipeline optimization. Provides precision@k comparisons across domain types, guiding model selection for production reranking stages.
Long-Context RAG Performance Across LLMs
Databricks' empirical testing of long-context models in RAG scenarios, examining whether extended context windows reduce the need for chunking and retrieval or whether retrieval-augmented patterns still outperform pure long-context approaches at enterprise scale.
TruLens — RAG Evaluation and Observability
Open-source evaluation and observability platform for LLM applications. Provides the TruLens RAG Triad (context relevance, groundedness, and answer relevance) as real-time quality gauges, enabling continuous monitoring of RAG quality in production alongside batch evaluation.
Text-to-SQL, semantic layers, and safe data access
References for turning natural-language analytics into governed production capability rather than a risky demo surface.
Survey of Text-to-SQL: Benchmarks, Techniques, and Challenges
Comprehensive survey of Text-to-SQL research covering SPIDER, BIRD, and domain-specific benchmarks; schema-linking approaches; multi-turn dialogue; and the gap between academic accuracy scores and production-environment reliability. The canonical literature review for the field.
Semantic Layer vs. Text-to-SQL: The 2026 Decision Guide
dbt Labs' practitioner analysis of when a semantic layer (MetricFlow, LookML) outperforms raw Text-to-SQL, and when direct SQL generation is preferable. Compares governance posture, schema complexity, multi-tenant isolation, and maintenance overhead — the architectural fork most enterprise analytics teams face.
Prompt Injection Is Not SQL Injection — And Why That Matters
UK National Cyber Security Centre analysis distinguishing prompt injection attacks from traditional SQL injection, explaining why LLM-facing systems require different mitigations and why parameterized queries alone are insufficient. Essential reading for teams exposing Text-to-SQL to untrusted input.
MAC-SQL: Multi-Agent Collaborative Text-to-SQL
COLING 2025 paper introducing MAC-SQL, a multi-agent framework that decomposes complex Text-to-SQL tasks across specialized sub-agents for schema decomposition, SQL generation, and self-correction. Achieves state-of-the-art results on BIRD benchmark, establishing agentic approaches as the production frontier.
Agentic Retrieval Patterns for Text-to-SQL
Recent research on agentic retrieval patterns that combine schema-aware retrieval, iterative self-correction, and execution-feedback loops. Documents the patterns that close the gap between single-shot Text-to-SQL accuracy and reliable, revision-capable systems suitable for production analytics.
VLDB: Text-to-SQL in Production Databases
VLDB paper examining Text-to-SQL performance across production databases with complex schemas, multi-table joins, and domain-specific terminology. Identifies schema documentation quality and example query caching as the two highest-leverage factors for real-world accuracy improvement.
Measurement frameworks for executives and CFOs
Decision support for leadership teams trying to distinguish AI activity from measurable business value.
From Promise to Impact: The Five-Layer AI Measurement Framework
McKinsey's structured approach to AI impact assessment across five layers: technical performance, user adoption, operational KPIs, strategic outcomes, and financial impact. Key finding: 60% of respondents have not seen enterprise-wide EBIT impact from AI — because horizontal tools like chatbots improve experience but rarely move the P&L. Introduces governance cadences, decision gates, and project-phase measurement milestones for moving AI from pilots to scaled value.
56% of CEOs See Zero AI ROI — Here's What the 12% Do Differently
PwC 2026 CEO Survey data showing 56% of CEOs report no revenue increase or cost decrease from AI. CEOs who do see returns are 2–3× more likely to have integrated AI deeply into decision-making and demand generation. Shifts the framing from deployment count to economic primitives and capability depth.
Enterprise AI ROI Framework
Atlassian's practitioner framework for measuring AI value across engineering, IT, and business operations teams. Distinguishes hard cost savings, soft productivity gains, and strategic capability uplift — providing a structured approach to building the business case for continued AI investment.
How to Measure and Prove the Value of Your AI Investments
ISACA's IT governance perspective on demonstrating AI investment value to boards and audit committees. Bridges the gap between operational AI metrics (latency, throughput) and the governance reporting that risk committees and auditors expect to see from enterprise AI programs.
Agentic AI vs. Traditional Copilots: 2026 ROI Decision Guide
Comparative ROI data on copilot-style assistants versus autonomous agentic AI across enterprise use cases in 2026. Documents the scenarios where agentic systems justify their higher implementation complexity, and where simpler copilot deployments deliver better risk-adjusted returns.
State of Enterprise Data & AI 2025
Atlan's annual enterprise survey on AI adoption patterns, data quality challenges, and the governance gaps slowing AI ROI. Identifies data catalog maturity, metadata quality, and data team AI literacy as the top predictors of successful enterprise AI programs.
Put the research to work in your organization.
Bayani.ai turns these frameworks, reference architectures, and governance models into production AI systems on .NET, Azure, and MCP with auditability and human review built in from day one.