Whitepapers & Research

02

Regulations & Compliance

Primary regulatory and governance sources

The core regulations, standards, and governance references that shape enterprise AI deployment across major markets.

EU AI Act · Official Timeline · 2024-2031

EU AI Act Implementation Timeline

The definitive phased application schedule from the Future of Life Institute. Prohibition chapters took effect February 2025; GPAI and governance rules August 2025; full high-risk AI obligations August 2026. Providers of models already on market must comply by August 2027. The timeline that sets the procurement clock for every enterprise deploying AI in or with the EU.

NIST Framework

NIST AI Risk Management Framework (AI RMF)

Voluntary framework released January 2023. Addresses design, development, use, and evaluation of AI to incorporate trustworthiness and manage risks across individuals, organizations, and society. The de-facto US federal reference for enterprise AI governance programs.

Duane Morris Analysis

Singapore's Digital & AI Governance: A Pro-Innovation Model

Practitioner analysis of Singapore's voluntary, principles-based AI governance stack — IMDA's Model AI Governance Frameworks for general AI, generative AI, and the January 2026 Agentic AI framework. Covers PDPC data obligations, MAS FEAT principles, and the AI Verify testing toolkit.

Sidley Austin Legal Analysis

Unpacking the December 11, 2025 US Executive Order on AI

Attorney analysis of the federal executive order actively pushing preemption of conflicting state AI bills and conditioning broadband and discretionary grant funding on state alignment. Critical reading for US-market AI deployments operating across multiple states.

White & Case Regulatory Tracker

AI Watch: Global Regulatory Tracker — China

Live regulatory tracker for China's layered AI legal framework, including the Generative AI Regulations (August 2023), algorithm recommendation rules, and deep-synthesis provisions. Essential for teams operating any AI service that touches Chinese users or infrastructure.

Quisumbing Torres Legal Guidance

Deploying AI in the Philippines

Philippine legal guidance covering data privacy obligations under the Data Privacy Act, NPC advisories, sector-specific rules, and the governance expectations for organizations deploying AI systems affecting Filipino residents. Directly relevant to Bayani.ai's home market.

ISO Standard

ISO/IEC 42001 — AI Management System Standard

The international standard for AI management systems, analogous to ISO 27001 for information security. Establishes requirements for responsible development, provision, and use of AI-based products and services. Increasingly cited in enterprise procurement and vendor qualification.

03

Security & Zero Trust

Security frameworks, threat models, and identity guidance

Security architecture, adversarial threat models, and identity patterns for protecting AI systems in production.

Microsoft Security · March 2026

Zero Trust for AI (ZT4AI): New Tools and Reference Architecture

Microsoft's March 2026 announcement extends Zero Trust principles across the full AI lifecycle — from data ingestion and model training through deployment and agent behavior. Introduces a new AI pillar in the Zero Trust Workshop, updated reference architecture, and concrete patterns for prompt-injection defense, agentic system security, and AI observability. The most current practitioner-level Zero Trust guidance for AI-native stacks.

OWASP v2025

OWASP Top 10 for LLM Applications 2025

The community standard for LLM security risk — Prompt Injection, Insecure Output Handling, Training Data Poisoning, Model DoS, Supply Chain Vulnerabilities, Sensitive Information Disclosure, Insecure Plugin Design, Excessive Agency, Overreliance, and Model Theft. Required reading for any team building production LLM systems.

NIST SP 800-207

NIST SP 800-207: Zero Trust Architecture

The foundational US federal publication defining Zero Trust Architecture principles, components, and deployment models. Establishes the terminology and principles — verify explicitly, use least privilege, assume breach — that underpin all enterprise Zero Trust programs, now extended to AI runtimes.

NIST AI 100-2 · 2025

NIST AI 100-2: Adversarial Machine Learning

Taxonomy and terminology of AI adversarial attacks and mitigations. Covers evasion, poisoning, privacy, and abuse attacks across both predictive and generative AI. The risk taxonomy that procurement teams use to evaluate security posture in AI vendor assessments.

CSA Guidance

Agentic AI Identity Management Approach

Cloud Security Alliance guidance on identity challenges unique to autonomous AI agents — non-human identities, least-privilege tool grants, session scoping, and delegation chains. Directly applicable to MCP-connected agents operating across enterprise systems.

Anthropic Threat Intelligence

Disrupting AI Espionage: Anthropic Threat Intelligence

Anthropic's real-world analysis of how nation-state and commercial actors attempt to misuse large language models for espionage, influence operations, and capability development — and the technical and policy countermeasures that have proven effective.

Trend Micro Industry Report · H1 2025

State of AI Security Report — H1 2025

Trend Micro's threat landscape data covering observed AI-specific attacks in the first half of 2025. Quantifies attack volumes, vectors, and industry verticals most targeted, providing statistical grounding for AI security investment decisions.

04

RAG & Knowledge Systems

Retrieval architecture, evaluation, and production quality

The research, benchmarks, and evaluation frameworks that matter when retrieval systems move from prototype to enterprise workload.

Anthropic · Published Sep 2024

Contextual Retrieval: Reducing RAG Retrieval Failures by 67%

Anthropic's technique for prepending chunk-specific explanatory context before embedding significantly outperforms standard RAG. Contextual Embeddings alone cut top-20 retrieval failures by 35% (5.7% → 3.7%). Combined with Contextual BM25 the failure rate drops 49% to 2.9%. Adding a reranking step brings the failure rate to 1.9% — a 67% reduction. The implementation cost is $1.02 per million document tokens. One of the most cited RAG improvements of 2024.

arXiv Academic Paper

Lost in the Middle: Long-Context RAG Position Bias

Foundational research demonstrating that LLMs perform significantly better when relevant information appears at the beginning or end of the retrieved context, not in the middle. Documents the "lost in the middle" phenomenon that affects retrieval ranking and reranking strategy decisions.

arXiv Academic Paper · 2025

Comprehensive Chunking Evaluation for Production RAG

2025 systematic evaluation of chunking strategies across domains, document types, and retrieval configurations. Identifies which chunking approaches preserve semantic coherence, minimize context fragmentation, and produce measurably higher retrieval precision in enterprise knowledge bases.

Open Source Eval Framework

RAGAS — RAG Assessment Framework

The most widely adopted open-source framework for evaluating RAG pipeline quality. Measures faithfulness, answer relevancy, context precision, context recall, and context entity recall without requiring human-labeled ground truth. Used by teams at Databricks, Cohere, and across enterprise AI programs.

NVIDIA · arXiv Benchmark Paper

Reranker Benchmark for RAG Pipelines

NVIDIA's systematic comparison of cross-encoder rerankers, bi-encoder models, and LLM-based reranking approaches for RAG pipeline optimization. Provides precision@k comparisons across domain types, guiding model selection for production reranking stages.

Databricks Industry Research

Long-Context RAG Performance Across LLMs

Databricks' empirical testing of long-context models in RAG scenarios, examining whether extended context windows reduce the need for chunking and retrieval or whether retrieval-augmented patterns still outperform pure long-context approaches at enterprise scale.

TruEra Eval Tool

TruLens — RAG Evaluation and Observability

Open-source evaluation and observability platform for LLM applications. Provides the TruLens RAG Triad (context relevance, groundedness, and answer relevance) as real-time quality gauges, enabling continuous monitoring of RAG quality in production alongside batch evaluation.

05

Governed Analytics & Text-to-SQL

Text-to-SQL, semantic layers, and safe data access

References for turning natural-language analytics into governed production capability rather than a risky demo surface.

arXiv Survey · 2024

Survey of Text-to-SQL: Benchmarks, Techniques, and Challenges

Comprehensive survey of Text-to-SQL research covering SPIDER, BIRD, and domain-specific benchmarks; schema-linking approaches; multi-turn dialogue; and the gap between academic accuracy scores and production-environment reliability. The canonical literature review for the field.

dbt Labs Architecture · 2026

Semantic Layer vs. Text-to-SQL: The 2026 Decision Guide

dbt Labs' practitioner analysis of when a semantic layer (MetricFlow, LookML) outperforms raw Text-to-SQL, and when direct SQL generation is preferable. Compares governance posture, schema complexity, multi-tenant isolation, and maintenance overhead — the architectural fork most enterprise analytics teams face.

UK NCSC Security Guidance

Prompt Injection Is Not SQL Injection — And Why That Matters

UK National Cyber Security Centre analysis distinguishing prompt injection attacks from traditional SQL injection, explaining why LLM-facing systems require different mitigations and why parameterized queries alone are insufficient. Essential reading for teams exposing Text-to-SQL to untrusted input.

ACL / COLING Conference Paper · 2025

MAC-SQL: Multi-Agent Collaborative Text-to-SQL

COLING 2025 paper introducing MAC-SQL, a multi-agent framework that decomposes complex Text-to-SQL tasks across specialized sub-agents for schema decomposition, SQL generation, and self-correction. Achieves state-of-the-art results on BIRD benchmark, establishing agentic approaches as the production frontier.

arXiv Research · 2025

Agentic Retrieval Patterns for Text-to-SQL

Recent research on agentic retrieval patterns that combine schema-aware retrieval, iterative self-correction, and execution-feedback loops. Documents the patterns that close the gap between single-shot Text-to-SQL accuracy and reliable, revision-capable systems suitable for production analytics.

VLDB DB Conference Paper

VLDB: Text-to-SQL in Production Databases

VLDB paper examining Text-to-SQL performance across production databases with complex schemas, multi-table joins, and domain-specific terminology. Identifies schema documentation quality and example query caching as the two highest-leverage factors for real-world accuracy improvement.

06

AI ROI & Executive Guidance

Measurement frameworks for executives and CFOs

Decision support for leadership teams trying to distinguish AI activity from measurable business value.

McKinsey QuantumBlack · April 2026

From Promise to Impact: The Five-Layer AI Measurement Framework

McKinsey's structured approach to AI impact assessment across five layers: technical performance, user adoption, operational KPIs, strategic outcomes, and financial impact. Key finding: 60% of respondents have not seen enterprise-wide EBIT impact from AI — because horizontal tools like chatbots improve experience but rarely move the P&L. Introduces governance cadences, decision gates, and project-phase measurement milestones for moving AI from pilots to scaled value.

Forbes Executive Briefing · Jan 2026

56% of CEOs See Zero AI ROI — Here's What the 12% Do Differently

PwC 2026 CEO Survey data showing 56% of CEOs report no revenue increase or cost decrease from AI. CEOs who do see returns are 2–3× more likely to have integrated AI deeply into decision-making and demand generation. Shifts the framing from deployment count to economic primitives and capability depth.

Atlassian ROI Framework

Enterprise AI ROI Framework

Atlassian's practitioner framework for measuring AI value across engineering, IT, and business operations teams. Distinguishes hard cost savings, soft productivity gains, and strategic capability uplift — providing a structured approach to building the business case for continued AI investment.

ISACA Governance Guidance

How to Measure and Prove the Value of Your AI Investments

ISACA's IT governance perspective on demonstrating AI investment value to boards and audit committees. Bridges the gap between operational AI metrics (latency, throughput) and the governance reporting that risk committees and auditors expect to see from enterprise AI programs.

TechStoriess Decision Guide · 2026

Agentic AI vs. Traditional Copilots: 2026 ROI Decision Guide

Comparative ROI data on copilot-style assistants versus autonomous agentic AI across enterprise use cases in 2026. Documents the scenarios where agentic systems justify their higher implementation complexity, and where simpler copilot deployments deliver better risk-adjusted returns.

Atlan Industry Report · 2025

State of Enterprise Data & AI 2025

Atlan's annual enterprise survey on AI adoption patterns, data quality challenges, and the governance gaps slowing AI ROI. Identifies data catalog maturity, metadata quality, and data team AI literacy as the top predictors of successful enterprise AI programs.

Enterprise AI research,
governance, and implementation references

Primary regulatory and governance sources

EU AI Act Implementation Timeline

NIST AI Risk Management Framework (AI RMF)

Singapore's Digital & AI Governance: A Pro-Innovation Model

Unpacking the December 11, 2025 US Executive Order on AI

AI Watch: Global Regulatory Tracker — China

Deploying AI in the Philippines

ISO/IEC 42001 — AI Management System Standard

Security frameworks, threat models, and identity guidance

Zero Trust for AI (ZT4AI): New Tools and Reference Architecture

OWASP Top 10 for LLM Applications 2025

NIST SP 800-207: Zero Trust Architecture

NIST AI 100-2: Adversarial Machine Learning

Agentic AI Identity Management Approach

Disrupting AI Espionage: Anthropic Threat Intelligence

State of AI Security Report — H1 2025

Retrieval architecture, evaluation, and production quality

Contextual Retrieval: Reducing RAG Retrieval Failures by 67%

Lost in the Middle: Long-Context RAG Position Bias

Comprehensive Chunking Evaluation for Production RAG

RAGAS — RAG Assessment Framework

Reranker Benchmark for RAG Pipelines

Long-Context RAG Performance Across LLMs

TruLens — RAG Evaluation and Observability

Text-to-SQL, semantic layers, and safe data access

Survey of Text-to-SQL: Benchmarks, Techniques, and Challenges

Semantic Layer vs. Text-to-SQL: The 2026 Decision Guide

Prompt Injection Is Not SQL Injection — And Why That Matters

MAC-SQL: Multi-Agent Collaborative Text-to-SQL

Agentic Retrieval Patterns for Text-to-SQL

VLDB: Text-to-SQL in Production Databases

Measurement frameworks for executives and CFOs

From Promise to Impact: The Five-Layer AI Measurement Framework

56% of CEOs See Zero AI ROI — Here's What the 12% Do Differently

Enterprise AI ROI Framework

How to Measure and Prove the Value of Your AI Investments

Agentic AI vs. Traditional Copilots: 2026 ROI Decision Guide

State of Enterprise Data & AI 2025

Put the research to work in your organization.

Enterprise AI research, governance, and implementation references

Primary regulatory and governance sources

EU AI Act Implementation Timeline

NIST AI Risk Management Framework (AI RMF)

Singapore's Digital & AI Governance: A Pro-Innovation Model

Unpacking the December 11, 2025 US Executive Order on AI

AI Watch: Global Regulatory Tracker — China

Deploying AI in the Philippines

ISO/IEC 42001 — AI Management System Standard

Security frameworks, threat models, and identity guidance

Zero Trust for AI (ZT4AI): New Tools and Reference Architecture

OWASP Top 10 for LLM Applications 2025

NIST SP 800-207: Zero Trust Architecture

NIST AI 100-2: Adversarial Machine Learning

Agentic AI Identity Management Approach

Disrupting AI Espionage: Anthropic Threat Intelligence

State of AI Security Report — H1 2025

Retrieval architecture, evaluation, and production quality

Contextual Retrieval: Reducing RAG Retrieval Failures by 67%

Lost in the Middle: Long-Context RAG Position Bias

Comprehensive Chunking Evaluation for Production RAG

RAGAS — RAG Assessment Framework

Reranker Benchmark for RAG Pipelines

Long-Context RAG Performance Across LLMs

TruLens — RAG Evaluation and Observability

Text-to-SQL, semantic layers, and safe data access

Survey of Text-to-SQL: Benchmarks, Techniques, and Challenges

Semantic Layer vs. Text-to-SQL: The 2026 Decision Guide

Prompt Injection Is Not SQL Injection — And Why That Matters

MAC-SQL: Multi-Agent Collaborative Text-to-SQL

Agentic Retrieval Patterns for Text-to-SQL

VLDB: Text-to-SQL in Production Databases

Measurement frameworks for executives and CFOs

From Promise to Impact: The Five-Layer AI Measurement Framework

56% of CEOs See Zero AI ROI — Here's What the 12% Do Differently

Enterprise AI ROI Framework

How to Measure and Prove the Value of Your AI Investments

Agentic AI vs. Traditional Copilots: 2026 ROI Decision Guide

State of Enterprise Data & AI 2025

Put the research to work in your organization.

Enterprise AI research,
governance, and implementation references