New in Confluent Cloud: Making Data & Pipelines Accessible for AI-Ready Streaming | Learn More

New in Confluent Intelligence: Real-Time Context Engine Upgrade, New Model Support, ML Functions, and More

Written By

As AI models become increasingly interchangeable, what matters isn’t which large language model (LLM) you choose – it’s whether your agents can see and act on the live state of your business. Context is the real competitive advantage. And when that context is stale, fragmented, or locked behind brittle point‑to‑point integrations, even the best models fail to deliver reliable decisions in production.

Confluent Intelligence was built to solve this problem. It’s a fully managed service on Confluent for building real-time, context‑rich, and trustworthy AI systems on a unified data streaming platform that brings together Apache Kafka® and Apache Flink®. It allows you to stream operational events, continuously enrich them with external data, apply built‑in AI / ML functions, and power Streaming Agents and other AI applications with fresh, governed context—without stitching together infrastructure.

Today, we’re moving Confluent Intelligence into its next phase with a powerful Real-Time Context Engine generally available (GA), expanded agent operations, more model options, and new ML functions for real-time AI.

Q2’26: What’s New in Confluent Intelligence

We’re excited to announce the following new capabilities:

  • Real-Time Context Engine (GA): Upgrade includes low‑latency enhanced querying capabilities such as filters, ranges, compound queries, projections, and ordering, so agents can access rich context without relying on external databases.

  • Streaming Agents (GA): Production‑ready, event‑driven agents running natively on Flink and Kafka, now with GA support and enterprise‑grade operations.

  • Agent Management Console (GA): Centralized, UI-driven control plane for creating and operating Streaming Agents at scale.

  • Additional model support:

    • TimesFM (Early Access [EA]): Time-series forecasting models embedded into streaming pipelines.

    • Anthropic (GA): Native support for Anthropic models.

    • Fireworks AI (GA): Access to a broad catalog of optimized foundation models.

  • Built-in ML functions:

    • Multivariate anomaly detection (OP): Moved from EA to Open Preview for detecting complex, multiple correlated metrics.

    • PII detection (EA): Automatically detect and redact sensitive fields in real time.

    • Sentiment analysis (EA): Score sentiment on streams of events and text for real-time customer service and operations.

Confluent Intelligence demo highlighting A2A Integration and Multivariate Anomaly Detection.

Let’s dive into key areas in more detail.

Real-Time Context Engine (GA): Rich Queries Directly on Live Tables

Real-Time Context Engine continuously serves fresh context to any AI agent or application at low latency via MCP. Until now, it focused on primary key lookups: blazing fast, row‑by‑row retrieval for agents. With GA, it expands its role as a low‑latency context engine to let you query tables far more flexibly—without standing up and managing separate databases.

At GA, Real-Time Context Engine adds the following:

  • Enhanced query support at low latency for filters, ranges, compound queries, projections, ordering, and more in motion. 

  • Scale to any size: Designed to scale with the volume and cardinality of your streaming data, so growing traffic doesn’t force you into separate operational databases.

  • Full schema support:

    • All schema types: AVRO, JSON, and Protobuf, with schema evolution support.

    • Nested values: First‑class support for nested and complex structures.

    • All data types: Coverage across character, string, integer, etc.

  • Terraform and UI support for control plane operations:

    • Enable or disable topics feeding Real-Time Context Engine.

    • Update tool descriptions used by agents and other MCP clients.

    • Manage table exposure and life cycle programmatically as part of your platform automation.

For most teams, the hardest part of production AI is context engineering—assembling the right slice of fresh, governed data at decision time. Learn more from the ebook The Complete Guide to Context Engineering for AI and get started with docs here.

Use Case Example: Real-Time Credit Decisioning

A bank may have tables for customer profiles and risk segments, recent transaction windows, device fingerprints, and geo patterns. When a new credit application arrives, an agent can do the following:

  1. Use Real-Time Context Engine for advanced querying on the fly

  2. Use the built-in ML function for anomaly detection to find application inconsistencies

  3. Call an LLM directly using Flink SQL for explanation generation

  4. Make an approve/decline/needs‑review decision in real time, not hours or days later

Streaming Agents (GA) and Agent Management Console (GA): Production-Ready, Event-Driven AI

Streaming Agents bring agentic AI directly into your data streams. Instead of polling data warehouses or relying on batch snapshots, event-driven Streaming Agents continuously monitor live business signals and take autonomous action in real time.

With GA for Streaming Agents, you now get enterprise‑grade reliability, with a four nines SLA, production support, and consistent runtime behavior for long‑lived agents consuming high‑throughput streams. There’s also an agent reflection pattern that lets agents iteratively critique and refine their own outputs before emitting a single, trusted event into the stream. Start building your own event-driven agents in minutes with the Quickstart and docs.

Now with Agent Management Console, you gain more operational control for Streaming Agents in the Confluent Cloud UI. The Agent Management Console pulls this into a single, visual experience. Streaming Agents show up as first-class resources, so developers and platform teams can see how an agent is wired—its inputs and outputs, prompts, models, tools, tables—without digging through code.

The new Agent Management Console enables teams to do the following:

  • Make agents visible and manageable. See all agents, their statuses, and their core configurations in one place instead of going through SQL and jobs.

  • Accelerate iteration. Create and refine prompts, models, tools, and data wiring through a guided console, not just code.

  • Operate with confidence. Test agents with sample inputs, monitor live runs, and inspect logs to improve accuracy, latency, and reliability.

In practice, the Agent Management Console becomes a place where AI, data, and platform teams can collaborate on how agents are configured and how they behave in production, using the same rigor they already apply to other core services on Confluent Cloud.

Because everything is event‑driven and replayable, you can iterate on pipeline and agent logic with full observability and auditability.

Multivariate Anomaly Detection (OP), PII Detection, and Sentiment Analysis: ML Made Easy

In Q1, we introduced multivariate anomaly detection as an open preview built‑in ML function to find anomalies across multiple correlated metrics at once. This quarter, it reaches OP, and we’re adding two new ML functions in EA that are focused on trust and customer experience.

Flink vector search now supports Cosmos DB and S3 Vectors.

Flink vector search on Azure Cosmos DB and Amazon S3 Vectors closes this gap by making both systems first-class vector providers in Confluent’s external tables and search fabric—alongside MongoDB, Elastic, Couchbase, and Pinecone.

With this release, you can:

  • Query Cosmos DB or S3 Vectors directly from Flink SQL to retrieve k‑nearest‑neighbor (kNN) results in real time, before prompting your LLM.

  • Run a single streaming pipeline that handles ingestion, transformation, embedding creation, vector search, and model inference instead of chaining multiple microservices.

  • Ground Streaming Agents and RAG applications in relevant context, mitigating hallucinations and improving answer quality.

  • Avoid duplicating data into extra operational stores by querying vectors in place via external tables and search.

  • Stay future-proof by using the same Flink and Streaming Agents patterns across all supported vector databases.

For Amazon S3 Vectors, you also get low-cost, durable vector storage that easily integrates with your S3 data lake and AWS services such as Amazon Bedrock—enabling Streaming Agents to combine semantic context retrieved from S3 with Bedrock models in the same Flink job. 

Learn more from docs here.

Vector Search Use Case: Real-Time Customer Support With Fresh Context

Suppose you’re building a customer support agent that needs the most relevant context at prompt time from knowledge base articles and recent call transcripts in Cosmos DB or S3 Vectors.

With Confluent Intelligence:

  • A Streaming Agent monitors Kafka topics for incoming support tickets and user events.

  • For each ticket, Flink SQL performs vector search against Cosmos DB or S3 Vectors to retrieve the most semantically relevant content.

  • The Streaming Agent combines these vector search results with real-time streams (e.g., loyalty status, recent browsing and cart activity, order/refund history) and calls the LLM to create a hyperpersonalized recommendation.

Because all of this is running in the same Flink stream processing pipeline, you can unify AI and data processing workflows while avoiding the additional cost and sprawl of separate ingestion jobs, bespoke RAG services, and ad hoc synchronization between vector databases and data sources.

AWS and Azure Private Link for Model Inference, External Tables, and Vector Search: Secure Networking for AI

For many teams, the last mile of connecting AI workflows to sensitive systems of record isn’t about APIs or schemas. It’s networking. 

AWS and Azure Private Link for model inference, external tables, and vector search enables private, VPC‑to‑VPC connectivity from Flink to your external models and systems—so Confluent Intelligence (including Streaming Agents) can call LLMs and enrich real-time streams with sensitive, proprietary enterprise data from external databases, vector stores, and REST endpoints without going over the public internet. For many organizations, this helps unblock production deployments and ensures information security compliance.

Private Link enables you to:

  • Keep AI traffic private and compliant: Use Private Link so that lookups against CRM, enterprise resource planning (ERP), vector stores, and REST APIs happen over private network paths.

  • Safely join Kafka streams with systems of record: Give Streaming Agents fresh, complete context (e.g., customer records, orders, policies) while meeting strict security requirements.

  • Reduce networking complexity: Replace ad hoc proxies and tunnels with standardized, cloud-native private connectivity backed by Confluent’s (Teams adopting Anthropic’s MCP want a simple, standardized way for AI agents to tap into Confluent Cloud—Kafka, Flink, connectors, Tableflow, Schema Registry, and more—using the open source Confluent MCP server. Until now, they’ve largely been on their own: deploying and operating the MCP server themselves, wiring it to each MCP client (e.g., Claude Desktop, Goose, Gemini CLI), and handling issues without formal vendor backing.

Learn more from docs here.

Support for Open Source MCP Server for Confluent Cloud: Production Backing for Any MCP Client

Teams adopting Anthropic’s MCP want a simple, standardized way for AI agents to tap into Confluent Cloud—Kafka, Flink, connectors, Tableflow, Schema Registry, and more—using the open source Confluent MCP server. Until now, they’ve largely been on their own: deploying and operating the MCP server themselves, wiring it to each MCP client (e.g., Claude Desktop, Goose, Gemini CLI), and handling issues without formal vendor backing.

Confluent now provides official support for the open source MCP server for Confluent Cloud, turning it into a production‑ready, vendor‑backed bridge between your MCP-based agents and real-time, governed data on Confluent.

This support means you can:

  • Give MCP agents direct access to fresh Confluent Cloud data and operations far beyond just managing Kafka topics—covering connectors, Flink, Tableflow, support for billing and metrics APIs, and more. This can also be used to debug issues like backpressure.

  • Manage Confluent via natural language—configuring topics, running Flink SQL, and interacting with data infrastructure from MCP clients without code.

  • Raise issues via GitHub or your account team, with Confluent engineering working to resolve them under defined SLAs.

Start Building Real-Time, Context-Rich With Confluent Intelligence

To capitalize on the full potential of AI, teams need AI agents and applications that can see, understand, and act on the live state of the business. Confluent Intelligence brings together streaming and stream processing for context engineering, ML, vector search for RAG, Streaming Agents, and open protocols such as MCP and A2A to make that possible on a fully managed, governed platform.

Get started with Confluent Intelligence to turn your AI initiatives into real-time production systems today!


Apache®, Apache Kafka®, Apache Flink®, Flink®, and the respective logos are trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by the Apache Software Foundation is implied by using these marks. All other trademarks are the property of their respective owners.

  • This blog was a collaborative effort between multiple Confluent employees.

Did you like this blog post? Share it now