Guides
Stream Lineage
Visualize producers, topics, processors, and consumers as a graph
Overview
Stream Lineage renders an interactive, left-to-right graph of every producer, topic, processor
(Managed Connect, Flink SQL, Tableflow), lake table, and consumer group running in your
(business_unit, stage) context. It answers the question
"who produces to and consumes from this topic, and what runs in between?" without leaving
the self-service portal.
The graph is reconstructed on demand by querying Confluent Cloud directly — Managed Connect, Flink SQL v1, Tableflow, the Telemetry (Metrics) API, the Schema Registry, and the Kafka REST endpoint — and mapping the results to the open OpenLineage object model. Nothing is stored: every page load rebuilds the graph from authoritative sources (with a short cache window — see Caching).
Reference
Access
Stream Lineage is admin-only. Both the page (/ui/lineage)
and the backing endpoint (GET /{business_unit}/{stage}/lineage)
require either a BU admin, super-admin, or platform admin role.
Why admin-only?
How It Works
On every uncached page load the backend issues parallel async calls to six Confluent sources and merges the results into a single OpenLineage event stream. The frontend then bootstraps a Cytoscape.js graph with a dagre layout.
| Source | Confluent API | Contributes |
|---|---|---|
connect |
Managed Connect REST | Source & sink connector topic bindings |
flink |
Flink SQL v1 + Compute Pools (fcpm/v2) | RUNNING Flink statements → topic-to-topic edges (parsed from the SQL) |
tableflow |
Tableflow v1 | Topic → Delta / Iceberg table sync edges |
dataflow |
Telemetry API /v2/metrics/dataflow/query |
Producer & consumer-group edges + per-topic throughput facet. Surfaces non-committing consumers (e.g. Spark Structured Streaming) the documented consumer_lag_offsets metric never returns. |
dataflow_ksql |
Telemetry API (ksqlDB query metrics, gated on cluster presence) | ksqlDB query → topic edges (skipped silently when no ksqlDB cluster exists) |
schema_registry |
Schema Registry | Per-topic value & key schemas (facet enrichment) |
kafka_rest |
Kafka REST v3 | Topic partition counts & replication factor (facet enrichment) |
Each source runs independently. If one fails (auth, transport, 5xx), the others still contribute — see Partial Responses.
Reading the Graph
Data flows left to right. Two node families render distinctly:
Dataset nodes (pink chips)
Kafka topics, Delta tables, and Iceberg tables. Topic nodes carry partition / replication / schema facets when the enrichments succeed.
Job nodes (teal chips)
Anything that moves data: Connect connectors, Flink statements, Tableflow syncs, producer client IDs, and consumer groups. Job nodes carry source-specific facets (connector class, Flink SQL excerpt, table format, throughput, etc.).
Badge nodes (aggregated)
When several producers or consumer groups attach to a single topic, they collapse into one PRODUCERS · N or CONSUMER GROUPS · N badge. Click the badge to expand the individual nodes inline.
Toolbar & Interaction
| Action | Behaviour |
|---|---|
| Click a node | Opens a right side-panel listing every OpenLineage facet attached to that node. |
| Click a badge | Expands the aggregated producers / consumer groups into individual nodes. |
| Refresh | Cache-busts the browser fetch. The server-side 60 s cache still applies; see Caching. |
| Export PNG | Downloads the full graph (not just the viewport) at 2× scale. |
Caching
Lineage responses are cached server-side for 60 seconds, keyed by
(business_unit, stage). Repeated loads within the window
share a single build — important because a full graph build can fan out a dozen or more API calls
against Confluent.
The browser also gets Cache-Control: private, max-age=60.
Newly created resources (topics, connectors, consumer groups) show up on the next build after the
TTL expires.
Partial Responses
Source failures never return a 5xx. The endpoint always answers 200 OK
with the events it could build, and adds two response headers when one or more sources failed:
X-Lineage-Failed-Sources: dataflow,flink
A yellow banner above the graph enumerates the failed sources so you can tell at a glance whether you are looking at the full picture or a degraded view. The most common causes:
| Failed source | Likely cause | Fix |
|---|---|---|
dataflow / dataflow_ksql |
Admin Cloud API key is missing the MetricsViewer role. | Grant MetricsViewer at org or env scope on the admin service account. |
flink |
Compute pool listing returned 5xx or Flink SQL v1 rejected the credential. | Verify the per-landing-zone flink_<region>_client_id / _secret entries in App Configuration. |
connect |
At least one connector's config could not be parsed. | Check application logs for the offending connector name; the entire source fails on a single parse error by design. |
schema_registry / kafka_rest |
Per-context credentials missing or expired. | Same credentials used by Topics & Schemas pages — verify those work first. |
Flink not configured ≠ failure
Limitations
No ksqlDB or self-managed Connect
EON does not run ksqlDB or self-managed Connect clusters today. Statements / connectors running outside Confluent Managed Connect & Cloud Flink will not appear.
Spark / non-committing consumers
Consumers that use assign() instead of
subscribe() (e.g. Spark Structured Streaming) don't
commit offsets and are invisible to the consumer_lag_offsets
metric — they won't render as consumer-group nodes.
SQL parsing is regex-based
Complex Flink SQL patterns (CTEs, lateral joins, deeply nested subqueries) may not yield edges. The statement still renders as a job node — only its input / output detection is affected.
10-minute lookback window
Producer and consumer edges come from the Telemetry API with a default 10-minute lookback.
Clients that haven't produced or committed in that window won't show up. Tunable via
CONFLUENT_METRICS_LOOKBACK_MINUTES.
Topic-level only
Column / field-level lineage is not extracted. Schemas are attached as facets on topic nodes for inspection, but edges are always topic-to-job-to-topic.
Desktop-only
Below 768 px the page shows an "open on desktop" placeholder. Large DAGs with badges are not usable on a phone screen.