मुख्य विषयवस्तु में जाएं

Apache Flink

Enterprise Beta

Apache Flink support in Ilum is currently available as a Beta feature for Enterprise deployments. Production deployments should validate Flink workloads against their specific use cases before relying on Flink for critical pipelines. Contact Ilum for enablement details.

Apache Flink is a distributed stream-processing engine designed for low-latency, event-driven workloads. In Ilum, Flink is exposed through the Apache Kyuubi SQL gateway as a peer engine alongside Spark, Trino, and DuckDB.

Flink is the right engine for:

  • Continuous data pipelines with sub-second latency requirements.
  • Event-time analytics with windowing and watermarks.
  • Real-time enrichment of streaming data against reference datasets.
  • Long-running streaming jobs with exactly-once semantics.

For batch ETL and large transformations, prefer अपाचे स्पार्क . For interactive analytics, prefer त्रिगुण . For lightweight queries, prefer डकडीबी .

For batch streaming use cases (micro-batch with the same code as batch jobs), Spark Structured Streaming remains a strong default.

Execution model

Flink runs as a JobManager and configurable number of TaskManagers:

  • JobManager: Coordinates execution, manages checkpoints and savepoints, and tracks job state.
  • TaskManagers: Execute parallel stream operators, hold operator state, and emit watermarks.

Flink jobs are typically long-running, with state persisted to durable checkpoint storage on object storage.

Supported catalogs

When enabled, Flink in Ilum reads from and writes to:

Catalog configuration is shared with the rest of the platform; Flink jobs see the same tables that Spark and Trino do.

When Flink is enabled in your Ilum deployment, it appears in the Engine Selector dropdown of the SQL Editor. The engine status indicator shows JobManager and TaskManager health.

When the automatic engine router is enabled, Flink is selected automatically for queries identified as streaming workloads.

रोडमैप

Flink is on track to graduate from Enterprise Beta to general availability in an upcoming release. The roadmap includes:

  • Self-service enablement through the Modules registry.
  • Expanded catalog connector coverage.
  • Tighter integration with the automatic engine router for hybrid batch and streaming workloads.