Overview
A comprehensive edge-to-cloud real-time streaming data platform.
Cloudera Dataflow (CDF) is a scalable, real-time streaming data platform that ingests, curates, and analyzes data for key insights and immediate actionable intelligence. DataFlow addresses the following challenges:
- Processing real-time data streaming at high volume and high scale
- Tracking data provenance and lineage of streaming data
- Managing and monitoring edge applications and streaming sources
- Gaining real-time insights and actionable intelligence from streaming data
Extend DataFlow to Cloudera Data Platform
DataFlow capabilities are available within CDP Public Cloud in two deployment options—DataFlow for Data Hub and DataFlow for the Public Cloud—detailed below. Take advantage of CDP Public Cloud’s key benefits such as quick cluster provisioning, management, monitoring, as well as SDX’s unified security and governance across the data lifecycle.
Deployment options
Edge-to-cloud streaming data platform across on-premises, public cloud, and hybrid cloud environments.
DataFlow for Data Hub
- Spin-up Apache NiFi clusters quickly for high-scale data ingestion with Flow Management for Data Hub
- Extend on-premises Apache Kafka clusters to the public cloud with Streams Messaging for Data Hub
- Accelerate real-time stream processing with Apache Flink on hybrid cloud with Streaming Analytics for Data Hub
DataFlow for the Public Cloud
- Reduce cloud infrastructure costs by enabling auto-scaling on cloud-native flows
- Manage and monitor all NiFi flows across multiple cloud clusters from a centralized dashboard
- Accelerate development by leveraging pre-built NiFi flows from a gallery of ReadyFlows
The Cloudera DataFlow Platform
Edge & Flow Management
Manage, control, and monitor the edge for streaming and IoT initiatives and deliver real-time streaming data with no-code ingestion and management.
Streams Messaging
Buffer and scale massive volumes of data ingests to serve the real-time data needs of other enterprise and cloud applications.
Stream Processing & Analytics
Empower real-time insights to improve detection and response to critical events that deliver valuable business outcomes.
Use cases
Logging modernization
Customer 360
Real-time insights
Logging Modernization
Unlock the value of machine-generated data with CDF’s Logging Modernization.
Logging Modernization is a holistic approach toward unlocking the value of machine-generated data by lowering processing costs and enabling a range of new analytics use cases. This is achieved through real-time data ingestion, edge processing, transformation, and routing log data through to descriptive, prescriptive, and predictive analytics.
Customer 360
Get the complete view of your customer by gathering all their data from multiple sources.
One of the primary digital transformation initiatives across organizations is to understand the full picture of their customers. But customer data exists across multiple data sources such as traditional enterprise databases, data lakes, cloud stores, and social feeds. CDF’s data ingestion and messaging capabilities lets you ingest, combine, enrich, and process data from all these data sources seamlessly and delivers a full 360-degree view of your customer.
Real-time insights
Predict failures and take corrective actions in real time.
Your IoT or streaming analytics implementations are only as good as your ability to harness the value of the data you ingest in real time. IoT use cases like predictive maintenance or patient monitoring require the data to be instantly consumed and processed to generate predictive and prescriptive analytics in real time. These can be truly life-saving insights in some use cases.
Update your stream processing workloads to CDP
Cloudera Data Platform (CDP) is the new enterprise data cloud. With Cloudera DataFlow as part of CDP, businesses can operate streaming workloads from edge data collection to streaming analytics in the cloud.