Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.
特点unbounded data: 有头无尾
bounded data: 有头有尾
都可以用Flink来处理,对应的就是流处理和批处理
Layered APIs3. 对比
Spark:Streaming 结构化流 批处理为主 流式处理是批处理的一个特例(mini batch)
Flink:流式为主 批处理是流式处理的一个特例
Storm:流式 Tuple
使用案例Event-driven Applications
Fraud detectionAnomaly detectionRule-based alertingBusiness process monitoringWeb application (social network)Data Analytics Applications
Quality monitoring of Telco networksAnalysis of product updates & experiment evaluation in mobile applicationsAd-hoc analysis of live data in consumer technologyLarge-scale graph analysisData Pipeline Applications
Real-time search index building in e-commerceContinuous ETL in e-commerce