IT Shared: Data Streaming

27 September 2016

A Beginner's Guide to Apache Flink – 12 Key Terms, Explained

Overview

In this post, I will go through 12 core Apache Flink concepts to better understand what it does and how it works. This article could perfectly serve as a beginner's overview of Flink and Streaming engine terminology.

1. What is Apache Flink?

At first glance, the origins of Apache Flink can be traced back to June 2008 as a researching project of the Database Systems and Information Management (DIMA) Group at the Technische Universität (TU) Berlin in Germany.

Apache Flink is an open source platform for distributed stream and batch data processing, initially it was designed as an alternative to MapReduce and the Hadoop Distributed File System (HFDS) in Hadoop origins.

According to the Apache Flink project, it is an open source platform for distributed stream and batch data processing. Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink also builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program optimization.”

29 April 2015

How to count distinct?

Counting the number of distinct elements in a data set is a very common query. It can help give you an idea of how many duplicates you are dealing with. Let's say for example that you have a set of transactions, and you wish to detect if these transactions are either associated to a small set of frequent buying customers or performed by different customers. This can help you understand your clients and what type of marketing strategies you need to adopt.

	Ahmet Anıl Pala
	Alexey Grigorev
	Andrés Vivanco Villamar
	Andres Felipe Zamora Montaño
	Elena Samota
	Guven Toprakkiran
	Hicham Akaoka Badssi
	José Luis Pino López
	Madalina Burghelea
	Maximiliano Ariel López
	Mia Johnson Vioulès
	Navid Mahlouji
	Nyami Ronald Mitterand
	Steffi Melinda
	Stephany García Martínez
	Tamara Mendt