Get Flink up and running in a few simple steps.
Flink runs on Linux, Mac OS X, and Windows. To be able to run Flink, the only requirement is to have a working Java 7.x (or higher) installation. Windows users, please take a look at the Flink on Windows guide which describes how to run Flink on Windows for local setups.
Download the ready to run binary package. Choose the Flink distribution that matches your Hadoop version. If you are unsure which version to choose or you just want to run locally, pick the package for Hadoop 1.2.
$ cd ~/Downloads # Go to download directory $ tar xzf flink-*.tgz # Unpack the downloaded archive $ cd flink-0.10.2 $ bin/start-local.sh # Start Flink
Check the JobManager’s web frontend at http://localhost:8081 and make sure everything is up and running.
Instead of starting Flink with
bin/start-local.sh you can also start Flink in an streaming optimized
Run the Word Count example to see Flink at work.
Download test data:
$ wget -O hamlet.txt http://www.gutenberg.org/cache/epub/1787/pg1787.txt
Start the example program:
$ bin/flink run ./examples/WordCount.jar file://`pwd`/hamlet.txt file://`pwd`/wordcount-result.txt
Running Flink on a cluster is as easy as running it locally. Having passwordless SSH and the same directory structure on all your cluster nodes lets you use our scripts to control everything.
conf/flink-conf.yamlto its IP or hostname. Make sure that all nodes in your cluster have the same
You can now start the cluster at your master node with
bin/start-cluster.sh. If you are planning
to run only streaming jobs with Flink, you can also an optimized streaming mode:
The following example illustrates the setup with three nodes (with IP addresses from 10.0.0.1 to 10.0.0.3 and hostnames master, worker1, worker2) and shows the contents of the configuration files, which need to be accessible at the same path on all machines:
Have a look at the Configuration section of the documentation to see other available configuration options. For Flink to run efficiently, a few configuration values need to be set.
are very important configuration values.
You can easily deploy Flink on your existing YARN cluster.
./bin/yarn-session.sh. You can run the client with options
-n 10 -tm 8192to allocate 10 TaskManagers with 8GB of memory each.
For more detailed instructions, check out the programming Guides and examples.