Azure Blob Storage

Azure Blob Storage is a Microsoft-managed service providing cloud storage for a variety of use cases. You can use Azure Blob Storage with Flink for reading and writing data as well in conjunction with the streaming state backends

You can use Azure Blob Storage objects like regular files by specifying paths in the following format:

wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>

// SSL encrypted access
wasbs://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>

Below shows how to use Azure Blob Storage with Flink:

// Read from Azure Blob storage
env.readTextFile("wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>");

// Write to Azure Blob storage
stream.writeAsText("wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>")

// Use Azure Blob Storage as FsStatebackend
env.setStateBackend(new FsStateBackend("wasb://<your-container>@$<your-azure-account>.blob.core.windows.net/<object-path>"));

Shaded Hadoop Azure Blob Storage file system

To use flink-azure-fs-hadoop, copy the respective JAR file from the opt directory to the lib directory of your Flink distribution before starting Flink, e.g.

cp ./opt/flink-azure-fs-hadoop-1.10-SNAPSHOT.jar ./lib/

flink-azure-fs-hadoop registers default FileSystem wrappers for URIs with the wasb:// and wasbs:// (SSL encrypted access) scheme.

Configurations setup

After setting up the Azure Blob Storage FileSystem wrapper, you need to configure credentials to make sure that Flink is allowed to access Azure Blob Storage.

To allow for easy adoption, you can use the same configuration keys in flink-conf.yaml as in Hadoop’s core-site.xml

You can see the configuration keys in the Hadoop Azure Blob Storage documentation.

There are some required configurations that must be added to flink-conf.yaml:

fs.azure.account.key.youraccount.blob.core.windows.net: Azure Blob Storage access key

Back to top