How to install Apache Kafka on Ubuntu 20.04?

Overview

Apache Kafka is an open-source distributed event streaming platform that is widely used for real-time data processing and analysis. Kafka is designed to handle large volumes of real-time data from multiple sources and distribute them to various applications and systems. It is a distributed system, meaning that it can run on a cluster of servers, allowing for increased scalability, fault tolerance, high throughput and low latency. In this tutorial, we'll walk you through the process of how to install Apache Kafka on Ubuntu 20.04.

Prerequisites

There are certain prerequisites that need to be met before you begin:

  • Server running Ubuntu 20.04

  • Access to SSH-connected text editor

  • User account with root or sudo access

  • Internet connection

Key

· Red box- Input

· Green box- Output

Get Started

Step 1: Update your System

· Before installing Kafka, it is recommended to update your system to ensure that all the necessary packages are up to date. You can do this by running the following command:

sudo apt update && sudo apt upgrade –y

Step 2: Install Stable Java Version

· To run Kafka, you'll need to install Java on your system. Fortunately, you can easily install the open-source implementation of Java, called OpenJDK, by running the following command:

sudo apt install openjdk-11-jre-headless -y

· Verify the installed java version, with the following command:

java –version

Note: If the above command throws you the following error:

You may try executing the following command:

java --version

Step 3: Download latest Apache Kafka

Next, you need to download Kafka. You can do this by visiting the Kafka downloads page on the Apache website and selecting the latest stable release. At the time of writing, the latest version is 3.4.0. Copy the download link for the binary tarball file.

wget https://downloads.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz

Step 4: Extract Kafka

After downloading the archive file, create a new directory and then extract its contents, by the following commands:

sudo mkdir /usr/local/kafka-server
sudo tar xzf kafka_2.13-3.4.0.tgz

Now, move the extracted files to the /usr/local/kafka directory, by executing the following command:

sudo mv kafka_2.13-3.4.0/* /usr/local/kafka-server

Step 5: Configure Zookeeper

Zookeeper is a distributed coordination service that is often used with Kafka to manage various aspects of the Kafka cluster.

Create systemd files, for controlling the startup and management of system services, daemons, for both Zookeeper and Kafka separately, with the following command:

sudo nano /etc/systemd/system/zookeeper.service

Add the following lines in the Zookeeper systemd file to setup the configuration:

[Unit]


Description=Apache Zookeeper Server


Requires=network.target remote-fs.target


After=network.target remote-fs.target


[Service]


Type=simple


ExecStart=/usr/local/kafka-server/bin/zookeeper-server-start.sh /usr/local/kafka-server/config/zookeeper.properties


ExecStop=/usr/local/kafka-server/bin/zookeeper-server-stop.sh


Restart=on-abnormal


[Install]


WantedBy=multi-user.target

Save and exit by entering Ctrl+ O and Ctrl + X.

Step 5: Configure Kafka

Let's now create a systemd file for Kafka service, using the following command:

sudo nano /etc/systemd/system/kafka.service

Mention the following lines of code in the Kafka systemd file, while ensuring that you have set the correct JAVA_HOME path that matches the version of Java installed on your system:

[Unit]


Description=Apache Kafka Server


Documentation=http://kafka.apache.org/documentation.html


Requires=zookeeper.service


After=zookeeper.service


[Service]


Type=simple


Environment="JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64"


ExecStart=/usr/local/kafka-server/bin/kafka-server-start.sh /usr/local/kafka-server/config/server.properties


ExecStop=/usr/local/kafka-server/bin/kafka-server-stop.sh


Restart=on-abnormal


[Install]


WantedBy=multi-user.target

Save and exit by entering Ctrl+ O and Ctrl + X.

To apply the new changes, reload the systemd daemon, with the following command:

sudo systemctl daemon-reload

This will cause all the systemd files in the system environment to be reloaded.

Step 6: Start Zookeeper and Kafka service

Kafka uses ZooKeeper to manage and coordinate its brokers.

You can start and enable the ZooKeeper service using the following command:

sudo systemctl enable --now zookeeper.service

Then, start and enable the Kafka service by running the following command in a separate terminal window:

sudo systemctl enable --now kafka.service

Verify the running status of both Zookeeper and Apache Kafka services, by the following command:

sudo systemctl status kafka zookeeper

Conclusion

You now have Kafka installed and running on your Ubuntu 20.04 system.

Last updated