In this tutorial, we will show you how to install and set Apache Kafka on a VPS running Ubuntu 18.04.
Kafka or Apache Kafka is a distributed messaging system based on the principle of the pub-sub (publish-subscribe) model. It allows us to publish and subscribe to a stream of records that can be categorized. It is an incredibly fast, highly scalable, fault-tolerant system, and it’s designed to process large amounts of data in real time. Apache Kafka can be used as an alternative to a message broker as well, which allows us to process/transform a stream of records. Kafka can be used as a messaging system, but in a rather incomparably huge scale. Overall, Apache Kafka is a very powerful tool when used correctly.
Table of Contents
Prerequisites
- A Server running Ubuntu 18.04 with at least 4GB of memory. For the purposes of this tutorial, we’ll be using one of our Managed Ubuntu 18.04 VPSes.
- SSH access with root privileges, or access to the “root” user itself
Step 1: Log in via SSH and Update the System
Log in to your Ubuntu 18.04 VPS with SSH as the root user:
ssh root@IP_Address -p Port_number
Replace “root” with a user that has sudo privileges if necessary. Additionally, replace “IP_Address” and “Port_Number” with your server’s respective IP address and SSH port.
Once that is done, you can check whether you have the proper Ubuntu version installed on your server with the following command:
# lsb_release -a
You should get this output:
Distributor ID: Ubuntu Description: Ubuntu 18.04.2 LTS Release: 18.04 Codename: bionic
Then, run the following command to make sure that all installed packages on the server are updated to their latest available versions:
# apt update && apt upgrade
Step 2: Add a System User
Let’s create a new user called ‘kafka’, after which we will add this new user as a sudoer.
# adduser kafka # usermod -aG sudo kafka
Step 3: Install Java
Kafka is written in Java, so a JVM is required to get it working. In this tutorial, we will use OpenJDK 11, as it is the standard version of Java that comes with Ubuntu since September 2018.
# apt install default-jre
Step 4: Download Apache Kafka
Now let’s download Kafka. The latest download link at the time of writing has already been entered in the example for you.
# su - kafka
wget https://www-us.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz -O kafka.tgz
Now that the Apache Kafka binary has been downloaded, now we need to extract it in our Kafka user directory
$ tar -xzvf kafka.tgz --stripe 1
Step 5: Configure Apache Kafka
It is time to configure Apache Kafka. By default, we are not allowed to delete topics, categories or groups in which messages can be posted. To change this behavior, we need to edit the default configuration.
$ nano ~/config/server.properties
Append the following line to the last line of the configuration file.
delete.topic.enable = true
Step 6: Create a System Unit File for Apache Kafka
Zookeeper is required for running Kafka. Kafka uses zookeeper, so we’ll need to first start an instance of the Zookeeper server prior to starting the Apache Kafka service. In this tutorial, we will use the convenience script packaged with Kafka to get a quick-and-dirty single-node Zookeeper instance.
Open a new file at the filepath /etc/systemd/system/zookeeper.service
, and open it in your preferred text editor. We’ll be using nano
for this tutorial.
$ sudo nano /etc/systemd/system/zookeeper.service
Paste the following lines into it:
[Unit] Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple User=kafka ExecStart=/home/kafka/bin/zookeeper-server-start.sh /home/kafka/config/zookeeper.properties ExecStop=/home/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Now, let’s create a system unit file for kafka at the filepath /etc/systemd/system/kafka.service
:
$ sudo nano /etc/systemd/system/kafka.service
Paste the following lines into the file:
[Unit] Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka ExecStart=/bin/sh -c '/home/kafka/bin/kafka-server-start.sh /home/kafka/config/server.properties > /home/kafka/kafka.log 2>&1' ExecStop=/home/kafka/bin/kafka-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
The new system units have been added, so let’s enable Apache Kafka to automatically run on boot, and then run the service.
$ sudo systemctl enable kafka $ sudo systemctl start kafka
Step 7: Create a Topic
In this step, we will create a topic named “FirstTopic”, with a single partition and only one replica:
$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic FirstTopic Created topic "FirstTopic".
The replication-factor value describes how many copies of data will be created. We are running with a single instance, so the value would be 1.
The partitions value describe the number of brokers you want your data to be split between. We are running with a single broker, so the value would be 1.
Now you can see the created topic on Kafka by running the list topic command:
$ bin/kafka-topics.sh --list --zookeeper localhost:2181 FirstTopic
Step 8: Send Messages using Apache Kafka
Apache Kafka comes with a command line client that will take input from a file or standard input and send it out as messages to the Kafka cluster. The “producer” is the process that has responsibility for putting data into our Kafka service. By default, Kafka sends each line as a separate message.
Let’s run the producer and then type a few messages into the console to send to the server.
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic FirstTopic >Welcome to kafka >This is the content of our first topic >
Keep the terminal opened, and let’s proceed to the next step.
Step 9: Use Apache Kafka as a Consumer
Apache Kafka also has a command line for the consumer to read data from Kafka – this is so that the consumer can use Kafka to display messages in a standard output.
Run the following command in a new SSH session.
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic testTopic --from-beginning Welcome to kafka This is the content of our first topic
That’s it! Apache Kafka has been successfully installed and set up. Now we can type some messages on the producer terminal as stated in the previous step. The messages will be immediately visible on our consumer terminal.
Of course, you don’t have to know how to install Apache Kafka on Ubuntu 18.04 if you have an Ubuntu 18.04 VPS hosted with us. If you do, you can simply ask our support team to install Apache Kafka on Ubuntu 18.04 for you. They are available 24/7 and will be able to help you with the installation of Apache Kafka, as well as any additional requirements that you may have.
PS. If you enjoy reading this blog post on how to install Apache Kafka on Ubuntu 18.04, feel free to share it on social networks by using the shortcuts below, or simply leave a comment down in the comments section. Thank you.