Monday, 12 May 2025

Docker vs Kubernetes, and Step-by-Step: Deploy Spring Boot on Kubernetes

1. Docker vs Kubernetes

🛠️ What Is Docker?

  • Docker is a tool to build, ship, and run applications inside containers.

  • Each container is an isolated environment (like a lightweight virtual machine).

  • Packages everything (code, libraries, OS dependencies) into one image.

bash
docker build -t myapp . docker run -p 8080:8080 myapp

⚙️ What Is Kubernetes?

  • An open-source container orchestration platform.

  • Manages deployment, scaling, load balancing, and failover of containers.

  • Works with Docker (or other container runtimes).

yaml

apiVersion: apps/v1 kind: Deployment metadata: name: myapp-deployment spec: replicas: 3 selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers: - name: myapp image: myapp:latest ports: - containerPort: 8080
bash
kubectl apply -f deployment.yaml

Feature Docker Kubernetes
Definition A platform for developing, shipping, and running containers. A system for orchestrating and managing multiple containers.
Purpose Package applications with dependencies into lightweight containers. Manage containerized applications across a cluster (scaling, failover, etc.)
Role Creates and runs containers. Deploys, scales, and manages containers across multiple nodes.
Scope Single container or application. Cluster of containers across many machines.
Core Tool docker, Dockerfile, docker-compose. kubectl, kubelet, YAML configs, Deployments, Services, etc.
Scaling Manual (via docker-compose scale, scripts). Built-in auto-scaling and load balancing.
Networking Container-to-container on one host. Supports cluster-wide networking, service discovery, and DNS.
Fault Tolerance Manual restart or recovery. Automatically restarts failed containers, reschedules them.
Storage Volumes and bind mounts. Persistent Volumes (PV) and Persistent Volume Claims (PVC).
Deployment docker run, docker-compose up. Uses YAML (Deployment, Service, Pod, etc.).
Best Use Case Developing, testing, and running a single container or small system. Running and managing large-scale, production-grade containerized systems.

🤝 Docker + Kubernetes

  • Docker creates containers.

  • Kubernetes runs and manages them in a cluster.

You typically use:

  • Docker to build your app container.

  • Kubernetes to deploy, scale, and manage the app in production.

🧠 Analogy

ConceptAnalogy
DockerBuilding and packaging a car
KubernetesManaging a fleet of cars
Docker ComposeManaging a few cars (manually)
Kubernetes ClusterAutomated traffic management

✅ Summary

DockerKubernetes
UseBuild and run containersDeploy and manage containers
Ideal ForLocal developmentProduction-grade deployments
Works Alone?YesNeeds containers (like Docker)
Replacement?No – Kubernetes uses DockerNot a replacement – works with Docker

Figure 1. Kubernetes cluster components.

A Kubernetes cluster consists of a control plane plus a set of worker machines, called nodes, that run containerized applications. Every cluster needs at least one worker node in order to run Pods.

The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster. In production environments, the control plane usually runs across multiple computers and a cluster usually runs multiple nodes, providing fault-tolerance and high availability.

This document outlines the various components you need to have for a complete and working Kubernetes cluster.



2. Step-by-Step: Deploy Spring Boot on Kubernetes 

1. Build the Spring Boot App

a)Make sure your app is working locally, then build or run it using Maven or Gradle:

./mvnw clean package

b)If you want to skip tests while building

./mvnw clean package -DskipTests

This creates a target/<my-spring-boot-app-0.0.1-SNAPSHOT>.jar file.

(Replace my-spring-boot-app with your actual artifact ID)


c)Then run the JAR Locally

java -jar target/my-spring-boot-app-0.0.1-SNAPSHOT.jar

d) 📁 Example Directory After Build:
my-spring-boot-app/
├── src/
├── target/
│   ├── classes/
│   ├── my-spring-boot-app-0.0.1-SNAPSHOT.jar
│   └── ...
├── pom.xml
├── mvnw
├── mvnw.cmd
└── ...



2. Create a Dockerfile
In your project root, create a file named Dockerfile:

FROM openjdk:17
ARG JAR_FILE=target/*.jar
COPY ${JAR_FILE} app.jar
ENTRYPOINT ["java", "-jar", "/app.jar"]

3. Build and Push the Docker Image
docker build -t your-dockerhub-username/springboot-app:latest

Push it to Docker Hub (or any container registry):

docker push your-dockerhub-username/springboot-app:latest

Note: Replace your-dockerhub-username accordingly.

4. Create Kubernetes Deployment YAML
Create a file named deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: springboot-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: springboot-app
  template:
    metadata:
      labels:
        app: springboot-app
    spec:
      containers:
      - name: springboot-container
        image: your-dockerhub-username/springboot-app:latest
        ports:
        - containerPort: 8080

5. Create Kubernetes Service YAML
Create a file named service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: springboot-service
spec:
  selector:
    app: springboot-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: LoadBalancer  # Use NodePort if not using a cloud provider

6. Deploy to Kubernetes
Apply the configuration:

kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

7. Verify
kubectl get pods
kubectl get deployments
kubectl get services

Look for the EXTERNAL-IP of your service (may take a minute to appear if using LoadBalancer on cloud).

8. Access the App
Open the EXTERNAL-IP in your browser. If using NodePort, access it with:

http://<Node-IP>:<NodePort>

Tips:
Use ConfigMap or Secrets for environment variables
For production, consider using Helm, Ingress, TLS, and Horizontal Pod Autoscaler.
For local Kubernetes: use Minikube, Docker Desktop, or Kind.
--------------------------------------


Apache Kafka and Event-driven architecture (Kafka)

1.What is Kafka?
 
Apache Kafka is a distributed messaging system and event streaming platform designed for high-throughput, fault-tolerant, real-time data pipelines and event-driven applications. It works on a publish-subscribe model, where producers send messages to topics, and consumers receive them. Kafka is widely used for real-time data streaming, supporting scalable and durable communication between systems.

2.Key Components in Kafka (Spring Boot)

Component                   Role
Producer                 Sends messages to Kafka topics
Consumer               Listens to Kafka topics
Topic                      Logical name for message stream
KafkaTemplate      Helper class to send messages
@KafkaListener    Annotation to consume messages
KafkaAdmin          Creates topics programmatically

3.Dependencies (pom.xml)

<dependency>
    <groupId>org.springframework.kafka</groupId>
    <artifactId>spring-kafka</artifactId>
</dependency>

4.Configuration (application.yml)

spring:
  kafka:
    bootstrap-servers: localhost:9092

    consumer:
      group-id: my-group
      auto-offset-reset: earliest
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer

    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
      transaction-id-prefix: tx-  # 🔑 Required for Kafka transactions

5. Kafka Topic Creation with KafkaAdmin

import org.apache.kafka.clients.admin.NewTopic;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.config.TopicBuilder;

@Configuration
public class KafkaTopicConfig {

    @Bean
    public NewTopic topic() {
        return TopicBuilder.name("my-topic")
                .partitions(3)
                .replicas(1)
                .build();
    }
}

6.Kafka Producer

import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;

@Service
public class KafkaProducerService {

    private final KafkaTemplate<String, String> kafkaTemplate;

    public KafkaProducerService(KafkaTemplate<String, String> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    public void sendMessage(String topic, String message) {
        kafkaTemplate.send(topic, message);
    }
}

7.Kafka Consumer

import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.stereotype.Service;

@Service
public class KafkaConsumerService {

    @KafkaListener(topics = "my-topic", groupId = "my-group")
    public void consume(String message) {
        System.out.println("Consumed message: " + message);
    }
}

8. REST Controller to Test Producer

import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/kafka")
public class KafkaController {

    private final KafkaProducerService producerService;

    public KafkaController(KafkaProducerService producerService) {
        this.producerService = producerService;
    }

    @GetMapping("/send")
    public String sendMessage(@RequestParam String message) {
        producerService.sendMessage("my-topic", message);
        return "Message sent!";
    }
}

9.Kafka Annotations in Detail

Annotation                Purpose
@KafkaListener            Listens to Kafka topic; binds method to topic
@EnableKafka              Enables detection of @KafkaListener
@KafkaHandler             Used with @KafkaListener on class level for multi-method listeners
@KafkaListeners           Container for multiple @KafkaListener annotations on one method
@SendTo                   Sends reply to another topic from consumer
@KafkaListenerContainerFactoryCreates custom listener container for advanced settings
@Header                   Access Kafka headers (optional metadata)

10.Kafka Listener with Reply

@KafkaListener(topics = "request-topic", groupId = "my-group")
@SendTo("response-topic")
public String handleRequest(String message) {
    return "Response to: " + message;
}

Tips
  • Use @EnableKafka in a config class if @KafkaListener doesn’t seem to work.

  • You can send JSON objects by customizing the serializers/deserializers.

  • Use DLQ (Dead Letter Queue) for failed message handling.

  • For transactions, set spring.kafka.producer.transaction-id-prefix=tx-

-----------------------------

11.Enable Kafka Transactions in Configuration

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.kafka.core.*;
import org.springframework.transaction.annotation.EnableTransactionManagement;

@Configuration
@EnableTransactionManagement
public class KafkaConfig {

    @Bean
    public KafkaTemplate<String, String> kafkaTemplate(ProducerFactory<String, String> producerFactory) {
        KafkaTemplate<String, String> kafkaTemplate = new KafkaTemplate<>(producerFactory);
        kafkaTemplate.setTransactionalIdPrefix("tx-");
        return kafkaTemplate;
    }
}

12. Using Kafka Transactions in Code

import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;

@Service
public class TransactionalKafkaService {

    private final KafkaTemplate<String, String> kafkaTemplate;

    public TransactionalKafkaService(KafkaTemplate<String, String> kafkaTemplate) {
        this.kafkaTemplate = kafkaTemplate;
    }

    @Transactional
    public void sendMessages() {
        kafkaTemplate.executeInTransaction(t -> {
            t.send("topic1", "message 1");
            t.send("topic2", "message 2");
            return true;
        });

        // OR just use this inside a @Transactional method
        // kafkaTemplate.send("topic1", "message 1");
        // kafkaTemplate.send("topic2", "message 2");
    }
}

Notes: 
a)If any send fails in a transaction, all others are rolled back.
b)The transaction-id-prefix must be unique per producer instance in a cluster.
c)Transactions require idempotent producer settings (handled automatically by Spring Boot when transaction-id-prefix is set).
-----------------------------------------------------------

2. Event-Driven Architecture with Kafka in Java (Spring Boot)

🔍 What is Event-Driven Architecture (EDA)?

Event-Driven Architecture (EDA) is a design paradigm where services communicate by producing and consuming events, rather than making direct API calls. It's highly asynchronous, decoupled, and scalable.

Instead of saying "do this", services say "this happened" — and any interested component can respond.


🧩 Core Concepts in Kafka-Based EDA

ConceptDescription
EventA message representing a state change (e.g., "UserRegistered")
ProducerA service that sends (publishes) events to a Kafka topic
ConsumerA service that listens (subscribes) to events from a Kafka topic
TopicA category/feed name to which records are sent
BrokerA Kafka server that stores and manages messages
PartitionA way to split topic data for scaling
OffsetMessage ID used by consumers to track reading position

🎯 Benefits of EDA with Kafka

BenefitDescription
Loose couplingServices don't depend on each other
Scalability     Kafka handles millions of messages per second
ResilienceFailures in one service don’t crash the others
AsynchronousNon-blocking operations
AuditabilityEvents are durable and replayable

🔧 Example: User-Service ➡ Kafka ➡ Notification-Service

Scenario:

  • user-service emits a UserRegistered event

  • notification-service consumes this and sends a welcome email


✅ Step-by-Step with Spring Boot + Kafka

1. Add Dependencies

In pom.xml:

<dependency> <groupId>org.springframework.kafka</groupId> <artifactId>spring-kafka</artifactId> </dependency>

2. Kafka Configuration (application.yml)

spring: kafka: bootstrap-servers: localhost:9092 consumer: group-id: notification-group auto-offset-reset: earliest key-deserializer: org.apache.kafka.common.serialization.StringDeserializer value-deserializer: org.apache.kafka.common.serialization.StringDeserializer producer: key-serializer: org.apache.kafka.common.serialization.StringSerializer value-serializer: org.apache.kafka.common.serialization.StringSerializer

3. Producer Code (UserService)

@Service
public class KafkaProducer { @Autowired private KafkaTemplate<String, String> kafkaTemplate; public void sendUserRegisteredEvent(String message) { kafkaTemplate.send("user-registered-topic", message); } }

Usage:

java

producer.sendUserRegisteredEvent("{\"email\":\"test@example.com\"}");

4. Consumer Code (NotificationService)

@Service
public class KafkaConsumer { @KafkaListener(topics = "user-registered-topic", groupId = "notification-group") public void consume(String message) { System.out.println("Received UserRegistered event: " + message); // send email or SMS } }

5. Kafka Topic Configuration (Optional)

@Configuration public class KafkaTopicConfig { @Bean public NewTopic topic() { return TopicBuilder.name("user-registered-topic") .partitions(3) .replicas(1) .build(); } }

🐳 Docker Compose for Kafka

yaml

version: '3' services: zookeeper: image: confluentinc/cp-zookeeper environment: ZOOKEEPER_CLIENT_PORT: 2181 kafka: image: confluentinc/cp-kafka ports: - "9092:9092" environment: KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

✅ Summary

ComponentRole

user-service

Publishes UserRegistered event to Kafka
Kafka BrokerQueues and distributes the event
notification-serviceConsumes the event and acts (e.g., sends email)

------------------------------

3) Methods in Apache Kafka

Apache Kafka provides several core APIs with different sets of methods for producing, consuming, administering, and streaming data. Here's a detailed breakdown of important methods across the main Kafka classes:


🔑 1. KafkaProducer API (Producer Methods)

Used to send records to Kafka topics.

Common Methods:

MethodDescription
send(ProducerRecord<K,V>)Sends a record to a topic asynchronously.
send(ProducerRecord<K,V>, Callback)Sends a record with a callback to get acknowledgment.
flush()Forces all buffered records to be sent.
close()Closes the producer gracefully.

Example:
ProducerRecord<String, String> record = new ProducerRecord<>("topic", "key", "value"); producer.send(record);

📥 2. KafkaConsumer API (Consumer Methods)

Used to poll and read messages from Kafka topics.

Common Methods:

MethodDescription
subscribe(Collection<String>)Subscribe to a list of topics.
poll(Duration)Polls for messages from the broker.
commitSync()Commits the current offset synchronously.
commitAsync()Commits the offset asynchronously.
seek(TopicPartition, long)Seek to a specific offset in a partition.
close()Closes the consumer.
assign(Collection<TopicPartition>)Assign specific partitions manually.
position(TopicPartition)Returns the current offset position.
seekToBeginning(Collection<TopicPartition>)Seeks to beginning of the partition.
seekToEnd(Collection<TopicPartition>)Seeks to end of the partition.

Example:
consumer.subscribe(Arrays.asList("my-topic")); ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100)); for (ConsumerRecord<String, String> record : records) { System.out.println(record.value()); }

🛠️ 3. AdminClient API (Admin Operations)

Used to manage Kafka topics, brokers, etc.

Common Methods:

MethodDescription
createTopics(Collection<NewTopic>)Creates new topics.
listTopics()Lists all topics.
describeTopics(Collection<String>)Describes one or more topics.
deleteTopics(Collection<String>)Deletes topics.
describeCluster()Describes the Kafka cluster.

Example:
java
AdminClient admin = AdminClient.create(props);
admin.createTopics(Arrays.asList(new NewTopic("new-topic", 1, (short) 1)));

🔄 4. Streams API (KafkaStreams)

Used for real-time stream processing.

Common Methods:

MethodDescription
start()Starts stream processing.
close()Stops stream processing.
cleanUp()Cleans local state stores.
state()Gets the current state of the KafkaStreams instance.

📦 5. Other Utility/Support Methods

ProducerRecord Methods:

MethodDescription
topic()Gets the topic name.
key()Gets the message key.
value()Gets the message value.
partition()Gets the assigned partition (can be null).

ConsumerRecord Methods:

MethodDescription
topic()Gets the topic name.
key()Gets the key.
value()Gets the value.
offset()Gets the offset.
partition()Gets the partition number.
timestamp()Gets the timestamp of the record.

✅ Summary by Component

Kafka ComponentKey ClassesImportant Methods
ProducerKafkaProducer, ProducerRecordsend(), flush(), close()
ConsumerKafkaConsumer, ConsumerRecordpoll(), subscribe(), commitSync()
AdminAdminClient, NewTopiccreateTopics(), listTopics()
StreamsKafkaStreamsstart(), close(), state()

4)poll() in kafka

In Apache Kafka, the poll() method is a key part of the Kafka Consumer API. It is used to fetch data from Kafka topics.

🔍 What is poll() in Kafka?

In Kafka, the consumer retrieves records from the broker using the poll() method. This is how the consumer gets new messages from the partitions it subscribes to.


📘 Syntax


ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
  • consumer: An instance of KafkaConsumer.

  • poll(Duration): Tries to fetch data from Kafka for a given timeout (here, 100 milliseconds).

  • Returns a ConsumerRecords object, which is an iterable collection of ConsumerRecord.


🔁 Internals of poll()

  • Heartbeat: It also acts as a heartbeat to the Kafka broker to indicate the consumer is alive.

  • Rebalance Participation: It ensures the consumer participates in group rebalancing.

  • Offset Management: Kafka tracks the offset of consumed messages, and poll() plays a key role in this.

  • Fetching messages: Pulls batches of messages from the topic partitions assigned to the consumer.


🛑 Important Notes

  1. Mandatory loop:
    You must call poll() continuously in a loop, or Kafka considers the consumer as dead.

    java
    while (true) { ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100)); for (ConsumerRecord<String, String> record : records) { System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value()); } }
  2. Timeout value:

    • Shorter timeouts mean less waiting time but more CPU usage.

    • Longer timeouts might cause delays in rebalancing or consumer group heartbeats.

  3. Thread-safety:

    • poll() is not thread-safe.

    • Call it from one thread only per consumer instance.

  4. Auto Offset Commit:
    If enable.auto.commit=true, offsets are committed automatically after the poll.


🧠 Summary

AspectDescription
PurposeFetch records from Kafka
Return TypeConsumerRecords<K, V>
Needs LoopYes
Heartbeat to KafkaYes
Timeout ParamDefines how long to wait for new data
Auto Offset CommitHappens after poll() if enabled

4) Kafka Consumer Group 

A Consumer Group is a collection of one or more consumers that work together to consume messages from one or more Kafka topics.

🔍 Key Points:

  • Each consumer in a group reads from a unique partition of a topic (no duplication).

  • Kafka distributes partitions among consumers in the same group for parallel processing.

  • If the number of consumers > partitions, some consumers will be idle.

  • If the number of consumers < partitions, some consumers will read multiple partitions.

📌 Use Case:

  • For load balancing: spread the load of processing across multiple consumers.

  • For scalability: increase consumers to scale processing horizontally.

🔄 How it Works

  • If Topic T1 has 4 partitions: P0, P1, P2, P3

  • Consumer Group G1 has 2 consumers: C1, C2

Then:

  • Kafka will assign partitions like:

    • C1P0, P1

    • C2P2, P3


🎯 Benefits of Consumer Groups

FeatureDescription
ScalabilityConsume from partitions in parallel
Fault ToleranceIf a consumer crashes, another takes over
Offset TrackingKafka stores offset per partition per group
Exactly-onceEach message delivered to one group member only

5). Kafka Cluster

What: A Kafka Cluster is a distributed system consisting of multiple Kafka brokers. It is used for real-time stream processing, event sourcing, and asynchronous communication between microservices.

Key Components:

  • Broker: A Kafka server that stores and serves data

  • Zookeeper (legacy): Coordinates brokers (Kafka ≥ 2.8 can work without it)

  • Producer: Publishes messages

  • Consumer: Subscribes and reads messages

  • Topic: Logical channel to group messages

  • Partition: Splits a topic for parallelism and scalability

Cluster Characteristics:

  • High availability: Replication of partitions

  • Scalability: Add brokers to handle more load

  • Fault tolerance: If one broker fails, others take over

Example Cluster Setup:

  • 3 Kafka brokers (running on different machines)

  • Topic "orders" with 3 partitions and replication factor 2

  • Zookeeper (optional from Kafka 3.0+)

Best Practices:

  • Monitor lag using Kafka UI or Burrow

  • Set retention policies to manage disk space

  • Use schema registry with Avro/Protobuf


6) group.id in Kafka

group.id is a mandatory configuration property used to assign a Kafka consumer to a specific consumer group.

🔍 Key Points:

  • Kafka uses group.id to manage offsets and partition assignment.

  • Consumers with the same group.id form one consumer group and share the topic partitions.

  • Kafka tracks offsets per group, so each group has independent consumption.

7) Saga Pattern

What: Saga is a microservice design pattern to maintain data consistency in distributed transactions without two-phase commit (2PC). Each service performs a local transaction and publishes an event or makes a compensating transaction if something fails.

Types of Sagas:

  1. Choreography (Event-based):

    • Services listen to events and act accordingly

    • No centralized coordinator

    • Pros: Loose coupling

    • Cons: Hard to track, complex logic in services

  2. Orchestration (Command-based):

    • A central orchestrator tells each service what to do

    • Services report back success/failure

    • Pros: Centralized logic

    • Cons: Single point of coordination

Example (Order Processing):

  • Step 1: Order Service → creates order

  • Step 2: Payment Service → reserves funds

  • Step 3: Inventory Service → reduces stock

  • If step 3 fails → Compensation → Refund payment → Cancel order

Tools/Libraries:

  • Axon Framework

  • Eventuate Tram / Eventuate Saga

  • Camunda / Temporal.io (Workflow engines)

Use Case: Distributed e-commerce or travel booking systems where all-or-nothing is not possible.

8) Apache Kafka - Explanation of Each Component

1. Apache Kafka Architecture

Apache Kafka is a distributed messaging system and event streaming platform designed for high-throughput, fault-tolerant, real-time data pipelines and event-driven applications. It works on a publish-subscribe model, where producers send messages to topics, and consumers receive them. Kafka is widely used for real-time data streaming, supporting scalable and durable communication between systems.

Kafka ensures:

  • High throughput

  • Fault-tolerance

  • Horizontal scalability


2. Producers

  • Producers are clients or services that send data to Kafka topics.

  • Example: A Spring Boot service that sends an event like UserRegistered to a Kafka topic.

  • They serialize data and push it to Kafka asynchronously.

📌 Spring Boot Integration: You use KafkaTemplate<String, Object> to send messages.


3. Consumers

  • Consumers are services or apps that read data from Kafka topics.

  • They subscribe to a topic and process messages as they arrive.

  • Can be grouped into consumer groups for parallel processing and load balancing.

📌 Spring Boot Integration: You use @KafkaListener to consume messages from topics.


4. Topics

  • A topic is a logical channel or stream to which messages are published.

  • Producers write to a topic; consumers subscribe to and read from it.

  • Topics are append-only logs — messages are not deleted immediately after consumption.

Example:

  • Topic: order-events

  • Producers send new orders; consumers process them.


5. Partitions

  • Each Kafka topic is split into partitions, enabling:

    • Parallel processing (across threads or instances)

    • Scalability (across machines/brokers)

    • Ordering guarantees within a partition

More partitions = better throughput, but requires careful planning.


6. Brokers

  • A Kafka broker is a Kafka server.

  • It stores data, handles read/write requests, and maintains partitions.

  • Brokers coordinate with ZooKeeper (or KRaft in newer versions) for leader election and metadata.

📌 Example: In a Kafka cluster of 3 brokers, partitions are distributed across brokers.


7. Clusters

  • A Kafka cluster consists of multiple Kafka brokers working together.

  • Clusters offer:

    • High availability

    • Data replication

    • Load balancing

📌 Example:
If broker 1 fails, broker 2 can take over as leader for some partitions.


8. Integration with Spring Boot

  • Spring provides Spring Kafka, a high-level abstraction to simplify Kafka integration.

🔹 Producer Side:

java
@Autowired private KafkaTemplate<String, String> kafkaTemplate; kafkaTemplate.send("my-topic", "message");

🔹 Consumer Side:

java
@KafkaListener(topics = "my-topic") public void consume(String message) { System.out.println("Received: " + message); }

✅ Summary Table:

Component Purpose / Description
Producer Sends messages to Kafka topics.
Consumer Reads messages from Kafka topics, individually or as part of a consumer group.
Consumer Group A group of consumers that cooperatively consume messages from topics in parallel.
Topic Logical channel or category where messages are published and consumed.
Partition Subdivision of a topic to allow scalability and parallel processing.
Broker Kafka server that stores and manages message data and handles client communication.
Kafka Cluster A group of brokers that together manage the distributed messaging system.
Spring Boot Integration Enables applications to act as producer or consumer using Spring Kafka annotations and templates.
Spring Boot App The application that integrates with Kafka via Spring Boot; can send or receive messages.

Kafka Architecture

The Spring Boot Kafka Consumer is responsible for reading messages from Kafka and storing them into MS SQL DB using service and repository layers.

Kafka Consumer (Spring Boot)
        |
        v
[ Service Layer / Repository Layer ]
        |
        v
     MS SQL DB  

Spring Boot → Repository Layer → MS SQL DB