Rate Limit Usage

High-Level

Rate limiting is used to control the rate at which requests are processed by a service. It ensures fair resource allocation, prevents system abuse, and provides stability under high load. The mechanism revolves around defining a limit on the number of requests allowed within a time window.

Core Concepts

Key: A unique identifier for the entity being limited (e.g., IP, client ID, service name).
Limit: The maximum number of requests allowed per window.
Window: A time duration (e.g., 1 minute or 5 seconds) during which requests are counted.

Use Cases

API Rate Limiting: Prevent excessive API requests from a single client.
Kafka Listener Throttling: Pause Kafka consumers when limits are exceeded.
Resource Protection: Protect services from being overwhelmed.

Configuration

To define rate-limiting behavior, the RateLimitRequest is used, specifying the key, limit per window, and window duration. Additionally, rate limiting can be seamlessly integrated into Kafka consumers using KafkaRateLimitedListener.

Example Configuration

1. Basic Rate Limit Request

import cz.datalite.tsm.commons.rate.RateLimitRequest
import kotlin.time.Duration.Companion.minutes

val rateLimitRequest = RateLimitRequest(
    key = "exampleResource", // Identifier for the resource
    limitPerWindow = 100,    // Allow up to 100 requests
    window = 1.minutes       // In a 1-minute time window
)

2. Kafka Rate-Limiting Configuration

tsm:
  kafka:
    rate-limit:
      key: "dynamicKeyFromConfig"    # Unique key for the Kafka consumer (can differ per service/environment)
      limit: 100                     # Allow up to 100 messages
      window: 2m                     # 2-minute time window

import cz.datalite.tsm.commons.rate.KafkaRateLimitedListener
import org.springframework.kafka.annotation.KafkaListener

@KafkaListener(topics = ["rate-limited-topic"])
@KafkaRateLimitedListener(
   key = "\${tsm.kafka.rate-limit.key:defaultKey}",
   limitPerWindow = "\${tsm.kafka.rate-limit.limit:50}",
   window = "\${tsm.kafka.rate-limit.window:1m}"
)

fun consume(message: String) {
    println("Processing: $message")
}

Technical Details

Architecture

The rate-limiting mechanism uses the following components:

RateLimitRequest:
Represents the rules for rate limiting, including the unique client key, the limit per sliding time window, and the time window duration.
RateLimitService:
Responsible for enforcing rate limits. It provides methods to evaluate whether a specific request is allowed and when the client should retry after reaching the limit. Internally, it uses a backend store (Redis) to track requests.
Kafka Integration:
The KafkaRateLimitedListener annotation and the KafkaConsumerRateLimiter class allow for rate limiting Kafka listeners. When limits are exceeded, consumers are paused until the retry interval elapses.

Key Features

Allows limiting requests on per-client or per-resource basis.
Supports both synchronous limits (e.g., API requests) and asynchronous consumption (Kafka consumers).
Uses Redis for efficient tracking of request counts and time windows.

Code Examples

1. Using RateLimitService to enforce limits

import cz.datalite.tsm.commons.rate.RateLimitService
import cz.datalite.tsm.commons.rate.RateLimitRequest
import org.springframework.beans.factory.annotation.Autowired
import kotlin.time.Duration.Companion.seconds

@Autowired
lateinit var rateLimitService: RateLimitService

fun processRequest() {
    val request = RateLimitRequest("exampleClient", 5, 30.seconds)

    val result = rateLimitService.rate(request)
    if (result.allowedNext) {
        println("Request allowed. Remaining: ${result.remainingRequests}")
    } else {
        println("Rate limit exceeded. Retry after: ${result.retryAfter}")
    }
}

2. Kafka Listener with Rate-Limiting Runtime Behavior

import cz.datalite.tsm.commons.rate.KafkaConsumerRateLimiter
import cz.datalite.tsm.commons.rate.RateLimitRequest
import cz.datalite.tsm.commons.rate.RateLimitService
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.springframework.beans.factory.annotation.Autowired
import org.springframework.kafka.annotation.KafkaListener
import kotlin.time.Duration.Companion.minutes

@Autowired
lateinit var rateLimitService: RateLimitService

@Autowired
lateinit var kafkaConsumerRateLimiter: KafkaConsumerRateLimiter

@KafkaListener(id = "kafkaConsumer", topics = ["rate-limited-topic"])
fun consumeMessage(record: ConsumerRecord<String, String>) {
    val request = RateLimitRequest("rateLimitedInterface", 10, 1.minutes)
    val result = rateLimitService.rate(request)

    kafkaConsumerRateLimiter.rateLimit("kafkaConsumer", result) // "kafkaConsumer" is container id

    if (!result.allowedNext) {
        throw IllegalStateException("Rate limit exceeded. Retry after ${result.retryAfter}")
    }

    println("Processed: ${record.value()}")
}

Conclusion

This architecture provides a robust way to manage rate limiting for both standard API usage and Kafka consumers. It ensures system reliability by enforcing limits and distributing resource usage fairly. Depending on the scenario, the provided components (RateLimitService, KafkaRateLimitedListener, etc.) can be used independently or in combination to ensure smooth throttling and exception handling behavior.