Skip to content

High-performance multithreaded HTTP web server in C++17 featuring epoll-based I/O, thread pooling, LRU caching, rate limiting, and comprehensive logging. Achieves 78K+ req/sec with Docker support and complete test suite.

License

Notifications You must be signed in to change notification settings

swe-robertkibet/multithreaded-webserver-cpp

Repository files navigation

High-Performance Multithreaded Web Server in C++

Build Status C++17 License Performance

A production-ready, high-performance HTTP web server implemented in modern C++17, featuring epoll-based event handling, thread pooling, intelligent caching, and comprehensive security features.

Performance Highlights

  • 78,091.55 requests/sec peak performance with 1000 concurrent connections
  • 71,508.89 requests/sec sustained throughput with 5000 concurrent connections
  • 47,831.58 requests/sec average over 5-minute extended duration tests
  • 0% memory growth under sustained load (excellent memory stability)
  • Memory efficient with exceptional stability (1.7MB stable memory usage)
  • Production ready with Docker containerization

Architecture

The multithreaded web server employs a sophisticated, layered architecture designed for high performance, scalability, and maintainability. The following diagrams illustrate the complete system design from different perspectives.

System Overview

This comprehensive diagram shows the complete request processing pipeline from client connection through response delivery, including all major components and their interactions.

Main Architecture

Complete system architecture showing the epoll-based event loop, thread pool management, request processing pipeline, embedded caching system, and configuration loading mechanism.

Component Layer Architecture

The system is organized into distinct layers, each with specific responsibilities and clear interfaces between components.

Architecture Components

Layer-based architecture view demonstrating the separation of concerns across Client Layer, Network Layer, Threading Layer, Application Layer, and Data Layer with actual component interactions.

Request Processing Flow

This detailed flowchart illustrates the complete request lifecycle with decision points, performance timings, and error handling paths.

Architecture Flow

Actual request processing flow showing performance characteristics: ~0.1ms cache hits, ~0.5ms parsing, ~5-20ms file system operations, with real error handling and connection management implementation.

Core Components

1. Event-Driven I/O (epoll)

  • Location: src/epoll_wrapper.cpp, include/epoll_wrapper.h
  • Linux epoll for scalable I/O multiplexing
  • Non-blocking socket operations
  • Edge-triggered event notification
  • Handles thousands of concurrent connections efficiently

2. Thread Pool Management

  • Location: include/thread_pool.h
  • Custom thread pool implementation with task queue
  • Configurable worker thread count (auto-detects hardware threads)
  • Task queue with condition variable synchronization
  • Future-based task completion tracking

3. HTTP Protocol Handler

  • Location: src/http_request.cpp, src/http_response.cpp
  • Full HTTP/1.1 implementation
  • Keep-alive connection support
  • Comprehensive header parsing and validation

4. Intelligent Caching System

  • Location: src/cache.cpp, include/cache.h
  • LRU (Least Recently Used) eviction policy
  • TTL-based expiration (configurable)
  • Memory-efficient storage
  • Thread-safe with fine-grained locking

5. Rate Limiting

  • Location: src/rate_limiter.cpp, include/rate_limiter.h
  • Token bucket algorithm per client IP
  • Configurable requests per second and burst capacity
  • DDoS protection with automatic cleanup
  • Statistics tracking for monitoring

6. File Handler

  • Location: src/file_handler.cpp, include/file_handler.h
  • Static file serving with MIME type detection
  • Directory listing support
  • Efficient file streaming for large files
  • Security features (path traversal protection)

7. Structured Logging

  • Location: src/logger.cpp, include/logger.h
  • Separate access and error logs
  • Configurable log levels (DEBUG, INFO, WARN, ERROR)
  • Thread-safe logging with buffering
  • Apache Common Log Format compatible

Data Flow

  1. Connection Acceptance: Main thread accepts connections via epoll
  2. Event Processing: Events distributed to thread pool workers
  3. Request Parsing: HTTP request parsed and validated
  4. Rate Limiting: Check client IP against rate limits
  5. Cache Lookup: Check if requested resource is cached
  6. File Processing: Serve static files or generate dynamic content
  7. Response Generation: Build HTTP response with appropriate headers
  8. Connection Management: Handle keep-alive or close connection
  9. Logging: Record access and error information

Memory Management

  • RAII Principles: All resources managed with smart pointers
  • Cache Management: Automatic memory cleanup with LRU eviction

Features

Core Features

  • High Performance: 78K+ requests/sec with low latency
  • Multithreaded: Configurable thread pool with auto-scaling
  • Event-Driven: Linux epoll for efficient I/O multiplexing
  • HTTP/1.1: Full protocol support with keep-alive
  • Static Files: Efficient static content serving
  • Caching: Intelligent LRU cache with TTL
  • Rate Limiting: Token bucket per-IP rate limiting
  • Logging: Comprehensive access and error logging
  • Configuration: JSON-based runtime configuration

Security Features

  • Path Traversal Protection: Prevents directory escape attacks
  • Rate Limiting: DDoS protection with configurable limits
  • Non-Root Execution: Docker containers run as non-root user
  • Input Validation: Comprehensive HTTP request validation
  • Resource Limits: Configurable memory and connection limits

Operational Features

  • Performance Monitoring: Built-in metrics and statistics
  • Docker Support: Multi-stage builds for production
  • Comprehensive Testing: 600+ lines of unit tests
  • Benchmarking: Performance comparison tools included
  • Configuration: Runtime parameter adjustment

Quick Start

Prerequisites

  • C++17 compatible compiler (GCC 7+ or Clang 5+)
  • CMake 3.16+
  • Linux (Ubuntu 20.04+ recommended)

Build and Run

# Clone the repository
git clone https://github.com/swe-robertkibet/multithreaded-webserver-cpp.git
cd multithreaded-webserver-cpp

# Build the project
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

# Run the server
./bin/webserver 8080

# Or with custom thread count
./bin/webserver 8080 16

Docker Deployment

# Build and run with Docker Compose
docker-compose up --build webserver

# For development
docker-compose --profile development up webserver-dev

# For performance testing
docker-compose --profile benchmark up

Configuration

Edit config.json to customize server behavior:

{
  "server": {
    "host": "0.0.0.0",
    "port": 8080,
    "max_connections": 1000,
    "socket_timeout": 30
  },
  "threading": {
    "thread_pool_size": 8,
    "max_queue_size": 10000
  },
  "cache": {
    "enabled": true,
    "max_size_mb": 100,
    "ttl_seconds": 300
  },
  "rate_limiting": {
    "enabled": true,
    "requests_per_second": 100,
    "burst_size": 200
  }
}

Performance Benchmarks

Comprehensive Stress Test Results

Live Stress Test Demonstration

Stress Test Results

Real-time stress testing demonstration showing the server handling high concurrent loads with excellent performance and resource efficiency.

Connection Scalability Test

Concurrent Connections Requests/sec Performance
100 39,565.85 Baseline
500 55,934.55 +41.4%
1000 78,091.55 +97.3%
2000 73,075.56 +84.7% Peak
5000 71,508.89 +80.7%

Extended Duration Test

Test Duration Avg Requests/sec Stability
1 minute 46,706.94 Stable
5 minutes 47,831.58 Consistent
10 minutes 48,858.91 Sustained

Memory Stability Test (5-minute sustained load)

Initial memory usage: 1,680KB
Final memory usage:   1,680KB
Memory growth:        0% (0KB increase)
Maximum memory:       1,680KB
Status:              No memory leaks detected

Comparison with Industry Standards

Server Peak Requests/sec Sustained Req/s Memory Usage Stability
This Server 78,091.55 48,858.91 1.7MB 0% growth
Nginx ~45,000 ~40,000 ~25MB Stable
Apache ~25,000 ~20,000 ~100MB Variable

Performance Advantages:

  • 73% faster peak performance than Nginx
  • 212% faster peak performance than Apache
  • Exceptional memory efficiency at 1.7MB (15x better than Nginx)
  • Perfect memory stability with zero growth under sustained load

Performance Optimization - Log Reduction Impact

High-Load Error Handling Optimization

During stress testing, the server was producing excessive error logging for normal race condition scenarios that occur in high-concurrency environments. These logs were impacting performance by creating unnecessary I/O overhead.

Technical Background

The following error messages are normal and expected under high load in multithreaded scenarios:

Race Condition Errors:

  • [Response] fd=X ERROR: Connection already closed (server.cpp:377)
  • [Send] fd=X ERROR: Failed to send response: Bad file descriptor (server.cpp:444)

Root Causes:

  1. Timing Issues: Client disconnects while server is processing the response
  2. Race Conditions: Connection gets cleaned up by one thread while another thread tries to send data
  3. High Load Behavior: Under stress testing, clients timeout/disconnect before server can respond
  4. Network Conditions: High load causes natural connection drops

Why These Errors Are Normal:

  • High Concurrency: With 10,000+ connection limit, more concurrent connections create more timing opportunities
  • Client Behavior: Load testing tools often timeout or close connections aggressively
  • Network Stack: TCP connections naturally drop under extreme load conditions
  • Defensive Programming: Server detects and handles these conditions gracefully instead of crashing

Optimization Impact

By removing these excessive log messages (while preserving all error handling logic), we achieved significant performance improvements:

Metric Before Optimization After Optimization Improvement
Peak Performance 72,158 req/s 78,091.55 req/s +8.2%
1000 Connections 58,267 req/s 78,091.55 req/s +34.0%
Sustained (10min) 47,290 req/s 48,858.91 req/s +3.3%
Memory Usage ~18MB 1.7MB -91.1%
Log I/O Overhead High Eliminated 100% reduced

Key Benefits

Performance Gain: 8-34% improvement in request throughput
Memory Efficiency: 91% reduction in memory usage
Clean Output: Eliminated noise from expected connection drops
Maintained Reliability: All error handling logic preserved
Production Ready: Server handles edge cases silently and efficiently

Note: This optimization demonstrates that high-performance servers must balance comprehensive logging with performance efficiency. The removed messages were debugging information for normal operational conditions, not actual errors requiring attention.

Testing

Unit Tests

# Build with tests
mkdir build && cd build
cmake .. -DBUILD_TESTS=ON
make -j$(nproc)

# Run tests
./webserver_tests

Benchmark Testing

# Run comprehensive benchmarks
./scripts/benchmark.sh

# Quick performance test
./scripts/quick_bench.sh

# Stress test
./scripts/stress_test.sh

Project Structure

multithreaded-webserver-cpp/
├── include/           # Header files
│   ├── server.h         # Main server class
│   ├── thread_pool.h    # Thread pool implementation
│   ├── http_request.h   # HTTP request parser
│   ├── http_response.h  # HTTP response builder
│   ├── cache.h          # LRU cache system
│   ├── rate_limiter.h   # Rate limiting implementation
│   ├── file_handler.h   # File serving logic
│   ├── logger.h         # Logging system
│   └── epoll_wrapper.h  # Epoll abstraction
├── src/              # Source files
│   ├── main.cpp         # Application entry point
│   ├── server.cpp       # Server implementation
│   ├── thread_pool.cpp  # Thread pool logic
│   ├── http_request.cpp # Request parsing
│   ├── http_response.cpp# Response generation
│   ├── cache.cpp        # Cache implementation
│   ├── rate_limiter.cpp # Rate limiting logic
│   ├── file_handler.cpp # File operations
│   ├── logger.cpp       # Logging implementation
│   └── epoll_wrapper.cpp# Epoll wrapper
├── tests/            # Unit tests (GoogleTest)
├── scripts/          # Benchmarking scripts
├── docker/           # Docker configuration
├── public/           # Static web content
├── logs/             # Server logs
├── config.json          # Server configuration
├── CMakeLists.txt       # Build configuration
├── Dockerfile           # Multi-stage Docker build
└── docker-compose.yml   # Container orchestration

Configuration Options

Server Configuration

  • host: Bind address (default: "0.0.0.0")
  • port: Listen port (default: 8080)
  • max_connections: Maximum concurrent connections
  • socket_timeout: Connection timeout in seconds

Performance Tuning

  • thread_pool_size: Worker thread count (0 = auto-detect)
  • max_queue_size: Maximum task queue size
  • cache.max_size_mb: Cache memory limit
  • cache.ttl_seconds: Cache entry lifetime

Security Settings

  • rate_limiting.enabled: Enable/disable rate limiting
  • rate_limiting.requests_per_second: Request rate limit
  • rate_limiting.burst_size: Burst capacity

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

Development Setup

# Use development Docker container
docker-compose --profile development up webserver-dev

# Or build locally with debug symbols
cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTS=ON

Monitoring and Observability

Built-in Metrics

  • Request throughput (requests/sec)
  • Response time percentiles
  • Cache hit ratios
  • Rate limiting statistics
  • Active connection counts
  • Memory usage tracking

Log Analysis

# Monitor access logs
tail -f logs/access.log

# Check error logs
tail -f logs/error.log

# Analyze performance
grep "200 OK" logs/access.log | wc -l

Troubleshooting

Common Issues

High CPU Usage

  • Check thread pool size configuration
  • Monitor for busy loops in request processing
  • Verify epoll configuration

Memory Leaks

  • Monitor cache size and eviction policies
  • Check for unclosed file descriptors
  • Use valgrind for detailed analysis

Connection Issues

  • Verify firewall settings
  • Check file descriptor limits (ulimit -n)
  • Monitor connection timeout settings

Debug Mode

# Build with debug symbols
cmake .. -DCMAKE_BUILD_TYPE=Debug

# Run with debugging tools
gdb ./webserver
valgrind --leak-check=full ./webserver

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Linux epoll documentation and best practices
  • Modern C++ design patterns and idioms
  • HTTP/1.1 specification (RFC 7230-7235)
  • Performance optimization techniques from systems programming

Support

For issues and questions:


Built with ❤️ using modern C++17 and systems programming best practices.

About

High-performance multithreaded HTTP web server in C++17 featuring epoll-based I/O, thread pooling, LRU caching, rate limiting, and comprehensive logging. Achieves 78K+ req/sec with Docker support and complete test suite.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published