Back to blog

Build a Video Platform: Deployment & Production (Docker + VPS)

javaspring-bootreactnextjsvideo-streaming
Build a Video Platform: Deployment & Production (Docker + VPS)

We've built 14 phases of a video streaming platform — authentication, streaming, transcoding, payments, analytics, security, and testing. Everything works perfectly on localhost. But a platform that only runs on your laptop isn't a platform — it's a demo.

This final post takes our project from development to production. We'll containerize everything with Docker, configure Nginx with SSL, run database migrations safely, automate backups, write a deployment script, and add monitoring with Spring Boot Actuator. By the end, you'll have a production-ready video platform running on a VPS.

Time commitment: 4–5 hours
Prerequisites: Phase 13: Testing Strategy

What we'll build in this post:
✅ Multi-stage Docker builds for Spring Boot and Next.js
✅ Docker Compose production configuration with all services
✅ Nginx reverse proxy with SSL termination and HLS serving
✅ Flyway production migrations with safety checks
✅ Automated backup strategy (database + video files)
✅ One-command deploy script with health checks
✅ Spring Boot Actuator monitoring and alerting


Production Architecture

Here's the full production deployment architecture:

Every component runs in its own Docker container, orchestrated by Docker Compose. Let's build it layer by layer.


Multi-Stage Docker Build: Spring Boot API

Multi-stage builds keep production images small. We separate the build environment (with Maven and JDK) from the runtime environment (just JRE).

API Dockerfile

# Dockerfile.api
 
# ── Stage 1: Build ──────────────────────────────
FROM eclipse-temurin:17-jdk-alpine AS builder
 
WORKDIR /app
 
# Copy Maven wrapper and POM first (layer caching)
COPY mvnw pom.xml ./
COPY .mvn .mvn
RUN chmod +x mvnw && ./mvnw dependency:go-offline -B
 
# Copy source and build
COPY src ./src
RUN ./mvnw package -DskipTests -B
 
# ── Stage 2: Runtime ────────────────────────────
FROM eclipse-temurin:17-jre-alpine
 
# Install FFmpeg for video transcoding
RUN apk add --no-cache ffmpeg
 
WORKDIR /app
 
# Create non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
 
# Copy the built JAR
COPY --from=builder /app/target/*.jar app.jar
 
# Create directories for video storage
RUN mkdir -p /app/videos/uploads /app/videos/hls /app/videos/keys \
    && chown -R appuser:appgroup /app
 
USER appuser
 
# Health check
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
    CMD wget -qO- http://localhost:8080/actuator/health || exit 1
 
EXPOSE 8080
 
ENTRYPOINT ["java", \
    "-XX:+UseContainerSupport", \
    "-XX:MaxRAMPercentage=75.0", \
    "-Djava.security.egd=file:/dev/./urandom", \
    "-jar", "app.jar"]

Key decisions:

  • eclipse-temurin — production-grade OpenJDK distribution
  • -XX:+UseContainerSupport — respects Docker memory limits
  • -XX:MaxRAMPercentage=75.0 — uses 75% of container memory for JVM heap
  • Non-root user — security best practice
  • FFmpeg installed — video transcoding runs in the same container
  • Health check — Docker monitors the container via Actuator

Multi-Stage Docker Build: Next.js Frontend

Frontend Dockerfile

# Dockerfile.web
 
# ── Stage 1: Dependencies ───────────────────────
FROM node:20-alpine AS deps
 
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production
 
# ── Stage 2: Build ──────────────────────────────
FROM node:20-alpine AS builder
 
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
 
COPY . .
 
# Build-time environment variables
ARG NEXT_PUBLIC_API_URL
ENV NEXT_PUBLIC_API_URL=$NEXT_PUBLIC_API_URL
 
RUN npm run build
 
# ── Stage 3: Runtime ────────────────────────────
FROM node:20-alpine
 
WORKDIR /app
 
# Create non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
 
# Copy only what's needed for production
COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/public ./public
COPY --from=builder /app/package.json ./
COPY --from=builder /app/next.config.mjs ./
 
USER appuser
 
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
    CMD wget -qO- http://localhost:3000/api/health || exit 1
 
EXPOSE 3000
 
CMD ["npm", "start"]

Three-stage build keeps the final image minimal:

  1. deps — install production dependencies only
  2. builder — compile Next.js with full dev dependencies
  3. runtime — copy only the built output and production node_modules

Docker Compose: Production Configuration

This is the heart of our deployment — a single file that defines every service, volume, and network.

docker-compose.prod.yml

version: "3.8"
 
services:
  # ── PostgreSQL ────────────────────────────────
  db:
    image: postgres:16-alpine
    container_name: vidplatform-db
    restart: unless-stopped
    environment:
      POSTGRES_DB: ${DB_NAME}
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
    networks:
      - backend
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d ${DB_NAME}"]
      interval: 10s
      timeout: 5s
      retries: 5
 
  # ── Redis ─────────────────────────────────────
  redis:
    image: redis:7-alpine
    container_name: vidplatform-redis
    restart: unless-stopped
    command: redis-server --requirepass ${REDIS_PASSWORD} --maxmemory 256mb --maxmemory-policy allkeys-lru
    volumes:
      - redis_data:/data
    networks:
      - backend
    healthcheck:
      test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
 
  # ── Spring Boot API ───────────────────────────
  api:
    build:
      context: ./backend
      dockerfile: Dockerfile.api
    container_name: vidplatform-api
    restart: unless-stopped
    environment:
      SPRING_PROFILES_ACTIVE: production
      SPRING_DATASOURCE_URL: jdbc:postgresql://db:5432/${DB_NAME}
      SPRING_DATASOURCE_USERNAME: ${DB_USER}
      SPRING_DATASOURCE_PASSWORD: ${DB_PASSWORD}
      SPRING_DATA_REDIS_HOST: redis
      SPRING_DATA_REDIS_PASSWORD: ${REDIS_PASSWORD}
      JWT_SECRET: ${JWT_SECRET}
      STRIPE_SECRET_KEY: ${STRIPE_SECRET_KEY}
      STRIPE_WEBHOOK_SECRET: ${STRIPE_WEBHOOK_SECRET}
      VIDEO_STORAGE_PATH: /app/videos
      HLS_BASE_URL: ${HLS_BASE_URL}
    volumes:
      - video_data:/app/videos
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    networks:
      - backend
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: "2.0"
 
  # ── Next.js Frontend ──────────────────────────
  web:
    build:
      context: ./frontend
      dockerfile: Dockerfile.web
      args:
        NEXT_PUBLIC_API_URL: ${NEXT_PUBLIC_API_URL}
    container_name: vidplatform-web
    restart: unless-stopped
    environment:
      NODE_ENV: production
    depends_on:
      - api
    networks:
      - backend
 
  # ── Nginx Reverse Proxy ───────────────────────
  nginx:
    image: nginx:alpine
    container_name: vidplatform-nginx
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/conf.d:/etc/nginx/conf.d:ro
      - ./certbot/conf:/etc/letsencrypt:ro
      - ./certbot/www:/var/www/certbot:ro
      - video_data:/var/www/videos:ro
    depends_on:
      - api
      - web
    networks:
      - backend
 
  # ── Certbot (SSL) ────────────────────────────
  certbot:
    image: certbot/certbot
    container_name: vidplatform-certbot
    volumes:
      - ./certbot/conf:/etc/letsencrypt
      - ./certbot/www:/var/www/certbot
    entrypoint: "/bin/sh -c 'trap exit TERM; while :; do certbot renew; sleep 12h; done'"
 
volumes:
  postgres_data:
  redis_data:
  video_data:
 
networks:
  backend:
    driver: bridge

Service Communication Flow


Environment Variables

Create a .env file on your production server — never commit this to git.

# .env (production)
 
# ── Database ─────────────────────────────────────
DB_NAME=vidplatform
DB_USER=vidplatform
DB_PASSWORD=generate-a-strong-password-here
 
# ── Redis ────────────────────────────────────────
REDIS_PASSWORD=generate-another-strong-password
 
# ── JWT ──────────────────────────────────────────
JWT_SECRET=at-least-64-chars-random-string-here
 
# ── Stripe ───────────────────────────────────────
STRIPE_SECRET_KEY=sk_live_your_key
STRIPE_WEBHOOK_SECRET=whsec_your_secret
 
# ── URLs ─────────────────────────────────────────
DOMAIN=yourdomain.com
NEXT_PUBLIC_API_URL=https://yourdomain.com/api
HLS_BASE_URL=https://yourdomain.com/videos/hls
 
# ── OAuth (optional) ────────────────────────────
GOOGLE_CLIENT_ID=your-google-client-id
GOOGLE_CLIENT_SECRET=your-google-client-secret
GITHUB_CLIENT_ID=your-github-client-id
GITHUB_CLIENT_SECRET=your-github-client-secret

Generate secure passwords:

# Generate random passwords
openssl rand -base64 32  # For DB_PASSWORD
openssl rand -base64 32  # For REDIS_PASSWORD
openssl rand -base64 48  # For JWT_SECRET

Nginx Configuration

Nginx handles SSL termination, reverse proxying, HLS video serving, and security headers — all in one place.

Main Configuration

# nginx/nginx.conf
 
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
 
events {
    worker_connections 1024;
    use epoll;
}
 
http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
 
    # Logging format
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for" '
                    '$request_time';
 
    access_log /var/log/nginx/access.log main;
 
    # Performance
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
 
    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css application/json application/javascript
               text/xml application/xml text/javascript image/svg+xml;
 
    # Rate limiting zones
    limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
    limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
 
    # File upload size (for video uploads)
    client_max_body_size 2G;
 
    include /etc/nginx/conf.d/*.conf;
}

Site Configuration with SSL

# nginx/conf.d/vidplatform.conf
 
# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name yourdomain.com;
 
    # Certbot challenge
    location /.well-known/acme-challenge/ {
        root /var/www/certbot;
    }
 
    location / {
        return 301 https://$host$request_uri;
    }
}
 
# Main HTTPS server
server {
    listen 443 ssl http2;
    server_name yourdomain.com;
 
    # ── SSL Configuration ─────────────────────────
    ssl_certificate /etc/letsencrypt/live/yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/yourdomain.com/privkey.pem;
 
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384;
    ssl_prefer_server_ciphers off;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 1d;
 
    # ── Security Headers ──────────────────────────
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;
    add_header Content-Security-Policy
        "default-src 'self'; "
        "script-src 'self' 'unsafe-inline' 'unsafe-eval' https://js.stripe.com; "
        "style-src 'self' 'unsafe-inline'; "
        "img-src 'self' data: https:; "
        "connect-src 'self' https://api.stripe.com; "
        "frame-src https://js.stripe.com; "
        always;
 
    # ── API Proxy ─────────────────────────────────
    location /api/ {
        limit_req zone=api burst=50 nodelay;
 
        proxy_pass http://api:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
 
        # Timeout for long-running requests (video upload)
        proxy_read_timeout 600s;
        proxy_send_timeout 600s;
    }
 
    # ── Auth endpoints (stricter rate limit) ──────
    location /api/auth/ {
        limit_req zone=auth burst=3 nodelay;
 
        proxy_pass http://api:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
 
    # ── Stripe Webhook (no rate limit) ────────────
    location /api/webhooks/stripe {
        proxy_pass http://api:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
 
        # Stripe needs raw body for signature verification
        proxy_set_header Content-Type $content_type;
        proxy_pass_request_body on;
    }
 
    # ── HLS Video Streaming (secure_link) ─────────
    location /videos/hls/ {
        alias /var/www/videos/hls/;
 
        # Secure link validation
        secure_link $arg_token,$arg_expires;
        secure_link_md5 "$secure_link_expires$uri$remote_addr your-secure-link-secret";
 
        # Deny if token is invalid or expired
        if ($secure_link = "") { return 403; }
        if ($secure_link = "0") { return 410; }
 
        # HLS-specific headers
        add_header Cache-Control "private, max-age=3600";
        add_header Access-Control-Allow-Origin "https://yourdomain.com";
 
        # MIME types for HLS
        types {
            application/vnd.apple.mpegurl m3u8;
            video/mp2t ts;
            application/octet-stream key;
        }
    }
 
    # ── Actuator (internal only) ──────────────────
    location /actuator/ {
        # Only allow from internal networks
        allow 172.16.0.0/12;  # Docker network
        allow 10.0.0.0/8;
        deny all;
 
        proxy_pass http://api:8080;
        proxy_set_header Host $host;
    }
 
    # ── Next.js Frontend ──────────────────────────
    location / {
        proxy_pass http://web:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
 
        # WebSocket support (for HMR in dev, Next.js)
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

SSL Certificate Setup

# First-time SSL certificate with Let's Encrypt
# Step 1: Start Nginx with HTTP only (comment out SSL block first)
docker compose -f docker-compose.prod.yml up -d nginx
 
# Step 2: Obtain certificate
docker compose -f docker-compose.prod.yml run --rm certbot \
    certonly --webroot --webroot-path=/var/www/certbot \
    -d yourdomain.com --email your@email.com --agree-tos
 
# Step 3: Uncomment SSL block in nginx.conf, restart
docker compose -f docker-compose.prod.yml restart nginx
 
# Verify SSL
curl -I https://yourdomain.com

Certbot auto-renews certificates every 12 hours (via the container's entrypoint). Nginx reloads are handled by the renewal hooks.


Spring Boot Production Profile

application-production.yml

# src/main/resources/application-production.yml
 
spring:
  # ── Database ──────────────────────────────────
  datasource:
    url: ${SPRING_DATASOURCE_URL}
    username: ${SPRING_DATASOURCE_USERNAME}
    password: ${SPRING_DATASOURCE_PASSWORD}
    hikari:
      maximum-pool-size: 20
      minimum-idle: 5
      idle-timeout: 300000
      connection-timeout: 20000
      max-lifetime: 1200000
 
  # ── JPA ───────────────────────────────────────
  jpa:
    hibernate:
      ddl-auto: validate  # NEVER use update/create in production
    show-sql: false
    properties:
      hibernate:
        format_sql: false
        generate_statistics: false
 
  # ── Flyway ────────────────────────────────────
  flyway:
    enabled: true
    baseline-on-migrate: false
    validate-on-migrate: true
    locations: classpath:db/migration
 
  # ── Redis ─────────────────────────────────────
  data:
    redis:
      host: ${SPRING_DATA_REDIS_HOST}
      port: 6379
      password: ${SPRING_DATA_REDIS_PASSWORD}
      timeout: 5000
 
  # ── Cache ─────────────────────────────────────
  cache:
    type: redis
    redis:
      time-to-live: 600000  # 10 minutes default
 
# ── Server ────────────────────────────────────────
server:
  port: 8080
  shutdown: graceful
  tomcat:
    max-threads: 200
    accept-count: 100
    connection-timeout: 10000
 
# ── Actuator ──────────────────────────────────────
management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus
  endpoint:
    health:
      show-details: when-authorized
      probes:
        enabled: true
  health:
    db:
      enabled: true
    redis:
      enabled: true
    diskspace:
      enabled: true
 
# ── Logging ───────────────────────────────────────
logging:
  level:
    root: WARN
    com.vidplatform: INFO
    org.springframework.web: WARN
    org.hibernate: WARN
  pattern:
    console: '{"time":"%d","level":"%p","logger":"%logger","msg":"%m"}%n'
 
# ── Video Storage ─────────────────────────────────
video:
  storage:
    path: ${VIDEO_STORAGE_PATH:/app/videos}
    max-upload-size: 2GB
  hls:
    base-url: ${HLS_BASE_URL}

Key production settings:

  • ddl-auto: validate — Hibernate only validates schema, never modifies it. Flyway handles migrations.
  • shutdown: graceful — Spring Boot waits for in-flight requests before shutting down
  • HikariCP tuning — connection pool sized for production traffic
  • Structured JSON logging — easy to parse with log aggregation tools
  • Actuator — health checks, metrics, and Prometheus endpoint

Flyway Production Migrations

Flyway manages database schema changes as versioned migration files. Every schema change is a new SQL file — no manual ALTER TABLE on production.

Migration File Structure

src/main/resources/db/migration/
├── V1__create_users_table.sql
├── V2__create_oauth_accounts_table.sql
├── V3__create_courses_tables.sql
├── V4__create_lessons_table.sql
├── V5__create_subscriptions_tables.sql
├── V6__create_lesson_progress_table.sql
├── V7__create_course_enrollments_table.sql
├── V8__add_payment_history_table.sql
└── V9__add_analytics_indexes.sql

Example Migration: Initial Schema

-- V1__create_users_table.sql
 
CREATE TABLE users (
    id          BIGSERIAL PRIMARY KEY,
    email       VARCHAR(255) NOT NULL UNIQUE,
    password_hash VARCHAR(255),
    name        VARCHAR(255) NOT NULL,
    avatar_url  VARCHAR(500),
    role        VARCHAR(20) NOT NULL DEFAULT 'USER',
    email_verified BOOLEAN NOT NULL DEFAULT FALSE,
    created_at  TIMESTAMP NOT NULL DEFAULT NOW(),
    updated_at  TIMESTAMP NOT NULL DEFAULT NOW()
);
 
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_role ON users(role);

Example Migration: Adding Indexes for Analytics

-- V9__add_analytics_indexes.sql
 
-- Optimize analytics queries (added in Phase 11)
CREATE INDEX idx_subscriptions_status_period
    ON subscriptions(status, current_period_end);
 
CREATE INDEX idx_lesson_progress_completed
    ON lesson_progress(completed) WHERE completed = TRUE;
 
CREATE INDEX idx_course_enrollments_progress
    ON course_enrollments(course_id, progress_pct);
 
-- Partial index: only active subscriptions
CREATE INDEX idx_subscriptions_active
    ON subscriptions(user_id)
    WHERE status = 'ACTIVE';

Migration Safety Rules

Golden rules for production migrations:

  1. Never rename columns directly — add new column, migrate data, then drop old column
  2. Never drop a column in the same release — mark deprecated, remove in the next release
  3. Always add indexes concurrently when possible (CREATE INDEX CONCURRENTLY)
  4. Test migrations on a copy of production data before deploying
  5. Never modify a migration that's already been applied — create a new one instead

Running Migrations

Flyway runs automatically when Spring Boot starts (configured in application-production.yml). You can also run them manually:

# Run migrations via Spring Boot
docker compose -f docker-compose.prod.yml exec api \
    java -jar app.jar --spring.flyway.enabled=true
 
# Check migration status
docker compose -f docker-compose.prod.yml exec db \
    psql -U vidplatform -d vidplatform \
    -c "SELECT version, description, installed_on, success FROM flyway_schema_history ORDER BY version;"

Backup Strategy

Data loss is the one failure you can't recover from. We need backups for two things: the PostgreSQL database and video files.

Database Backup Script

#!/bin/bash
# scripts/backup-db.sh
 
set -euo pipefail
 
# ── Configuration ─────────────────────────────────
BACKUP_DIR="/opt/backups/db"
RETENTION_DAYS=30
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="vidplatform_${TIMESTAMP}.sql.gz"
 
# Load environment
source /opt/vidplatform/.env
 
# ── Create backup ─────────────────────────────────
echo "[$(date)] Starting database backup..."
 
mkdir -p "$BACKUP_DIR"
 
docker compose -f /opt/vidplatform/docker-compose.prod.yml exec -T db \
    pg_dump -U "$DB_USER" -d "$DB_NAME" \
    --no-owner --no-privileges --format=plain \
    | gzip > "$BACKUP_DIR/$BACKUP_FILE"
 
BACKUP_SIZE=$(du -sh "$BACKUP_DIR/$BACKUP_FILE" | cut -f1)
echo "[$(date)] Backup complete: $BACKUP_FILE ($BACKUP_SIZE)"
 
# ── Remove old backups ────────────────────────────
echo "[$(date)] Cleaning backups older than $RETENTION_DAYS days..."
find "$BACKUP_DIR" -name "vidplatform_*.sql.gz" -mtime +$RETENTION_DAYS -delete
 
# ── Verify backup ────────────────────────────────
if gzip -t "$BACKUP_DIR/$BACKUP_FILE" 2>/dev/null; then
    echo "[$(date)] Backup verified successfully"
else
    echo "[$(date)] ERROR: Backup file is corrupted!"
    exit 1
fi
 
echo "[$(date)] Database backup complete"

Video Files Backup Script

#!/bin/bash
# scripts/backup-videos.sh
 
set -euo pipefail
 
# ── Configuration ─────────────────────────────────
VIDEO_SOURCE="/opt/vidplatform/videos"
BACKUP_DEST="/opt/backups/videos"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
LOG_FILE="/var/log/vidplatform/video-backup.log"
 
# ── Incremental sync with rsync ──────────────────
echo "[$(date)] Starting video backup..." | tee -a "$LOG_FILE"
 
mkdir -p "$BACKUP_DEST"
 
rsync -avz --progress \
    --exclude="*.tmp" \
    --exclude="*.processing" \
    "$VIDEO_SOURCE/" "$BACKUP_DEST/" \
    2>&1 | tee -a "$LOG_FILE"
 
TOTAL_SIZE=$(du -sh "$BACKUP_DEST" | cut -f1)
echo "[$(date)] Video backup complete. Total size: $TOTAL_SIZE" | tee -a "$LOG_FILE"

Cron Schedule

# /etc/cron.d/vidplatform-backups
 
# Database backup: daily at 2 AM
0 2 * * * root /opt/vidplatform/scripts/backup-db.sh >> /var/log/vidplatform/db-backup.log 2>&1
 
# Video sync: daily at 3 AM
0 3 * * * root /opt/vidplatform/scripts/backup-videos.sh >> /var/log/vidplatform/video-backup.log 2>&1
 
# Cleanup old logs: weekly on Sunday
0 4 * * 0 root find /var/log/vidplatform -name "*.log" -mtime +90 -delete

Restore Procedure

# ── Restore database ─────────────────────────────
# Step 1: Stop the API (prevent writes)
docker compose -f docker-compose.prod.yml stop api
 
# Step 2: Restore from backup
gunzip -c /opt/backups/db/vidplatform_20260323_020000.sql.gz | \
    docker compose -f docker-compose.prod.yml exec -T db \
    psql -U vidplatform -d vidplatform
 
# Step 3: Restart the API
docker compose -f docker-compose.prod.yml start api
 
# ── Restore videos ───────────────────────────────
rsync -avz /opt/backups/videos/ /opt/vidplatform/videos/

Deploy Script

A single script that handles the entire deployment: pull code, build images, run migrations, and restart services with health checks.

deploy.sh

#!/bin/bash
# scripts/deploy.sh
 
set -euo pipefail
 
# ── Configuration ─────────────────────────────────
APP_DIR="/opt/vidplatform"
COMPOSE_FILE="$APP_DIR/docker-compose.prod.yml"
LOG_FILE="/var/log/vidplatform/deploy.log"
MAX_WAIT=120  # seconds to wait for health checks
 
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
 
log() { echo -e "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"; }
 
# ── Pre-flight checks ────────────────────────────
log "${YELLOW}Starting deployment...${NC}"
 
if [ ! -f "$APP_DIR/.env" ]; then
    log "${RED}ERROR: .env file not found${NC}"
    exit 1
fi
 
cd "$APP_DIR"
 
# ── Pull latest code ─────────────────────────────
log "Pulling latest code..."
git fetch origin main
git reset --hard origin/main
 
# ── Backup database before deploy ─────────────────
log "Creating pre-deploy backup..."
bash scripts/backup-db.sh
 
# ── Build new images ─────────────────────────────
log "Building Docker images..."
docker compose -f "$COMPOSE_FILE" build --no-cache api web
 
# ── Rolling restart ──────────────────────────────
log "Restarting services..."
 
# Restart API first (runs Flyway migrations on startup)
docker compose -f "$COMPOSE_FILE" up -d --no-deps api
log "Waiting for API to be healthy..."
 
WAIT=0
until docker compose -f "$COMPOSE_FILE" exec api wget -qO- http://localhost:8080/actuator/health 2>/dev/null | grep -q '"status":"UP"'; do
    WAIT=$((WAIT + 5))
    if [ $WAIT -ge $MAX_WAIT ]; then
        log "${RED}ERROR: API health check failed after ${MAX_WAIT}s${NC}"
        log "Rolling back..."
        docker compose -f "$COMPOSE_FILE" down api
        docker compose -f "$COMPOSE_FILE" up -d api  # Previous image
        exit 1
    fi
    sleep 5
    log "  Waiting... (${WAIT}s)"
done
log "${GREEN}API is healthy${NC}"
 
# Restart frontend
docker compose -f "$COMPOSE_FILE" up -d --no-deps web
sleep 5
 
# Reload Nginx to pick up any config changes
docker compose -f "$COMPOSE_FILE" exec nginx nginx -s reload
log "${GREEN}Nginx reloaded${NC}"
 
# ── Final health check ───────────────────────────
log "Running final health checks..."
 
check_service() {
    local name=$1
    local url=$2
    if curl -sf "$url" > /dev/null 2>&1; then
        log "${GREEN}  ✓ $name${NC}"
        return 0
    else
        log "${RED}  ✗ $name${NC}"
        return 1
    fi
}
 
FAILED=0
check_service "API Health" "http://localhost:8080/actuator/health" || FAILED=1
check_service "Frontend" "http://localhost:3000" || FAILED=1
check_service "Nginx HTTPS" "https://localhost" || FAILED=1
 
if [ $FAILED -eq 1 ]; then
    log "${RED}Some health checks failed! Check logs.${NC}"
    docker compose -f "$COMPOSE_FILE" ps
    exit 1
fi
 
# ── Cleanup ──────────────────────────────────────
log "Cleaning up unused Docker images..."
docker image prune -f
 
log "${GREEN}Deployment complete!${NC}"
docker compose -f "$COMPOSE_FILE" ps

Deployment Flow


Spring Boot Actuator Monitoring

Actuator provides production-ready monitoring endpoints out of the box. We configured it in application-production.yml — here's how to use it.

Health Check Endpoint

// src/main/java/com/vidplatform/monitoring/CustomHealthIndicator.java
 
@Component
public class VideoStorageHealthIndicator implements HealthIndicator {
 
    @Value("${video.storage.path}")
    private String videoStoragePath;
 
    @Override
    public Health health() {
        File storageDir = new File(videoStoragePath);
 
        if (!storageDir.exists() || !storageDir.canWrite()) {
            return Health.down()
                    .withDetail("path", videoStoragePath)
                    .withDetail("error", "Storage directory not writable")
                    .build();
        }
 
        long usableSpace = storageDir.getUsableSpace();
        long totalSpace = storageDir.getTotalSpace();
        double usedPercentage = ((double) (totalSpace - usableSpace) / totalSpace) * 100;
 
        Health.Builder builder = usedPercentage > 90
                ? Health.down().withDetail("warning", "Disk usage above 90%")
                : Health.up();
 
        return builder
                .withDetail("path", videoStoragePath)
                .withDetail("usableSpaceGB", usableSpace / (1024 * 1024 * 1024))
                .withDetail("totalSpaceGB", totalSpace / (1024 * 1024 * 1024))
                .withDetail("usedPercentage", String.format("%.1f%%", usedPercentage))
                .build();
    }
}

Custom Metrics

// src/main/java/com/vidplatform/monitoring/VideoMetrics.java
 
@Component
public class VideoMetrics {
 
    private final Counter transcodingStarted;
    private final Counter transcodingCompleted;
    private final Counter transcodingFailed;
    private final Timer transcodingDuration;
    private final AtomicInteger activeTranscodings;
 
    public VideoMetrics(MeterRegistry registry) {
        this.transcodingStarted = Counter.builder("video.transcoding.started")
                .description("Number of transcoding jobs started")
                .register(registry);
 
        this.transcodingCompleted = Counter.builder("video.transcoding.completed")
                .description("Number of transcoding jobs completed")
                .register(registry);
 
        this.transcodingFailed = Counter.builder("video.transcoding.failed")
                .description("Number of transcoding jobs failed")
                .register(registry);
 
        this.transcodingDuration = Timer.builder("video.transcoding.duration")
                .description("Time taken to transcode videos")
                .register(registry);
 
        this.activeTranscodings = registry.gauge(
                "video.transcoding.active",
                new AtomicInteger(0)
        );
 
        // Gauge for total video count
        Gauge.builder("video.storage.count", this, VideoMetrics::countVideoFiles)
                .description("Total number of video files in storage")
                .register(registry);
    }
 
    public void recordTranscodingStarted() {
        transcodingStarted.increment();
        activeTranscodings.incrementAndGet();
    }
 
    public void recordTranscodingCompleted(long durationMs) {
        transcodingCompleted.increment();
        activeTranscodings.decrementAndGet();
        transcodingDuration.record(durationMs, TimeUnit.MILLISECONDS);
    }
 
    public void recordTranscodingFailed() {
        transcodingFailed.increment();
        activeTranscodings.decrementAndGet();
    }
 
    private double countVideoFiles() {
        // Count .m3u8 files in HLS directory
        try (Stream<Path> files = Files.walk(Path.of("/app/videos/hls"))) {
            return files.filter(p -> p.toString().endsWith(".m3u8")).count();
        } catch (IOException e) {
            return -1;
        }
    }
}

Prometheus Integration

Add the Prometheus dependency to expose metrics in Prometheus format:

<!-- pom.xml -->
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Metrics are now available at /actuator/prometheus:

# Sample output
curl http://localhost:8080/actuator/prometheus
 
# HELP video_transcoding_started_total Number of transcoding jobs started
# TYPE video_transcoding_started_total counter
video_transcoding_started_total 42.0
 
# HELP video_transcoding_duration_seconds Time taken to transcode videos
# TYPE video_transcoding_duration_seconds summary
video_transcoding_duration_seconds_count 38.0
video_transcoding_duration_seconds_sum 4567.8
 
# HELP video_transcoding_active Number of active transcoding jobs
# TYPE video_transcoding_active gauge
video_transcoding_active 2.0

Monitoring with Actuator Endpoints

# Health check (used by Docker and deploy script)
curl http://localhost:8080/actuator/health
# {
#   "status": "UP",
#   "components": {
#     "db": { "status": "UP" },
#     "redis": { "status": "UP" },
#     "videoStorage": { "status": "UP", "details": { "usedPercentage": "45.2%" } },
#     "diskSpace": { "status": "UP" }
#   }
# }
 
# Application info
curl http://localhost:8080/actuator/info
 
# JVM metrics
curl http://localhost:8080/actuator/metrics/jvm.memory.used
 
# Custom metrics
curl http://localhost:8080/actuator/metrics/video.transcoding.active

Server Setup Guide

Here's a step-by-step guide to set up a fresh VPS for our video platform.

VPS Requirements

ResourceMinimumRecommended
CPU2 cores4 cores
RAM4 GB8 GB
Storage50 GB SSD200 GB+ SSD
OSUbuntu 22.04Ubuntu 22.04
Bandwidth1 TB/monthUnmetered

FFmpeg transcoding is CPU-intensive. For platforms with many concurrent uploads, consider 4+ cores.

Initial Server Setup

# ── 1. Update system ─────────────────────────────
sudo apt update && sudo apt upgrade -y
 
# ── 2. Create app user ──────────────────────────
sudo adduser vidplatform
sudo usermod -aG sudo vidplatform
sudo usermod -aG docker vidplatform
 
# ── 3. Install Docker ───────────────────────────
curl -fsSL https://get.docker.com | sh
sudo systemctl enable docker
sudo systemctl start docker
 
# ── 4. Install Docker Compose ───────────────────
sudo apt install docker-compose-plugin -y
 
# ── 5. Configure firewall ───────────────────────
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable
 
# ── 6. Set up project directory ─────────────────
sudo mkdir -p /opt/vidplatform
sudo chown vidplatform:vidplatform /opt/vidplatform
 
# ── 7. Clone repository ────────────────────────
su - vidplatform
cd /opt/vidplatform
git clone https://github.com/yourusername/vidplatform.git .
 
# ── 8. Create directories ──────────────────────
mkdir -p nginx/conf.d certbot/conf certbot/www
mkdir -p /opt/backups/{db,videos}
mkdir -p /var/log/vidplatform
 
# ── 9. Configure environment ───────────────────
cp .env.example .env
nano .env  # Edit with your production values
 
# ── 10. Deploy! ─────────────────────────────────
bash scripts/deploy.sh

Swap Space (Important for FFmpeg)

FFmpeg can consume a lot of memory during transcoding. Add swap space:

# Create 4GB swap
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
 
# Make permanent
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
 
# Verify
free -h

Production Checklist

Before going live, verify every item on this checklist:

Security

  • .env file is not in git (check .gitignore)
  • Database password is at least 32 characters
  • JWT secret is at least 64 characters
  • Stripe uses live keys (not test keys)
  • SSL certificate is valid and auto-renewing
  • Actuator endpoints are restricted to internal networks
  • ddl-auto is set to validate (not update or create)
  • Rate limiting is configured for auth and API endpoints

Reliability

  • Database backups run daily
  • Backup restore has been tested
  • Health checks are configured for all containers
  • Deploy script includes rollback on failure
  • Graceful shutdown is enabled (server.shutdown: graceful)
  • Container restart policy is unless-stopped

Performance

  • Gzip compression enabled in Nginx
  • Redis caching is configured for hot data
  • HikariCP connection pool is sized correctly
  • Next.js is running in production mode
  • Static assets have cache headers
  • JVM memory is sized to 75% of container limit

Monitoring

  • Actuator health endpoint responds with all components
  • Custom video storage health indicator is active
  • Prometheus metrics are being collected
  • Log files are rotating (not filling disk)
  • Deploy logs are being captured

Common Production Issues

Issue: API Won't Start

# Check logs
docker compose -f docker-compose.prod.yml logs api --tail=100
 
# Common causes:
# 1. Database not ready yet → check depends_on health checks
# 2. Flyway migration failed → check migration SQL
# 3. Missing environment variable → check .env file
# 4. Port already in use → check with: ss -tlnp | grep 8080

Issue: Video Playback Fails

# Check secure_link token
# Look at Nginx error logs
docker compose -f docker-compose.prod.yml logs nginx --tail=50
 
# Common causes:
# 1. Token expired → check client clock sync
# 2. IP mismatch → user's IP changed (mobile networks)
# 3. HLS files not generated → check video_data volume mount
# 4. CORS error → check Access-Control-Allow-Origin header

Issue: Out of Disk Space

# Check disk usage
df -h
 
# Find large files
du -sh /opt/vidplatform/videos/*
du -sh /var/lib/docker/*
 
# Clean up Docker
docker system prune -a  # Remove unused images/containers
 
# Clean old backups
find /opt/backups -mtime +30 -delete

Issue: High Memory Usage

# Check container resource usage
docker stats --no-stream
 
# If API container is using too much:
# 1. Reduce MaxRAMPercentage in Dockerfile
# 2. Reduce HikariCP pool size
# 3. Check for memory leaks in transcoding

Complete Series Architecture

Congratulations — you've built a complete video streaming platform from scratch! Here's everything we built across 15 posts:


Where to Go From Here

You've built a production-ready video streaming platform. Here are directions to explore next:

Feature Extensions

  • Multi-course bundles — package courses together with discount pricing
  • Discussion forums — per-lesson comment threads for student Q&A
  • Certificates — generate PDF completion certificates with unique verification codes
  • Live streaming — add RTMP ingest with FFmpeg for live lessons
  • Mobile app — React Native app consuming the same API

Infrastructure Improvements

  • CDN integration — CloudFlare or BunnyCDN for global HLS delivery
  • Object storage — migrate video files from disk to S3/MinIO for scalability
  • Kubernetes — orchestrate with Helm charts for auto-scaling
  • Message queue — RabbitMQ for async transcoding instead of @Async
  • Database replicas — read replicas for analytics queries

Monitoring & Observability

  • Grafana dashboards — visualize Prometheus metrics with custom panels
  • Distributed tracing — OpenTelemetry + Jaeger for request tracing
  • Error tracking — Sentry for real-time error monitoring
  • Uptime monitoring — external health checks with alerts
  • Log aggregation — ELK stack or Loki for centralized logging

Learning Paths

  • System design interviews — you can now discuss video streaming with real implementation experience
  • Microservices — split into separate services (auth, catalog, transcoding, billing)
  • Multi-tenancy — let creators host their own course platforms on your infrastructure
  • Open source — publish your platform and build a community

The skills you've learned across this series — API design, database modeling, video transcoding, caching, authentication, payment processing, testing, containerization, and deployment — are the same skills used to build platforms at every scale. From a solo project to a production service serving thousands of students, the fundamentals remain the same.


Series: Build a Video Streaming Platform
Previous: Phase 13: Testing Strategy

📬 Subscribe to Newsletter

Get the latest blog posts delivered to your inbox every week. No spam, unsubscribe anytime.

We respect your privacy. Unsubscribe at any time.

💬 Comments

Sign in to leave a comment

We'll never post without your permission.