357 lines
8.2 KiB
Markdown
357 lines
8.2 KiB
Markdown
# Rolling Update Deployment Guide
|
|
|
|
This guide explains how to perform zero-downtime deployments using the rolling update strategy.
|
|
|
|
## Overview
|
|
|
|
The rolling update approach allows you to deploy new backend code without any downtime for users. Here's how it works:
|
|
|
|
1. **Build** new backend image while old container is still running
|
|
2. **Start** new container on port 8082 (old one stays on 8080)
|
|
3. **Health check** new container to ensure it's ready
|
|
4. **Switch** Nginx to point to new container (zero downtime)
|
|
5. **Stop** old container after grace period
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────┐
|
|
│ Nginx │ (Port 80/443)
|
|
│ (Host) │
|
|
└──────┬──────┘
|
|
│
|
|
├───> Backend (Port 8080) - Primary
|
|
└───> Backend-New (Port 8082) - Standby (during deployment)
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
1. **Nginx running on host** (not in Docker)
|
|
2. **Backend containers** managed by Docker Compose
|
|
3. **Health check endpoint** available at `/actuator/health/readiness`
|
|
4. **Sufficient memory** for two backend containers during deployment (~24GB)
|
|
|
|
## Quick Start
|
|
|
|
### 1. Make Script Executable
|
|
|
|
```bash
|
|
cd /opt/app/backend/lottery-be
|
|
chmod +x scripts/rolling-update.sh
|
|
```
|
|
|
|
### 2. Run Deployment
|
|
|
|
```bash
|
|
# Load database password (if not already set)
|
|
source scripts/load-db-password.sh
|
|
|
|
# Run rolling update
|
|
sudo ./scripts/rolling-update.sh
|
|
```
|
|
|
|
That's it! The script handles everything automatically.
|
|
|
|
## What the Script Does
|
|
|
|
1. **Checks prerequisites**:
|
|
- Verifies Docker and Nginx are available
|
|
- Ensures primary backend is running
|
|
- Loads database password
|
|
|
|
2. **Builds new image**:
|
|
- Builds backend-new service
|
|
- Uses Docker Compose build cache for speed
|
|
|
|
3. **Starts new container**:
|
|
- Starts `lottery-backend-new` on port 8082
|
|
- Waits for container initialization
|
|
|
|
4. **Health checks**:
|
|
- Checks `/actuator/health/readiness` endpoint
|
|
- Retries up to 30 times (60 seconds total)
|
|
- Fails deployment if health check doesn't pass
|
|
|
|
5. **Updates Nginx**:
|
|
- Backs up current Nginx config
|
|
- Updates upstream to point to port 8082
|
|
- Sets old backend (8080) as backup
|
|
- Tests Nginx configuration
|
|
|
|
6. **Reloads Nginx**:
|
|
- Uses `systemctl reload nginx` (zero downtime)
|
|
- Traffic immediately switches to new backend
|
|
|
|
7. **Stops old container**:
|
|
- Waits 10 seconds grace period
|
|
- Stops old backend container
|
|
- Old container can be removed or kept for rollback
|
|
|
|
## Manual Steps (If Needed)
|
|
|
|
If you prefer to do it manually or need to troubleshoot:
|
|
|
|
### Step 1: Build New Image
|
|
|
|
```bash
|
|
cd /opt/app/backend/lottery-be
|
|
source scripts/load-db-password.sh
|
|
docker-compose -f docker-compose.prod.yml --profile rolling-update build backend-new
|
|
```
|
|
|
|
### Step 2: Start New Container
|
|
|
|
```bash
|
|
docker-compose -f docker-compose.prod.yml --profile rolling-update up -d backend-new
|
|
```
|
|
|
|
### Step 3: Health Check
|
|
|
|
```bash
|
|
# Wait for container to be ready
|
|
sleep 10
|
|
|
|
# Check health
|
|
curl http://127.0.0.1:8082/actuator/health/readiness
|
|
|
|
# Check logs
|
|
docker logs lottery-backend-new
|
|
```
|
|
|
|
### Step 4: Update Nginx
|
|
|
|
```bash
|
|
# Backup config
|
|
sudo cp /etc/nginx/conf.d/lottery.conf /etc/nginx/conf.d/lottery.conf.backup
|
|
|
|
# Edit config
|
|
sudo nano /etc/nginx/conf.d/lottery.conf
|
|
```
|
|
|
|
Change upstream from:
|
|
```nginx
|
|
upstream lottery_backend {
|
|
server 127.0.0.1:8080 max_fails=3 fail_timeout=30s;
|
|
}
|
|
```
|
|
|
|
To:
|
|
```nginx
|
|
upstream lottery_backend {
|
|
server 127.0.0.1:8082 max_fails=3 fail_timeout=30s;
|
|
server 127.0.0.1:8080 backup;
|
|
}
|
|
```
|
|
|
|
### Step 5: Reload Nginx
|
|
|
|
```bash
|
|
# Test config
|
|
sudo nginx -t
|
|
|
|
# Reload (zero downtime)
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
### Step 6: Stop Old Container
|
|
|
|
```bash
|
|
# Wait for active connections to finish
|
|
sleep 10
|
|
|
|
# Stop old container
|
|
docker-compose -f docker-compose.prod.yml stop backend
|
|
```
|
|
|
|
## Rollback Procedure
|
|
|
|
If something goes wrong, you can quickly rollback:
|
|
|
|
### Automatic Rollback
|
|
|
|
The script automatically rolls back if:
|
|
- Health check fails
|
|
- Nginx config test fails
|
|
- Nginx reload fails
|
|
|
|
### Manual Rollback
|
|
|
|
```bash
|
|
# 1. Restore Nginx config
|
|
sudo cp /etc/nginx/conf.d/lottery.conf.backup /etc/nginx/conf.d/lottery.conf
|
|
sudo systemctl reload nginx
|
|
|
|
# 2. Start old backend (if stopped)
|
|
cd /opt/app/backend/lottery-be
|
|
docker-compose -f docker-compose.prod.yml start backend
|
|
|
|
# 3. Stop new backend
|
|
docker-compose -f docker-compose.prod.yml --profile rolling-update stop backend-new
|
|
docker-compose -f docker-compose.prod.yml --profile rolling-update rm -f backend-new
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Health Check Settings
|
|
|
|
Edit `scripts/rolling-update.sh` to adjust:
|
|
|
|
```bash
|
|
HEALTH_CHECK_RETRIES=30 # Number of retries
|
|
HEALTH_CHECK_INTERVAL=2 # Seconds between retries
|
|
GRACE_PERIOD=10 # Seconds to wait before stopping old container
|
|
```
|
|
|
|
### Nginx Upstream Settings
|
|
|
|
Edit `/etc/nginx/conf.d/lottery.conf`:
|
|
|
|
```nginx
|
|
upstream lottery_backend {
|
|
server 127.0.0.1:8082 max_fails=3 fail_timeout=30s;
|
|
server 127.0.0.1:8080 backup; # Old backend as backup
|
|
keepalive 32;
|
|
}
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### During Deployment
|
|
|
|
```bash
|
|
# Watch container status
|
|
watch -n 1 'docker ps | grep lottery-backend'
|
|
|
|
# Monitor new backend logs
|
|
docker logs -f lottery-backend-new
|
|
|
|
# Check Nginx access logs
|
|
sudo tail -f /var/log/nginx/access.log
|
|
|
|
# Monitor memory usage
|
|
free -h
|
|
docker stats --no-stream
|
|
```
|
|
|
|
### After Deployment
|
|
|
|
```bash
|
|
# Verify new backend is serving traffic
|
|
curl http://localhost/api/health
|
|
|
|
# Check container status
|
|
docker ps | grep lottery-backend
|
|
|
|
# Verify Nginx upstream
|
|
curl http://localhost/actuator/health
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Health Check Fails
|
|
|
|
```bash
|
|
# Check new container logs
|
|
docker logs lottery-backend-new
|
|
|
|
# Check if container is running
|
|
docker ps | grep lottery-backend-new
|
|
|
|
# Test health endpoint directly
|
|
curl -v http://127.0.0.1:8082/actuator/health/readiness
|
|
|
|
# Check database connection
|
|
docker exec lottery-backend-new wget -q -O- http://localhost:8080/actuator/health
|
|
```
|
|
|
|
### Nginx Reload Fails
|
|
|
|
```bash
|
|
# Test Nginx config
|
|
sudo nginx -t
|
|
|
|
# Check Nginx error logs
|
|
sudo tail -f /var/log/nginx/error.log
|
|
|
|
# Verify upstream syntax
|
|
sudo nginx -T | grep -A 5 upstream
|
|
```
|
|
|
|
### Memory Issues
|
|
|
|
If you run out of memory during deployment:
|
|
|
|
```bash
|
|
# Check memory usage
|
|
free -h
|
|
docker stats --no-stream
|
|
|
|
# Option 1: Reduce heap size temporarily
|
|
# Edit docker-compose.prod.yml, change JAVA_OPTS to use 8GB heap
|
|
|
|
# Option 2: Stop other services temporarily
|
|
docker stop lottery-phpmyadmin # If not needed
|
|
```
|
|
|
|
### Old Container Won't Stop
|
|
|
|
```bash
|
|
# Force stop
|
|
docker stop lottery-backend
|
|
|
|
# If still running, kill it
|
|
docker kill lottery-backend
|
|
|
|
# Remove container
|
|
docker rm lottery-backend
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Test in staging first** - Always test the deployment process in a staging environment
|
|
|
|
2. **Monitor during deployment** - Watch logs and metrics during the first few deployments
|
|
|
|
3. **Keep backups** - The script automatically backs up Nginx config, but keep your own backups too
|
|
|
|
4. **Database migrations** - Ensure migrations are backward compatible or run them separately
|
|
|
|
5. **Gradual rollout** - For major changes, consider deploying during low-traffic periods
|
|
|
|
6. **Health checks** - Ensure your health check endpoint properly validates all dependencies
|
|
|
|
7. **Graceful shutdown** - Spring Boot graceful shutdown (30s) allows active requests to finish
|
|
|
|
## Performance Considerations
|
|
|
|
- **Build time**: First build takes longer, subsequent builds use cache
|
|
- **Memory**: Two containers use ~24GB during deployment (brief period)
|
|
- **Network**: No network interruption, Nginx handles the switch seamlessly
|
|
- **Database**: No impact, both containers share the same database
|
|
|
|
## Security Notes
|
|
|
|
- New container uses same secrets and configuration as old one
|
|
- No exposure of new port to internet (only localhost)
|
|
- Nginx handles all external traffic
|
|
- Health checks are internal only
|
|
|
|
## Next Steps
|
|
|
|
After successful deployment:
|
|
|
|
1. ✅ Monitor new backend for errors
|
|
2. ✅ Verify all endpoints are working
|
|
3. ✅ Check application logs
|
|
4. ✅ Remove old container image (optional): `docker image prune`
|
|
|
|
## Support
|
|
|
|
If you encounter issues:
|
|
|
|
1. Check logs: `docker logs lottery-backend-new`
|
|
2. Check Nginx: `sudo nginx -t && sudo tail -f /var/log/nginx/error.log`
|
|
3. Rollback if needed (see Rollback Procedure above)
|
|
4. Review this guide's Troubleshooting section
|
|
|