Skip to main content

Deployment Guide

This guide covers environment configuration, Docker deployment, monitoring, and production best practices for the General Socket service.


Environment Variables

Core Configuration

# Service Identity
PORT=4000
SERVICE_NAME=general-socket

# MongoDB Connection
MONGO_HOST=mongodb://localhost:27017/dashclicks
MONGO_CONNECTION_STRING=mongodb://username:password@host:27017/dashclicks

# Redis Configuration (Required for Socket.IO adapter)
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=your-redis-password

# API Endpoints
INTERNAL_API_URL=http://localhost:5002
EXTERNAL_API_URL=http://localhost:5003

# JWT Authentication
JWT_SECRET=your-jwt-secret-key
JWT_ISSUER=dashclicks-auth
JWT_AUDIENCE=dashclicks-api

# CORS Configuration
ALLOWED_ORIGINS=http://localhost:3000,https://app.dashclicks.com

Socket.IO Configuration

# Connection Settings
SOCKET_PING_TIMEOUT=60000 # 60 seconds
SOCKET_PING_INTERVAL=25000 # 25 seconds
SOCKET_MAX_HTTP_BUFFER_SIZE=1048576 # 1MB

# Performance Tuning
SOCKET_TRANSPORTS=["websocket", "polling"]
SOCKET_UPGRADE_TIMEOUT=10000 # 10 seconds
SOCKET_CLOSE_TIMEOUT=60000 # 60 seconds

Logging & Monitoring

# Log Level
LOG_LEVEL=info # debug, info, warn, error

# Monitoring
ENABLE_METRICS=true
METRICS_PORT=9090

# Sentry Error Tracking
SENTRY_DSN=https://your-sentry-dsn@sentry.io/project-id
SENTRY_ENVIRONMENT=production
SENTRY_TRACES_SAMPLE_RATE=0.1

Feature Flags

# Enable/Disable Namespaces
ENABLE_CONVERSATION=true
ENABLE_LIVECHAT=true
ENABLE_LEADFINDER=true
ENABLE_SHARED=true

# Rate Limiting
ENABLE_RATE_LIMITING=true
RATE_LIMIT_WINDOW_MS=60000 # 1 minute
RATE_LIMIT_MAX_REQUESTS=100 # per window

Docker Deployment

Dockerfile

FROM node:18-alpine

WORKDIR /app

# Install dependencies
COPY package*.json ./
RUN npm ci --only=production

# Copy application code
COPY . .

# Copy shared models and utilities
COPY ../shared/models ./models
COPY ../shared/utilities ./utilities

# Expose port
EXPOSE 4000

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
CMD node -e "require('http').get('http://localhost:4000/health', (r) => r.statusCode === 200 ? process.exit(0) : process.exit(1))"

# Start service
CMD ["node", "index.js"]

Docker Compose

version: '3.8'

services:
general-socket:
build:
context: .
dockerfile: Dockerfile
container_name: general-socket
restart: unless-stopped
ports:
- '4000:4000'
environment:
- PORT=4000
- MONGO_HOST=mongodb://mongo:27017/dashclicks
- REDIS_HOST=redis
- REDIS_PORT=6379
- JWT_SECRET=${JWT_SECRET}
- INTERNAL_API_URL=http://internal-api:5002
- LOG_LEVEL=info
depends_on:
- mongo
- redis
networks:
- dashclicks-network
volumes:
- ./logs:/app/logs

redis:
image: redis:7-alpine
container_name: redis
restart: unless-stopped
ports:
- '6379:6379'
command: redis-server --requirepass ${REDIS_PASSWORD}
volumes:
- redis-data:/data
networks:
- dashclicks-network

mongo:
image: mongo:6
container_name: mongo
restart: unless-stopped
ports:
- '27017:27017'
environment:
- MONGO_INITDB_ROOT_USERNAME=${MONGO_USERNAME}
- MONGO_INITDB_ROOT_PASSWORD=${MONGO_PASSWORD}
volumes:
- mongo-data:/data/db
networks:
- dashclicks-network

volumes:
redis-data:
mongo-data:

networks:
dashclicks-network:
driver: bridge

Multi-Instance Deployment (Horizontal Scaling)

version: '3.8'

services:
general-socket-1:
build: .
environment:
- PORT=4000
- REDIS_HOST=redis
- INSTANCE_ID=1
ports:
- '4001:4000'
networks:
- dashclicks-network

general-socket-2:
build: .
environment:
- PORT=4000
- REDIS_HOST=redis
- INSTANCE_ID=2
ports:
- '4002:4000'
networks:
- dashclicks-network

general-socket-3:
build: .
environment:
- PORT=4000
- REDIS_HOST=redis
- INSTANCE_ID=3
ports:
- '4003:4000'
networks:
- dashclicks-network

nginx:
image: nginx:alpine
ports:
- '4000:80'
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- general-socket-1
- general-socket-2
- general-socket-3
networks:
- dashclicks-network

networks:
dashclicks-network:
driver: bridge

Nginx Load Balancer Configuration

http {
upstream general_socket {
ip_hash; # Sticky sessions for Socket.IO
server general-socket-1:4000;
server general-socket-2:4000;
server general-socket-3:4000;
}

server {
listen 80;
server_name socket.dashclicks.com;

location / {
proxy_pass http://general_socket;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

# Timeouts
proxy_connect_timeout 7d;
proxy_send_timeout 7d;
proxy_read_timeout 7d;
}

location /health {
proxy_pass http://general_socket/health;
access_log off;
}
}
}

Kubernetes Deployment

Deployment Manifest

apiVersion: apps/v1
kind: Deployment
metadata:
name: general-socket
namespace: dashclicks
spec:
replicas: 3
selector:
matchLabels:
app: general-socket
template:
metadata:
labels:
app: general-socket
spec:
containers:
- name: general-socket
image: dashclicks/general-socket:latest
ports:
- containerPort: 4000
name: http
env:
- name: PORT
value: '4000'
- name: REDIS_HOST
value: redis-service
- name: REDIS_PORT
value: '6379'
- name: MONGO_HOST
valueFrom:
secretKeyRef:
name: mongo-secret
key: connection-string
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: jwt-secret
key: secret
resources:
requests:
memory: '256Mi'
cpu: '250m'
limits:
memory: '512Mi'
cpu: '500m'
livenessProbe:
httpGet:
path: /health
port: 4000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 4000
initialDelaySeconds: 10
periodSeconds: 5

Service Manifest

apiVersion: v1
kind: Service
metadata:
name: general-socket-service
namespace: dashclicks
spec:
type: LoadBalancer
sessionAffinity: ClientIP # Sticky sessions for Socket.IO
selector:
app: general-socket
ports:
- port: 4000
targetPort: 4000
protocol: TCP

Ingress Manifest

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: general-socket-ingress
namespace: dashclicks
annotations:
nginx.ingress.kubernetes.io/websocket-services: general-socket-service
nginx.ingress.kubernetes.io/proxy-read-timeout: '3600'
nginx.ingress.kubernetes.io/proxy-send-timeout: '3600'
spec:
rules:
- host: socket.dashclicks.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: general-socket-service
port:
number: 4000
tls:
- hosts:
- socket.dashclicks.com
secretName: socket-tls-secret

Monitoring & Observability

Health Check Endpoint

// Add to index.js
app.get('/health', (req, res) => {
const health = {
status: 'UP',
timestamp: new Date().toISOString(),
uptime: process.uptime(),
checks: {
mongodb: mongoStatus,
redis: redisStatus,
socketConnections: io.engine.clientsCount,
},
};

res.status(200).json(health);
});

Prometheus Metrics

// Install: npm install prom-client
const promClient = require('prom-client');

// Metrics
const socketConnectionsGauge = new promClient.Gauge({
name: 'socket_connections_total',
help: 'Total number of active socket connections',
});

const socketEventsCounter = new promClient.Counter({
name: 'socket_events_total',
help: 'Total number of socket events',
labelNames: ['event', 'namespace'],
});

const socketEmissionLatency = new promClient.Histogram({
name: 'socket_emission_duration_seconds',
help: 'Socket emission latency',
labelNames: ['event'],
});

// Update metrics
io.on('connection', socket => {
socketConnectionsGauge.inc();

socket.on('disconnect', () => {
socketConnectionsGauge.dec();
});

socket.onAny((event, ...args) => {
socketEventsCounter.inc({ event, namespace: socket.nsp.name });
});
});

// Expose metrics endpoint
app.get('/metrics', (req, res) => {
res.set('Content-Type', promClient.register.contentType);
res.end(promClient.register.metrics());
});

Grafana Dashboard

Key Metrics to Monitor:

  • Active socket connections
  • Events per second (by namespace)
  • Message emission latency
  • Redis pub/sub lag
  • Memory usage
  • CPU usage
  • Error rate

Example Queries:

# Active connections
socket_connections_total

# Event rate
rate(socket_events_total[5m])

# 95th percentile latency
histogram_quantile(0.95, socket_emission_duration_seconds)

# Error rate
rate(socket_errors_total[5m])

Logging Best Practices

Structured Logging with Winston

const winston = require('winston');

const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json(),
),
defaultMeta: {
service: 'general-socket',
instance: process.env.INSTANCE_ID,
},
transports: [
new winston.transports.File({
filename: 'logs/error.log',
level: 'error',
}),
new winston.transports.File({
filename: 'logs/combined.log',
}),
],
});

if (process.env.NODE_ENV !== 'production') {
logger.add(
new winston.transports.Console({
format: winston.format.simple(),
}),
);
}

// Usage
io.on('connection', socket => {
logger.info('Socket connected', {
socketId: socket.id,
userId: socket.userId,
namespace: socket.nsp.name,
});
});

Log Aggregation with ELK Stack

Filebeat Configuration:

filebeat.inputs:
- type: log
enabled: true
paths:
- /app/logs/*.log
json.keys_under_root: true
json.add_error_key: true

output.elasticsearch:
hosts: ['elasticsearch:9200']
index: 'general-socket-%{+yyyy.MM.dd}'

setup.kibana:
host: 'kibana:5601'

Testing

Unit Tests

# Run all tests
npm test

# Run with coverage
npm run coverage

# Run specific test suite
npm test -- --testPathPattern=conversation

Integration Tests

// tests/integration/socket.test.js
const io = require('socket.io-client');

describe('Socket Connection', () => {
let socket;

beforeAll(() => {
socket = io('http://localhost:4000/v1/conversation', {
auth: { token: validJWT },
});
});

afterAll(() => {
socket.close();
});

test('should connect with valid JWT', done => {
socket.on('connect', () => {
expect(socket.connected).toBe(true);
done();
});
});

test('should emit and receive message', done => {
socket.emit('convo_send_message', {
conversationId: 'test-convo',
message: 'Hello',
});

socket.on('convo_new_message', data => {
expect(data.message).toBe('Hello');
done();
});
});
});

Load Testing with Artillery

# artillery-load-test.yml
config:
target: 'http://localhost:4000'
socketio:
transports: ['websocket']
phases:
- duration: 60
arrivalRate: 10
name: 'Warm up'
- duration: 300
arrivalRate: 50
name: 'Sustained load'
- duration: 60
arrivalRate: 100
name: 'Spike test'

scenarios:
- name: 'Socket Connection and Message'
engine: socketio
flow:
- emit:
channel: 'convo_join'
data:
conversationId: 'test-convo-{{ $randomNumber() }}'
- think: 3
- emit:
channel: 'convo_send_message'
data:
conversationId: 'test-convo-{{ $randomNumber() }}'
message: 'Load test message'
- think: 5

Run Load Test:

artillery run artillery-load-test.yml

Production Checklist

Pre-Deployment

  • Environment variables configured
  • JWT secret set (strong, random)
  • Redis connection tested
  • MongoDB connection tested
  • CORS origins whitelisted
  • SSL/TLS certificates installed
  • Load balancer configured with sticky sessions
  • Health checks enabled
  • Monitoring dashboards created

Security

  • JWT authentication enforced (except LeadFinder)
  • Rate limiting enabled
  • Input validation on all socket events
  • SQL injection prevention (Mongoose escaping)
  • XSS prevention (sanitize user input)
  • API key authentication for REST endpoints
  • Network isolation (internal services only)

Performance

  • Redis adapter enabled for multi-instance
  • Connection pooling optimized
  • Socket timeout configured (60s)
  • Message size limits enforced (1MB)
  • Database indexes created
  • Query optimization completed
  • Caching strategy implemented

Monitoring

  • Prometheus metrics exposed
  • Grafana dashboards configured
  • Error tracking (Sentry) enabled
  • Log aggregation (ELK) configured
  • Alerts configured (high error rate, low connections)
  • Uptime monitoring (Pingdom/UptimeRobot)

Disaster Recovery

  • Database backups automated (daily)
  • Redis persistence enabled (AOF)
  • Graceful shutdown implemented
  • Auto-restart on crash (PM2/systemd)
  • Rollback plan documented
  • Incident response plan created

Troubleshooting

Common Issues

Socket Connections Not Working

Symptoms: Clients can't connect, timeout errors

Solutions:

  1. Check CORS configuration matches frontend origin
  2. Verify JWT token is valid and not expired
  3. Check firewall allows WebSocket connections (port 4000)
  4. Ensure Nginx/Load balancer has WebSocket upgrade headers
  5. Check Redis connection (Redis adapter requirement)

Redis Connection Errors

Symptoms: Error: Redis connection refused

Solutions:

  1. Verify Redis is running: redis-cli ping
  2. Check Redis host/port in environment variables
  3. Verify Redis password if authentication enabled
  4. Check network connectivity to Redis server
  5. Review Redis logs for errors

High Memory Usage

Symptoms: Memory usage increasing over time

Solutions:

  1. Check for socket memory leaks (disconnect listeners)
  2. Monitor active socket connections (may have stale sockets)
  3. Implement socket connection limits per user
  4. Review message buffering strategy
  5. Enable Redis persistence to offload memory

Message Delivery Failures

Symptoms: Messages not reaching clients

Solutions:

  1. Verify user is in correct Socket.IO room
  2. Check socket ID exists in userSocketObj
  3. Review emit logic (ensure correct namespace)
  4. Check client-side event listeners are registered
  5. Verify MongoDB Communication record created

Scaling Strategies

Vertical Scaling

  • Increase CPU/memory allocation
  • Optimize Node.js event loop
  • Use clustering (PM2)

Horizontal Scaling

  • Deploy multiple instances behind load balancer
  • Use Redis adapter for pub/sub
  • Implement sticky sessions (IP hash)
  • Database read replicas for queries

Database Optimization

  • Index frequently queried fields
  • Use aggregation pipelines for complex queries
  • Implement caching layer (Redis)
  • Archive old messages/conversations

Redis Optimization

  • Use Redis Cluster for high availability
  • Enable Redis persistence (AOF + RDB)
  • Monitor Redis memory usage
  • Set eviction policy (allkeys-lru)

Maintenance

Regular Tasks

  • Daily: Monitor error logs and metrics
  • Weekly: Review socket connection patterns
  • Monthly: Database cleanup (old messages)
  • Quarterly: Security audit and dependency updates

Upgrade Process

  1. Test new version in staging environment
  2. Deploy to single instance (canary deployment)
  3. Monitor metrics for 1 hour
  4. Gradually roll out to remaining instances
  5. Keep previous version ready for rollback

Additional Resources

💬

Documentation Assistant

Ask me anything about the docs

Hi! I'm your documentation assistant. Ask me anything about the docs!

I can help you with:
- Code examples
- Configuration details
- Troubleshooting
- Best practices

Try asking: How do I configure the API?
09:31 AM