graceful-shutdown
About
The graceful-shutdown skill enables applications to handle SIGTERM signals properly by stopping new requests while completing in-flight operations. It ensures clean termination by draining connections, closing resources, and flushing logs before exit. Use this for containerized deployments, server restarts, and zero-downtime scenarios in Kubernetes, Docker, or process managers.
Documentation
Graceful Shutdown
Overview
Implement proper shutdown procedures to ensure all requests are completed, connections are closed, and resources are released before process termination.
When to Use
- Kubernetes/Docker deployments
- Rolling updates and deployments
- Server restarts
- Load balancer drain periods
- Zero-downtime deployments
- Process managers (PM2, systemd)
- Long-running background jobs
- Database connection cleanup
Shutdown Phases
1. Receive SIGTERM signal
2. Stop accepting new requests
3. Drain active connections
4. Complete in-flight requests
5. Close database connections
6. Flush logs and metrics
7. Exit process
Implementation Examples
1. Express.js Graceful Shutdown
import express from 'express';
import http from 'http';
class GracefulShutdownServer {
private app: express.Application;
private server: http.Server;
private isShuttingDown = false;
private activeConnections = new Set<any>();
private shutdownTimeout = 30000; // 30 seconds
constructor() {
this.app = express();
this.server = http.createServer(this.app);
this.setupMiddleware();
this.setupRoutes();
this.setupShutdownHandlers();
}
private setupMiddleware(): void {
// Track active connections
this.app.use((req, res, next) => {
if (this.isShuttingDown) {
res.set('Connection', 'close');
return res.status(503).json({
error: 'Server is shutting down'
});
}
this.activeConnections.add(res);
res.on('finish', () => {
this.activeConnections.delete(res);
});
res.on('close', () => {
this.activeConnections.delete(res);
});
next();
});
}
private setupRoutes(): void {
this.app.get('/health', (req, res) => {
if (this.isShuttingDown) {
return res.status(503).json({ status: 'shutting_down' });
}
res.json({ status: 'ok' });
});
this.app.get('/api/data', async (req, res) => {
// Simulate long-running request
await new Promise(resolve => setTimeout(resolve, 5000));
res.json({ data: 'response' });
});
}
private setupShutdownHandlers(): void {
const signals: NodeJS.Signals[] = ['SIGTERM', 'SIGINT'];
signals.forEach(signal => {
process.on(signal, () => {
console.log(`Received ${signal}, starting graceful shutdown...`);
this.gracefulShutdown(signal);
});
});
// Handle uncaught exceptions
process.on('uncaughtException', (error) => {
console.error('Uncaught exception:', error);
this.gracefulShutdown('UNCAUGHT_EXCEPTION');
});
process.on('unhandledRejection', (reason, promise) => {
console.error('Unhandled rejection:', reason);
this.gracefulShutdown('UNHANDLED_REJECTION');
});
}
private async gracefulShutdown(signal: string): Promise<void> {
if (this.isShuttingDown) {
console.log('Shutdown already in progress');
return;
}
this.isShuttingDown = true;
console.log(`Starting graceful shutdown (${signal})`);
// Set shutdown timeout
const shutdownTimer = setTimeout(() => {
console.error('Shutdown timeout reached, forcing exit');
process.exit(1);
}, this.shutdownTimeout);
try {
// 1. Stop accepting new connections
await this.stopAcceptingConnections();
// 2. Wait for active requests to complete
await this.waitForActiveConnections();
// 3. Close server
await this.closeServer();
// 4. Cleanup resources
await this.cleanupResources();
console.log('Graceful shutdown completed');
clearTimeout(shutdownTimer);
process.exit(0);
} catch (error) {
console.error('Error during shutdown:', error);
clearTimeout(shutdownTimer);
process.exit(1);
}
}
private async stopAcceptingConnections(): Promise<void> {
console.log('Stopping new connections...');
return new Promise((resolve) => {
this.server.close(() => {
console.log('Server stopped accepting new connections');
resolve();
});
});
}
private async waitForActiveConnections(): Promise<void> {
console.log(`Waiting for ${this.activeConnections.size} active connections...`);
const checkInterval = 100;
const maxWait = this.shutdownTimeout - 5000;
let waited = 0;
while (this.activeConnections.size > 0 && waited < maxWait) {
await new Promise(resolve => setTimeout(resolve, checkInterval));
waited += checkInterval;
if (waited % 1000 === 0) {
console.log(`Still waiting for ${this.activeConnections.size} connections...`);
}
}
if (this.activeConnections.size > 0) {
console.warn(`Force closing ${this.activeConnections.size} remaining connections`);
this.activeConnections.forEach((res: any) => {
res.destroy();
});
}
console.log('All connections closed');
}
private async closeServer(): Promise<void> {
// Server already closed in stopAcceptingConnections
console.log('Server closed');
}
private async cleanupResources(): Promise<void> {
console.log('Cleaning up resources...');
// Close database connections
await this.closeDatabaseConnections();
// Flush logs
await this.flushLogs();
// Close any other resources
await this.closeOtherResources();
console.log('Resources cleaned up');
}
private async closeDatabaseConnections(): Promise<void> {
// Close database connections
console.log('Closing database connections...');
// await db.close();
}
private async flushLogs(): Promise<void> {
// Flush any pending logs
console.log('Flushing logs...');
}
private async closeOtherResources(): Promise<void> {
// Close Redis, message queues, etc.
console.log('Closing other resources...');
}
start(port: number): void {
this.server.listen(port, () => {
console.log(`Server listening on port ${port}`);
});
}
}
// Usage
const server = new GracefulShutdownServer();
server.start(3000);
2. Kubernetes-Aware Shutdown
class KubernetesGracefulShutdown {
private isReady = true;
private isLive = true;
private shutdownDelay = 5000; // K8s propagation delay
setupProbes(app: express.Application): void {
// Readiness probe
app.get('/health/ready', (req, res) => {
if (this.isReady) {
res.status(200).json({ status: 'ready' });
} else {
res.status(503).json({ status: 'not_ready' });
}
});
// Liveness probe
app.get('/health/live', (req, res) => {
if (this.isLive) {
res.status(200).json({ status: 'alive' });
} else {
res.status(503).json({ status: 'not_alive' });
}
});
}
async shutdown(): Promise<void> {
console.log('Kubernetes graceful shutdown initiated');
// 1. Mark as not ready (fail readiness probe)
this.isReady = false;
console.log('Marked as not ready');
// 2. Wait for K8s to remove pod from service endpoints
console.log(`Waiting ${this.shutdownDelay}ms for endpoint propagation...`);
await new Promise(resolve => setTimeout(resolve, this.shutdownDelay));
// 3. Continue with normal graceful shutdown
// ... rest of shutdown logic
}
}
3. Worker Process Shutdown
import Queue from 'bull';
class WorkerShutdown {
private queue: Queue.Queue;
private isProcessing = new Map<string, boolean>();
constructor(queue: Queue.Queue) {
this.queue = queue;
this.setupWorker();
this.setupShutdownHandlers();
}
private setupWorker(): void {
this.queue.process('task', 5, async (job) => {
const jobId = job.id!.toString();
this.isProcessing.set(jobId, true);
try {
console.log(`Processing job ${jobId}`);
await this.processJob(job);
console.log(`Completed job ${jobId}`);
} finally {
this.isProcessing.delete(jobId);
}
});
}
private async processJob(job: Queue.Job): Promise<void> {
// Job processing logic
await new Promise(resolve => setTimeout(resolve, 5000));
}
private setupShutdownHandlers(): void {
process.on('SIGTERM', () => {
console.log('SIGTERM received, shutting down worker...');
this.shutdownWorker();
});
}
private async shutdownWorker(): Promise<void> {
console.log('Pausing queue...');
await this.queue.pause(true, true);
console.log(`Waiting for ${this.isProcessing.size} jobs to complete...`);
// Wait for current jobs to finish
const checkInterval = 500;
const maxWait = 30000;
let waited = 0;
while (this.isProcessing.size > 0 && waited < maxWait) {
await new Promise(resolve => setTimeout(resolve, checkInterval));
waited += checkInterval;
if (waited % 5000 === 0) {
console.log(`Still processing ${this.isProcessing.size} jobs...`);
}
}
if (this.isProcessing.size > 0) {
console.warn(`Forcing shutdown with ${this.isProcessing.size} jobs remaining`);
}
console.log('Closing queue...');
await this.queue.close();
console.log('Worker shutdown complete');
process.exit(0);
}
}
4. Database Connection Pool Shutdown
import { Pool } from 'pg';
class DatabaseShutdown {
private pool: Pool;
private activeQueries = new Set<Promise<any>>();
constructor(pool: Pool) {
this.pool = pool;
this.setupQueryTracking();
}
private setupQueryTracking(): void {
const originalQuery = this.pool.query.bind(this.pool);
this.pool.query = (...args: any[]) => {
const queryPromise = originalQuery(...args);
this.activeQueries.add(queryPromise);
queryPromise.finally(() => {
this.activeQueries.delete(queryPromise);
});
return queryPromise;
};
}
async shutdown(): Promise<void> {
console.log('Shutting down database connections...');
// Wait for active queries
if (this.activeQueries.size > 0) {
console.log(`Waiting for ${this.activeQueries.size} active queries...`);
await Promise.race([
Promise.all(Array.from(this.activeQueries)),
new Promise(resolve => setTimeout(resolve, 5000))
]);
}
// Close pool
console.log('Ending pool...');
await this.pool.end();
console.log('Database connections closed');
}
}
5. PM2 Graceful Shutdown
// ecosystem.config.js
module.exports = {
apps: [{
name: 'api-server',
script: './dist/server.js',
instances: 4,
exec_mode: 'cluster',
kill_timeout: 30000, // Wait 30s for graceful shutdown
wait_ready: true,
listen_timeout: 10000,
shutdown_with_message: true
}]
};
// server.ts
import express from 'express';
const app = express();
const port = process.env.PORT || 3000;
// ... setup routes ...
const server = app.listen(port, () => {
console.log(`Server started on port ${port}`);
// Signal to PM2 that app is ready
if (process.send) {
process.send('ready');
}
});
// Handle shutdown message from PM2
process.on('message', (msg) => {
if (msg === 'shutdown') {
console.log('Received shutdown message from PM2');
gracefulShutdown();
}
});
async function gracefulShutdown() {
console.log('Starting graceful shutdown...');
// Stop accepting new connections
server.close(() => {
console.log('Server closed');
process.exit(0);
});
// Force shutdown after timeout
setTimeout(() => {
console.error('Forced shutdown after timeout');
process.exit(1);
}, 28000); // Less than PM2's kill_timeout
}
6. Python/Flask Graceful Shutdown
import signal
import sys
import time
from flask import Flask, request, g
from threading import Lock
app = Flask(__name__)
class GracefulShutdown:
def __init__(self):
self.is_shutting_down = False
self.active_requests = 0
self.lock = Lock()
def before_request(self):
"""Track active requests."""
if self.is_shutting_down:
return {'error': 'Server is shutting down'}, 503
with self.lock:
self.active_requests += 1
def after_request(self, response):
"""Decrement active requests."""
with self.lock:
self.active_requests -= 1
return response
def shutdown(self, signum, frame):
"""Handle shutdown signal."""
print(f"Received signal {signum}, starting graceful shutdown...")
self.is_shutting_down = True
# Wait for active requests
max_wait = 30
waited = 0
while self.active_requests > 0 and waited < max_wait:
print(f"Waiting for {self.active_requests} active requests...")
time.sleep(1)
waited += 1
if self.active_requests > 0:
print(f"Force closing with {self.active_requests} requests remaining")
print("Graceful shutdown complete")
sys.exit(0)
# Setup graceful shutdown
shutdown_handler = GracefulShutdown()
app.before_request(shutdown_handler.before_request)
app.after_request(shutdown_handler.after_request)
signal.signal(signal.SIGTERM, shutdown_handler.shutdown)
signal.signal(signal.SIGINT, shutdown_handler.shutdown)
@app.route('/health')
def health():
if shutdown_handler.is_shutting_down:
return {'status': 'shutting_down'}, 503
return {'status': 'ok'}
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Best Practices
✅ DO
- Handle SIGTERM and SIGINT signals
- Stop accepting new requests immediately
- Wait for in-flight requests to complete
- Set reasonable shutdown timeouts
- Close database connections properly
- Flush logs and metrics
- Fail health checks during shutdown
- Test shutdown procedures
- Log shutdown progress
- Use graceful shutdown in containers
❌ DON'T
- Ignore shutdown signals
- Force kill processes without cleanup
- Set unreasonably long timeouts
- Skip resource cleanup
- Forget to close connections
- Block shutdown indefinitely
Kubernetes Configuration
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
template:
spec:
containers:
- name: api
image: api-server:latest
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 15
periodSeconds: 10
terminationGracePeriodSeconds: 30
Resources
Quick Install
/plugin add https://github.com/aj-geddes/useful-ai-prompts/tree/main/graceful-shutdownCopy and paste this command in Claude Code to install this skill
GitHub 仓库
Related Skills
sglang
MetaSGLang is a high-performance LLM serving framework that specializes in fast, structured generation for JSON, regex, and agentic workflows using its RadixAttention prefix caching. It delivers significantly faster inference, especially for tasks with repeated prefixes, making it ideal for complex, structured outputs and multi-turn conversations. Choose SGLang over alternatives like vLLM when you need constrained decoding or are building applications with extensive prefix sharing.
evaluating-llms-harness
TestingThis Claude Skill runs the lm-evaluation-harness to benchmark LLMs across 60+ standardized academic tasks like MMLU and GSM8K. It's designed for developers to compare model quality, track training progress, or report academic results. The tool supports various backends including HuggingFace and vLLM models.
llamaguard
OtherLlamaGuard is Meta's 7-8B parameter model for moderating LLM inputs and outputs across six safety categories like violence and hate speech. It offers 94-95% accuracy and can be deployed using vLLM, Hugging Face, or Amazon SageMaker. Use this skill to easily integrate content filtering and safety guardrails into your AI applications.
langchain
MetaLangChain is a framework for building LLM applications using agents, chains, and RAG pipelines. It supports multiple LLM providers, offers 500+ integrations, and includes features like tool calling and memory management. Use it for rapid prototyping and deploying production systems like chatbots, autonomous agents, and question-answering services.
