Node.js Application Monitoring: A Comprehensive Implementation Guide

May 27, 2025

Node.js Application Monitoring: A Comprehensive Implementation Guide - Odown - uptime monitoring and status page

Node.js has become a cornerstone technology for building high-performance, scalable web applications. However, its event-driven, non-blocking architecture introduces unique monitoring challenges that differ from traditional server environments. While understanding the business impact of website reliability provides the rationale for investing in monitoring, this guide focuses on the technical implementation specific to Node.js applications.

Key Performance Metrics for Node.js Applications

Effective Node.js monitoring requires tracking metrics that reflect its unique runtime characteristics. Unlike traditional thread-based servers, Node.js operates on a single-threaded event loop model with asynchronous I/O operations, necessitating specialized monitoring approaches.

Core Runtime Metrics

1. CPU Usage

Node.js applications are single-threaded for JavaScript execution but use a thread pool for certain operations. Monitoring CPU usage helps identify processing bottlenecks:

javascript

  // Basic CPU usage monitoring implementation

  const os = require('os');

  function getCpuUsage() {

    const cpus = os.cpus();

    let totalIdle = 0;

    let totalTick = 0;

    for (const cpu of cpus) {

      for (const type in cpu.times) {

        totalTick += cpu.times[type];

      }

      totalIdle += cpu.times.idle;

    }

    return {

      idle: totalIdle / cpus.length,

      total: totalTick / cpus.length,

      usage: 100 - (totalIdle / totalTick * 100)

    };

  }

  // Track CPU usage over time

  let lastCpuUsage = getCpuUsage();

  setInterval(() => {

    const currentCpuUsage = getCpuUsage();

    const idleDiff = currentCpuUsage.idle - lastCpuUsage.idle;

    const totalDiff = currentCpuUsage.total - lastCpuUsage.total;

    const usagePercentage = 100 - (idleDiff / totalDiff * 100);

    console.log(`CPU Usage: ${usagePercentage. toFixed(2)}%`);

    // Alert on high CPU usage

    if (usagePercentage > 85) {

      notifyHighCpuUsage (usagePercentage);

    }

    lastCpuUsage = currentCpuUsage;

  }, 5000);

Recommended Thresholds:

Warning: >70% sustained CPU usage

Critical: >85% sustained CPU usage

Alert trigger: >80% for 3+ consecutive measurement intervals

2. Memory Consumption

Node.js applications have a default memory limit that can cause instability if exceeded. Track both total memory usage and the breakdown of different memory types:

javascript

  // Memory usage monitoring

  function getMemoryUsage() {

    const memoryUsage = process.memoryUsage();

    return {

      rss: memoryUsage.rss / 1024 / 1024, // Resident Set Size in MB

      heapTotal: memoryUsage. heapTotal / 1024 / 1024, // Total size of the allocated heap

      heapUsed: memoryUsage .heapUsed / 1024 / 1024, // Actual memory used during execution

      external: memoryUsage. external / 1024 / 1024, // Memory used by C++ objects bound to JavaScript

      arrayBuffers: memoryUsage. arrayBuffers / 1024 / 1024 // ArrayBuffers and SharedArrayBuffers

    };

  }

  setInterval(() => {

    const memory = getMemoryUsage();

    console.log(`Memory RSS: ${memory.rss. toFixed(2)}MB | Heap Used: ${memory.heapUsed. toFixed(2)}MB / ${memory.heapTotal. toFixed(2)}MB`);

    // Check for potential memory leaks based on heap growth pattern

    memoryHistoryArray. push (memory.heapUsed);

    if (memoryHistoryArray. length > 10) {

      memoryHistoryArray. shift();

      const isConstantlyGrowing = memoryHistoryArray .every ((value, index, array) =>

        index === 0 || value >= array[index - 1] * 1.01 // 1% growth threshold

      );

      if (isConstantlyGrowing && memory.heapUsed > 500) { // 500MB threshold

        notifyPotential MemoryLeak (memory, memoryHistoryArray);

      }

    }

  }, 30000);

Recommended Thresholds:

Warning: >70% of max old space size

Critical: >85% of max old space size

Heap growth pattern: Alert on consistent upward trend without garbage collection drops

RSS growth: Alert when exceeding 3x the initial RSS after application warmup

3. Event Loop Lag Monitoring

The event loop is the heart of a Node.js application. Monitoring its lag helps identify when the application is becoming unresponsive:

javascript

  // Event loop lag monitoring

  function monitorEventLoopLag() {

    let lastCheck = Date.now();

    setInterval(() => {

      const now = Date.now();

      const lag = now - lastCheck - 100; // We expect ~100ms between checks

      console.log (`Event Loop Lag: ${lag}ms`);

      if (lag > 200) { // More than 100ms of lag

        notifyEvent LoopLag(lag);

      }

      lastCheck = now;

    }, 100);

  }

  monitorEvent LoopLag();

For more accurate monitoring, consider using specialized libraries like loopbench or toobusy-js:

javascript

  // Node.js event loop monitoring with toobusy-js

  const toobusy = require ('toobusy-js');

  // Set maximum lag to 100ms

  toobusy.maxLag(100);

  // Express middleware to track event loop lag and respond with 503 when overloaded

  app.use((req, res, next) => {

    if (toobusy()) {

      // Record the overload incident

      recordEvent LoopOverload (toobusy.lag());

      // Respond with 503 Service Unavailable

      res.status(503) .send ("Server is too busy right now. Please try again later.");

      return;

    }

    next();

  });

  // Periodically log event loop lag

  setInterval(() => {

    console.log (`Current event loop lag: ${toobusy.lag()}ms`);

  }, 1000);

Recommended Thresholds:

Warning: >100ms event loop lag

Critical: >200ms event loop lag

Request rejection threshold: Typically 500-1000ms, depending on application type

Application-Level Metrics

1. HTTP Request Metrics

For web applications, tracking request metrics provides insights into performance and usage patterns:

javascript

  // Express middleware for request monitoring

  const requestMonitoring = (req, res, next) => {

    // Track request start time

    const startTime = process.hrtime();

    // Track response size

    let responseSize = 0;

    const originalWrite = res.write;

    const originalEnd = res.end;

    res.write = function (chunk) {

      if (chunk) {

        responseSize += chunk.length;

      }

      return originalWrite .apply (res, arguments);

    };

    res.end = function (chunk) {

      if (chunk) {

        responseSize += chunk.length;

      }

      // Calculate duration

      const duration = getDurationInMs (startTime);

      // Log request details

      const requestLog = {

        method: req.method,

        url: req.url,

        statusCode: res.statusCode,

        duration: duration,

        responseSize: responseSize,

        userAgent: req.get ('User-Agent'),

        timestamp: new Date().t oISOString()

      };

      // Record request metrics

      recordRequest Metrics (requestLog);

      // Track slow requests

      if (duration > 1000) {

        notifySlowRequest (requestLog);

      }

      return originalEnd .apply(res, arguments);

    };

    next();

  };

  // Helper function to calculate duration in milliseconds

  function getDurationInMs (startTime) {

    const diff = process.hrtime (startTime);

    return (diff[0] * 1e3) + (diff[1] * 1e-6);

  }

  // Apply middleware to Express app

  app.use (requestMonitoring);

Recommended Thresholds:

Average response time: Alert on 50% increase from baseline

Slow requests: >1000ms for standard APIs, >3000ms for complex operations

Error rate: >1% of total requests

Status code anomalies: Sudden increase in non-200 responses

2. Database Connection Metrics

Most Node.js applications rely heavily on databases, making connection monitoring critical:

javascript

  // MongoDB connection pool monitoring example

  const mongoose = require ('mongoose');

  // Monitor connection events

  mongoose.connection.on ('connected', () => {

    console.log ('MongoDB connected');

    startMongo ConnectionMonitoring();

  });

  mongoose. connection.on ('error', (err) => {

    console.error ('MongoDB connection error:', err);

    notifyDatabase ConnectionError(err);

  });

  mongoose.connection.on ('disconnected', () => {

    console.log ('MongoDB disconnected');

    notifyDatabase Disconnection();

  });

  // Monitor connection pool stats

  function startMongo ConnectionMonitoring() {

    setInterval(async () => {

      try {

        const adminDb = mongoose. connection.db.admin();

        const serverStatus = await adminDb. serverStatus();

        const connectionStats = {

          current: serverStatus. connections.current,

          available: serverStatus. connections.available,

          totalCreated: serverStatus. connections.totalCreated,

          utilization: serverStatus. connections.current /

          (serverStatus. connections.current + serverStatus. connections.available) * 100

        };

        console.log (`MongoDB Connections: ${connectionStats.current} active, ${connectionStats.available} available (${connectionStats. utilization. toFixed(2)}% utilization)`);

        // Alert on high connection utilization

        if (connectionStats. utilization > 85) {

          notifyHighConnection Utilization (connectionStats);

        }

      } catch (error) {

        console.error('Error monitoring MongoDB connections:', error);

      }

    }, 30000);

  }

Recommended Thresholds:

Connection pool utilization: >85%

Connection errors: >0 over 5-minute period

Connection reset frequency: >3 reconnects per hour

Query timeout rate: >0.1% of total queries

3. External Service Dependencies

Track performance and availability of external services your application depends on:

javascript

  // HTTP client instrumentation example

  const axios = require('axios');

  const originalRequest = axios.request;

  // Wrap axios requests with monitoring

  axios.request = function monitoredRequest (config) {

    const startTime = process.hrtime();

    const service = extractServiceName (config.url);

    return originalRequest .call(this, config)

      .then(response => {

        const duration = getDurationInMs (startTime);

        // Record successful dependency call

        recordDependency Call({

          service,

          url: config.url,

          method: config.method,

          statusCode: response.status,

          duration,

          successful: true

        });

        // Alert on slow dependencies

        if (duration > 1000) {

          notifySlow Dependency (service, config.url, duration);

        }

        return response;

      })

      .catch(error => {

        const duration = getDurationInMs (startTime);

        // Record failed dependency call

        recordDependencyCall({

          service,

          url: config.url,

          method: config.method,

          statusCode: error.response ? error.response.status : 0,

          duration,

          successful: false,

          errorMessage: error.message

        });

        // Alert on dependency failures

        notifyDependency Failure (service, config.url, error);

        throw error;

      });

  };

  // Extract service name from URL

  function extractService Name(url) {

    try {

      const parsedUrl = new URL(url);

      return parsedUrl.hostname;

    } catch (e) {

      return 'unknown';

    }

  }

Recommended Thresholds:

Response time: >1000ms average

Error rate: >1% of requests to a specific service

Availability: <99.5% over 5-minute window

Circuit breaker trigger: 5 consecutive failures

Memory Leak Detection Strategies

Memory leaks are among the most common issues in long-running Node.js applications. Implement these strategies to detect and address them:

1. Heap Snapshot Analysis

Use the heapdump module to capture heap snapshots for analysis:

javascript

  // Heap snapshot management

  const heapdump = require('heapdump');

  const fs = require('fs');

  // Enable heap snapshot creation on signal

  process.on ('SIGUSR2', () => {

    const filename = `${process.cwd()}/ heapdump-${Date.now()}. heapsnapshot`;

    heapdump.writeSnapshot (filename, (err) => {

      if (err) console.error ('Failed to create heapdump:', err);

      else console.log(`Heap snapshot written to ${filename}`);

    });

  });

  // Automatically create snapshots on threshold breach

  let lastHeapUsed = 0;

  const heapGrowthThreshold = 100; // MB

  function checkHeapGrowth() {

    const memoryUsage = process.memoryUsage();

    const heapUsedMB = memoryUsage.heapUsed / 1024 / 1024;

    if (lastHeapUsed > 0 && heapUsedMB - lastHeapUsed > heapGrowthThreshold) {

      console.warn (`Significant heap growth detected: ${heapUsedMB. toFixed(2)}MB (increased by ${(heapUsedMB - lastHeapUsed) .toFixed(2)}MB)`);

      const filename = `${process.cwd()}/ heapdump-growth-$ {Date.now()}. heapsnapshot`;

      heapdump.writeSnapshot (filename, (err) => {

        if (err) console.error ('Failed to create automatic heapdump:', err);

        else {

          console.log (`Growth-triggered heap snapshot written to ${filename}`);

          notifyAutomaticHeapdump (filename, heapUsedMB, lastHeapUsed);

        }

      });

    }

    lastHeapUsed = heapUsedMB;

  }

  // Check heap every 15 minutes in production environments

  if (process.env. NODE_ENV === 'production') {

    setInterval (checkHeapGrowth, 15 * 60 * 1000);

  }

2. Garbage Collection Metrics

Track garbage collection frequency and duration to identify potential issues:

javascript

  // Tracking garbage collection with gc-stats

  const gcStats = require('gc-stats')();

  let gcMetrics = {

    totalTime: 0,

    count: 0,

    scavengeCount: 0,

    markSweepCount: 0,

    compactCount: 0,

    incrementalMarkingCount: 0

  };

  gcStats.on('stats', (stats) => {

    // Update metrics

    gcMetrics.totalTime += stats.pause;

    gcMetrics.count += 1;

    // Update specific GC type counts

    switch (stats.gctype) {

      case 1: // Scavenge (Minor GC)

        gcMetrics. scavengeCount += 1;

        break;

      case 2: // Mark/Sweep/Compact (Major GC)

        gcMetrics. markSweepCount += 1;

        break;

      case 3: // Incremental Marking

        gcMetrics. incrementalMarkingCount += 1;

        break;

      case 4: // Incremental Marking

        gcMetrics. incrementalMarkingCount += 1;

        break;

    }

    // Log GC activity

    console.log (`GC: type=${stats.gctype}, pause=${stats.pause}ms, heapBefore=${ (stats.before.totalHeapSize /1024/1024). toFixed(2)}MB, heapAfter=$ {(stats.after. totalHeapSize/1024/1024) .toFixed(2)}MB`);

    // Alert on concerning GC patterns

    if (stats.pause > 200) {

      notifyLongGCPause (stats);

    }

  });

  // Report GC metrics periodically

  setInterval(() => {

    const gcPercentage = gcMetrics.totalTime / (5 * 60 * 1000) * 100;

    console.log(`GC Summary: ${gcMetrics.count} collections in last 5 minutes, total time: ${gcMetrics. totalTime.toFixed (2)}ms (${gcPercentage. toFixed(2)}% of time)`);

    // Alert if garbage collection is taking too much time

    if (gcPercentage > 10) {

      notifyExcessiveGC (gcMetrics);

    }

    // Reset metrics

    gcMetrics = {

      totalTime: 0,

      count: 0,

      scavengeCount: 0,

      markSweepCount: 0,

      compactCount: 0,

      incremental MarkingCount: 0

    };

  }, 5 * 60 * 1000);

3. Memory Growth Pattern Analysis

Implement trend analysis to detect consistent memory growth patterns:

javascript

  // Memory trend analysis

  const memoryHistory = {

    timestamps: [],

    measurements: [],

    maxSize: 60 // Store an hour of data at 1-minute intervals

  };

  function recordMemoryUsage() {

    const memoryUsage = process.memoryUsage ();

    const heapUsedMB = memoryUsage.heapUsed / 1024 / 1024;

    memoryHistory. timestamps.push (Date.now());

    memoryHistory. measurements.push (heapUsedMB);

    // Keep history within size limit

    if (memoryHistory. timestamps.length > memoryHistory.maxSize) {

      memoryHistory. timestamps.shift();

      memoryHistory. measurements.shift();

    }

    // Only analyze after collecting enough data points

    if (memoryHistory. measurements.length >= 10) {

      analyzeMemoryTrend();

    }

  }

  function analyzeMemoryTrend() {

    // Calculate linear regression to detect growth pattern

    const n = memoryHistory. measurements.length;

    let sumX = 0, sumY = 0, sumXY = 0, sumXX = 0;

    for (let i = 0; i < n; i++) {

      sumX += i;

      sumY += memoryHistory. measurements[i];

      sumXY += i * memoryHistory. measurements[i];

      sumXX += i * i;

    }

    const slope = (n * sumXY - sumX * sumY) / (n * sumXX - sumX * sumX);

    const growthRatePerHour = slope * 60; // Convert to MB/hour

    // Detect consistent growth patterns that indicate potential leaks

    if (growthRatePerHour > 10) { // 10MB/hour growth threshold

      notifyMemoryGrowthPattern (growthRatePerHour, memoryHistory);

    }

  }

  // Record memory usage every minute

  setInterval (recordMemoryUsage, 60 * 1000);

Setting Up Effective Node.js Monitoring with Odown

Implementing a comprehensive Node.js monitoring solution with Odown involves integrating several components to track both external availability and internal health metrics.

Application Health Check Endpoint

Start by implementing a health check endpoint that provides internal status information:

javascript

  // health.js - Express health check route

  const os = require('os');

  const router = require ('express').Router();

  // Basic health status

  router.get('/', (req, res) => {

    res.status(200).json({

      status: 'UP',

      timestamp: new Date()

    });

  });

  // Detailed health metrics

  router.get('/details', (req, res) => {

    const memoryUsage = process.memoryUsage ();

    res.status(200).json({

      status: 'UP',

      timestamp: new Date(),

      uptime: process.uptime(),

      memory: {

        rss: memoryUsage.rss / 1024 / 1024,

        heapTotal: memoryUsage.heapTotal / 1024 / 1024,

        heapUsed: memoryUsage.heapUsed / 1024 / 1024,

        external: memoryUsage.external / 1024 / 1024,

        memoryUtilization: memoryUsage.heapUsed / memoryUsage.heapTotal

      },

      cpu: {

        count: os.cpus().length,

        load: os.loadavg()

      },

      system: {

        platform: process.platform,

        arch: process.arch,

        nodeVersion: process.version

      }

    });

  });

  // Dependency health checks

  router.get ('/dependencies', async (req, res) => {

    try {

      const dependencyChecks = await Promise.allSettled([

        checkDatabase Connection(),

        checkRedisConnection(),

        checkExternalAPI()

      ]);

      const dependencies = {

        database: dependencyChecks[0].status === 'fulfilled' ? dependencyChecks[0] .value : { status: 'DOWN', error: dependencyChecks [0].reason },

        cache: dependencyChecks[1].status === 'fulfilled' ? dependencyChecks[1] .value : { status: 'DOWN', error: dependencyChecks [1].reason },

        externalApi: dependencyChecks[2].status === 'fulfilled' ? dependencyChecks[2] .value : { status: 'DOWN', error: dependencyChecks [2].reason }

      };

      // Overall status is UP only if all critical dependencies are UP

      const status = dependencies. database.status === 'UP' ? 'UP' : 'DOWN';

      res.status (status === 'UP' ? 200 : 503).json({

        status,

        timestamp: new Date(),

        dependencies

      });

    } catch (error) {

      res.status (500).json({

        status: 'ERROR',

        error: error.message

      });

    }

  });

  // Add the health routes to your Express app

  app.use('/health', router);

Odown HTTP Check Configuration

Configure Odown to monitor your application's health endpoints:

javascript

  // Example Odown monitor configuration

  const nodejsMonitor = {

    name: "Node.js Application Monitoring",

    type: "http",

    url: "https://api.example.com /health",

    method: "GET",

    checkFrequency: 60, // Check every 60 seconds

    locations: ["us-east", "eu-west", "asia-east"],

    alertThreshold: 2, // Alert after two consecutive failures

    assertions: [

      { type: "statusCode", comparison: "equals", value: 200 },

      { type: "responseTime", comparison: "lessThan", value: 500 },

      { type: "jsonBody", path: "$.status", comparison: "equals", value: "UP" }

    ]

  };

  // Dependency health check

  const dependencyMonitor = {

    name: "Node.js Dependencies Health",

    type: "http",

    url: "https://api.example.com /health/dependencies",

    method: "GET",

    checkFrequency: 120, // Check every 2 minutes

    locations: ["us-east"],

    assertions: [

      { type: "statusCode", comparison: "equals", value: 200 },

      { type: "jsonBody", path: "$.dependencies. database.status", comparison: "equals", value: "UP" },

      { type: "jsonBody", path: "$.dependencies. cache.status", comparison: "equals", value: "UP" }

    ]

  };

Microservice Dependency Tracking

For applications with microservice architectures, implement dependency tracking between services:

javascript

  // microservice-tracking.js

  const { Tracer } = require ('opentracing');

  const axios = require ('axios');

  // Initialize tracer (using a specific implementation like Jaeger)

  const tracer = initTracer ('user-service');

  // Track outgoing HTTP requests

  function instrumentAxios() {

    const originalRequest = axios.request;

    axios.request = function tracedRequest (config) {

      const span = tracer.startSpan ('http_request');

      // Add span context to headers

      const tracingHeaders = {};

      tracer.inject (span.context(), opentracing. FORMAT_HTTP_HEADERS, tracingHeaders);

      config.headers = {

        ...config.headers,

        ...tracingHeaders

      };

      // Add request details to span

      span.setTag ('http.url', config.url);

      span.setTag ('http.method', config.method);

      span.setTag ('service.name', extractServiceName (config.url));

      const startTime = Date.now();

      return originalRequest .call(this, config)

        .then(response => {

          const duration = Date.now() - startTime;

          span.setTag ('http.status_code', response.status);

          span.setTag ('response.time', duration);

          span.finish();

          return response;

        })

        .catch(error => {

          const duration = Date.now() - startTime;

          span.setTag ('http.status_code', error.response ? error.response.status : 0);

          span.setTag ('response.time', duration);

          span.setTag ('error', true);

          span.setTag ('error.message', error.message);

          span.finish();

          throw error;

        });

    };

  }

  // Express middleware to extract and create spans

  function traceMiddleware (req, res, next) {

    let span;

    // Try to extract parent span context from request headers

    const parentSpanContext = tracer.extract (opentracing. FORMAT_HTTP_HEADERS, req.headers);

    if (parentSpanContext) {

      span = tracer.startSpan ('http_server', { childOf: parentSpanContext });

    } else {

      span = tracer.startSpan ('http_server');

    }

    // Add request details to span

    span.setTag ('http.url', req.url);

    span.setTag ('http.method', req.method);

    span.setTag ('service.name', 'user-service');

    // Store span in request for later use

    req.span = span;

    // Finish span on response completion

    const finishSpan = () => {

      span.setTag ('http.status_code', res.statusCode);

      span.finish();

    };

    res.on ('finish', finishSpan);

    res.on ('close', finishSpan);

    next();

  }

  // Apply middleware to Express app

  app.use (traceMiddleware);

This distributed tracing implementation allows you to visualize service dependencies and track performance across service boundaries.

Troubleshooting Common Node.js Performance Issues

Event Loop Blocking Detection

Identify operations that block the event loop and cause application unresponsiveness:

javascript

  // event-loop-monitor.js

  const blocked = require('blocked');

  // Monitor event loop blocking

  blocked((ms) => {

    console.warn (`Event loop blocked for ${ms}ms`);

    if (ms < 100) {

      // Minor blocking, log and record metrics

      recordEventLoop Blocking('minor', ms);

    } else if (ms < 500) {

      // Moderate blocking, alert for investigation

      recordEventLoop Blocking('moderate', ms);

      notifyEventLoop Blocking(ms);

    } else {

      // Severe blocking, high-priority alert

      recordEventLoop Blocking('severe', ms);

      notifyEventLoop BlockingEmergency(ms);

      // Capture diagnostic information

      captureDiagnostics();

    }

  }, { threshold: 50 }); // Detect blocks over 50ms

  // Capture diagnostic information during severe event loop blocks

  function captureDiagnostics() {

    // Record CPU profile for 5 seconds

    const profiler = require ('v8-profiler-node8');

    const profileName = `cpu-profile-$ {Date.now()}`;

    profiler. startProfiling (profileName);

    setTimeout(() => {

      const profile = profiler.stopProfiling (profileName);

      // Save profile to disk

      const fs = require('fs');

      fs.writeFileSync (`${profileName}. cpuprofile`, JSON.stringify(profile));

      // Cleanup

      profile.delete();

      // Notify about profile creation

      notifyDiagnostics Created (profileName);

    }, 5000);

  }

Memory Leak Identification

Implement more advanced memory leak detection beyond basic growth tracking:

javascript

  // memory-leak-detectors.js

  const memwatch = require('' @airbnb/node-memwatch');

  const heapdump = require('' heapdump');

  // Listen for memory leak events

  memwatch.on ('leak', (info) => {

    console.warn ('Memory leak detected:', info);

    // Generate heap snapshot

    const filename = `${process.cwd()} /heapdump-leak-${Date.now()} .heapsnapshot`;

    heapdump. writeSnapshot (filename, (err) => {

      if (err) {

        console.error ('Failed to create leak heapdump:', err);

      } else {

        console.log (`Leak-triggered heap snapshot written to ${filename}`);

        // Notify about leak detection

        notifyMemoryLeak (info, filename);

      }

    });

  });

  // Track memory stats between garbage collections

  memwatch.on('stats', (stats) => {

    console.log ('Memory stats:', stats);

    // Alert on concerning memory patterns

    if (stats.current_base > stats.estimated_base * 1.5) {

      notifyMemoryBaseDrift (stats);

    }

  });

  // Leak detection using object growth tracking

  const objectCounts = new Map();

  let lastSampled = Date.now();

  function trackObject Counts() {

    try {

      // Sample object counts

      const objects = getObjectCounts();

      const now = Date.now();

      // Check for significant increases

      for (const [type, count] of Object.entries (objects)) {

        const previous = objectCounts.get(type) || 0;

        // Calculate growth rate per hour

        const growthRate = (count - previous) / ((now - lastSampled) / 3600000);

        // Alert on high growth rates for significant object counts

        if (previous > 1000 && count > previous * 1.2 && growthRate > 1000) {

          notifyObject TypeGrowth (type, previous, count, growthRate);

        }

        // Update counts

        objectCounts.set (type, count);

      }

      lastSampled = now;

    } catch (error) {

      console.error ('Error tracking object counts:', error);

    }

  }

  function getObject Counts() {

    // V8 heap statistics - note that this is a simplified version

    // In real implementations, use v8-heap-snapshot or similar libraries

    const v8 = require('v8');

    const stats = v8.getHeap Statistics();

    return {

      total_heap_size: stats.total_heap_size,

      used_heap_size: stats.used_heap_size,

      heap_size_limit: stats.heap_size_limit

    };

  }

  // Sample object counts every 15 minutes

  setInterval (trackObjectCounts, 15 * 60 * 1000);

CPU Profiling for Performance Bottlenecks

Implement on-demand CPU profiling to identify performance bottlenecks:

javascript

  // cpu-profiler.js

  const v8Profiler = require ('v8-profiler-node8');

  const fs = require('fs');

  // Set up prof

  // cpu-profiler.js (continued)

  const v8Profiler = require ('v8-profiler-node8');

  const fs = require('fs');

  // Set up profiling endpoint for on-demand CPU profiling

  function setupProfiling Endpoints(app) {

    // Secure with API key to prevent unauthorized profiling

    const API_KEY = process.env. PROFILING_API_KEY || 'development-only-key';

    app.post ('/debug/cpu-profile', (req, res) => {

      // Verify API key

      if (req.headers ['x-api-key'] !== API_KEY) {

        return res.status(401). json({ error: 'Unauthorized' });

      }

      const duration = parseInt (req.query.duration || '30', 10);

      const profileName = `cpu-profile-$ {Date.now()}`;

      // Start CPU profiling

      console.log(`Starting CPU profile: ${profileName} for ${duration} seconds`);

      v8Profiler. startProfiling (profileName, true);

      // Stop profiling after specified duration

      setTimeout(() => {

        const profile = v8Profiler.stopProfiling (profileName);

        // Save profile to disk

        const profilePath = `${process.cwd()} /profiles/${profileName}. cpuprofile`;

        // Ensure directory exists

        fs.mkdirSync (`${process.cwd()} /profiles`, { recursive: true });

        // Write profile to file

        fs.writeFileSync (profilePath, JSON.stringify (profile));

        // Clean up

        profile.delete();

        console.log(`CPU profile saved to ${profilePath}`);

        // Return profile info

        res.json({

          profile: profileName,

          path: profilePath,

          duration: duration

        });

      }, duration * 1000);

      // Respond immediately

      res.json({

        status: 'Profiling started',

        profile: profileName,

        duration: duration

      });

    });

    // Endpoint to list available profiles

    app.get ('/debug/cpu-profiles', (req, res) => {

      // Verify API key

      if (req.headers ['x-api-key'] !== API_KEY) {

        return res.status (401).json({ error: 'Unauthorized' });

      }

      const profilesDir = `${process.cwd ()}/profiles`;

      // Create directory if it doesn't exist

      if (!fs.existsSync (profilesDir)) {

        fs.mkdirSync (profilesDir, { recursive: true });

      }

      // Read directory and filter for CPU profiles

      const files = fs.readdirSync (profilesDir)

        .filter(file => file.endsWith ('.cpuprofile'))

        .map(file => ({

          name: file.replace ('.cpuprofile', ''),

          path: `${profilesDir} /${file}`,

          size: fs.statSync (`${profilesDir}/ ${file}`).size,

          created: fs.statSync (`${profilesDir}/ ${file}`).mtime

        }));

      res.json(files);

    });

  }

Handling Uncaught Exceptions and Promise Rejections

Proper error handling is essential for reliable Node.js applications:

javascript

  // error-handling.js

  const fs = require('fs');

  // Create error log directory

  const errorLogDir = `${process.cwd()} /logs`;

  fs.mkdirSync (errorLogDir, { recursive: true });

  // Track uncaught exceptions

  process.on ('uncaughtException', (error) => {

    // Get stack trace

    const stack = error.stack || new Error().stack;

    // Create detailed error log

    const errorLog = {

      timestamp: new Date(). toISOString(),

      type: 'uncaughtException',

      error: {

        message: error.message,

        name: error.name,

        stack: stack

      },

      process: {

        pid: process.pid,

        uptime: process.uptime(),

        memory: process.memoryUsage ()

      }

    };

    // Log to console

    console.error ('Uncaught Exception:', error);

    // Write to log file

    const logFile = `${errorLogDir}/ uncaught-exception- ${Date.now()}.json`;

    fs.writeFileSync (logFile, JSON.stringify (errorLog, null, 2));

    // Report error to monitoring system

    reportFatalError (errorLog);

    // Graceful shutdown

    gracefulShutdown ('uncaughtException')

      .catch (shutdownError => {

        console.error ('Error during graceful shutdown:', shutdownError);

        process.exit(1);

      });

  });

  // Track unhandled promise rejections

  process.on ('unhandledRejection', (reason, promise) => {

    // Create error log

    const errorLog = {

      timestamp: new Date(). toISOString(),

      type: 'unhandledRejection',

      error: {

        message: reason instanceof Error ? reason.message : String(reason),

        stack: reason instanceof Error ? reason.stack : 'No stack trace available',

        reason: reason instanceof Error ? reason : String(reason)

      },

      process: {

        pid: process.pid,

        uptime: process.uptime(),

        memory: process.memoryUsage ()

      }

    };

    // Log to console

    console.error ('Unhandled Promise Rejection:', reason);

    // Write to log file

    const logFile = `${errorLogDir}/ unhandled-rejection- ${Date.now()}.json`;

    fs.writeFileSync (logFile, JSON.stringify (errorLog, null, 2));

    // Report error to monitoring system

    reportNonFatalError (errorLog);

  });

  // Graceful shutdown function

  async function gracefulShutdown (reason) {

    console.log(`Initiating graceful shutdown due to ${reason}...`);

    // Log shutdown event

    const shutdownLog = {

      timestamp: new Date(). toISOString(),

      type: 'shutdown',

      reason: reason,

      process: {

        pid: process.pid,

        uptime: process.uptime(),

        memory: process.memoryUsage ()

      }

    };

    // Write shutdown log

    const logFile = `${errorLogDir} /shutdown- ${Date.now()}.json`;

    fs.writeFileSync (logFile, JSON.stringify (shutdownLog, null, 2));

    // Close database connections

    try {

      console.log ('Closing database connections...');

      await closeDatabase Connections();

    } catch (error) {

      console. ('Error closing database connections:', error);

    }

    // Close other resources (Redis, etc.)

    try {

      console.log('Closing other resources...');

      await closeOtherResources ();

    } catch (error) {

      console.error ('Error closing other resources:', error);

    }

    // Let existing requests finish (for HTTP servers)

    if (global.server) {

      console.log ('Closing HTTP server...');

      await new Promise ((resolve) => {

        global.server. close(resolve);

      });

    }

    console.log ('Graceful shutdown complete.');

    process.exit(1);

  }

Integrating with Alerting Systems

Configure alerts for different monitoring metrics based on their severity and impact:

javascript

  // alerting.js

  const axios = require('axios');

  // Alert levels

  const ALERT_LEVELS = {

    INFO: 'info',

    WARNING: 'warning',

    ERROR: 'error',

    CRITICAL: 'critical'

  };

  // Configure alert destinations

  const alertConfig = {

    slack: {

      webhookUrl: process.env. SLACK_WEBHOOK_URL,

      enabled: true

    },

    email: {

      apiKey: process.env. EMAIL_API_KEY,

      recipients: process.env. ALERT_EMAIL_RECIPIENTS? .split(',') || [],

      enabled: true

    },

    pagerDuty: {

      serviceKey: process.env. PAGERDUTY_SERVICE_KEY,

      enabled: process.env. NODE_ENV === 'production'

    }

  };

  // Send alert to configured destinations

  async function sendAlert (level, title, details) {

    const timestamp = new Date(). toISOString();

    const environment = process.env. NODE_ENV || 'development';

    const serviceName = process.env. SERVICE_NAME || 'nodejs-app';

    console.log(`ALERT [${level}]: ${title}`);

    const alertPromises = [];

    // Send to Slack

    if (alertConfig. slack.enabled && alertConfig. slack.webhookUrl) {

      alertPromises. push(

        axios.post (alertConfig.slack. webhookUrl, {

          text: `[${environment. toUpperCase()}] [${level. toUpperCase()}] ${title}`,

          attachments: [

            {

              color: getColorFor Level(level),

              fields: [

                {

                  title: 'Service',

                  value: serviceName,

                  short: true

                },

                {

                  title: 'Environment',

                  value: environment,

                  short: true

                },

                {

                  title: 'Timestamp',

                  value: timestamp,

                  short: true

                },

                {

                  title: 'Details',

                  value: typeof details === 'object' ? JSON.stringify (details, null, 2) : details

                }

              ]

            }

          ]

        }).catch(error => {

          console.error('Error sending Slack alert:', error.message);

        })

      );

    }

    // Send to PagerDuty for critical alerts

    if (alertConfig. pagerDuty. enabled && alertConfig. pagerDuty.serviceKey && level === ALERT_LEVELS.CRITICAL) {

      alertPromises.push(

        axios.post ('https://events. pagerduty.com/v2 /enqueue', {

          routing_key: alertConfig. pagerDuty.serviceKey,

          event_action: 'trigger',

          payload: {

            summary: `[${environment. toUpperCase()}] ${title}`,

            source: serviceName,

            severity: 'critical',

            timestamp: timestamp,

            custom_details: details

          }

        }).catch (error => {

          console.error('Error sending PagerDuty alert:', error.message);

        })

      );

    }

    // Wait for all alerts to be sent

    await Promise.all (alertPromises);

  }

  // Helper function to get color based on alert level

  function getColorFor Level(level) {

    switch (level) {

      case ALERT_LEVELS. INFO:

        return '#3498db';

      case ALERT_LEVELS. WARNING:

        return '#f39c12';

      case ALERT_LEVELS. ERROR:

        return '#e74c3c';

      case ALERT_LEVELS. CRITICAL:

        return '#c0392b';

      default:

        return '#95a5a6';

    }

  }

  // Alert functions for different scenarios

  function alertHighMemoryUsage (memoryUsage) {

    const title = `High Memory Usage: ${memoryUsage. heapUsed.toFixed (2)}MB / ${memoryUsage. heapTotal. (2)}MB (${(memoryUsage. heapUsed / memoryUsage. heapTotal * 100).toFixed (2)}%)`;

    sendAlert(

      memoryUsage.heapUsed / memoryUsage.heapTotal > 0.85 ? ALERT_LEVELS.CRITICAL : ALERT_LEVELS.WARNING,

      title,

      memoryUsage

    );

  }

  function alertEventLoopLag(lag) {

    const title = `Event Loop Lag: ${lag.toFixed(2)}ms`;

    const level = lag > 500 ? ALERT_LEVELS.CRITICAL : (lag > 100 ? ALERT_LEVELS.WARNING : ALERT_LEVELS.INFO);

    sendAlert(level, title, { lag });

  }

  // Export alert functions

  module.exports = {

    alertHighMemoryUsage,

    alertEventLoopLag,

    alertHighCpuUsage: (usage) => {

      sendAlert(

        usage > 90 ? ALERT_LEVELS.CRITICAL : ALERT_LEVELS.WARNING,

        `High CPU Usage: ${usage.toFixed(2)}%`,

        { cpuUsage: usage }

      );

    },

    alertMemoryLeak: (info, snapshotPath) => {

      sendAlert(

        ALERT_LEVELS.CRITICAL,

        'Memory Leak Detected',

        { ...info, snapshotPath }

      );

    },

    alertHighErrorRate: (rate, timeWindow) => {

      sendAlert(

        rate > 0.1 ? ALERT_LEVELS.CRITICAL : ALERT_LEVELS.ERROR,

        `High Error Rate: ${(rate * 100).toFixed(2)}%`,

        { rate, timeWindow }

      );

    },

    alertDatabase ConnectionIssue: (error) => {

      sendAlert(

        ALERT_LEVELS.CRITICAL,

        'Database Connection Issue',

        { error: error.message, stack: error.stack }

      );

    }

  };

Complete Node.js Monitoring Implementation

Combining all the components creates a comprehensive monitoring solution:

javascript

  // monitoring.js - Main monitoring integration module

  const os = require('os');

  const process = require('process');

  const express = require('express');

  const memwatch = require ('@airbnb/node-memwatch');

  const blocked = require('blocked');

  const gcStats = require('gc-stats')();

  // Import custom modules

  const alerts = require('./alerting');

  const { setupProfiling Endpoints } = require ('./cpu-profiler');

  // Initialize monitoring

  function initialize Monitoring(app) {

    // Set up health check endpoints

    setupHealthChecks (app);

    // Set up profiling endpoints

    setupProfiling Endpoints(app);

    // Set up metrics collection

    setupMetrics Collection();

    // Configure error tracking

    setupError Tracking();

    // Set up memory monitoring

    setupMemory Monitoring();

    // Set up event loop monitoring

    setupEventLoop Monitoring();

    // Set up garbage collection monitoring

    setupGC Monitoring();

    // Log initialization

    console.log ('Node.js monitoring initialized');

  }

  // Set up health check endpoints

  function setupHealthChecks (app) {

    // Import and use health check router

    const healthRouter = require('./health');

    app.use('/health', healthRouter);

  }

  // Set up metrics collection

  function setupMetrics Collection() {

    // Basic system metrics

    setInterval(() => {

      const memoryUsage = process.memoryUsage ();

      const cpuUsage = getCpuUsagePercentage ();

      // Record metrics

      recordMetrics({

        timestamp: Date.now(),

        memory: {

        rss: memoryUsage.rss / 1024 / 1024,

        heapTotal: memoryUsage.heapTotal / 1024 / 1024,

        heapUsed: memoryUsage.heapUsed / 1024 / 1024,

        external: memoryUsage.external / 1024 / 1024,

        memoryUtilization: memoryUsage.heapUsed / memoryUsage.heapTotal

        },

        cpu: {

        usage: cpuUsage,

        load: os.loadavg()

        },

        system: {

        uptime: process.uptime()

        }

      });

      // Check for high resource usage

      if (memoryUsage.heapUsed / memoryUsage.heapTotal > 0.8) {

        alerts. alertHighMemoryUsage (memoryUsage);

      }

      if (cpuUsage > 80) {

        alerts. alertHighCpuUsage (cpuUsage);

      }

    }, 30000); // Every 30 seconds

  }

  // Calculate CPU usage percentage

  function getCpuUsage Percentage() {

    // This is a simplified implementation

    // For production, use a more sophisticated approach with multiple samples

    return os.loadavg()[0] * 100 / os.cpus().length;

  }

  // Set up error tracking

  function setupError Tracking() {

    // Track global unhandled errors

    process.on ('uncaughtException', (error) => {

      console.error ('Uncaught Exception:', error);

      // Record error

      recordError('uncaughtException', error);

      // Send alert

      alerts.alertError(error);

      // Attempt graceful shutdown

      process.exit(1);

    });

    process.on ('unhandledRejection', (reason, promise) => {

      console.error ('Unhandled Rejection:', reason);

      // Record error

      recordError ('unhandledRejection', reason);

      // Send alert

      alerts.alertError (reason);

    });

  }

  // Set up memory monitoring

  function setupMemory Monitoring() {

    // Monitor for memory leaks

    memwatch.on ('leak', (info) => {

      console.warn ('Memory leak detected:', info);

      // Record leak

      recordMemoryLeak (info);

      // Send alert

      alerts.alertMemoryLeak (info);

    });

    // Monitor heap stats

    memwatch.on ('stats', (stats) => {

      recordMemoryStats (stats);

    });

  }

  // Set up event loop monitoring

  function setupEvent LoopMonitoring() {

    // Monitor event loop blocking

    blocked((ms) => {

      console.warn(`Event loop blocked for ${ms}ms`);

      // Record blocking

      recordEventLoop Blocked(ms);

      // Alert on significant blocking

      if (ms > 100) {

        alerts.alertEvent LoopLag(ms);

      }

    }, { threshold: 50 });

  }

  // Set up garbage collection monitoring

  function setupGC Monitoring() {

    gcStats.on ('stats', (stats) => {

      // Record GC stats

      recordGCStats (stats);

      // Alert on long GC pauses

      if (stats.pause > 200) {

        alerts.alertLongGC Pause(stats);

      }

    });

  }

  // Record metrics (implement based on your metrics storage)

  function recordMetrics (metrics) {

    // This would connect to your metrics storage system

    // Example: Prometheus, InfluxDB, etc.

    console.log ('Metrics recorded:', metrics);

  }

  // Record errors

  function recordError (type, error) {

    // Log error to your error tracking system

    console.error (`Error recorded (${type}):`, error);

  }

  // Record memory leak

  function recordMemory Leak(info) {

    console.warn ('Memory leak recorded:', info);

  }

  // Record memory stats

  function recordMemory Stats(stats) {

    console.log('Memory stats recorded:', stats);

  }

  // Record event loop blocking

  function recordEventLoop Blocked(duration) {

    console.warn('Event loop blocked:', duration);

  }

  // Record garbage collection stats

  function recordGCStats (stats) {

    console.log('GC stats recorded:', stats);

  }

  // Export the monitoring initialization function

  module.exports = {

    initializeMonitoring

  };

Conclusion

Monitoring Node.js applications effectively requires a multifaceted approach that addresses the unique characteristics of the Node.js runtime. By implementing the strategies outlined in this guide, you can gain comprehensive visibility into your application's health, performance, and resource utilization.

Key takeaways from this guide include:

Focus on Node.js-specific metrics like event loop lag, memory patterns, and garbage collection behavior that directly impact application performance.
Implement both external availability monitoring with Odown and internal health metrics collection to get a complete picture of application reliability.
Use specialized monitoring for memory leak detection, CPU profiling, and event loop blocking to address common Node.js performance challenges.
Configure intelligent alerting thresholds based on application characteristics and business impact to ensure appropriate response to issues.
Integrate monitoring with your CI/CD pipeline and development workflow to catch performance issues before they reach production.

By combining these approaches, you can build a robust monitoring system that helps maintain optimal performance for your Node.js applications while enabling rapid troubleshooting when issues arise.

Node.js Application Monitoring: A Comprehensive Implementation Guide

Key Performance Metrics for Node.js Applications

Core Runtime Metrics

Application-Level Metrics

Memory Leak Detection Strategies

Setting Up Effective Node.js Monitoring with Odown

Application Health Check Endpoint

Odown HTTP Check Configuration

Microservice Dependency Tracking

Troubleshooting Common Node.js Performance Issues

Event Loop Blocking Detection

Memory Leak Identification

CPU Profiling for Performance Bottlenecks

Handling Uncaught Exceptions and Promise Rejections

Integrating with Alerting Systems

Complete Node.js Monitoring Implementation

Conclusion

Infrastructure as Code for Monitoring: Automating Observability

Monitoring for Website Security Vulnerabilities: A Defensive Guide

Node.js Application Monitoring: A Comprehensive Implementation Guide

Key Performance Metrics for Node.js Applications

Core Runtime Metrics

Application-Level Metrics

Memory Leak Detection Strategies

Setting Up Effective Node.js Monitoring with Odown

Application Health Check Endpoint

Odown HTTP Check Configuration

Microservice Dependency Tracking

Troubleshooting Common Node.js Performance Issues

Event Loop Blocking Detection

Memory Leak Identification

CPU Profiling for Performance Bottlenecks

Handling Uncaught Exceptions and Promise Rejections

Integrating with Alerting Systems

Complete Node.js Monitoring Implementation

Conclusion

Infrastructure as Code for Monitoring: Automating Observability

Monitoring for Website Security Vulnerabilities: A Defensive Guide

It's time to get started