Building Custom Monitoring Integrations: A Developer's Guide to Monitoring APIs
In modern DevOps environments, effective monitoring often requires integration with a variety of tools, services, and custom applications. While status pages provide visibility into monitoring results, developers frequently need to create custom integrations to extend monitoring capabilities or connect with existing systems. This guide walks through the general process of building custom integrations with monitoring APIs, enabling you to extend your monitoring infrastructure to meet specific organizational needs.
Understanding Monitoring API Architectures
Before diving into implementation details, it's important to understand common monitoring API architectures and organization patterns.
API Overview and Common Structures
Most monitoring APIs follow RESTful principles, with resources organized logically by functionality:
├── monitors/ # Monitor configuration and status
├── checks/ # Check results and historical data
├── alerts/ # Alert configuration and history
├── incidents/ # Incident management
├── status-pages/ # Status page configuration
└── teams/ # Team and user management
Each resource typically supports standard HTTP methods:
- GET: Retrieve resource information
- POST: Create new resources
- PUT or PATCH: Update existing resources
- DELETE: Remove resources
Monitoring APIs generally return JSON-formatted responses and use standard HTTP status codes to indicate success or failure.
Example API Response Pattern:
json
"data": {
"id": "mon_12345abcde",
"type": "http",
"name": "API Endpoint Monitor",
"url": "https://api.example.com /health",
"created_at": "2023-04-15T18:30:22Z",
"updated_at": "2023-05-10T09:15:43Z",
"status": "up",
"last_checked_at": "2023-05-20T14:22:15Z",
"check_frequency": 60
},
"meta": {
"request_id": "req_7890xyz"
}
}
Authentication and Security Best Practices
Secure access to monitoring APIs requires proper authentication and adherence to security best practices.
Common Authentication Methods:
- API Keys: Simple string tokens included in request headers
- OAuth 2.0: More sophisticated authentication flow for applications needing different permissions
- Basic Authentication: Username/password encoded in base64
API Key Management Best Practices:
-
Generate separate keys for different integrations
javascript
const MONITORING_API = {
READ_ONLY_KEY: process.env. MONITOR_READ_KEY,
WRITE_KEY: process.env. MONITOR_WRITE_KEY,
ADMIN_KEY: process.env. MONITOR_ADMIN_KEY
};
// Use appropriate key based on operation
function getApiKey (operation) {
if (operation === 'read') {
return MONITORING_ API.READ_ONLY_KEY;
} else if (operation === 'write') {
return MONITORING_ API.WRITE_KEY;
} else if (operation === 'admin') {
return MONITORING_ API.ADMIN_KEY;
}
throw new Error (Unknown operation: ${operation});
}
-
Use the principle of least privilege
- Request only the permissions necessary for your integration
- Avoid using admin-level API keys in automated systems
-
Secure API key storage
- Never hard-code API keys in application source code
- Use environment variables or secure credential storage
- Rotate keys periodically and during team member transitions
-
Implement proper error handling
javascript
try {
const options = {
method: method,
headers: {
'Authorization': Bearer ${process.env. MONITORING_API_KEY},
'Content-Type': 'application/json'
}
};
if (data && (method === 'POST' || method === 'PUT')) {
options.body = JSON.stringify (data);
}
const response = await fetch(https://api.monitoring -service.com/v1/${endpoint}, options);
// Handle different HTTP status codes appropriately
if (response.status === 401) {
throw new Error('Authentication failed. Check API key validity.');
}
if (response.status === 403) {
throw new Error('Permission denied. Check API key permissions.');
}
if (!response.ok) {
const errorData = await response.json();
throw new Error(API error: ${errorData.message || response.statusText});
}
return await response.json();
} catch (error) {
console.error ('API call failed:', error);
// Implement appropriate error handling for your application
throw error;
}
}
Rate Limiting and Performance Considerations
Most monitoring APIs implement rate limiting to ensure fair usage and system stability:
Typical Rate Limit Implementation:
- Default limits often range from 60-300 requests per minute
- Burst allowances for short-term higher usage
- Higher limits for paid/enterprise accounts
- Separate limits for read vs. write operations
Rate Limit Headers:
Many APIs include headers that provide information about your current rate limit status:
X-RateLimit-Remaining: 58
X-RateLimit-Reset: 1621523940
Best Practices for Rate Limit Management:
-
Implement exponential backoff for retries
javascript
async function makeApiRequestWithRetry (endpoint, maxRetries = 3) {
let retries = 0;
while (retries < maxRetries) {
try {
const response = await fetch (https://api.monitoring -service.com/v1/${endpoint}, {
headers: {
'Authorization': Bearer ${API_KEY}
}
});
if (response.status === 429) { // Too Many Requests
const resetTime = parseInt (response.headers.get ('X-RateLimit-Reset') || '0', 10);
const currentTime = Math.floor (Date.now () / 1000);
const waitTime = resetTime - currentTime;
const backoffTime = Math.min (
waitTime > 0 ? waitTime * 1000 : 1000,
Math.pow(2, retries) * 1000 + Math.random() * 1000
);
console.warn(Rate limited. Retrying in ${backoffTime}ms);
await new Promise (resolve => setTimeout(resolve, backoffTime));
retries++;
} else {
return await response.json();
}
} catch (error) {
retries++;
if (retries >= maxRetries) {
throw error;
}
}
}
}
-
Cache responses where appropriate
javascript
async function getCachedApiResponse (endpoint, ttlSeconds = 60) {
const cacheKey = endpoint;
const now = Date.now();
if (responseCache.has (cacheKey)) {
const cachedData = responseCache.get (cacheKey);
if (now - cachedData.timestamp < ttlSeconds * 1000) {
return cachedData.data;
}
}
const response = await makeApiRequestWithRetry (endpoint);
responseCache.set (cacheKey, {
timestamp: now,
data: response
});
return response;
}
-
Batch operations when possible
javascript
const promises = [
makeApiRequestWithRetry ('monitors/mon_123'),
makeApiRequestWithRetry ('monitors/mon_456'),
makeApiRequestWithRetry ('monitors/mon_789')
];
const monitors = await Promise.all (promises);
// Use batch endpoints when available
const batchedMonitors = await makeApiRequestWithRetry ('monitors?ids=mon_123 ,mon_456,mon_789');
-
Schedule non-urgent operations during off-peak times
javascript
function scheduleReport Generation() {
const now = new Date();
let scheduledTime = new Date (now);
// Schedule for 3 AM local time
scheduledTime. setHours (3, 0, 0, 0);
// If it's already past 3 AM, schedule for tomorrow
if (now > scheduledTime) {
scheduledTime. setDate (scheduledTime. getDate() + 1);
}
const delayMs = scheduledTime - now;
setTimeout(() => {
generateAnd SendReport();
// Schedule next report
scheduleReport Generation();
}, delayMs);
}
Implementing Custom Data Collection and Reporting
One of the most common integration needs is extending monitoring with custom data collection and reporting capabilities.
Custom Data Collection Integration
Most monitoring platforms support custom metric collection through their APIs, allowing you to monitor virtually any system or service.
Creating Custom Monitors:
javascript
async function createCustom ServiceMonitor (serviceName, endpoint, checkFrequency = 60) {
const monitorData = {
name: ${serviceName} Health Check,
type: "http",
url: endpoint,
method: "GET",
check_frequency: checkFrequency,
locations: ["us-east", "eu-west"],
assertions: [
{ type: "statusCode", comparison: "equals", value: 200 },
{ type: "responseTime", comparison: "lessThan", value: 500 }
],
alert_settings: {
sensitivity: "medium",
notification_channels: ["email", "slack"]
}
};
const response = await fetch ('https://api.monitoring -service.com/v1/monitors', {
method: 'POST',
headers: {
'Authorization': Bearer ${API_KEY},
'Content-Type': 'application/json'
},
body: JSON.stringify (monitorData)
});
const result = await response.json();
return result.data.id;
}
Reporting Custom Check Results:
For systems that can't be directly accessed by the monitoring infrastructure, you can implement custom checks and report results via API:
javascript
async function reportCustomCheckResult (monitorId, isUp, responseTime, additionalData = {}) {
const checkData = {
monitor_id: monitorId,
status: isUp ? "up" : "down",
response_time: responseTime,
check_time: new Date() .toISOString(),
additional_data: additionalData
};
const response = await fetch (https://api.monitoring -service.com/v1/checks, {
method: 'POST',
headers: {
'Authorization': Bearer ${API_KEY},
'Content-Type': 'application/json'
},
body: JSON.stringify (checkData)
});
return await response.json();
}
// Example usage for a database health check
async function checkDatabase Health() {
const startTime = Date.now();
try {
// Implement your database health check logic
const connection = await database. createConnection();
const result = await connection.query ('SELECT 1');
await connection.close();
const responseTime = Date.now() - startTime;
// Report successful check
await reportCustomCheckResult(
'mon_database123',
true,
responseTime,
{ connections: database. activeConnections, queryResult: result }
);
} catch (error) {
const responseTime = Date.now() - startTime;
// Report failed check
await reportCustomCheckResult(
'mon_database123',
false,
responseTime,
{ error: error.message, errorCode: error.code }
);
}
}
Implementing Complex Health Checks:
For sophisticated monitoring scenarios, you can implement custom health checks that aggregate multiple conditions:
javascript
async function checkMicroservice Health (serviceId) {
const checks = {
api: await checkEndpoint (https://api.internal /${serviceId}/health),
database: await checkDatabase (serviceId),
cache: await checkRedisHealth (serviceId),
dependencies: await checkDependencyServices (serviceId)
};
// Determine overall health
const isHealthy = Object.values (checks).every (check => check.status === "ok");
const responseTime = Math.max (...Object.values (checks).map( check => check.responseTime));
// Report to monitoring service
await reportCustom CheckResult(
mon_service_$ {serviceId},
isHealthy,
responseTime,
{ checks }
);
return {
isHealthy,
checks
};
}
Custom Reporting and Dashboarding
Monitoring APIs provide access to monitoring data that can be used to create custom reports and dashboards:
Fetching Monitoring Data:
javascript
async function getMonitorHistory (monitorId, days = 7) {
const endDate = new Date();
const startDate = new Date();
startDate. setDate (startDate.getDate() - days);
const response = await fetch(
https://api.monitoring -service.com/v1/checks ?monitor_id=$ {monitorId} +
&start_time=$ {startDate.toISOString ()}&end_time= ${endDate. toISOString()},
{
headers: {
'Authorization': Bearer ${API_KEY}
}
}
);
return await response.json();
}
Calculating Custom Metrics:
javascript
async function calculatePerformance Metrics (monitorId, days = 30) {
const history = await getMonitorHistory (monitorId, days);
const checks = history.data;
// Calculate uptime percentage
const totalChecks = checks.length;
const successfulChecks = checks.filter (check => check.status === "up"). length;
const uptimePercentage = (successfulChecks / totalChecks) * 100;
// Calculate average response time
const totalResponseTime = checks.reduce ((sum, check) => sum + check.response_time, 0);
const averageResponseTime = totalResponseTime / totalChecks;
// Calculate 95th percentile response time
const responseTimes = checks.map (check => check.response_time) .sort((a, b) => a - b);
const p95Index = Math.floor (responseTimes.length * 0.95);
const p95ResponseTime = responseTimes [p95Index];
return {
monitorId,
period: ${days} days,
metrics: {
uptime: uptimePercentage .toFixed (3),
averageResponseTime: averageResponseTime .toFixed (2),
p95ResponseTime: p95ResponseTime,
totalChecks,
successfulChecks,
failedChecks: totalChecks - successfulChecks
}
};
}
Creating Custom Reports:
javascript
async function generateWeeklyReport (teamId) {
// Get all monitors for the team
const monitorsResponse = await fetch (https://api.monitoring -service.com/v1/ monitors?team_id= ${teamId}, {
headers: {
'Authorization': Bearer ${API_KEY}
}
});
const monitors = (await monitorsResponse. json()).data;
// Calculate metrics for each monitor
const monitorMetrics = await Promise.all(
monitors.map (monitor => calculatePerformance Metrics (monitor.id, 7))
);
// Generate report
const report = {
title: "Weekly Performance Report",
generated_at: new Date(). toISOString(),
period: "Last 7 days",
summary: {
totalMonitors: monitors.length,
healthyMonitors: monitors.filter (m => m.status === "up"). length,
averageUptime: (
monitorMetrics. reduce((sum, m) => sum + parseFloat (m.metrics.uptime), 0) / monitors.length
).toFixed(2),
alertsTriggered: await getAlertCount (teamId, 7)
},
monitors: monitorMetrics
};
return report;
}
Integration with Third-Party Dashboarding Tools:
javascript
async function exportToGrafana(monitorIds, grafanaApiKey, grafanaUrl) {
// Fetch data for each monitor
const monitorData = await Promise.all(
monitorIds.map(async (monitorId) => {
const monitor = await fetch (https://api.monitoring -service.com/v1 /monitors/ ${monitorId}, {
headers: {
'Authorization': Bearer ${API_KEY}
}
}).then(res => res.json());
const history = await getMonitorHistory (monitorId, 1); // Last 24 hours
return {
monitor: monitor.data,
checks: history.data
};
})
);
// Transform data for Grafana
const grafanaDatapoints = monitorData. flatMap(data => {
return data.checks.map (check => ({
target: data.monitor.name,
datapoints: [
[check.status === "up" ? 1 : 0, new Date (check.check_time) .getTime()],
[check. response_time, new Date(check. check_time) .getTime()]
]
}));
});
// Send to Grafana
const response = await fetch(${grafanaUrl}/api/ datasources/proxy/1/ api/v1/write, {
method: 'POST',
headers: {
'Authorization': Bearer ${grafanaApiKey},
'Content-Type': 'application/json'
},
body: JSON.stringify (grafanaDatapoints)
});
return response.ok;
}
Creating Advanced Alert Workflows
While monitoring platforms provide built-in alerting capabilities, integrating with external systems allows for advanced alert workflows tailored to your organization's needs.
Webhook Event Handling Patterns
Most monitoring systems can send webhook notifications for various events. Implementing proper webhook handlers enables integration with other systems.
Basic Webhook Handler Setup:
javascript
const express = require ('express');
const crypto = require ('crypto');
const app = express();
// Parse JSON bodies
app.use (express.json());
// Verify webhook signature (common security practice)
function verifySignature (req) {
const signature = req.headers ['x-signature'];
const payload = JSON.stringify(req.body);
const hmac = crypto.createHmac ('sha256', process.env. WEBHOOK_SECRET)
.update (payload)
.digest ('hex');
return hmac === signature;
}
// Handle monitor status change webhooks
app.post('/webhooks /monitor-status', (req, res) => {
if (!verifySignature(req)) {
return res.status (401).send ('Invalid signature');
}
const { monitor, check, previous_status, current_status } = req.body;
console.log (Monitor ${monitor.name} changed from ${previous_status} to ${current_status});
processMonitor StatusChange (monitor, check, previous_status, current_status);
res.status(200).send ('Webhook received');
});
// Handle alert triggered webhooks
app.post('/webhooks/ alert-triggered', (req, res) => {
if (!verifySignature(req)) {
return res.status (401).send ('Invalid signature');
}
const { alert, monitor, check } = req.body;
console.log (Alert triggered for monitor ${monitor.name}: ${alert.message});
processAlert Triggered (alert, monitor, check);
res.status (200).send ('Webhook received');
});
// Start the server
app.listen(3000, () => {
console.log ('Webhook handler listening on port 3000');
});
Implementing Webhook Queuing for Reliability:
javascript
const { Queue } = require ('bullmq');
// Create a queue for webhook processing
const webhookQueue = new Queue ('webhooks', {
connection: {
host: process.env. REDIS_HOST,
port: process.env. REDIS_PORT
}
});
// Add webhook job to queue in handler
app.post('/webhooks /alert-triggered', (req, res) => {
if (!verifySignature(req)) {
return res.status (401).send ('Invalid signature');
}
const eventId = req.headers ['x-event-id'];
// Add job to queue
webhookQueue.add('alert', {
eventId,
data: req.body,
receivedAt: new Date(). toISOString()
}, {
// Set job options
attempts: 3,
backoff: {
type: 'exponential',
delay: 5000
}
});
// Immediately acknowledge receipt
res.status (202).send('Webhook queued for processing');
});
// Process queue in a separate worker
const { Worker } = require ('bullmq');
const webhookWorker = new Worker('webhooks', async job => {
const { eventId, data, receivedAt } = job.data;
console.log (Processing webhook ${eventId} received at ${receivedAt});
switch (job.name) {
case 'alert':
await processAlertTriggered (data.alert, data.monitor, data.check);
break;
case 'monitor-status':
await processMonitor StatusChange (data.monitor, data.check, data.previous_status, data.current_status);
break;
default:
console.warn (Unknown webhook type: ${job.name});
}
}, {
connection: {
host: process.env. REDIS_HOST,
port: process.env. REDIS_PORT
}
});
Building Custom Alerting Workflows
Integrating with third-party systems allows for sophisticated alerting workflows:
Integration with Incident Management Systems:
javascript
async function create PagerDutyIncident (alert, monitor, check) {
const response = await fetch('https:// api.pagerduty.com /incidents', {
method: 'POST',
headers: {
'Authorization': Token token=$ {process.env. PAGERDUTY_API_KEY},
'Accept': 'application/vnd. pagerduty+json;version=2',
'Content-Type': 'application/json'
},
body: JSON.stringify({
incident: {
type: 'incident',
title: ${monitor.name}: ${alert.message},
service: {
id: getPagerDuty ServiceId(monitor),
type: 'service_reference'
},
urgency: alert.severity === 'critical' ? 'high' : 'low',
body: {
type: 'incident_body',
details: Monitor: ${monitor.name} Status: ${check.status} Response Time: ${check.response_time}ms Error: ${check.error || 'N/A'} Check Time: ${check.check_time}
},
custom_details: {
monitor_id: monitor.id,
check_id: check.id,
alert_id: alert.id,
url: monitor.url,
response_code: check.status_code
}
}
})
});
const result = await response.json();
return result.incident.id;
}
Slack Integration for Alerts:
javascript
async function sendSlackAlert Notification (alert, monitor, check) {
const color = alert.severity === 'critical' ? '#ff0000' :
alert.severity === 'warning' ? '#ffa500' : '#36a64f';
const response = await fetch (process.env. SLACK_WEBHOOK_URL, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify ({
attachments: [
{
color: color,
title: ${'{'}alert.severity. toUpperCase(){'}'}: ${'{'}monitor.name{'}'},
title_link: https://monitoring-dashboard .example.com/monitors/ ${'{'}monitor.id{'}'},
text: alert.message,
fields: [
{ title: 'Status', value: check.status.toUpperCase(), short: true },
{ title: 'Response Time', value: ${'{'}check.response_time{'}'}ms, short: true },
{ title: 'URL', value: monitor.url, short: false },
{ title: 'Error', value: check.error || 'None', short: false }
],
actions: [
{ type: 'button', text: 'View Monitor', url: https://monitoring-dashboard .example.com/monitors /${'{'}monitor.id {'}'} },
{ type: 'button', text: 'Acknowledge', url: https://monitoring -dashboard.example.com/ alerts/${'{'} alert.id{'}'} /acknowledge }
],
footer: Monitor Alert - ${'{'}new Date (check.check_time). toLocaleString(){'}'}
}
]
})
});
return response.ok;
}
SMS Notifications with Twilio:
javascript
async function sendTwilioSmsAlert (phoneNumber, alert, monitor) {
const twilioClient = require('twilio')(
process.env. TWILIO_ACCOUNT_SID,
process.env. TWILIO_AUTH_TOKEN
);
try {
const message = await twilioClient. messages.create({
body: ALERT: ${'{'} monitor.name${'}'} is ${'{'} alert.status. toUpperCase() ${'}'}. ${'{'} alert.message${'}'},
from: process.env. TWILIO_PHONE_NUMBER,
to: phoneNumber
});
return message.sid;
} catch (error) {
console.error ('Failed to send SMS alert:', error);
throw error;
}
}
Implementing Escalation Logic:
javascript
async function checkFor UnacknowledgedAlerts (monitoringApiBaseUrl, apiKey) {
// Get all unacknowledged alerts older than 15 minutes
const fifteenMinutesAgo = new Date();
fifteenMinutesAgo. setMinutes (fifteenMinutesAgo. getMinutes() - 15);
const response = await fetch(
${'{'} monitoringApiBaseUrl ${'}'} /alerts?status = triggered&before =${'{'} fifteenMinutesAgo. toISOString() ${'}'},
{
headers: {
'Authorization': Bearer ${'{'} apiKey${'}'}
}
}
);
const alerts = (await response.json ()).data;
// Process each unacknowledged alert
for (const alert of alerts) {
// Get monitor details
const monitorResponse = await fetch (${'{'} monitoringApiBaseUrl$ {'}'}/monitors/$ {'{'}alert.monitor_id ${'}'}, {
headers: {
'Authorization': Bearer ${'{'} apiKey${'}'}
}
});
const monitor = (await monitorResponse.json()).data;
// Escalate the alert
await escalateAlert (alert, monitor);
}
}
async function escalateAlert (alert, monitor) {
console.log (Escalating alert ${'{'} alert.id${'}'} for monitor ${'{'}monitor. name${'}'});
// Get the team's escalation policy
const teamId = monitor.team_id;
const escalationPolicy = await getTeam EscalationPolicy (teamId);
if (!escalationPolicy) {
console.warn (No escalation policy found for team ${'{'} teamId${'}'});
return;
}
// Determine escalation level based on alert age
const alertAge = (new Date() - new Date (alert.created_at)) / (1000 * 60); // in minutes
let escalationLevel;
if (alertAge > 60) {
escalationLevel = 'management';
} else if (alertAge > 30) {
escalationLevel = 'secondary';
} else {
escalationLevel = 'primary';
}
// Get contacts for this escalation level
const contacts = escalationPolicy [escalationLevel];
// Notify all contacts in this escalation level
for (const contact of contacts) {
switch (contact.type) {
case 'email':
await sendEscalatedEmail (contact.value, alert, monitor);
break;
case 'sms':
await sendTwilioSmsAlert (contact.value, alert, monitor);
break;
case 'phone':
await initiatePhoneCall (contact.value, alert, monitor);
break;
default:
console.warn (Unknown contact type: ${'{'} contact.type${'}'});
}
}
// Log escalation
await logAlert Escalation (alert.id, escalationLevel, contacts);
}
await logAlert Escalation (alert.id, escalationLevel, contacts);
Building Alert Aggregation Systems
For organizations with large monitoring deployments, alert aggregation can reduce notification fatigue:
javascript
class AlertAggregator {
constructor (options = {}) {
this.aggregationWindow = options.aggregationWindow || 5 * 60 * 1000;
this.maxAlertsPer Notification = options.maxAlerts PerNotification || 10;
this.pendingAlerts = new Map();
this.aggregationTimers = new Map();
}
processAlert(alert, monitor) {
const serviceKey = monitor.service || 'default';
// Initialize service bucket if needed
if (!this. pendingAlerts.has (serviceKey)) {
this. pendingAlerts.set (serviceKey, []);
}
// Add alert to pending list
this.pendingAlerts .get (serviceKey) .push({
alert,
monitor,
receivedAt: new Date()
});
// Create aggregation timer if missing
if (!this. aggregationTimers.has (serviceKey)) {
const timerId = setTimeout(() => {
this. sendAggregated Notification (serviceKey);
}, this. aggregationWindow);
this.aggregation Timers.set (serviceKey, timerId);
}
// Send immediately if alert limit reached
if (this. pendingAlerts.get (serviceKey).length >= this. maxAlertsPerNotification) {
this. sendAggregated Notification (serviceKey);
}
}
// Clear the timer
if (this. aggregationTimers.has (serviceKey)) {
clearTimeout (this.aggregationTimers .get(serviceKey));
this. aggregationTimers.delete (serviceKey);
}
const alerts = this. pendingAlerts.get (serviceKey) || [];
if (alerts.length === 0) return;
const alertsBySeverity = {
critical: alerts.filter (a => a.alert.severity === 'critical'),
warning: alerts.filter (a => a.alert.severity === 'warning'),
info: alerts.filter (a => a.alert.severity === 'info')
};
const notification = {
service: serviceKey,
timestamp: new Date(),
total: alerts.length,
severityCounts: {
critical: alertsBySeverity. critical.length,
warning: alertsBySeverity. warning.length,
info: alertsBySeverity. info.length
},
alerts: {
critical: alertsBySeverity. critical.map (this.formatAlertSummary),
warning: alertsBySeverity .warning.map (this.formatAlertSummary),
info: alertsBySeverity. info.map (this.formatAlertSummary)
}
};
this. sendNotification (notification);
this. pendingAlerts.set (serviceKey, []);
}
formatAlertSummary (alertData) {
return {
id: alertData.alert.id,
monitorName: alertData. monitor.name,
message: alertData. alert.message,
receivedAt: alertData. receivedAt
};
}
// Example: Send to Slack
try {
const response = await fetch (process.env. SLACK_AGGREGATED_ WEBHOOK_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: Alert Digest for ${notification .service},
attachments: [
{
color: notification.severity Counts.critical > 0 ? '#ff0000' :
notification. severityCounts.warning > 0 ? '#ffa500' : '#36a64f',
title: ${notification.total} alerts in the last ${this. aggregationWindow / 60000} minutes,
fields: [
{ title: 'Critical', value: notification. severityCounts.critical. toString(), short: true },
{ title: 'Warning', value: notification. severityCounts.warning. toString(), short: true },
{ title: 'Info', value: notification. severityCounts.info. toString(), short: true }
],
text: this. formatAlertDigest (notification)
}
]
})
});
if (!response.ok) {
console.error ('Failed to send aggregated notification');
}
} catch (error) {
console.error ('Error sending aggregated notification:', error);
}
}
formatAlertDigest (notification) {
let digest = '';
if (notification. severityCounts.critical > 0) {
digest += 'Critical Alerts:\n';
notification. alerts.critical.forEach (alert => {
digest += - ${alert.monitorName}: ${alert.message}\n;
});
digest += '\n';
}
if (notification. severityCounts .warning > 0) {
digest += 'Warning Alerts:\n';
notification.alerts .warning.forEach (alert => {
digest += - ${alert.monitorName}: ${alert.message}\n;
});
digest += '\n';
}
if (notification. severityCounts.info > 0) {
digest += 'Info Alerts:\n';
notification.alerts .info.forEach (alert => {
digest += - ${alert.monitorName}: ${alert.message}\n;
});
}
return digest;
}
// Webhook usage example
const aggregator = new AlertAggregator({
aggregationWindow: 10 * 60 * 1000, // 10 minutes
maxAlerts PerNotification: 15
});
app.post ('/webhooks/ alert-triggered', (req, res) => {
if (!verifySignature (req)) {
return res.status (401).send('Invalid signature');
}
const { alert, monitor } = req.body;
aggregator. processAlert (alert, monitor);
res.status (200).send ('Alert received');
});
Programmatic Alert Management
For sophisticated monitoring deployments, programmatic alert management can help reduce alert noise and improve response efficiency:
javascript
class AlertManager {
constructor (apiBaseUrl, apiKey) {
this.apiBaseUrl = apiBaseUrl;
this.apiKey = apiKey;
}
// Get active alerts
async getActiveAlerts (filters = {}) {
let url = ${this.apiBaseUrl} /alerts?status= triggered;
if (filters.teamId) {
url += &team_id=$ {filters.teamId};
}
if (filters.severity) {
url += &severity= ${filters.severity};
}
if (filters.monitorId) {
url += &monitor_id= ${filters.monitorId};
}
const response = await fetch(url, {
headers: {
'Authorization': Bearer ${this.apiKey}
}
});
return (await response.json ()).data;
}
// Acknowledge an alert
async acknowledgeAlert (alertId, acknowledgedBy, note = '') {
const response = await fetch(${this.apiBaseUrl} /alerts/${alertId} /acknowledge, {
method: 'POST',
headers: {
'Authorization': Bearer ${this.apiKey},
'Content-Type': 'application/json'
},
body: JSON.stringify({
acknowledged_by: acknowledgedBy,
note: note
})
});
return await response.json();
}
// Usage example
const alertManager = new AlertManager(
"https://api.monitoring -service.com/v1",
process.env. MONITORING_API_KEY
);
// Set up auto-resolution
alertManager.setupAuto ResolutionListener();
// Periodically check for alert storms
setInterval(() => {
alertManager. detectAndHandle AlertStorm ( 5, 10)
.catch (error => console.error("Error detecting alert storm:", error));
}, 2 * 60 * 1000); // Check every 2 minutes
Practical Integration Examples
Here are some practical examples of common integration patterns with monitoring APIs.
Integrating with CI/CD Pipelines
Automated testing and deployment processes can benefit from integration with monitoring systems:
javascript
async function recordDeployment (apiBaseUrl, apiKey, environment, version, details) {
const response = await fetch(${apiBaseUrl}/events, {
method: 'POST',
headers: {
'Authorization': Bearer ${apiKey},
'Content-Type': 'application/json'
},
body: JSON.stringify({
type: 'deployment',
title: Deployed v${version} to ${environment},
description: details,
environment: environment,
metadata: {
version,
deployer: process.env. CI_USERNAME || 'CI system',
commit: process.env. CI_COMMIT_SHA,
build_number: process.env. CI_BUILD_NUMBER
}
})
});
return await response.json();
}
// In your CI/CD pipeline script
async function deploymentPostTask() {
try {
await recordDeployment(
'https://api.monitoring -service.com/v1',
process.env. MONITORING_API_KEY,
process.env. DEPLOY_ENVIRONMENT,
process.env. PACKAGE_VERSION,
Deployment of version ${process.env. PACKAGE_VERSION} completed successfully.
);
await verifyMonitorsAfter Deployment();
} catch (error) {
console.error ('Error in deployment post-task:', error);
process.exit (1);
}
}
// Verify monitors health after deployment
async function verifyMonitors AfterDeployment() {
const apiBaseUrl = 'https://api.monitoring -service.com/v1';
const apiKey = process.env. MONITORING_API_KEY;
const response = await fetch(
${apiBaseUrl} /monitors?service= ${process.env. SERVICE_NAME},
{ headers: { 'Authorization': Bearer ${apiKey} } }
);
const monitors = (await response.json ()).data;
console.log ('Waiting for monitoring checks to run...');
await new Promise (resolve => setTimeout (resolve, 5 * 60 * 1000));
let allHealthy = true;
for (const monitor of monitors) {
const statusResponse = await fetch(
${apiBaseUrl} /monitors/$ {monitor.id},
{ headers: { 'Authorization': Bearer ${apiKey} } }
);
const monitorStatus = (await statusResponse.json ()).data;
if (monitorStatus.status !== 'up') {
console.error(Monitor ${monitor.name} is ${monitorStatus.status} after deployment);
allHealthy = false;
}
}
if (!allHealthy) {
await recordDeployment(
apiBaseUrl,
apiKey,
process.env. DEPLOY_ENVIRONMENT,
process.env. PACKAGE_VERSION,
'Deployment health check failed. Some monitors are not healthy.'
);
throw new Error('Post-deployment health check failed');
}
console.log ('All monitors are healthy after deployment');
}
Scheduled Maintenance Integration
Automatically create and manage scheduled maintenance windows:
javascript
async function scheduleMaintenanceWindow (apiBaseUrl, apiKey, options) {
const {
title,
description,
startTime,
endTime,
affectedComponents = []
} = options;
const response = await fetch(${apiBaseUrl} /maintenance, {
method: 'POST',
headers: {
'Authorization': Bearer ${apiKey},
'Content-Type': 'application/json'
},
body: JSON.stringify({
title,
description,
start_time: startTime.toISOString(),
end_time: endTime. toISOString(),
affected_components: affectedComponents
})
});
return await response.json();
}
// Usage in infrastructure automation script
async function performDatabaseMaintenance() {
const apiBaseUrl = 'https://api.monitoring- service.com/v1';
const apiKey = process.env. MONITORING_API_KEY;
// Calculate maintenance window
const now = new Date();
const startTime = new Date (now.getTime() + 30 * 60 * 1000);
const endTime = new Date(startTime. getTime() + 2 * 60 * 60 * 1000);
// Schedule maintenance window
const maintenance = await scheduleMaintenanceWindow (apiBaseUrl, apiKey, {
title: 'Database Maintenance',
description: 'Scheduled database maintenance including version upgrade and index optimization.',
startTime,
endTime,
affectedComponents: ['database', 'api', 'backend-services']
});
console.log (Maintenance window scheduled: ${maintenance.data.id});
// Perform actual maintenance when window starts
const timeUntilStart = startTime - now;
setTimeout (async () => {
await updateMaintenanceStatus (apiBaseUrl, apiKey, maintenance.data.id, 'in_progress', 'Maintenance has started');
try {
await performDatabaseUpgrade();
await optimizeDatabaseIndexes();
await updateMaintenanceStatus(
apiBaseUrl,
apiKey,
maintenance.data.id,
'completed',
'Maintenance completed successfully'
);
} catch (error) {
console.error('Error during maintenance:', error);
await updateMaintenanceStatus(
apiBaseUrl,
apiKey,
maintenance.data.id,
'completed',
Maintenance completed with issues: ${error.message}
);
}
}, timeUntilStart);
}
// Update maintenance status
async function updateMaintenanceStatus (apiBaseUrl, apiKey, maintenanceId, status, message) {
const response = await fetch(${apiBaseUrl} /maintenance /${maintenanceId}, {
method: 'PATCH',
headers: {
'Authorization': Bearer ${apiKey},
'Content-Type': 'application/json'
},
body: JSON.stringify ({ status, message })
});
return await response.json();
}
Conclusion
Building custom integrations with monitoring APIs enables you to extend monitoring capabilities, automate workflows, and connect with your existing tools and systems. By following the patterns and practices outlined in this guide, you can build robust, reliable monitoring integrations that provide deeper visibility into your infrastructure and applications.
Remember these key points when building your custom integrations:
- Security First: Always follow best practices for API key management and secure communication.
- Reliability Matters: Implement proper error handling, retries, and failover mechanisms in your integrations.
- Rate Limit Awareness: Design your integrations with rate limits in mind, using techniques like caching and batching to optimize API usage.
- Start Simple: Begin with basic integrations and add complexity as needed, building on a solid foundation.
- Test Thoroughly: Ensure your integrations work as expected under various conditions, including error scenarios and edge cases.
By leveraging monitoring API capabilities, you can create a tailored monitoring solution that fits seamlessly into your organization's workflows and infrastructure.