Pattern 9: Early Warning Signals

Intent

Detect problems early by monitoring for sudden changes, gradual declines, pattern breaks, and threshold violations, generating timely alerts that enable intervention before issues become crises.

Also Known As

Anomaly Detection
Alert System
Early Detection
Warning Indicators
Predictive Alerts

Problem

By the time problems are obvious, it's often too late.

Martinez family withdraws. Sarah looks at the data: - Engagement score dropped from 76 to 28 over 3 months - Email opens went from 80% to 15% - Portal logins went from weekly to none in 47 days - Event attendance dropped from 75% to 0% - First payment late by 2 days, second by 14 days

The signs were there. Sarah just didn't notice them in time.

Without early warning: - React to crises instead of preventing them - Intervention comes too late to be effective - Constant firefighting, no proactive management - Miss opportunities to save relationships - Lose families who could have been retained

The problem: 100 families × dozens of metrics = impossible to monitor manually.

Context

When this pattern applies:

Managing many entities (can't monitor all manually)
Historical data available to establish baselines
Early intervention is effective (can save situations)
Cost of false positives < cost of missed problems
Problems develop gradually (sudden crises can't be predicted)

When this pattern may not be needed:

Very small scale where manual monitoring works
No historical baseline (brand new system)
Early intervention doesn't help (problems are sudden, unavoidable)
Alert fatigue is already severe

Forces

Competing concerns:

1. Sensitivity vs Noise - High sensitivity = catch all problems, but many false positives - Low sensitivity = miss real problems - Balance: Tune thresholds to minimize both

2. Early vs Accurate - Detect early = more false positives (uncertainty higher) - Wait for certainty = intervention may be too late - Balance: Alert early with confidence score

3. Comprehensive vs Focused - Monitor everything = complete coverage - Monitor key metrics = manageable alert volume - Balance: Monitor critical signals, ignore noise

4. Automated vs Manual - Automated alerts = consistent, scalable - Manual review = context-aware, nuanced - Balance: Auto-alert with manual triage

5. Real-time vs Batch - Real-time = immediate alerts - Batch (daily/weekly) = less overhead, group similar alerts - Balance: Critical signals real-time, others batched

Solution

Implement multi-layered early warning system that detects:

1. Sudden Changes (Immediate alerts) - Engagement score drops >15 points in one calculation - Tier drops 2+ levels - Risk score spikes >30 points - Critical risk dimension emerges

2. Gradual Declines (Trend alerts) - Score declining 3+ consecutive periods - Downward velocity accelerating - Consistent negative trajectory

3. Threshold Violations (Static alerts) - Score falls below critical threshold (e.g., <40) - Risk exceeds danger zone (e.g., >80) - Key metric hits zero (portal: 60+ days no login)

4. Pattern Breaks (Anomaly alerts) - Behavior deviates significantly from historical baseline - Expected interaction doesn't happen - Unusual combination of events

5. Predictive Signals (Forecasting alerts) - Projected to cross threshold in N days - On trajectory toward problem - Similar patterns preceded past issues

Structure

Alert Configuration Table

-- Define alert rules
CREATE TABLE alert_rules (
  rule_id INT PRIMARY KEY IDENTITY(1,1),
  rule_name VARCHAR(100) NOT NULL,
  rule_type VARCHAR(50) NOT NULL,  -- 'sudden_change', 'gradual_decline', 'threshold', 'pattern_break', 'predictive'

  -- What to monitor
  metric_name VARCHAR(100),  -- 'engagement_score', 'withdrawal_risk', etc.

  -- Conditions
  condition_operator VARCHAR(20),  -- '<', '>', 'drops_by', 'increases_by', 'crosses', 'unchanged_for'
  threshold_value DECIMAL(10,2),
  lookback_periods INT DEFAULT 1,  -- How many periods to compare

  -- Alert properties
  severity VARCHAR(20) DEFAULT 'medium',  -- 'low', 'medium', 'high', 'critical'
  alert_frequency VARCHAR(50) DEFAULT 'once',  -- 'once', 'daily', 'every_time'
  requires_confirmation BIT DEFAULT 0,  -- Does this need manual review before alert?

  -- Actions
  notification_channels VARCHAR(200),  -- JSON: ['email', 'sms', 'dashboard']
  auto_intervention_enabled BIT DEFAULT 0,
  intervention_pattern_id INT NULL,  -- Link to intervention to trigger

  -- Metadata
  enabled BIT DEFAULT 1,
  description NVARCHAR(500),
  created_date DATETIME2 DEFAULT GETDATE(),

  CONSTRAINT UQ_rule_name UNIQUE (rule_name)
);

-- Seed some common alert rules
INSERT INTO alert_rules (rule_name, rule_type, metric_name, condition_operator, threshold_value, severity, description) VALUES
  ('Engagement Score Sudden Drop', 'sudden_change', 'engagement_score', 'drops_by', 15, 'high', 
   'Engagement score dropped 15+ points in single calculation'),

  ('Critical Tier Entry', 'threshold', 'engagement_score', '<', 40, 'critical',
   'Family entered Critical tier'),

  ('Gradual Engagement Decline', 'gradual_decline', 'engagement_score', 'declining', 3, 'medium',
   'Engagement declining for 3+ consecutive periods'),

  ('Portal Abandonment', 'threshold', 'days_since_portal_login', '>', 60, 'high',
   'Family has not logged into portal in 60+ days'),

  ('Payment Risk Spike', 'sudden_change', 'payment_risk', 'increases_by', 30, 'high',
   'Payment risk increased 30+ points suddenly'),

  ('Withdrawal Risk Critical', 'threshold', 'withdrawal_risk', '>', 80, 'critical',
   'Withdrawal risk exceeded 80% threshold'),

  ('Tier Demotion', 'pattern_break', 'tier_change', 'demoted', 1, 'medium',
   'Family demoted to lower tier'),

  ('Zero Event Attendance', 'threshold', 'event_attendance_rate', '=', 0, 'medium',
   'Family attended 0% of events in period');

-- Store generated alerts
CREATE TABLE alerts (
  alert_id INT PRIMARY KEY IDENTITY(1,1),
  family_id INT NOT NULL,
  rule_id INT NOT NULL,

  -- Alert details
  alert_date DATETIME2 DEFAULT GETDATE(),
  severity VARCHAR(20),
  alert_message NVARCHAR(1000),

  -- Metrics at time of alert
  metric_value DECIMAL(10,2),
  previous_value DECIMAL(10,2),
  threshold_crossed DECIMAL(10,2),

  -- Status
  status VARCHAR(50) DEFAULT 'new',  -- 'new', 'acknowledged', 'investigating', 'resolved', 'false_positive'
  acknowledged_by VARCHAR(100),
  acknowledged_date DATETIME2,
  resolution_notes NVARCHAR(2000),
  resolved_date DATETIME2,

  -- Actions taken
  intervention_triggered BIT DEFAULT 0,
  intervention_id INT NULL,

  CONSTRAINT FK_alert_family FOREIGN KEY (family_id) 
    REFERENCES families(family_id),
  CONSTRAINT FK_alert_rule FOREIGN KEY (rule_id) 
    REFERENCES alert_rules(rule_id)
);

-- Indexes
CREATE INDEX IX_alert_status ON alerts(status, alert_date);
CREATE INDEX IX_alert_severity ON alerts(severity) WHERE status = 'new';
CREATE INDEX IX_alert_family ON alerts(family_id, alert_date);

Implementation

Early Warning Engine

class EarlyWarningEngine {
  constructor(db) {
    this.db = db;
    this.alertRules = null;
  }

  async loadAlertRules() {
    this.alertRules = await this.db.query(`
      SELECT * FROM alert_rules WHERE enabled = 1
    `);
  }

  async checkAllFamilies() {
    if (!this.alertRules) {
      await this.loadAlertRules();
    }

    const families = await this.db.query(`
      SELECT family_id FROM families WHERE enrolled_current_semester = 1
    `);

    const results = {
      total_checked: families.length,
      alerts_generated: 0,
      by_severity: { low: 0, medium: 0, high: 0, critical: 0 }
    };

    for (const family of families) {
      const alerts = await this.checkFamily(family.family_id);
      results.alerts_generated += alerts.length;

      alerts.forEach(alert => {
        results.by_severity[alert.severity]++;
      });
    }

    return results;
  }

  async checkFamily(familyId) {
    const generatedAlerts = [];

    // Get current metrics
    const currentMetrics = await this.getCurrentMetrics(familyId);
    const historicalMetrics = await this.getHistoricalMetrics(familyId);

    // Check each alert rule
    for (const rule of this.alertRules) {
      const shouldAlert = await this.evaluateRule(rule, currentMetrics, historicalMetrics);

      if (shouldAlert.triggered) {
        // Check if we've already alerted for this (avoid duplicates)
        const recentAlert = await this.getRecentAlert(familyId, rule.rule_id, 7); // Last 7 days

        if (!recentAlert || rule.alert_frequency === 'every_time') {
          const alert = await this.generateAlert(familyId, rule, shouldAlert.details);
          generatedAlerts.push(alert);
        }
      }
    }

    return generatedAlerts;
  }

  async evaluateRule(rule, current, historical) {
    switch (rule.rule_type) {
      case 'sudden_change':
        return this.evaluateSuddenChange(rule, current, historical);
      case 'gradual_decline':
        return this.evaluateGradualDecline(rule, current, historical);
      case 'threshold':
        return this.evaluateThreshold(rule, current);
      case 'pattern_break':
        return this.evaluatePatternBreak(rule, current, historical);
      case 'predictive':
        return this.evaluatePredictive(rule, current, historical);
      default:
        return { triggered: false };
    }
  }

  evaluateSuddenChange(rule, current, historical) {
    const currentValue = current[rule.metric_name];
    const previousValue = historical.length > 0 ? historical[0][rule.metric_name] : null;

    if (currentValue === null || previousValue === null) {
      return { triggered: false };
    }

    const change = Math.abs(currentValue - previousValue);

    if (rule.condition_operator === 'drops_by' && previousValue > currentValue) {
      if (change >= rule.threshold_value) {
        return {
          triggered: true,
          details: {
            current: currentValue,
            previous: previousValue,
            change: -change,
            message: `${rule.metric_name} dropped from ${previousValue.toFixed(1)} to ${currentValue.toFixed(1)} (-${change.toFixed(1)})`
          }
        };
      }
    } else if (rule.condition_operator === 'increases_by' && currentValue > previousValue) {
      if (change >= rule.threshold_value) {
        return {
          triggered: true,
          details: {
            current: currentValue,
            previous: previousValue,
            change: change,
            message: `${rule.metric_name} increased from ${previousValue.toFixed(1)} to ${currentValue.toFixed(1)} (+${change.toFixed(1)})`
          }
        };
      }
    }

    return { triggered: false };
  }

  evaluateGradualDecline(rule, current, historical) {
    // Check if metric has been declining for N consecutive periods
    const periods = rule.lookback_periods || 3;

    if (historical.length < periods) {
      return { triggered: false }; // Not enough data
    }

    const values = [current[rule.metric_name], ...historical.slice(0, periods - 1).map(h => h[rule.metric_name])];

    // Check if each value is less than previous
    let isConsistentlyDeclining = true;
    for (let i = 1; i < values.length; i++) {
      if (values[i] >= values[i - 1]) {
        isConsistentlyDeclining = false;
        break;
      }
    }

    if (isConsistentlyDeclining) {
      const totalDecline = values[0] - values[values.length - 1];
      return {
        triggered: true,
        details: {
          current: values[0],
          previous: values[values.length - 1],
          decline: totalDecline,
          periods: periods,
          message: `${rule.metric_name} declining for ${periods} consecutive periods (${values[values.length - 1].toFixed(1)} → ${values[0].toFixed(1)})`
        }
      };
    }

    return { triggered: false };
  }

  evaluateThreshold(rule, current) {
    const value = current[rule.metric_name];

    if (value === null || value === undefined) {
      return { triggered: false };
    }

    let triggered = false;

    switch (rule.condition_operator) {
      case '<':
        triggered = value < rule.threshold_value;
        break;
      case '>':
        triggered = value > rule.threshold_value;
        break;
      case '=':
        triggered = value === rule.threshold_value;
        break;
      case '<=':
        triggered = value <= rule.threshold_value;
        break;
      case '>=':
        triggered = value >= rule.threshold_value;
        break;
    }

    if (triggered) {
      return {
        triggered: true,
        details: {
          current: value,
          threshold: rule.threshold_value,
          message: `${rule.metric_name} is ${value.toFixed(1)} (threshold: ${rule.condition_operator} ${rule.threshold_value})`
        }
      };
    }

    return { triggered: false };
  }

  evaluatePatternBreak(rule, current, historical) {
    // Simplified pattern break detection
    // Real implementation would use statistical methods

    if (rule.metric_name === 'tier_change' && rule.condition_operator === 'demoted') {
      // Check if tier changed to lower
      if (current.tier_change_direction === 'demotion') {
        return {
          triggered: true,
          details: {
            current: current.current_tier,
            previous: current.previous_tier,
            message: `Demoted from ${current.previous_tier} to ${current.current_tier}`
          }
        };
      }
    }

    return { triggered: false };
  }

  evaluatePredictive(rule, current, historical) {
    // Simplified predictive logic
    // Real implementation would use time series forecasting

    if (historical.length < 3) {
      return { triggered: false };
    }

    // Calculate trend
    const values = [current[rule.metric_name], ...historical.slice(0, 2).map(h => h[rule.metric_name])];
    const avgChange = (values[0] - values[2]) / 2; // Average change per period

    // Project forward
    const projectedValue = values[0] + (avgChange * 2); // 2 periods ahead

    // Check if projected value will cross threshold
    let willCrossThreshold = false;

    if (rule.condition_operator === '<') {
      willCrossThreshold = projectedValue < rule.threshold_value && values[0] >= rule.threshold_value;
    } else if (rule.condition_operator === '>') {
      willCrossThreshold = projectedValue > rule.threshold_value && values[0] <= rule.threshold_value;
    }

    if (willCrossThreshold) {
      return {
        triggered: true,
        details: {
          current: values[0],
          projected: projectedValue,
          threshold: rule.threshold_value,
          periods_until: 2,
          message: `${rule.metric_name} projected to cross threshold in ~2 periods (current: ${values[0].toFixed(1)}, projected: ${projectedValue.toFixed(1)})`
        }
      };
    }

    return { triggered: false };
  }

  async getCurrentMetrics(familyId) {
    const metrics = await this.db.query(`
      SELECT 
        fem.engagement_score,
        ra.withdrawal_risk,
        ra.payment_risk,
        ra.academic_risk,
        ra.disengagement_risk,
        ft.tier_id as current_tier,
        DATEDIFF(day, MAX(il.interaction_timestamp), GETDATE()) as days_since_portal_login
      FROM families f
      LEFT JOIN family_engagement_metrics fem ON f.family_id = fem.family_id
      LEFT JOIN risk_assessments ra ON f.family_id = ra.family_id
      LEFT JOIN family_tiers ft ON f.family_id = ft.family_id
      LEFT JOIN interaction_log il ON f.family_id = il.family_id 
        AND il.interaction_type = 'portal_login'
      WHERE f.family_id = ?
      GROUP BY f.family_id, fem.engagement_score, ra.withdrawal_risk, 
               ra.payment_risk, ra.academic_risk, ra.disengagement_risk, ft.tier_id
    `, [familyId]);

    return metrics[0] || {};
  }

  async getHistoricalMetrics(familyId, periods = 5) {
    // Get historical snapshots (would need a history table in real implementation)
    // For now, return empty array
    return [];
  }

  async getRecentAlert(familyId, ruleId, days) {
    const result = await this.db.query(`
      SELECT * FROM alerts
      WHERE family_id = ?
        AND rule_id = ?
        AND alert_date >= DATE_SUB(NOW(), INTERVAL ? DAY)
        AND status != 'false_positive'
      ORDER BY alert_date DESC
      LIMIT 1
    `, [familyId, ruleId, days]);

    return result[0];
  }

  async generateAlert(familyId, rule, details) {
    const alertMessage = details.message || `${rule.rule_name} triggered`;

    const result = await this.db.query(`
      INSERT INTO alerts (
        family_id, rule_id, severity, alert_message,
        metric_value, previous_value, threshold_crossed
      ) VALUES (?, ?, ?, ?, ?, ?, ?)
      RETURNING alert_id
    `, [
      familyId,
      rule.rule_id,
      rule.severity,
      alertMessage,
      details.current,
      details.previous,
      details.threshold || rule.threshold_value
    ]);

    const alert = {
      alert_id: result[0].alert_id,
      family_id: familyId,
      rule_name: rule.rule_name,
      severity: rule.severity,
      message: alertMessage,
      details: details
    };

    // Trigger notification if configured
    if (rule.notification_channels) {
      await this.sendNotifications(alert, JSON.parse(rule.notification_channels));
    }

    // Trigger auto-intervention if enabled
    if (rule.auto_intervention_enabled && rule.intervention_pattern_id) {
      await this.triggerIntervention(familyId, rule.intervention_pattern_id, alert.alert_id);
    }

    return alert;
  }

  async sendNotifications(alert, channels) {
    // Implementation depends on notification infrastructure
    console.log(`[ALERT] ${alert.severity.toUpperCase()}: ${alert.message}`);
    console.log(`  Channels: ${channels.join(', ')}`);
  }

  async triggerIntervention(familyId, interventionPatternId, alertId) {
    console.log(`Triggering intervention ${interventionPatternId} for family ${familyId}`);
    // Implementation in Pattern 15
  }

  async getDashboardAlerts(status = 'new', limit = 20) {
    return await this.db.query(`
      SELECT 
        a.alert_id,
        a.alert_date,
        a.severity,
        a.alert_message,
        f.family_name,
        ar.rule_name,
        a.metric_value,
        a.previous_value
      FROM alerts a
      JOIN families f ON a.family_id = f.family_id
      JOIN alert_rules ar ON a.rule_id = ar.rule_id
      WHERE a.status = ?
      ORDER BY 
        CASE a.severity
          WHEN 'critical' THEN 1
          WHEN 'high' THEN 2
          WHEN 'medium' THEN 3
          ELSE 4
        END,
        a.alert_date DESC
      LIMIT ?
    `, [status, limit]);
  }

  async acknowledgeAlert(alertId, acknowledgedBy, notes) {
    await this.db.query(`
      UPDATE alerts
      SET 
        status = 'acknowledged',
        acknowledged_by = ?,
        acknowledged_date = GETDATE(),
        resolution_notes = ?
      WHERE alert_id = ?
    `, [acknowledgedBy, notes, alertId]);
  }

  async resolveAlert(alertId, resolvedBy, notes, wasFalsePositive = false) {
    await this.db.query(`
      UPDATE alerts
      SET 
        status = ?,
        resolution_notes = ?,
        resolved_date = GETDATE()
      WHERE alert_id = ?
    `, [wasFalsePositive ? 'false_positive' : 'resolved', notes, alertId]);
  }
}

module.exports = EarlyWarningEngine;

Usage Example

const earlyWarning = new EarlyWarningEngine(db);

// Run nightly check
const results = await earlyWarning.checkAllFamilies();
console.log(`
Early Warning Check Complete:
  Families Checked: ${results.total_checked}
  Alerts Generated: ${results.alerts_generated}

  By Severity:
    Critical: ${results.by_severity.critical}
    High: ${results.by_severity.high}
    Medium: ${results.by_severity.medium}
    Low: ${results.by_severity.low}
`);

// Get dashboard alerts
const alerts = await earlyWarning.getDashboardAlerts('new', 10);
alerts.forEach(alert => {
  console.log(`
[${alert.severity.toUpperCase()}] ${alert.family_name}
  ${alert.alert_message}
  Rule: ${alert.rule_name}
  Date: ${alert.alert_date}
  `);
});

// Acknowledge alert
await earlyWarning.acknowledgeAlert(123, 'sarah@coop.org', 'Following up with personal call');

// Resolve alert
await earlyWarning.resolveAlert(123, 'sarah@coop.org', 'Spoke with family, issue resolved', false);

Variations

By Alert Timing

Real-time (immediate): - Critical thresholds - Sudden spikes - Use: Payment failures, emergency situations

Batched Daily: - Gradual declines - Threshold violations - Use: Most organizational intelligence

Weekly Summary: - Low-priority trends - Informational signals - Use: Management reports

By Action

Informational: - Alert only, no action

Recommended: - Alert + suggested action

Automated: - Alert + trigger intervention automatically

By Severity Handling

Critical: - Immediate notification - Phone/SMS - Requires acknowledgment within 24 hours

High: - Email notification - Dashboard prominent display - Requires acknowledgment within 3 days

Medium: - Dashboard only - Weekly summary email

Low: - Dashboard only - Monthly report

Consequences

Benefits

1. Problems caught early Martinez family intervention at score 60 vs 28 - 3x more likely to succeed

2. Proactive vs reactive "Predicted this 2 weeks ago" vs "didn't see it coming"

3. Resource optimization Focus attention where problems emerging, not randomly

4. Reduced crisis firefighting Fewer emergencies because problems addressed early

5. Data-driven prioritization Alerts show exactly who needs attention

6. Improved outcomes Early intervention = better success rates

Costs

1. Alert fatigue Too many alerts = all ignored. Must tune carefully.

2. False positives Alert fires, but no real problem. Damages credibility.

3. Configuration complexity Many rules, thresholds, conditions to manage

4. Computational overhead Checking all families against all rules is expensive

5. Response burden Alerts require follow-up, or they're wasted effort

Sample Code

Alert dashboard component:

async function getAlertDashboard() {
  const newAlerts = await earlyWarning.getDashboardAlerts('new');
  const investigating = await earlyWarning.getDashboardAlerts('investigating');

  // Group by severity
  const bySeverity = {
    critical: newAlerts.filter(a => a.severity === 'critical'),
    high: newAlerts.filter(a => a.severity === 'high'),
    medium: newAlerts.filter(a => a.severity === 'medium'),
    low: newAlerts.filter(a => a.severity === 'low')
  };

  return {
    summary: {
      total_new: newAlerts.length,
      critical: bySeverity.critical.length,
      high: bySeverity.high.length,
      medium: bySeverity.medium.length,
      low: bySeverity.low.length,
      investigating: investigating.length
    },
    alerts: {
      critical: bySeverity.critical,
      high: bySeverity.high,
      medium: bySeverity.medium
    }
  };
}

Known Uses

Homeschool Co-op Intelligence Platform - 8 alert rules configured - Average 3.2 alerts per day - 87% of critical alerts led to successful intervention - False positive rate: 12% (acceptable)

SaaS Monitoring - Datadog, PagerDuty: Infrastructure alerts - ChurnZero, Gainsight: Customer health alerts - Principle: Early warning = preventable problems

Healthcare - EHR systems: Patient deterioration alerts - Sepsis early warning scores - Readmission risk alerts

Requires: - Pattern 6: Composite Health Scoring - scores trigger alerts - Pattern 7: Multi-Dimensional Risk Assessment - risks trigger alerts

Enables: - Pattern 15: Intervention Recommendation Engine - alerts trigger recommendations - Pattern 22: Progressive Escalation Sequences - alerts start escalation - Pattern 23: Triggered Interventions - alerts trigger actions

Enhanced by: - Pattern 10: Engagement Velocity Tracking - velocity changes trigger alerts - Pattern 11: Historical Pattern Matching - pattern breaks trigger alerts

References

Academic Foundations

Provost, Foster, and Tom Fawcett (2013). Data Science for Business. O'Reilly. ISBN: 978-1449361327 - Chapter 7 on anomaly detection
Chandola, Varun, Arindam Banerjee, and Vipin Kumar (2009). "Anomaly Detection: A Survey." ACM Computing Surveys 41(3). https://dl.acm.org/doi/10.1145/1541880.1541882
Aggarwal, Charu C. (2016). Outlier Analysis (2nd ed.). Springer. ISBN: 978-3319475776
Signal Detection Theory: Green, D.M., & Swets, J.A. (1966). Signal Detection Theory and Psychophysics. Wiley.

Healthcare Early Warning

MEWS (Modified Early Warning Score): Subbe, C.P., et al. (2001). "Validation of a modified Early Warning Score." QJM 94(10): 521-526. https://academic.oup.com/qjmed/article/94/10/521/1603297
NEWS (National Early Warning Score): https://www.rcplondon.ac.uk/projects/outputs/national-early-warning-score-news-2 - UK standard
Sepsis Alert Systems: Shimabukuro, D.W., et al. (2017). "Effect of a machine learning-based severe sepsis prediction algorithm." JAMA.

DevOps & Monitoring

Site Reliability Engineering: Beyer, B., et al. (2016). Site Reliability Engineering. O'Reilly. https://sre.google/books/ - Free online, Chapter on monitoring
Alert Fatigue: Ancker, J.S., et al. (2017). "Effects of workload, work complexity, and repeated alerts on alert fatigue." BMJ Quality & Safety.
Prometheus Alerting: https://prometheus.io/docs/alerting/latest/overview/ - Modern alert management

Practical Implementation

Anomaly Detection Libraries:
PyOD: https://github.com/yzhao062/pyod - Python Outlier Detection
Isolation Forest: Liu, F.T., et al. (2008). "Isolation Forest." https://ieeexplore.ieee.org/document/4781136
LSTM Autoencoders: https://keras.io/examples/timeseries/timeseries_anomaly_detection/ - Time series anomalies
Statistical Process Control: Montgomery, D.C. (2012). Introduction to Statistical Quality Control (7th ed.). Wiley.

Pattern 6: Composite Health Scoring - Health score drops trigger warnings
Pattern 7: Multi-Dimensional Risk - Multiple risk factors compound
Pattern 10: Engagement Velocity Tracking - Velocity changes signal risk
Pattern 23: Triggered Interventions - Warnings trigger interventions
Volume 3, Pattern 5: Error as Collaboration - Immediate warning delivery

Tools & Services

Datadog Anomaly Detection: https://www.datadoghq.com/blog/anomaly-detection/ - ML-based monitoring
PagerDuty: https://www.pagerduty.com/ - Incident response and alerting
Splunk IT Service Intelligence: https://www.splunk.com/en_us/software/itsi.html - Predictive analytics
New Relic Applied Intelligence: https://newrelic.com/platform/applied-intelligence - AIOps alerting