Pattern 12: Risk Stratification Models
Intent
Build machine learning models that learn complex, non-linear relationships between behavioral features and outcomes, providing more accurate risk predictions than simple pattern matching alone, with continuous model improvement as new data arrives.
Also Known As
- Predictive Risk Models
- ML-Based Risk Scoring
- Supervised Learning for Outcomes
- Churn Prediction Models
- Propensity Models
Problem
Pattern 11 (Historical Pattern Matching) has limitations:
Simple similarity matching works but: - Assumes linear relationships (engagement score ↓ = withdrawal risk ↑) - Can't capture interactions (low engagement + new tenure = very high risk, but low engagement + long tenure = moderate risk) - Sensitive to feature weights (manually tuned, may be suboptimal) - Curse of dimensionality (hard to find exact matches with many features) - Limited to historical patterns (can't generalize beyond observed cases)
Real relationships are complex:
Withdrawal risk isn't just sum of factors. It's:
Risk = f(
engagement_score,
velocity,
tenure,
engagement_score × velocity, // Interaction
velocity × tenure, // Interaction
payment_issues AND low_engagement, // Combined effect
...and 50+ other complex relationships
)
Machine learning can learn these patterns automatically.
Martinez family: - Pattern matching: 87% withdrawal risk (based on 8 similar cases) - ML model: 94% withdrawal risk (learned from 200+ cases, detected interaction: declining engagement + new enrollment + payment stress = very high risk)
Context
When this pattern applies:
- Have sufficient training data (200+ cases with outcomes)
- Relationships are complex, non-linear
- Want highest possible prediction accuracy
- Can invest in model development and maintenance
- Have or can acquire ML expertise
When this pattern may not be needed:
- Limited training data (<100 cases)
- Simple relationships (pattern matching sufficient)
- Interpretability more important than accuracy
- No ML expertise available
- Rapid deployment needed (pattern matching faster to implement)
Best approach: Use BOTH - Pattern matching for transparency and cold-start - ML models for accuracy once data accumulates - Ensemble: combine both for best results
Forces
Competing concerns:
1. Accuracy vs Interpretability - Complex models (neural nets) more accurate - But black boxes, hard to explain - Balance: Start with interpretable models (logistic regression, decision trees)
2. Training Data vs Overfitting - More features capture more patterns - But overfit with limited data - Balance: Regularization, cross-validation, feature selection
3. Static vs Dynamic Models - Static models easier to deploy - But performance degrades over time - Balance: Scheduled retraining (monthly/quarterly)
4. Single Model vs Ensemble - Single model simpler - Ensemble more accurate - Balance: Start single, graduate to ensemble
5. Prediction vs Explanation - Want accurate predictions - But also want to know WHY - Balance: Use SHAP/LIME for model explanation
6. Model Accuracy vs Input Data Quality ⚠️ - ML models can only be as good as their training data - Garbage in, garbage out: Incomplete or biased data → Unreliable predictions - Poor form design (Volume 3) directly degrades model accuracy: - High abandonment rates → Incomplete training data - Validation errors → Incorrect feature values - User confusion → Systematic data entry errors - Selection bias → Models trained on non-representative samples - Balance: Invest in quality data capture (V3 Interaction Patterns) alongside model sophistication - See Volume 3, Part II (Interaction Patterns) for form patterns that ensure quality training data
Solution
Build ML pipeline with:
- Feature Engineering - Transform raw data into predictive features
- Model Training - Learn from historical outcomes
- Model Validation - Test accuracy on holdout data
- Model Deployment - Serve predictions in production
- Continuous Learning - Retrain as new data arrives
- Model Explanation - Understand why predictions made
Recommended Models (in order of complexity):
Level 1: Logistic Regression - Linear model, interpretable coefficients - Good baseline, fast training - Works well with 200-500 cases
Level 2: Random Forest - Ensemble of decision trees - Handles non-linearities and interactions - Feature importance built-in - Works well with 500-2000 cases
Level 3: Gradient Boosting (XGBoost, LightGBM) - State-of-the-art for tabular data - Handles complex patterns - Requires 1000+ cases - Most accurate for many problems
Level 4: Neural Networks - Maximum flexibility - Requires 5000+ cases - Computationally expensive - Overkill for most organizational intelligence
Structure
Model Management Tables
-- Store trained models
CREATE TABLE ml_models (
model_id INT PRIMARY KEY IDENTITY(1,1),
-- Model metadata
model_name VARCHAR(100) NOT NULL, -- 'withdrawal_risk_v1', 'payment_default_v2'
model_type VARCHAR(50), -- 'logistic_regression', 'random_forest', 'xgboost'
target_outcome VARCHAR(50), -- 'withdrawal', 'payment_default'
-- Training details
trained_date DATETIME2 DEFAULT GETDATE(),
training_cases INT,
training_start_date DATE,
training_end_date DATE,
-- Performance metrics
accuracy DECIMAL(5,2),
precision_score DECIMAL(5,2),
recall DECIMAL(5,2),
f1_score DECIMAL(5,2),
auc_roc DECIMAL(5,2), -- Area Under ROC Curve
-- Model artifact
model_path VARCHAR(500), -- File path to serialized model
feature_list NVARCHAR(MAX), -- JSON array of features used
hyperparameters NVARCHAR(MAX), -- JSON
-- Status
status VARCHAR(50) DEFAULT 'active', -- 'active', 'deprecated', 'testing'
deployed_date DATETIME2,
deprecated_date DATETIME2,
-- Version control
version VARCHAR(20),
previous_model_id INT,
CONSTRAINT FK_previous_model FOREIGN KEY (previous_model_id)
REFERENCES ml_models(model_id)
);
-- Store individual predictions
CREATE TABLE ml_predictions (
prediction_id INT PRIMARY KEY IDENTITY(1,1),
model_id INT NOT NULL,
family_id INT NOT NULL,
-- Prediction
prediction_date DATETIME2 DEFAULT GETDATE(),
predicted_probability DECIMAL(5,4), -- 0-1
predicted_class VARCHAR(50), -- Binary: 'at_risk' or 'safe'
confidence_level VARCHAR(20), -- 'high', 'medium', 'low'
-- Feature values at prediction time
feature_values NVARCHAR(MAX), -- JSON
-- SHAP values for explanation
shap_values NVARCHAR(MAX), -- JSON
top_contributing_features NVARCHAR(MAX), -- JSON
-- Actual outcome (for validation)
actual_outcome VARCHAR(50),
actual_outcome_date DATETIME2,
prediction_correct BIT,
CONSTRAINT FK_mlpred_model FOREIGN KEY (model_id)
REFERENCES ml_models(model_id),
CONSTRAINT FK_mlpred_family FOREIGN KEY (family_id)
REFERENCES families(family_id)
);
-- Store feature importance
CREATE TABLE model_feature_importance (
importance_id INT PRIMARY KEY IDENTITY(1,1),
model_id INT NOT NULL,
feature_name VARCHAR(100),
importance_score DECIMAL(8,6),
rank INT,
CONSTRAINT FK_importance_model FOREIGN KEY (model_id)
REFERENCES ml_models(model_id)
);
Implementation
Feature Engineering
class FeatureEngineer {
constructor(db) {
this.db = db;
}
async extractFeatures(familyId, asOfDate = null) {
const cutoffDate = asOfDate || new Date();
// Get base features (from Pattern 11)
const baseFeatures = await this.extractBaseFeatures(familyId, cutoffDate);
// Engineer additional features
const engineeredFeatures = await this.engineerFeatures(familyId, cutoffDate, baseFeatures);
return {
...baseFeatures,
...engineeredFeatures
};
}
async extractBaseFeatures(familyId, cutoffDate) {
// Similar to Pattern 11, but more comprehensive
const metrics = await this.db.query(`
SELECT
fem.engagement_score,
fem.communication_score,
fem.platform_engagement_score,
fem.financial_health_score,
fem.participation_score,
fem.tenure_score,
fem.score_delta,
ra.withdrawal_risk,
ra.payment_risk,
ra.academic_risk,
ra.disengagement_risk
FROM family_engagement_metrics fem
LEFT JOIN risk_assessments ra ON fem.family_id = ra.family_id
WHERE fem.family_id = ?
AND fem.calculation_date <= ?
`, [familyId, cutoffDate]);
// Interaction patterns
const interactions = await this.db.query(`
SELECT
COUNT(*) as total_interactions,
COUNT(DISTINCT DATE(interaction_timestamp)) as active_days,
COUNT(CASE WHEN channel = 'email' THEN 1 END) as email_count,
COUNT(CASE WHEN channel = 'sms' THEN 1 END) as sms_count,
COUNT(CASE WHEN channel = 'phone' THEN 1 END) as phone_count,
COUNT(CASE WHEN outcome_category = 'success' THEN 1 END) as successful_interactions,
AVG(CASE WHEN interaction_type = 'email_sent'
THEN CAST(JSON_VALUE(metadata, '$.time_to_open_hours') AS FLOAT) END) as avg_email_response_time
FROM interaction_log
WHERE family_id = ?
AND interaction_timestamp <= ?
AND interaction_timestamp >= DATE_SUB(?, INTERVAL 90 DAY)
`, [familyId, cutoffDate, cutoffDate]);
const m = metrics[0] || {};
const i = interactions[0] || {};
return {
// Health scores
engagement_score: m.engagement_score || 50,
communication_score: m.communication_score || 50,
platform_engagement_score: m.platform_engagement_score || 50,
financial_health_score: m.financial_health_score || 50,
participation_score: m.participation_score || 50,
tenure_score: m.tenure_score || 50,
// Risk scores
withdrawal_risk: m.withdrawal_risk || 50,
payment_risk: m.payment_risk || 50,
academic_risk: m.academic_risk || 50,
disengagement_risk: m.disengagement_risk || 50,
// Velocity
score_delta: m.score_delta || 0,
// Interaction patterns
total_interactions: i.total_interactions || 0,
active_days: i.active_days || 0,
email_count: i.email_count || 0,
sms_count: i.sms_count || 0,
phone_count: i.phone_count || 0,
successful_interactions: i.successful_interactions || 0,
avg_email_response_time: i.avg_email_response_time || 0
};
}
async engineerFeatures(familyId, cutoffDate, baseFeatures) {
// Interaction features
const engagement_rate = baseFeatures.total_interactions > 0
? baseFeatures.successful_interactions / baseFeatures.total_interactions
: 0;
const days_per_interaction = baseFeatures.active_days > 0
? 90 / baseFeatures.active_days
: 999;
// Communication diversity (using multiple channels is good)
const channels_used = [
baseFeatures.email_count > 0 ? 1 : 0,
baseFeatures.sms_count > 0 ? 1 : 0,
baseFeatures.phone_count > 0 ? 1 : 0
].reduce((a, b) => a + b, 0);
// Feature interactions (THIS IS WHERE ML SHINES)
const engagement_velocity_interaction = baseFeatures.engagement_score * baseFeatures.score_delta;
const risk_score_product = baseFeatures.withdrawal_risk * baseFeatures.payment_risk;
const health_risk_gap = baseFeatures.engagement_score - baseFeatures.withdrawal_risk;
// Tenure-based features
const is_new_family = baseFeatures.tenure_score < 40 ? 1 : 0;
const is_established = baseFeatures.tenure_score > 80 ? 1 : 0;
// Declining engagement signal
const is_declining_rapidly = (baseFeatures.score_delta < -5) ? 1 : 0;
// Multi-dimensional risk
const high_risk_dimensions = [
baseFeatures.withdrawal_risk > 70 ? 1 : 0,
baseFeatures.payment_risk > 70 ? 1 : 0,
baseFeatures.academic_risk > 70 ? 1 : 0,
baseFeatures.disengagement_risk > 70 ? 1 : 0
].reduce((a, b) => a + b, 0);
// Payment patterns
const paymentHistory = await this.db.query(`
SELECT
COUNT(*) as recent_payments,
SUM(CASE WHEN outcome = 'paid_late' THEN 1 ELSE 0 END) as recent_late,
MAX(CASE WHEN outcome = 'paid_late'
THEN CAST(JSON_VALUE(metadata, '$.days_late') AS INT) ELSE 0 END) as max_days_late
FROM interaction_log
WHERE family_id = ?
AND interaction_type = 'payment_received'
AND interaction_timestamp <= ?
AND interaction_timestamp >= DATE_SUB(?, INTERVAL 6 MONTH)
`, [familyId, cutoffDate, cutoffDate]);
const ph = paymentHistory[0] || {};
const payment_deterioration = ph.recent_payments > 0 && ph.recent_late > 0
? ph.recent_late / ph.recent_payments
: 0;
return {
// Engineered interaction features
engagement_rate: engagement_rate,
days_per_interaction: Math.min(days_per_interaction, 90),
channels_used: channels_used,
// Feature interactions
engagement_velocity_interaction: engagement_velocity_interaction,
risk_score_product: risk_score_product,
health_risk_gap: health_risk_gap,
// Categorical flags
is_new_family: is_new_family,
is_established: is_established,
is_declining_rapidly: is_declining_rapidly,
// Multi-dimensional risk
high_risk_dimensions: high_risk_dimensions,
// Payment patterns
payment_deterioration: payment_deterioration,
max_days_late: ph.max_days_late || 0,
// Polynomial features (for logistic regression)
engagement_score_squared: Math.pow(baseFeatures.engagement_score, 2),
score_delta_squared: Math.pow(baseFeatures.score_delta, 2)
};
}
}
module.exports = FeatureEngineer;
Model Training (Python - using scikit-learn)
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score
import joblib
import json
class WithdrawalRiskModel:
def __init__(self, model_type='random_forest'):
self.model_type = model_type
self.model = None
self.feature_names = None
self.feature_importance = None
def prepare_training_data(self, db_connection):
"""Extract training data from database"""
query = """
SELECT
hp.pattern_id,
hp.family_id,
hp.engagement_score,
hp.engagement_velocity,
hp.communication_score,
hp.platform_engagement_score,
hp.financial_health_score,
hp.participation_score,
hp.tenure_score,
hp.withdrawal_risk,
hp.payment_risk,
hp.total_interactions,
hp.engagement_rate,
hp.channels_used,
hp.is_new_family,
hp.is_declining_rapidly,
hp.high_risk_dimensions,
hp.payment_deterioration,
hp.engagement_velocity_interaction,
hp.risk_score_product,
hp.health_risk_gap,
-- Target variable
CASE WHEN hp.outcome = 'withdrew' THEN 1 ELSE 0 END as withdrew
FROM historical_patterns hp
WHERE hp.outcome IN ('withdrew', 'remained')
AND hp.snapshot_date >= DATE_SUB(NOW(), INTERVAL 3 YEAR)
"""
df = pd.read_sql(query, db_connection)
# Separate features and target
feature_cols = [col for col in df.columns if col not in
['pattern_id', 'family_id', 'withdrew', 'outcome']]
X = df[feature_cols]
y = df['withdrew']
self.feature_names = feature_cols
return X, y, df
def train(self, X, y):
"""Train the model"""
# Split data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
print(f"Training set: {len(X_train)} samples")
print(f"Test set: {len(X_test)} samples")
print(f"Withdrawal rate: {y_train.mean():.1%}")
# Initialize model
if self.model_type == 'logistic_regression':
self.model = LogisticRegression(
max_iter=1000,
C=1.0,
random_state=42
)
elif self.model_type == 'random_forest':
self.model = RandomForestClassifier(
n_estimators=100,
max_depth=10,
min_samples_split=20,
min_samples_leaf=10,
random_state=42,
n_jobs=-1
)
elif self.model_type == 'gradient_boosting':
self.model = GradientBoostingClassifier(
n_estimators=100,
max_depth=5,
learning_rate=0.1,
random_state=42
)
# Train
print(f"\nTraining {self.model_type}...")
self.model.fit(X_train, y_train)
# Evaluate
y_pred = self.model.predict(X_test)
y_pred_proba = self.model.predict_proba(X_test)[:, 1]
metrics = {
'accuracy': accuracy_score(y_test, y_pred),
'precision': precision_score(y_test, y_pred),
'recall': recall_score(y_test, y_pred),
'f1': f1_score(y_test, y_pred),
'auc_roc': roc_auc_score(y_test, y_pred_proba)
}
print("\nTest Set Performance:")
for metric, value in metrics.items():
print(f" {metric}: {value:.3f}")
# Feature importance
if hasattr(self.model, 'feature_importances_'):
self.feature_importance = pd.DataFrame({
'feature': self.feature_names,
'importance': self.model.feature_importances_
}).sort_values('importance', ascending=False)
print("\nTop 10 Most Important Features:")
print(self.feature_importance.head(10))
# Cross-validation
cv_scores = cross_val_score(
self.model, X_train, y_train, cv=5, scoring='roc_auc'
)
print(f"\nCross-validation AUC: {cv_scores.mean():.3f} (+/- {cv_scores.std():.3f})")
return metrics
def predict(self, X):
"""Make predictions"""
probabilities = self.model.predict_proba(X)[:, 1]
predictions = self.model.predict(X)
return probabilities, predictions
def save_model(self, filepath):
"""Save trained model"""
model_data = {
'model': self.model,
'model_type': self.model_type,
'feature_names': self.feature_names,
'feature_importance': self.feature_importance.to_dict() if self.feature_importance is not None else None
}
joblib.dump(model_data, filepath)
print(f"Model saved to {filepath}")
def load_model(self, filepath):
"""Load trained model"""
model_data = joblib.load(filepath)
self.model = model_data['model']
self.model_type = model_data['model_type']
self.feature_names = model_data['feature_names']
self.feature_importance = pd.DataFrame(model_data['feature_importance']) if model_data['feature_importance'] else None
print(f"Model loaded from {filepath}")
# Training script
if __name__ == "__main__":
import mysql.connector
# Connect to database
conn = mysql.connector.connect(
host='localhost',
user='username',
password='password',
database='coop_intelligence'
)
# Train Random Forest model
model = WithdrawalRiskModel(model_type='random_forest')
X, y, df = model.prepare_training_data(conn)
metrics = model.train(X, y)
# Save model
model.save_model('models/withdrawal_risk_rf_v1.pkl')
conn.close()
Model Serving (Node.js)
const { spawn } = require('child_process');
const path = require('path');
class MLModelServer {
constructor(modelPath) {
this.modelPath = modelPath;
}
async predict(features) {
return new Promise((resolve, reject) => {
// Call Python script to make prediction
const python = spawn('python3', [
path.join(__dirname, 'predict.py'),
this.modelPath,
JSON.stringify(features)
]);
let result = '';
let error = '';
python.stdout.on('data', (data) => {
result += data.toString();
});
python.stderr.on('data', (data) => {
error += data.toString();
});
python.on('close', (code) => {
if (code !== 0) {
reject(new Error(`Python process exited with code ${code}: ${error}`));
} else {
try {
const prediction = JSON.parse(result);
resolve(prediction);
} catch (e) {
reject(new Error(`Failed to parse prediction: ${e.message}`));
}
}
});
});
}
async predictForFamily(familyId) {
const engineer = new FeatureEngineer(db);
const features = await engineer.extractFeatures(familyId);
const prediction = await this.predict(features);
// Save to database
await db.query(`
INSERT INTO ml_predictions (
model_id,
family_id,
predicted_probability,
predicted_class,
confidence_level,
feature_values
) VALUES (?, ?, ?, ?, ?, ?)
`, [
prediction.model_id,
familyId,
prediction.probability,
prediction.probability > 0.5 ? 'at_risk' : 'safe',
prediction.probability > 0.8 ? 'high' : (prediction.probability > 0.5 ? 'medium' : 'low'),
JSON.stringify(features)
]);
return prediction;
}
}
// Python prediction script (predict.py)
# predict.py
import sys
import json
import joblib
import pandas as pd
def predict(model_path, features):
# Load model
model_data = joblib.load(model_path)
model = model_data['model']
feature_names = model_data['feature_names']
# Prepare features
X = pd.DataFrame([features])[feature_names]
# Predict
probability = model.predict_proba(X)[0, 1]
prediction = model.predict(X)[0]
return {
'probability': float(probability),
'prediction': int(prediction),
'confidence': 'high' if abs(probability - 0.5) > 0.3 else 'medium'
}
if __name__ == "__main__":
model_path = sys.argv[1]
features = json.loads(sys.argv[2])
result = predict(model_path, features)
print(json.dumps(result))
Variations
By Model Complexity
Simple: Logistic Regression - Linear decision boundary - Coefficients interpretable - Fast training and prediction - Good with 200-500 cases
Medium: Random Forest - Non-linear, handles interactions - Feature importance built-in - Robust to overfitting - Good with 500-2000 cases
Advanced: Gradient Boosting (XGBoost) - State-of-the-art accuracy - Handles missing values - Feature interactions learned - Needs 1000+ cases
Expert: Neural Networks - Maximum flexibility - Deep learning possible - Needs 5000+ cases - Overkill for most scenarios
By Training Strategy
Batch Training: - Retrain monthly/quarterly on all data - Simple, predictable - Model may drift between retrainings
Online Learning: - Update model continuously as new data arrives - Always current - More complex to implement
Incremental Training: - Periodically add new data to existing model - Balance between batch and online - Good compromise
By Ensemble Strategy
Single Best Model: - Use random forest OR gradient boosting - Simpler deployment
Voting Ensemble: - Train 3-5 different models - Take majority vote or average probability - More accurate, more complex
Stacking: - Train multiple base models - Train meta-model on their predictions - Maximum accuracy, maximum complexity
Consequences
Benefits
1. Higher accuracy Random Forest typically 5-10% more accurate than pattern matching.
2. Learns complex patterns Captures interactions, non-linearities automatically.
3. Feature importance Know which features matter most.
4. Scales with data More data = better predictions (unlike rules).
5. Generalizes beyond training Can predict for novel situations.
6. Continuous improvement Retraining improves model automatically.
Costs
1. Requires ML expertise Need data scientist or ML engineer.
2. Black box risk Complex models hard to explain.
3. Computational cost Training can take hours, needs Python/R.
4. Maintenance burden Models need monitoring, retraining, versioning.
5. Data requirements Needs 200+ labeled cases minimum.
6. Deployment complexity Python model in Node.js app = integration challenge.
Sample Code
Model comparison:
def compare_models(X, y):
models = {
'Logistic Regression': LogisticRegression(max_iter=1000),
'Random Forest': RandomForestClassifier(n_estimators=100),
'Gradient Boosting': GradientBoostingClassifier(n_estimators=100)
}
results = {}
for name, model in models.items():
# Cross-validation
scores = cross_val_score(model, X, y, cv=5, scoring='roc_auc')
results[name] = {
'mean_auc': scores.mean(),
'std_auc': scores.std()
}
print(f"{name}: AUC = {scores.mean():.3f} (+/- {scores.std():.3f})")
return results
Known Uses
Homeschool Co-op Intelligence Platform - Random Forest model: 91% AUC - Trained on 247 historical cases - 15 features, 100 trees - Retrains quarterly
SaaS Churn Prediction - Standard practice in B2B SaaS - Typically 85-95% AUC - Gradient boosting common - Predicts 30-60 days in advance
Credit Scoring - FICO uses logistic regression + decision trees - Highly regulated, interpretability critical - Billions of predictions annually
Healthcare Risk Stratification - CMS uses regression models for Medicare risk - Predicts readmission, high-cost patients - Lives depend on accuracy
Related Patterns
Requires: - Pattern 1: Universal Event Log - training data source - Pattern 11: Historical Pattern Matching - labeled outcomes needed
Complements: - Pattern 11: Historical Pattern Matching - use both in ensemble - Pattern 13: Confidence Scoring - ML confidence + uncertainty
Enables: - Pattern 15: Intervention Recommendation Engine - accurate predictions drive recommendations - Pattern 26: Feedback Loop Implementation - predictions validated, model improved
References
On Machine Learning for Prediction: - Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning, 2nd Edition. Springer, 2009. https://web.stanford.edu/~hastie/ElemStatLearn/ (Free PDF, comprehensive ML theory) - Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition. O'Reilly, 2022. (Practical implementations) - James, Gareth, et al. An Introduction to Statistical Learning. Springer, 2021. https://www.statlearning.com/ (Free PDF, more accessible than Hastie)
On Ensemble Methods: - Chen, Tianqi, and Carlos Guestrin. "XGBoost: A Scalable Tree Boosting System." KDD 2016. https://arxiv.org/abs/1603.02754 (State-of-the-art gradient boosting) - Breiman, Leo. "Random Forests." Machine Learning 45(1), 2001: 5-32. (Original random forest paper) - Ke, Guolin, et al. "LightGBM: A Highly Efficient Gradient Boosting Decision Tree." NeurIPS 2017. https://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree
On Risk Scoring: - Thomas, L.C. "A Survey of Credit and Behavioural Scoring: Forecasting Financial Risk of Lending to Consumers." International Journal of Forecasting 16, 2000: 149-172. - "FICO Score Methodology." https://www.fico.com/en/products/fico-score (Industry-standard credit risk model)
On Model Ethics and Fairness: - Barocas, Solon, and Andrew D. Selbst. "Big Data's Disparate Impact." California Law Review 104, 2016: 671-732. - "Fairness Indicators." TensorFlow. https://www.tensorflow.org/responsible_ai/fairness_indicators/guide (Tools for measuring fairness) - "AI Fairness 360." IBM. https://aif360.mybluemix.net/ (Open-source bias detection toolkit)
On Implementation: - Scikit-learn: https://scikit-learn.org/stable/supervised_learning.html (ML library for Python) - XGBoost Documentation: https://xgboost.readthedocs.io/ (Gradient boosting library) - LightGBM Documentation: https://lightgbm.readthedocs.io/ (Fast gradient boosting) - H2O.ai: https://docs.h2o.ai/ (AutoML platform)
Related Patterns in This Trilogy: - Pattern 1 (Universal Event Log): Feature source for models - Pattern 7 (Multi-Dimensional Risk): Composite risk assessment - Pattern 11 (Historical Pattern Matching): Simpler alternative to ML - Pattern 13 (Confidence Scoring): Quantifying prediction uncertainty - Pattern 15 (Intervention Recommendation): Acting on risk predictions - Pattern 26 (Feedback Loop): Improving models from outcomes - Volume 3, Pattern 6 (Domain-Aware Validation): Garbage in, garbage out—quality data essential