Health Checks

Two approaches for monitoring service health in MinimalDoc.

Approach Comparison

Aspect	Static (CI/CD)	Real-Time (Client)
Updates	On pipeline trigger	Every page load
Server load	Zero	Every visitor polls endpoints
Latency display	No	Yes
External dependency	Prometheus, Datadog, etc.	CORS-enabled endpoints
Cost	Build/deploy only	None
Accuracy	Delayed (pipeline latency)	Live

Approach 1: Static (CI/CD Driven)

Status updates only when your CI/CD pipeline runs. External monitoring triggers the update.

How It Works

Prometheus detects issue → Webhook triggers pipeline →
Pipeline edits incident file → Site rebuilds → Static files uploaded

Configuration

No health_endpoint needed. Set status manually or via automation:

# components.yaml
- id: api
  name: API
  status: operational  # Updated by CI/CD

Automation Example

GitHub Actions workflow triggered by external monitoring:

name: Update Status
on:
  repository_dispatch:
    types: [status-change]

jobs:
  update:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Update component status
        run: |
          yq -i '.[] | select(.id == "${{ github.event.client_payload.component }}") .status = "${{ github.event.client_payload.status }}"' \
            docs/__status__/components.yaml
      - name: Create incident if needed
        if: github.event.client_payload.status != 'operational'
        run: |
          cat > docs/__status__/incidents/$(date +%Y-%m-%d)-${{ github.event.client_payload.component }}.md << EOF
          ---
          title: ${{ github.event.client_payload.title }}
          status: investigating
          severity: ${{ github.event.client_payload.severity }}
          affected_components:
            - ${{ github.event.client_payload.component }}
          created_at: $(date -u +%Y-%m-%dT%H:%M:%SZ)
          ---
          Investigating issue detected by monitoring.
          EOF
      - name: Build and deploy
        run: |
          minimaldoc build
          # deploy to hosting...

Benefits

No client-side requests to your services
Works with any monitoring stack (Prometheus, Datadog, Pingdom, etc.)
Zero runtime cost after build
Full control over when status changes

Trade-offs

Status updates have pipeline latency (seconds to minutes)
Requires external monitoring infrastructure
No real-time latency display

Approach 2: Real-Time (Client Polling)

Browser polls endpoints directly for live status and latency.

How It Works

Browser loads status page
JavaScript polls each configured endpoint
Status and latency displayed
Countdown shows time to next check
Repeat at configured interval

Configuration

# components.yaml
- id: api
  name: API
  url: https://api.example.com
  health_endpoint: /health
  health_interval: 30

Field	Type	Description
`url`	string	Base service URL
`health_endpoint`	string	Health check path
`health_interval`	int	Seconds between checks

Benefits

Live latency display
Immediate status updates
No external monitoring needed
Simple setup

Trade-offs

Every visitor polls your endpoints
Requires CORS headers on health endpoints
Can increase load on services
Browser must be open to see updates

Real-Time Details

Display

┌─────────────────────────────────────────────────┐
│  API                                             │
│  https://api.example.com                        │
│                                                  │
│  Status: ● Healthy         Latency: 24ms        │
│  Last check: 10s ago       Next: 20s            │
└─────────────────────────────────────────────────┘

Configuration

# components.yaml
- id: api
  name: API
  url: https://api.example.com
  health_endpoint: /health
  health_interval: 30

Field	Type	Description
`url`	string	Base service URL
`health_endpoint`	string	Health check path
`health_interval`	int	Seconds between checks

How It Works

Browser loads status page
JavaScript polls each configured endpoint
Status and latency displayed
Countdown shows time to next check
Repeat at configured interval

Display

┌─────────────────────────────────────────────────┐
│  API                                             │
│  https://api.example.com                        │
│                                                  │
│  Status: ● Healthy         Latency: 24ms        │
│  Last check: 10s ago       Next: 20s            │
└─────────────────────────────────────────────────┘

Status Indicators

Status	Indicator	Condition
Healthy	Green dot	2xx response
Degraded	Yellow dot	Slow response (>1s)
Unhealthy	Red dot	Non-2xx or timeout
Checking	Pulse animation	Request in progress

Latency Display

Range	Display
< 100ms	Green
100-500ms	Yellow
500ms-1s	Orange
> 1s	Red

Health Endpoint

Your service should expose a health endpoint:

Simple Response

GET /health

200 OK
{
  "status": "ok"
}

Detailed Response

GET /health

200 OK
{
  "status": "ok",
  "version": "1.2.3",
  "uptime": 86400,
  "checks": {
    "database": "ok",
    "cache": "ok",
    "queue": "ok"
  }
}

Unhealthy Response

GET /health

503 Service Unavailable
{
  "status": "unhealthy",
  "checks": {
    "database": "error",
    "cache": "ok"
  }
}

Status Determination

Response	Status
200-299	Healthy
300-399	Healthy (redirect)
400-499	Unhealthy
500-599	Unhealthy
Timeout	Unhealthy
Network error	Unhealthy

Configuration Examples

Basic

- id: api
  name: API
  url: https://api.example.com
  health_endpoint: /health
  health_interval: 30

Frequent Checks

- id: critical-service
  name: Critical Service
  url: https://critical.example.com
  health_endpoint: /health
  health_interval: 10 # Every 10 seconds

Custom Endpoint

- id: legacy-service
  name: Legacy Service
  url: https://legacy.example.com
  health_endpoint: /api/v1/status
  health_interval: 60

Multiple Services

- id: api
  name: API
  url: https://api.example.com
  health_endpoint: /health
  health_interval: 30

- id: web
  name: Web App
  url: https://app.example.com
  health_endpoint: /health
  health_interval: 30

- id: cdn
  name: CDN
  url: https://cdn.example.com
  health_endpoint: /health
  health_interval: 60

CORS Requirements

Health endpoints must allow cross-origin requests:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, HEAD, OPTIONS

Choosing an Approach

Use Case	Recommended
High traffic status page	Static (CI/CD)
Internal tools, low traffic	Real-Time (Client)
External services you don’t control	Static (CI/CD)
Services with CORS-enabled health endpoints	Either
No monitoring infrastructure	Real-Time (Client)

Static-Only Components

Components without health configuration use the static approach:

- id: external-service
  name: External Service
  status: operational  # Updated by CI/CD only

No live polling, no latency display. Status changes via pipeline.

Fallback Behavior

If health check fails:

Component shows configured static status
Error indicator displayed
Retry at next interval

JSON Output

Health configuration included in status.json:

{
  "components": [
    {
      "id": "api",
      "name": "API",
      "url": "https://api.example.com",
      "health_endpoint": "/health",
      "health_interval": 30
    }
  ]
}

External tools can use this for their own monitoring.