Status Page

Incident Management

Create and manage incidents to keep your community informed during service disruptions.

What is an Incident?

An incident represents a service disruption, degradation, or issue that affects your community. When you create an incident, all subscribers are automatically notified and the status page updates in real-time.

Incident lifecycle: Investigating → Identified → Monitoring → Resolved

Creating an Incident

Follow these steps to report a service incident:

Step 1: Start Incident

Click "Create Incident" on your status page dashboard or from a component

Step 2: Add Details

Provide incident name, description, and affected components

Step 3: Set Status

Choose incident severity level (Investigating, Identified, Monitoring, Resolved)

Step 4: Publish

Incident is posted to status page and subscribers are notified

Incident Status Levels

Investigating

You're aware of the issue and working on it. Unknown ETA.

Identified

Root cause found. You know what's wrong and are fixing it.

Monitoring

Fix deployed. Monitoring to ensure full recovery.

Resolved

Issue completely resolved. Services back to normal.

Incident Updates

Keep your community informed by adding updates to ongoing incidents:

Adding Updates

Click "Add Update" on an active incident to post a status update

Update Frequency

Post updates every 30-60 minutes during active incidents

What to Include

  • • Current status and latest findings
  • • What's being done to resolve it
  • • Estimated time to resolution
  • • Affected services and workarounds if available

Example Incident Timeline

14:35 UTC

Incident Created

Database connectivity issue detected

14:42 UTC

Investigating

Team identified issue with database server connection

15:10 UTC

Identified

Root cause: Network misconfiguration. Fix in progress

15:25 UTC

Monitoring

Fix deployed. Monitoring connection stability

15:35 UTC

Resolved

All systems back to normal operation

Incident Types

Different incident types help categorize issues:

Outage

Service is completely offline

Degradation

Service running slowly or with reduced functionality

Partial Outage

Some users or features affected

Maintenance

Planned downtime for updates

Resolved Incidents

Resolved incidents are archived automatically:

Visibility

Appears in incident history for 90 days by default

Post-Mortem

Add a post-mortem report explaining what happened and lessons learned

Analytics

Track incident duration, affected components, and impact metrics

Scheduled Maintenance

Plan and announce maintenance windows:

Schedule Maintenance

Set start and end times for planned downtime

Advance Notice

Post maintenance notices days or weeks in advance

Auto-Resolution

Maintenance window automatically closes at scheduled end time

Component Status

Select which components will be affected by maintenance

Best Practices

  • • Create incidents immediately when issues are detected
  • • Update incidents regularly - at least every 30 minutes
  • • Be transparent about the impact and timeline
  • • Provide workarounds when possible
  • • Post a brief post-mortem after resolution
  • • Use clear, non-technical language
  • • Mark as Monitoring before calling it fully Resolved