I love Sentry since I discovered it many years ago. Back in the days, self-hosting it was really easy - a single Docker image which you would use for spinning up like 2-3 separate services, Postgres, Redis, a few lines of config, and you’re done.

Nowadays (2024), self-hosting Sentry requires spinning up 50+ different services - and that is of course without any fancy HA setup. It’s still doable, especially with Kubernetes, but the learning curve is definitely more steep. Then again, the feature set of Sentry itself is much richer - it’s not about just catching errors anymore; you have full-fledged build-in performance monitoring, session recording, and tons of other observation-related goodies.

One thing is kinda tricky is monitoring of your Sentry instance. Here is how I do it.

Use built-in endpoints

…and it’s not enough

I ran into a situation where Relay went into some reconnecting spree (despite being live/ready) - at the moment I don’t remember the exact root cause, but it took me a while to realize that events are not being properly ingested. In the end, I decided to configure a cronjob which would trigger a Sentry event (exception) to one of my projects every ~hour. Then I muted that exception as obviously it was not actionable.

Once you have that exception in Sentry, it’s time to check if it’s being registered every hour.

Sentry exposes API endpoint where you can obtain issue details; one interesting part in that JSON response is lastEvent - you can query this endpoint every ~hour, parse the response, ensure it’s no older than ~2h (so you can have some overlap), and if it’s not - meaning that Sentry is nicely ingesting events.

You can do that using Cloudflare Workers, here is a part of Terraform code which I used to configure it:

resource "cloudflare_worker_script" "sentry_monitoring" {  
  account_id = var.cloudflare_account_id  
  name       = "sentry-monitoring"  
  content    = file("sentry.js")  
  
  plain_text_binding {  
    name = "ISSUE_ID"  
    text = "<your issue id>"  
  }  
  
  plain_text_binding {  
    name = "SENTRY_DOMAIN"  
    text = cloudflare_record.sentry.name // assuming you have this configured
  }  
}  
  
resource "cloudflare_worker_secret" "sentry_auth_token" {  
  account_id  = var.cloudflare_account_id  
  script_name = cloudflare_worker_script.sentry_monitoring.name  
  name        = "SENTRY_API_TOKEN" // issues read-only token 
  secret_text = var.sentry_api_token // I'm using Terraform Cloud to set this variable
}  
  
resource "cloudflare_worker_route" "sentry_monitoring" {  
  zone_id     = cloudflare_zone.your_zone_id.id // ssuming you have this configured
  script_name = cloudflare_worker_script.sentry_monitoring.name  
  pattern     = "https://${cloudflare_record.sentry.name}/uptime*"  
}

Here is the worker JS file:

addEventListener('fetch', event => {
  event.respondWith(handleRequest())
})

async function handleRequest() {
  // you might want to pass project id as argument here
  const apiUrl = `https://${SENTRY_DOMAIN}/api/0/issues/${ISSUE_ID}/`
  const token = SENTRY_API_TOKEN;

  try {
    const response = await fetch(apiUrl, {
      headers: {
        'Authorization': `Bearer ${token}`
      }
    })
    const data = await response.json()

    if (!response.ok) {
     // worker logs is currently in beta, you can enable it in Cloudflare dashboard,
     // seems like terraform provider is lagging behind
     console.log(`Error: ${response.status} - ${response.statusText}`);
     return new Response('API Error', { status: 500 });
   }

    const lastEventStr = data.lastSeen
    const lastEventDate = new Date(lastEventStr)
    const currentTime = new Date()
    const diffInHours = (currentTime - lastEventDate) / (1000 * 60 * 60)

    if (diffInHours <= 2) {
      return new Response(lastEventStr, { status: 200 })
    } else {
      return new Response(lastEventStr, { status: 500 })
    }
  } catch (error) {
    console.log('Error:', error);
    return new Response("Unhandled error", { status: 500 })
  }
}

Then under <your Sentry domain>/uptime I would have a Cloudflare worker returning 200/500 response - now it’s easy enough to add that endpoint as well to monitoring software of your choice.

Now you should be monitoring 4 different endpoints which should give you enough confidence that Sentry is indeed up & properly processing events.