3:47 AM: Inside a Real Incident Caught by an Open-Source SOC Stack

Most SOC content explains architecture. This one walks through an actual incident, alert by alert, on a production stack running for a mid-sized enterprise client — Wazuh for detection, DFIR-IRIS for case management, and a custom FastAPI integrator holding the two together. No SaaS licenses. No cloud SIEM bill. Every component either open-source or built in-house.

Details specific to the client have been generalized; the detection logic, timestamps, and system behavior are real.

If you want the full architecture first, we’ve covered that separately: how the stack was built, commit by commit, and what simpliSOC monitors end to end. This post is what happens when it’s watching.

3:47 AM — First Alert

A login succeeds on the VPN concentrator for a finance department account. Nothing unusual on its own — remote work means logins outside business hours aren’t automatically suspicious. Wazuh’s identity rule set logs it, scores it low, moves on.

3:52 AM — Second Login, Wrong Continent

Five minutes later, the same account authenticates again — this time from a residential IP block roughly 9,000 km from the first login’s geolocation. Neither login alone would raise an eyebrow in most SOCs. Together, they’re physically impossible.

This is where a stateless rule engine hits its ceiling. Wazuh can flag "login from new location" as a single event, but it has no memory of the login five minutes earlier and no way to compute distance-over-time between them. That correlation lives in the integrator layer: a PostgreSQL-backed worker that tracks recent authentication events per user and runs a haversine distance calculation against elapsed time. Two logins that would require faster-than-commercial-flight travel get flagged automatically — no analyst had to notice the pattern by eye.

The correlation engine opens a case in DFIR-IRIS. Severity: Critical. SLA clock starts: 1 hour to first response.

3:53 AM — Enrichment, Before a Human Even Looks

By the time the case appears in the analyst queue, it already has context attached. The integrator’s enrichment worker queried the second login’s source IP against VirusTotal and AbuseIPDB in the background — the IP comes back flagged across three community blocklists, associated with prior credential-stuffing activity. That confidence score is sitting in the case notes as a structured field, not buried in a paragraph the analyst has to go find.

This matters more than it sounds. In a lot of SOC setups, enrichment happens after triage, or requires the analyst to pivot to a second tool. Here it happens before the case ever reaches a human, because IOC enrichment is async and runs in parallel with alert creation — it isn’t waiting in line behind anything.

3:58 AM — The Alert That Didn’t Flood the Queue

The same source IP tries two more account logins in the next four minutes — both fail. Each one would normally generate its own alert. Without deduplication, this is exactly the pattern that trains analysts to tune out the queue: three related alerts for one underlying event, each demanding separate attention.

The integrator’s dedup worker groups them under the original case using a per-rule cooldown window — credential-abuse patterns get a set window per user/host, tuned from production alert volume, not guessed at. One case. One SLA clock. Three data points instead of three separate fires.

4:04 AM — Analyst Picks It Up

The on-call analyst gets paged through PagerDuty — Critical severity routes there automatically. They open the IRIS case and find, in one place: both login events, the geolocation math, the IOC enrichment, the MITRE ATT&CK mapping (T1078 – Valid Accounts), and a triage checklist specific to this rule, loaded from YAML rather than written from memory under pressure at 4 AM.

The account is disabled. The finance department is notified. The session tokens are revoked. Total time from second login to containment: 23 minutes — inside the 1-hour Critical SLA, with the progress bar still green.

4:31 AM — Closing the Loop

This is the step a lot of stacks skip. The analyst confirms the case in IRIS — but a confirmed correlation that only lives in the case management tool is half a detection. The integrator sends a syslog message back to Wazuh, registering the confirmed incident in the SIEM itself. Now both interfaces agree: the correlation didn’t just get investigated, it got remembered by the detection layer that will watch for the same pattern from this account going forward.

What This Would Have Looked Like Without the Integrator

Take away the correlation engine, and this is two low-severity, unrelated-looking alerts sitting in separate queues — the kind that get triaged individually and closed as "user error, VPN client cached location" without anyone connecting them.

Take away deduplication, and it’s five alerts instead of two, three of them for the exact same event, contributing to the alert fatigue that makes real signals harder to see.

Take away the enrichment worker, and the analyst is opening a second tab to check VirusTotal manually — the 23-minute containment window gets longer every time a human has to context-switch.

None of these individual pieces are exotic. What makes the difference is that they’re wired together and running before a human is in the loop, not after.

The Numbers Behind This Stack

Metric	Value
Custom detection rules	95, across 8 categories
Alert sync cycle (Wazuh → IRIS)	every 5 seconds
IOC list refresh	every 4 hours
Correlation state	PostgreSQL-backed, persistent
Critical incident SLA	1 hour to first response
This incident: detection to containment	23 minutes
Deployment	100% on-premises, no cloud dependency

Why It’s Built This Way

Every detection rule in this stack ships in two versions: a simulation rule that fires on synthetic test data, and a production rule tuned to real event fields. That pairing means the correlation logic above was validated against a test harness before it ever watched a real login — the 3:52 AM alert wasn’t the first time that rule had fired, just the first time it mattered.

It’s also why the stack runs as a dedicated FastAPI service rather than relying on built-in active-response scripts: correlation needs state that persists between events, enrichment needs to happen asynchronously without blocking alert creation, and every decision needs to be replayable through a REST endpoint when you’re tuning thresholds after an incident like this one.

FAQ

Is this a real incident or a demonstration scenario?
The detection logic, timing, and system behavior are drawn from a real production deployment. Client-identifying details have been generalized to protect confidentiality.

Could this same setup catch things that aren’t credential abuse?
Yes — the same correlation engine handles lateral movement and privilege-escalation patterns using the same persistent-state approach; credential abuse is simply the clearest one to walk through end to end.

What does it take to get a stack like this running in our environment?
It depends on your existing log sources and compliance requirements. The fastest way to answer that is to see it running against a scenario close to yours.

Does this replace our existing firewall or endpoint tools?
No. This stack ingests from your existing perimeter and endpoint sources (firewall syslog, Windows AD, Sysmon, hypervisor logs) — it’s the detection, correlation, and case management layer sitting on top, not a replacement for what’s already logging.

See This Stack Catch a Live Scenario

Reading about a 23-minute containment window is one thing. Watching the correlation engine flag it in real time, the IRIS case build itself, and the loop close back to Wazuh is another. Request a demo and we’ll run a scenario close to your actual environment — your log sources, your compliance requirements, your alert volume.

hello@simplico.net

Related Services