The Baseline Fallacy in Defensive Cyber Operations
Both civilian cybersecurity practitioners and military defensive cyber operations personnel continue to push a similar idea, that to identify malicious activity one must “just get a baseline and then flag anything different.” If only! I call this the Baseline Fallacy:
The Baseline Fallacy: One need only capture a baseline and then flag any differences to uncover meaningful evidence of compromise.
I call it a “fallacy” rather than a strategy because it is infeasible to capture a useful baseline, infeasible to flag meaningful differences, and infeasible to expect such a basic and flawed practice to uncover actionable evidence of compromise in a computer network.
"Capture a baseline..." #
The inherent flaw in this approach starts here, right at the beginning. “Capture a baseline” — a baseline of what? The number of people who can answer that question is much smaller than the number of people who recommend doing so. In the uncommon case that question has an answer, it then opens the door to a flood of secondary ones.
Consider, for example, user activity. Should you capture a baseline of user activity at the individual level, where lurking variables limit the applicability of any one data point to another, or try to capture a baseline at the organizational level, where the window necessary to accommodate the data’s spread makes it insensitive to the outliers this approach seeks to identify? What is the necessary time period to establish this baseline, and what is the appropriate amount of time to apply it before generating a new one? At what level of granularity must that baseline be captured in order to account for patterns of life at one installation or another, the differences in individual units, and individual agency?
"... and then flag any differences..." #
The challenges of capturing a suitable baseline, at the appropriate level of fidelity, compound here in the varying levels of importance for events that diverge from that baseline.
Consider, again, the user activity example. If an individual logs in at 8:00 a.m. instead of their usual 9:00 a.am., does that warrant alarm? What is the appropriate threshold for an alert? Generating an alert for any difference is impractical, but there can be no standard rule. This is the essence of a wicked problem, a situation for which no perfect solutions exists.
"... to uncover meaningful evidence of compromise." #
Ultimately, the problems of capturing a suitable baseline and the difficulty of identifying meaningful deviations from it manifest at this stage, where the level of precision in the baseline and fuzzy thresholds for the importance of unusual events either inundate analysts or, when inevitably over-tuned, lead to too many false negatives to uncover any meaningful evidence of compromise.
The Baseline Fallacy is not insurmountable. These examples will never be the silver bullet to identifying malicious activity, just as no other class of analytics will completely solve this problem on their own. With reasonable assumptions, appropriate thresholds, transparent communication, and in combination with other techniques, though, even the problematic approaches derided in these scenarios can play an important role in illuminating subtle evidence of malicious activity. That is the real point of this post, that detection is not a matter of just doing anything — including capturing a baseline — so let’s stop pretending that it is.