Amateurs Talk Detection. Professionals Talk Collection.

I went to a presentation from a large cybersecurity firm the other day. The salesmen — and they were all salesmen, as even a few post-presentation questions made clear — focused on the intelligence their company produced, but knew little about the rest of their company’s intelligence cycle. As a consumer of their products, though, I consider knowledge of the process that created them critical. After an hour on finished intelligence, and in particular a slide that touted a “globally deployed sensor grid”, I wanted to know more about that production process in general, and about their collection specifically. Unfortunately, when asked, the salesmen offered a handful of handwavy “We have millions of sensors across the globe”, and, “We ingest billions of events per day” statements, but little beyond that.

I left the venue disappointed but not surprised. Whether dealing with cyber threat intelligence specifically, intelligence in general, or even the types of analysis SOC personnel perform, almost no one wants to talk about the process — they just want to hear about the results. “APT X used Malware Y.” “The SOC identified a compromise in Network Z.” This myopic focus on the assessments themselves precludes an understanding of the entire process which would, critically, involve evaluating the impact of biases at each stage of the intelligence cycle on the end result. For example, how did guidance given during Planning and Direction inform Collection? Did limited or directed collection — a statistical bias1 — exacerbate human or systemic biases during Processing and Exploitation or Analysis and Production? How did decisions at each phase ultimately impact the accuracy of the assessment delivered in Dissemination and Integration? Answers to questions like these provide the necessary background to appropriately contextualize finished intelligence products, which are otherwise the result of an opaque process and subject to unclear biases at each stage.

This is not a novel insight. As Andrew Thompson frequently quips, “If your collection is trash, your analysis will be trash.” That holds for intelligence in general, and cyber threat intelligence specifically. Exquisite analysis can never overcome incomplete data. Over the last few days, I realized that rule also applies to the type of analysis SOC personnel perform as well.2 Effective analysis, whether in the intelligence or information security field, requires a transparent and rigorous process — but it relies on collection. Thanks to the fantastic work threat intelligence companies and individual researchers publish, and the myriad rule repositories available to SOC analysts3, the challenge in uncovering malicious activity is almost never in its actual identification; that is — for the most part — straightforward. The challenge in uncovering malicious activity is almost always in collecting the data necessary to enable its detection. “If your collection is trash, your analysis will be trash”, and that applies to the information security field as well.

General Omar Bradley once said, “Amateurs talk strategy. Professionals talk logistics.” This does not devalue strategy, but rather highlights the importance of logistics as an enabler of strategic plans. I propose a similar statement for the information security space not to devalue detection, but rather to highlight the importance of the collection upon which it relies: “Amateurs talk detection. Professionals talk collection.”

 NIST Special Publication 1270: Proposal for Identifying and Managing Bias in Artificial Intelligence does a nice job of identifying sources of bias. Whereas most think of biases as logical fallacies, the authors of SP 1270 correctly identify three distinct sources: statistical biases “such as representativeness of datasets and fairness of machine learning algorithms”, human biases, and systemic biases.

 Again, as I said in SOC Metrics, Part I: Foundational Metrics, “I use ‘SOC’ as a generic term for all groups responsible for securing an organization’s information systems. This includes the IT administrators who provision and maintain those systems, security and compliance monitors who deal with known threats and policy violations, and threat hunters who deal with emerging and novel threats. Each of these entities plays a distinct but critical role in an effective security program, but I will use ‘SOC’ as a general term for all of them here.”

 For instance, start with MITRE’s Cyber Analytics Repository (CAR) or the famous Sigma project. Also check out rules from Azure Sentinel, Google Chronicle, DNIF, Splunk, Elastic, FalconForce, Panther Labs, Emerging Threats, and SOC Prime. Don’t forget about advanced hunting queries for Microsoft 365 Defender, Elastic’s Prebuilt Rules, Google’s Community Security Analytics (CSA), and Open Threat Research Forge’s ThreatHunter Playbook, too.