Initial Thoughts From Round 2 Of MITRE’s Enterprise ATT&CK Evaluation
What an amazing year it’s been for the ATT&CK evals team, going from an initial cohort of seven vendors in round 1 to 21 vendors for round 2. The industry adoption of this evaluation has been nothing short of amazing and is well deserved. With that said, I’m pleased to once again contribute my thoughts and analysis on the outputs of this evaluation to help clients understand the strengths of each of these evaluated offerings.
As I did last year, I’ve released source code on GitHub that breaks out some key metrics for understanding the strengths of these products as well as generating an Excel workbook to make it easier to parse out results. Admittedly, it does feel like getting in the dunk tank, releasing code knowing it’s going to be pored over by the same engineers whose products I evaluated in a Forrester Wave™ evaluation a few months ago. 🙂
Here are a few of my initial thoughts from the evaluation:
The evaluation contains a mini game of whack-a-mole, but you should ignore all the MSSP stuff.
One of the first things that stood out to me in this evaluation was that some vendors thought this would be a good opportunity to show how good their managed security service provider (MSSP) offerings are by demonstrating a human detection for everything. The addition of the MSSP detection type was a response to some vendors leveraging hunt teams in the previous evaluation to help provide differentiation between the human and the technology. I’m totally in favor of this, mostly because it exposed this silliness.
Having your MSSP or IR investigators look at a small, clean environment and tell you everything bad they find is closer to the certification they probably took to get the position than a reflection of any reality your organization is going to present. In looking at these results, it’s more of an embarrassment to see these vendors trying so hard and not detecting everything than any victory they are going to claim by talking about it in their blogs. My advice is to just ignore this noise.
Sometimes the numbers don’t add up to what you expect.
This is a bit of a response to some of the questions I’ve gotten about my code in trying to figure out why the number of “None” detections reported by MITRE doesn’t add up to the number of “None” detections I’m scoring. The reasoning here is that “None” is a default condition for not having a detection, but in some cases, this isn’t explicitly called out — such as if there’s only an MSSP detection or where all the detections included configuration changes (which I ignore). In both these cases, the default condition rises to the top, and there is a “None” you wouldn’t have found by counting the published results.
Why do you hate configuration changes?
These evaluations need to be for the buyers and, therefore, reflect the buyers’ environments. I’m assuming these products are tuned going into the evaluation. You can’t get around that, but if it then requires further tuning in the middle of the evaluation, that’s taking it further and further away from what customer experience is going to be. I don’t forsake this data point altogether, as I calculate and expose this metric in a property I call “dfir” for digital forensics and incident response (DFIR) because the absolute visibility afforded by these configurations may be interesting to this buyer persona.
Are you looking at X to decide whether or not to give a vendor credit for this step?
It’s not my job to second-guess the results that MITRE published. I’ve booked research calls with all the participants to discuss the results and process, but the evaluation happened months ago, and I’m certainly not making my own judgments on scores based on limited information compared to what MITRE has already reviewed and made judgments on.
So who won?
The end user.
Seriously, one of my biggest regrets from the analysis I performed on round 1 was releasing a “simple score” that’s still being used to demonstrate who has the best product. One truth about this industry is every product has a vision and capability that’s designed with a buyer persona in mind. My goal in any of the research I do is to help buyers figure out which product aligns and delivers best for their needs.