Little Boxes Made out of Ticky Tacky, and They’re All the Same

By Shawn Moyer ·

On any given new client call or meeting about a penetration test or research engagement, something that inevitably comes up is what I call the Box Color and Team Color conversation:

"Does your team prefer a black, white, or grey box approach?"

"Do you consider this to be a red team or blue team sort of engagement?"

These are harder questions to answer than you might think, especially when posed in an RFP or an email, where I have to try to craft a coherent and well-formed reply to what is potentially a more amorphous question than it would seem.

To me, box colors and team colors have become so overused in our line of work that they’ve reached Semantic Satiation and are open enough to interpretation that they’ve outlived their usefulness as points of reference.

Don't get me wrong… We all sort of know what someone means when using the term "red team penetration test" or "black box assessment." We can start with those words and have some kind of entry into a conversation about the type of testing for which a particular client is looking.

Still, the interpretation of box colors varies so widely that I usually respond to the question by asking what the client means when they use those words, ideally using a definition that doesn't include yet another team or box color.

If I’m forced by circumstances to keep going along the convention of using boxes and team colors, I often refer to our preferred approach as a "hybrid assessment." I suppose you could arguably call this approach a "grey box" engagement as well. I guess I like to color outside of the lines. Hackers do that, you know.

The term "red team" appears to date as far back as World War I, and like many other battlefield analogies, it has been bolted onto our collective language as security professionals, regardless of how well it fits. To the extent that a formal definition exists for computer security, NIST 800-53 defines a red team as:

An exercise, reflecting real-world conditions, that is conducted as a simulated adversarial attempt to compromise organizational missions and/or business processes to provide a comprehensive assessment of the security capability of the information system and organization.

Sound specific enough? Sure. As always, the devil is in the details.

The fun begins when we try to define "real-world conditions." Does our team have zero knowledge of your environment or your company or target organization? Do we have any idea what you are really trying to evaluate? Like real-world adversaries, do we have zero rules of engagement and unlimited time for our testing? Not likely. Perhaps you've just spent three months creating a security awareness program and are hoping to show your CIO how resilient your users are to phishing and social engineering attacks. Sorry, we just compromised your SharePoint server via an insecure upload directory and used it to steal credentials and infiltrate your network instead. Whoops.

Okay, so let's specifically plan for a phishing exercise. But wait, NIST 800-53 says a red team should be a "comprehensive assessment," so our team should exhaust any and all potential avenues of attack: the SharePoint site should be in scope, but so should every other public or semi-public website you manage, as well as your VPN, user-specific phishing, and any other pretexting, physical access, and dumpster diving, to say nothing of compromising your corporate wireless LAN or stealing credentials with rogue access points. Perhaps we might try to bribe a sysadmin or two as well, just for good measure. Great. This engagement should take a couple of decades.

Blue teams are just as open to interpretation. IR-7298 describes a blue team as the "group responsible for defending an enterprise’s use of information systems" (this being the typical definition in wargaming or CTF), or as "a group of individuals that conduct operational network vulnerability evaluations and provide mitigation techniques to customers."

Alright then, so the "blue team" is either the folks who defend the network or the ones who figure out how to break into it. Clear as mud?

White and black boxes (or clear boxes, or glass boxes, or transparent boxes, or grey boxes, or semi-opaque boxes with an SPF rating of 50 or above), all of which are shorthand borrowed from circuit design and software unit testing rather than the battlefield, aren't any easier to define in a real and practical sense either.

To some, "black box" testing might mean that our team has no knowledge about the application and no credentials to the application or network under review. To others, it might mean we aren't provided with source code but are still given credentials and an overview of how the application or environment works. Likewise "white box" assessment might mean source code is provided, or it might mean the engineering team provides us with data flow diagrams and a functional spec so that our team knows what the application or device or firmware is actually supposed to do in the real world but little else.

When trapped in a box color and team color conversation, what I try to do is ask the important questions:

  • What controls are you attempting to evaluate with this engagement?
  • What is it about this particular target that is keeping you up at night?
  • What are the outcomes from testing you don't want to happen, no matter what?
  • What are the specific business reasons you have for a third-party security review?
By asking these questions, suddenly, we will all find ourselves talking about the actual problems we're trying to solve, instead of trying to color inside the lines of a picture created by someone else that doesn't look anything like what's right before our eyes.