Microsoft Cyber Defense shores up quality with end-to-end testing

Microsoft Defender Advanced Threat Protection (ATP) is a platform designed to help enterprise networks prevent, detect, investigate, and respond to advanced threats. Thousands of companies and organizations depend on Microsoft Defender ATP to keep networks and critical business systems safe from cybersecurity threats.

Microsoft Defender ATP’s cloud-based portal is actively used by security operations personnel on a daily basis. Defects can have immediate visibility and impact customer satisfaction, or potentially compromise security. Ensuring that it’s working as intended and that updates don’t break existing functionality is a critical concern.

We talked with Ran Mizrachi, a software engineering manager in Microsoft’s Windows Cyber Defense group.

Challenge:

The Defender ATP product is only a few years old. When the project started, Microsoft had about 25 software engineers on the project working at breakneck speed. They were checking in code daily, but there was very little deployment automation. Testing was manual—primarily sanity checks. The team was on an unsustainable path.

We didn’t have the manpower to add coverage to any existing code.

 

He continued, “We allowed a culture to develop that wasn’t sustainable. We needed to make testing part of our core processes, and that required more than a tool change.”

The team wanted to implement functional end-to-end test automation but didn’t want to spend a lot of time building test coverage for existing functionality. At the same time, they recognized the need to test for regression of existing code as they released new content.

Key requirements:

  • Easy to create tests—for developers, QA, and product managers

  • Fast onboarding time

  • Integrates into their DevOps pipelines

  • Wouldn’t require changes to their existing codebase

The team considered Selenium but felt that it would be a steep learning curve to ramp up skills. They also considered Applitools and liked the visual validation aspect, but it lacked the functional validation capabilities needed.

Solution:

Microsoft Cyber Defense Group chose Testim for automated functional and end-to-end tests.

The onboarding process was really straight-forward and fast according to Ran. They conducted a workshop with about 35 of the team’s developers with some support from Testim. Each team prepared a list of scenarios that they wanted to validate in the workshop. The first part of the onboarding process focused on learning about Testim—its features, how to use Testim, and some best practices to help structure tests. The second part of the workshop was applying this knowledge to create tests and build coverage.

After the six hour workshop, the teams had created 120 tests including 80 E2E tests that are still running today.

 

The Microsoft Defender ATP team has since integrated Testim into its development pipelines on Azure, running various tests at specific process gates such as pull requests, pre-deployment, releases, post-deployment and nightly. Developers are responsible for creating Testim tests for their code. “Every new piece of content that gets released should have proper test coverage in Testim,” said Mizrachi.

The new (nearly) continuous testing strategy is paying off. According to Ran, it is helping to identify a lot of bugs. As they build up test coverage for the existing codebase, if something stops working, the teams add tests so that going forward, they can test that area for regression.

One of the nice features of Testim is the ability to troubleshoot failed tests. Ran emphasized, “This is really important because we have many developers working on the same portal. Being able to quickly determine why a test failed, even when someone else wrote it, saves a lot of time.”

Automated testing helps Microsoft get closer to its goal of continuous delivery. As Ran emphasized,

 

Testim is helping to make the CI/CD dream possible—you can’t get to continuous delivery without proper test coverage.

Benefits:

  • Fast test creation—20 min on average per stable test

  • Simple to create tests, even for non-technical product managers

  • Easy onboarding—after a 6 hours they had created 80 stable tests

  • Really easy to see why a test failed even if you didn’t write the test

  • Helping to identify previously undetected bugs

  • Integrates into their dev processes to trigger tests at process gates

  • Responsive support from Testim