What's Up DOCC?

March 9, 2018
Maggie A

What's Up DOCC? The Data Operations and Cleaning Center: Training Machines to Recognize Breaking News for Intelligence and Security

A view of the Stabilitas dashboard. The data on the map is generated via machine learning and represents breaking news events that impact security.

Stabilitas, an A.I.-powered technology startup in Seattle, WA, uses machine learning to gather threat intelligence data from around the world and analyze it in real time to keep people safe. One of Stabilitas’s most valuable innovations thus far has been creating a Data Operations and Cleaning Center (DOCC), powered by a team of analysts affectionately called Data Referees. Many of these Data Referees were recruited from the Bunker Labs network, a not-for-profit organization that inspires, educates and connects veterans and military spouses with their next mission.  

“The early results are in and are really promising,” said Greg Adams, Co-Founder and CEO at Stabilitas. “Bunker Labs has been a fantastic resource. We are looking at growing our team of Data Referees, and thinking about how to make them a permanent competitive feature of Stabilitas. This kind of data organizational know-how is an incredible asset. On top of that, our machine learning pipeline is generating real results.”

Looking for more than a few good acronyms

Machine learning cannot happen without some sort of training data, and quality training data is hard to come by. Dr. Mikhail Zaydman, Stabilitas’s Lead Data Scientist, set off to fix the problem. First, he forced strict definitions of terms and categorization. It was a tough, iterative process to get consensus across the company. Second, he worked with David Ulrich, Stabilitas’s Data Engineer, to build a cleaning tool in the form of an informational dashboard. Harking back on Stabilitas’s founding military heritage, it was not lost on the Data Team that they would need a new acronym for this dashboard, which they decided to call the Human Intervention Tool (“HIT”). Third they got to working cleaning and grading the data.

Military veteran Dylan Cooper shifting gears after working a shift with Stabilitas’s DOCC Team at the Stabilitas office in Seattle.

The Human Intervention Tool (HIT) - below - is a way for the data referees to analyze and assemble data.

The Human Intervention Tool (HIT) is a way for the data referees to analyze and assemble data. Initially, the machine models had little training data to work with and were quite noisy. The DOCC solved that problem by providing more training data for the machine models.

Team work makes the Dream Work

“In the beginning, I wasn’t quite sure where to go to get help for this,” said Greg, “but then realized that with Bunker there is a great community of people with a vision aligned with our own. Vets are good team players, appreciate our mission, and I think there’s a great chance to introduce them to cutting edge machine learning technology. A huge thanks to Jake at Bunker Labs, Seattle and the rest of the Bunker Network for helping us make this happen.”

After a couple months of trial and error as well as creating an internal training program for Data Referees to come up to speed, the Data Referees and DOCC started producing real results. Stabilitas is pleased to report that they are looking to accelerate their machine learning model to market.

Greg Adams, Co-Founder and CEO at Stabilitas.

“We looked at this as an experiment as part of our National Science Foundation grant effort, and I have to say: it’s blown away my expectations,” said Greg.

As part of DOCC most Data Referees interface with the Stabilitas team via Slack, but they are always welcome to visit the Seattle office and periodically join group feedback calls. The company has even extended additional contract work to Data Referees that are interested in data science careers.

As Dylan Cooper, who has been working with Stabilitas since November of 2017, shared, “I saw this as a growth opportunity, along with a mission that was meaningful to me in a real way.”

Dylan is working on HIT on the right, as David a data engineer, works on managing the data pipeline.

Greg says the company is learning a lot about itself in the process of setting up DOCC. “DOCC is something of the trenches of data science. For our company, I’m coming to see Data Referee work as a selection tool to understand which contractors or potential employees work well with others, are reliable, and offer solutions to problems that come up every day. Data Referee work is a great way to gain insight on what the company is all about.”

As for the technology and its application to intelligence and security? There is little doubt:

“Machine Learning is the future,” said Dylan.

The Science of Security

Stay up to date with the latest news and insights from the Stabilitas security intelligence network.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.