What's Up DOCC? The Data Operations and Cleaning Center: Training Machines to Recognize Breaking News for Intelligence and Security
Stabilitas, an A.I.-powered technology startup in Seattle, WA, uses machine learning to gather threat intelligence data from around the world and analyze it in real time to keep people safe. One of Stabilitas’s most valuable innovations thus far has been creating a Data Operations and Cleaning Center (DOCC), powered by a team of analysts affectionately called Data Referees. Many of these Data Referees were recruited from the Bunker Labs network, a not-for-profit organization that inspires, educates and connects veterans and military spouses with their next mission.
“The early results are in and are really promising,” said Greg Adams, Co-Founder and CEO at Stabilitas. “Bunker Labs has been a fantastic resource. We are looking at growing our team of Data Referees, and thinking about how to make them a permanent competitive feature of Stabilitas. This kind of data organizational know-how is an incredible asset. On top of that, our machine learning pipeline is generating real results.”
Looking for more than a few good acronyms
Machine learning cannot happen without some sort of training data, and quality training data is hard to come by. Dr. Mikhail Zaydman, Stabilitas’s Lead Data Scientist, set off to fix the problem. First, he forced strict definitions of terms and categorization. It was a tough, iterative process to get consensus across the company. Second, he worked with David Ulrich, Stabilitas’s Data Engineer, to build a cleaning tool in the form of an informational dashboard. Harking back on Stabilitas’s founding military heritage, it was not lost on the Data Team that they would need a new acronym for this dashboard, which they decided to call the Human Intervention Tool (“HIT”). Third they got to working cleaning and grading the data.
The Human Intervention Tool (HIT) - below - is a way for the data referees to analyze and assemble data.
Team work makes the Dream Work
“In the beginning, I wasn’t quite sure where to go to get help for this,” said Greg, “but then realized that with Bunker there is a great community of people with a vision aligned with our own. Vets are good team players, appreciate our mission, and I think there’s a great chance to introduce them to cutting edge machine learning technology. A huge thanks to Jake at Bunker Labs, Seattle and the rest of the Bunker Network for helping us make this happen.”
After a couple months of trial and error as well as creating an internal training program for Data Referees to come up to speed, the Data Referees and DOCC started producing real results. Stabilitas is pleased to report that they are looking to accelerate their machine learning model to market.
“We looked at this as an experiment as part of our National Science Foundation grant effort, and I have to say: it’s blown away my expectations,” said Greg.
As part of DOCC most Data Referees interface with the Stabilitas team via Slack, but they are always welcome to visit the Seattle office and periodically join group feedback calls. The company has even extended additional contract work to Data Referees that are interested in data science careers.
As Dylan Cooper, who has been working with Stabilitas since November of 2017, shared, “I saw this as a growth opportunity, along with a mission that was meaningful to me in a real way.”
Greg says the company is learning a lot about itself in the process of setting up DOCC. “DOCC is something of the trenches of data science. For our company, I’m coming to see Data Referee work as a selection tool to understand which contractors or potential employees work well with others, are reliable, and offer solutions to problems that come up every day. Data Referee work is a great way to gain insight on what the company is all about.”
As for the technology and its application to intelligence and security? There is little doubt:
“Machine Learning is the future,” said Dylan.