Training Army Analysts to Use the Big Data Platform
Created January 2019
To handle the challenges and leverage the advantages of big data in cyberspace, the Army must cultivate a new generation of data scientists for its cyber workforce. Army Cyber Command (ARCYBER) is addressing this need by teaming with the SEI CERT Division to create training capabilities that help Army analysts develop the necessary skills for using its Big Data Platform.
A Flood of Data Reveals a Shortage of Data Scientists
From networks, sensors, weapons systems, and communications that span the globe, the Army generates and receives petabytes of data each day. To identify, defend against, and pursue threats to its networks; analyze intelligence; and monitor missions, cyber operators must aggregate and analyze this data. Data science and big data analytics can help analysts make sense of this flood of information and use it to support cyber operations. However, while the Army has an exceptional workforce skilled in a range of cyber operations, few are specifically trained in the three disciplines that apply data science in the cyber domain—operations and intelligence, mathematics and statistics, and computer science.
ARCYBER data scientists, in conjunction with the Defense Information Services Agency and the Army Program Executive Office for Enterprise Information Systems, developed the Big Data Platform (BDP) to enhance the data analysis mission of the Department of Defense. The BDP is a complex environment that monitors cyber data, provides data aggregation, and performs analysis. The Army currently faces a usability gap with the BDP; few analysts outside ARCYBER have the skills and knowledge to use it.
How is the Army solving its big data problem? In Army magazine, Maj. Gen. John W. Baker and Dr. Steven J. Henderson presented a plan based on their experience with the 7th Signal Command in Fort Gordon, Georgia. They argued that, in the short term, the Army should provide training to those in its enlisted and civilian ranks who have expertise in one of the three the disciplines of data science. In the long term, the Army should create an initiative to recruit and develop data scientists.
The SEI CERT Division is collaborating with Army Cyber Command to address this gap. ARCYBER integrates and conducts full-spectrum cyberspace operations, electronic warfare, and information operations, ensuring freedom of action for friendly forces in and through the cyber domain and the information environment.
Hands-On Courses Train Army Analysts to Use the Big Data Platform
To transition the BDP's power throughout the Army, ARCYBER must lead the Army to grow an organic data science workforce. As part of this effort, ARCYBER partnered with the SEI CERT Division to develop a model of the BDP that analysts can use to gain experience that they can apply to the real system. ARCYBER is also working with the SEI to develop realistic courses to train analysts from the broader Army and DoD.
The SEI used its expertise in data science, modeling, and simulation to develop a running instance of the BDP in it is own training environment. An advantage of the BDP training environment is its deterministic data set. When analysts run the same queries on the data, they get the same results. This predictability facilitates learning because it guarantees that if analysts go through the proper steps in the training exercises, they get the expected answers.
The SEI also worked with ARCYBER analysts to develop three courses:
- The Big Data Platform Primer course provides an introduction to data science, the BDP, and the BDP interface.
- The Introduction to the Big Data Platform course covers BDP data sources and a more in-depth examination of the tools accessible through the BDP; it also includes a series of labs for practical hands-on experience.
- The Cyber Data Science for the BDP course takes an in-depth look at the Army Cyber Data Science process and how it can be used with the BDP to produce actionable data that supports cyber operations.
All courses use the CERT Simulation, Training, and Exercise Platform (STEP) and are hosted in its Private Cyber Training Cloud (PCTC). Army analysts log into PCTC and access the courses, launch videos, and work through the hands-on labs.
Last year, ARCYBER issued an operations order requiring anyone who wants access to the Army's instance of the BDP, called Gabriel Nimbus, to take the SEI Big Data Platform Primer course and earn a completion certificate. Since then, more than 580 analysts have enrolled in the SEI Big Data Platform Primercourse, and 459 have completed the certificate. In addition, 341 participants have enrolled in the Introduction to the Big Data Platform course, and 89 have completed the training.
Fully leveraging data science to support cyber operations and missions remains a challenge for the Army and DoD as a whole. The SEI continues to work with the Army to create workforce development tools and capabilities that better equip soldiers with the skills they need to bring to bear a more robust cyber data science capability within the DoD.
Big Data Platform
January 29, 2018 Presentation
In this presentation, the author discusses the evolution of the Big Data Platform, examples of how it is being used today, and key lessons learned in its development.read
The Cyber Data Science Process
June 01, 2017 Article
Major General John W. Baker (U.S. Army)Dr. Steve Henderson
This article outlines the Cyber Data Science Process, a workflow of specific activities that define how data science should be incorporated with cyber operations.read
Making the Case for Army Data Scientists
August 01, 2016 Article
Major General John W. Baker (U.S. Army)Steven J. Henderson
The Army faces threats of scale, persistence and reach in the cyber domain. To fight and win, game changers are needed in the science of data.read