How To Analyze A Data Set

This post does not elaborate on the process of gathering a defensible data set. You’ll find that information here.

A tactical analyst can offer valuable day-to-day investigative support, but once a threat has been addressed, it is equally helpful for a strategic analyst to examine the incident through a broader lens, such as a comparison of similar incidents over a long timeframe.

An example of this is a published report from the United States Secret Service, which assessed mass attacks in public spaces from 2016 to 2020. By examining a wider data set, the agency was able to identify “a range of observable [emphasis theirs] concerning behaviors across a variety of community systems as they escalate toward violence.” The writers said their study gave stakeholders information that may be useful to intervene before a mass attack occurred.

Generally, when analyzing a data set, it is helpful to examine the input in two ways: quantitatively and qualitatively. With the first, you’ll look at the numbers, such as upward/downward/steady trends, statistical significance, modes, medians, means, and so forth. A qualitative analysis takes into account the societal, social, generational, geographical, historical, political, and economic circumstances that surround a topic. It’s not necessarily a straight line. You’ll likely go back and forth between the two as your proceed with your work.

How much data do you need? More is always preferable, as long as you can manage the volume. If you’re not finding enough data, you may want to expand your parameters. If you’re finding too much, narrow them. In terms of timespan, if you’re examining a trend, your results will benefit from a decade or more of data points. If you have a source from which you can collect more information with consistent parameters, and you have the time to study a larger amount of input, then the longer the timespan the better to judge the meaning and relevance of your data.

These pointers will guide you through a data set analysis:

  1. Begin with a question to frame your analysis. Are you merely gathering/reporting data (intelligence study), or are you seeking to answer an intelligence question (intelligence assessment)?
  2. Once you have established your goal, gather your data and enter them into a sortable database. Separate your data into multiple terms across columns. You may not have plans to study all of these attributes individually, but it’s easier to do the work up front than to play catch up later. Along the same line, think ahead about how your data will be used in charts and graphs. It’s a good idea to include links or some indexing method to get back to the original source. Finally, include a free text field to capture analyst notes, as well as facts that may be important, but don’t have a logical column on your spreadsheet.
  3. Sort your data in a number of ways, as appropriate to what you’re studying: i.e., dates, locations, weapons, perpetrators’ and victims’ demographics, gaps between dates, and others. For numerical data, look at the mean (average), median (center point), and mode (frequency of appearance), which provide different perspectives, and study outliers. In terms of outliers, consider whether they offer valuable insight, or if their importance can be minimized.
  4. Most software programs that support numerical data and calculations also allow data to be put into graphs, which make the results easier to visualize. You’ll have a number of graphs from which to choose. If a bar or line graph is appropriate, but it’s unclear which direction your data are heading, you can add a trend line. There are different types of trend lines, including a moving average, which will help smooth your data and offer a short-range forecast.
  5. It is helpful to test the statistical significance of certain findings if you’re comfortable with the techniques. When viewed alone, a finding may appear more or less weighty than it is. Applying statistical tests can help you understand the relevance of your results. If you choose to apply statistical methods and think it will add to your written narrative, be aware your audience may be less familiar with these techniques than you are, so summarize your results in a way they are broadly understood.
  6. Study the legal, societal, social, generational, geographical, historical, political, economic, and so forth circumstances that surround your issue. You don’t need to look at all of these. There are probably a few that are more relevant than others, but be open to new directions. These elements can help you understand why the matter you’re studying developed as it did and what factors may continue to influence it.
  7. As you proceed with your analysis, do regular sanity checks to consider whether your findings are logical, or if things appear to be distorted. What might be causing the distortion? For example, if you’re looking at specific violations, might different districts be using different reporting/prosecution standards? Have new laws been enacted that may have led to an increase/decrease in arrests? Has publicity/public attitude around a crime led to increased/decreased reporting? Ultimately, are there additional attributes that may need to be added to your data set?
  8. Along the way, you may find tangents that you want to explore either in your current project or in a future project. If the latter, include your plans in the outlook section of your final report.

There is a frustrating truth that sometimes accompanies data set analysis, which is “a-ha” moments are rare. Intelligence findings are generally nuanced. Our inclination may be to shape our final report in a way that garners attention. However, the purpose of data set analysis is to add context and perspective to an issue and to reduce emotions, not ignite them.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: