
Digging Deeper Into Data
In our increasingly complex, fast-paced world, effective and efficient information analysis is a hot commodity. It is also key to decision making in a multitude of critical real world applications.
Scientists at Pacific Northwest National Laboratory have been developing information analytics solutions for decades, mostly targeted to applications in homeland security, one of the laboratory’s major ongoing R&D mission thrusts. Perhaps some of the more colorful work lies in the visual analytics realm—tools that turn data into something an analyst can see and interpret in more efficient and meaningful ways.
“To simply use the terms ‘information analysis software’ does little justice to the power these solutions are offering and gives even less indication as to the complexity and sheer magnitude implied when we say ‘data’ in the context of real life applications,” said John McEntire, PNNL commercialization manager for information technologies. Often, data collections that information analysts sift through to investigate a single issue are massive, dynamic, ambiguous and often contain thousands of pieces of often conflicting information. “Somehow, analysts have to determine not only what is there, but also what is relevant and beyond that, what is truly connected and critical to solving the challenge du jour,” McEntire said.
This is no easy feat when one considers all of the different sources of information and the forms that information can take, and the convergence of those forms into a single massive collection that somehow must be used to paint a cohesive, accurate picture. But the lab has managed to produce some interesting results on this front in the past several years.
For example, information analytics experts have recently harnessed the power of several individual tools into a single software suite—called the Fused Analytic Desktop Environment, or FADE—a breakthrough product with a robust collection of information analysis tools not previously available in a consolidated solution. Born of a combination of existing PNNL software solutions developed with government funding over years of research, FADE enables users in a variety of information-driven fields to spend their time obtaining useful results rather than sifting through massive volumes of information that are often heterogeneous in source, format and organizational construct.
Instead of sorting and formatting data for individual analyses, which is highly cumbersome but still frequently a reality, FADE allows a one-time ingest of all data, internal parsing, centralized storage and retrieval, automatic statistical and user-trained thematic data triage, customizable organization using a readily familiar interface and a variety of options for visual data interpretation. The ability to employ multiple analysis tools in a single environment allows analysts to uncover relationships among entities in data that likely would have otherwise gone undiscovered—greatly strengthening understanding and decision-making.
While FADE’s components were initially designed with the needs of the intelligence community in mind—for things like tracking potential terrorist activity and preventing proliferation of weapons of mass destruction—their combined power as part of the FADE suite also benefits information analysts in business intelligence, financial analysis, law enforcement/first response, cyber security and beyond. With its combination of powerhouse-embedded analytical technologies, plug-and-play compatibility with additional existing tools outside the suite, and user-friendly interfaces and document organization options, FADE represents a huge breakthrough in flexible visual analysis of ad hoc information.
Digging even deeper into this particular technology, key elements of FADE include a collaborative analytic toolkit—CAT—that enables users to easily bring new data and information into the FADE environment. On a PC, it's as simple as dragging files from the desktop into the collaborative framework. Once in CAT, files are automatically ingested, indexed, and prepared for use in other analytic tools as they are stored in an underling document management system that is modeled similarly to Windows Explorer.
To analyze large collections of text documents, FADE relies on PNNL's IN-SPIRE text analytics platform. Its integration in this environment supports easy visualization of folders of documents to identify key themes and concepts. Brightly colored graphs make it easy for users to identify the most common themes detected within the data, allowing them to quickly drill down to deeper levels of detail.
Frame of Reference Visualization—or FoRViz—provides visual representation of the documents that have been categorized. Using both the relationships provided by the folder hierarchy in which the documents reside as well as those determined through statistical analysis of word usage and placement, the FoRViz visualization provides immediate recognition of similarity or lack thereof between documents. For instance, two documents may relate to “shipping and the Mediterranean Sea” but one may be about cargo ships carrying shipping containers and one may be about the operations of UPS-air in the area surrounding the Mediterranean Sea. In this case, the user may only be interested in activity related to shipping via cargo ships. FoRViz distinguishes between the two.
FADE's semantic graph analysis environment provides a way to automatically construct link analysis diagrams from source information and see connections across data sources. For example, an analyst could track shipments on several vessels traveling between a set of ports, including people, places, items, schedules, and related electronic correspondence—and how all of those entities might be related—to investigate trafficking of illegal items or potential terrorist activity.
Ali Madison is a PNNL communications specialist.

Copyright © 2012 | Innovation America