There is a quiet revolution going on in the deepest recesses of the big data world. Scores of big data experts are starting to wonder aloud if U.S. companies are wasting precious time and effort collecting large volumes of data they cannot possibly use. And if so, those same experts think it might be time to turn our attention to data quality instead.
Discussions of quality versus volume generally revolve around big data analytics – the science of analyzing data for the purposes of making business decisions. Yet common sense dictates that the question goes way beyond analytics itself. If quality data is more important than volume, the principle would seem to be consistent across the entire spectrum of business, healthcare, education, government, etc.
At Rock West Solutions in Southern California, one of their primary objectives is to combine signal processing and big data in order to help customers make sense of the information they collect. Rock West technologies have been applied to commercial, healthcare, military, and law enforcement needs.
Engineers at Rock West say signal processing is necessary if big data is to be used to its full potential. That being the case, there may be some legitimacy to the idea that quality is more important than volume.
Reducing the Size of the Net
If being data were comparable to commercial fishing, examining how fishing operations use nets could prove helpful to understanding the quality of collected data versus its volume. For purposes of illustration, imagine a commercial fishing operation setting out on its annual lobster run.
The captain would not even think about casting huge nets and dragging them along the sea floor to catch lobster. First, the nets would end up capturing all sorts of fish the crew does not want. Second, the total number of lobsters caught in the large nets would not be enough to make the operation worthwhile.
What is the solution? Replacing those incredibly large nets with smaller lobster pots. The pots are then baited in such a way as to attract lobsters. The pots still manage to catch some unwanted fish, but the vast majority of the catch is lobster.
The concept of focusing on data quality over volume is similar to using specialized pots to catch lobster. Rather than casting a wide net to collect as much data as possible, you focus on a specific kind of data that is more valuable – even in its raw form.
Identifying Usable Data Streams
Our fishing analogy breaks down in the sense that, unlike lobster fishermen, analytics experts do not necessarily know what kinds of data they need until after they’ve already done some comprehensive analytics. So in order to get down to the lobster pot scale, they first have to cast a very large net to learn what is out there.
Transitioning from volume to quality is a matter of identifying usable data streams before data collection ever begins. Rock West says signal processing can be equally valuable in this sort of endeavor. Correctly applied signal processing algorithms can reveal the intricate details of a given data stream for the purposes of determining whether that stream has any value.
Once a high value data stream has been identified, analysts can concentrate on harvesting its data. At the same time, they can ignore those streams with little or no value. This theoretically makes the big data paradigm more efficient and productive.
It is clear that big data in its current state is collecting far more data than we can use. Perhaps it is time to turn attentions to data quality over data volume.