The availability of data from corporate systems, legacy systems, social media, public databases, Internet of Things and numerous other sources has been much discussed. Most and perhaps all of these sources of data have often been grouped under the label of Big Data.
While companies are coming to recognize the growing importance of big data, there are issues around the ability to actually use it. Some of the data is structured (organized in standard formats and understandable on its own). Other data is unstructured, meaning any useful analysis can only come after some restructuring is carried out to make the data understandable by the analytics tools being used. Some of these tools, often based on the Hadoop framework, can handle unstructured data. Others have difficulty.
The difficulty is compounded by the fact that the data of interest is becoming available in different forms beyond that of simple numeric data. It includes text (for which analytical tools have been available for years), video, audio, graphics and other forms. The latter are very difficult to structure, particularly with tools that can be used across platforms.
The answer comes in different forms. One approach is to structure as much data as possible, using recognized standards such as XML and XBRL. But that generally applies in an effective way only to numeric data or structured non-numeric data.
Besides structuring data, an approach is to build larger data storage areas, where tools can be used across a variety of formats in some consistent way. While this is not a magic wand to fix all the analysis issues, it does allow for a more coordinated approach to data analytics and management. Many companies are going this route.
Check out this link.
No comments:
Post a Comment