Big data is a
generally poorly defined term, despite it being one of the most common terms
used these days. There is a general recognition that data is expanding rapidly
in volume, and is being used for various purposes more often, thanks to the
many tools available for using it.
Indeed, it’s become
something of a mantra that data is a major asset of many a business.
This is even more
compelling given the growth taking place in the Internet of Things, where
virtually every physical asset/item/thing opened by a business is potentially connected
to the internet and a source of reams of new data. The potential value of this
data is not lost on management and stakeholders of companies.
And so this gives rise
to several questions – Does the data need to be standardized? How does it get
stored? By what tools is it best used?
Data lakes constitute one
of the current favoured approaches to these questions.
Traditionally, data is
stored in data warehouses, which require some standardization and structure.
They are therefore limited in their capacity and therefore lacking in usefulness
for big data, which comes in all kinds of forms (structured, semi structured, unstructured,
etc.) from many sources (just about anything imaginable).
Data lakes accommodate
just about any form of data and are tied into analytical tools, such as Hadoop,
that can handle such data.
Like big data, data
lakes are poorly defined but they fit well with big data. Data, big data, data
lakes, Hadoop type tools, all form a critical new field for modern management.
One that cannot be ignored.
No comments:
Post a Comment