The term Big Data was introduced to define data sets that were ever increasing in their Volume, Variety and Velocity, also known as the 3Vs. Such data sets were tough to be processed, captured or analysed by simple applications or computer systems and thus required a unique set of technologies and tools to manipulate or access them. As the amount of data used is increasing every day, the 3Vs are constantly being managed to enable enhanced decision making, and process optimization of such data sets. Over time these three characteristics have been given proper definitions and also expanded to support other additional characteristics called Veracity and Variability.
Features of Big Data
How is Big Data Managed?
Big Data is processed and generated using a programming model called MapReduce. This process works on a cluster of data and is composed mainly of two procedures, Map () and Reduce (). The Map () procedure is used to filter data into different queries, such as splitting of address into a local and permanent address. The Reduce () procedure on the other hand is used to perform some operation on the data, such as counting the number of entries in the data set. The MapReduce model allows distributed processing of data in a parallel manner; that is, a number of processes can be executed across multiple devices at the same time. Such parallel execution of data makes it possible for the system to uncover errors more quickly and schedule another system to work if the current one is not functioning properly. Hadoop is another application that is used on Big Data to make it more structured and reliable. It helps solve various problems of formatting and makes the data easier to scale and support.
Who Uses Big Data?
Big Data can be applied in any environment that deals with large amounts of data. With the increase in volume of data, there has also been an increase in the need for effectively classifying and analysing the data. Some of the most commonly used domains of Big Data are listed as follows.