Understanding Big Data
The term Big Data was first appeared in 2000 by a Western industry analyst named Doug Laney. Globally, Big Data is data about many things that are collected in very large volumes at a fast speed. Big data can be analyzed and processed for the purposes of decision making ( desicion making ), business strategies, and business predictions.
In the terminology of classical data management, related to the increase in volume, Big data can be considered as data that is not can be solved with database (database) or traditional data processing applications. Why do we offend database ? Because of its implementation, the concept of Big Data can be called as database which is very large in size, Very Large Database (VLDB) whose configuration uses Database Management System (DBMS).
In a Big Data, mixed data between structured data and unstructured data. If you think that NoSQL is quite complicated, then Big data is tens of times more complicated than that. Even if there are programs or applications specifically designed to manage them, these applications require algorithmic designs and queries which are not public.
Framework and the applications used to manage large data are not directly connected to all data, but instead uses the analysis method. Framework or application for managing large data is commonly referred to as ' big data application analysis framework ' but there are also those who refer to it as ' big data tools ' only.
Benefits of Big Data
Big data can only be useful after analysis. We can analyze here in framework which is much smaller, like when we do query of databases on SQL server. However, on a very large and massive scale of data, the types of data will be more varied, the volume of data will be greater, and the structure more complex. Since this technology concept was coined, implemented, and developed framework Big data has been able to provide benefits to human life.
Citing information from techinasia.com following This is a brief summary of examples of the use of Big Data in Indonesia presented at the 'Big Data Week Indonesia' conference in 2015 (4 years ago).
1. Agricultural information systems
Regi Wahyu, CEO of Mediatrac, a Big Data analysis company recruited a number of talented students from Padjadjaran University to conduct research in a rice field in West Java.
Information obtained from the results of these studies collected into a Big data that can be used by farmers to increase crop production, predict the right time to grow crops, etc.
2. Taxation information system
Big data analysis in Directorate General of Tax (Directorate General of Tax) is still in the development stage. With the analysis of Big Data, it is expected to be able to solve the problem related to the low awareness of the community in paying taxes.
The head of the Director General of Taxes at the time, Iwan Djuniardi, in his demo presentation presented detailed visualizations such as analysis of family lineage, types and property items, and tax types and tax payment status.
3. Disaster information system
Quick Disaster is an application for Google Glass that will help users when and after a disaster occurs. For example, when an earthquake occurs, Google Glass will notify information about what users need to do, then provide recommendations for evacuation routes after a disaster occurs. The Quick Disaster application was developed by a researcher from Gajah Mada University (UGM) named Daniel Oscar Baskoro.
4. Health information system
Still from UGM, a health sector researcher named Anis Fuad, explained that clinics and hospitals in Indonesia still use their own applications to record patient data. Data sent to the Department of Health is still minimal and incomplete.
Using the Big Data analysis for the health sector will improve the accuracy of disease prediction and the health level of the population throughout the country in a centralized manner. At present, the problem is slowly being followed up with the start of database building on BPJS system online .
5. Language information system
Ruli Manurung from the University of Indonesia (UI) stated that we can classify and classify millions of words in Indonesian using Big Data. It can also be used to map sentences to support the application of foreign language translation into Indonesian or vice versa.
Characteristics of Big Data (5V)
Big data has a basic character of 3V i.e. Volume Velocity and Variety . However, the development was added again Value and Veracity so that it is now known to have a basic character of 5V. The following describes the five characteristics.
This means that a large and sometimes unstructured volume of volumes and volumes. For example Twitter feeds, Istagram feeds, text chat data and Whatsapp status, the flow of clicks user from web pages . These data flows can be sized up to thousands of Terrabytes (TB) per second.
Data can be accessed at a speed so fast that it can be used immediately at that very moment. Since the era of cloud storage and cloud computing developed in recent years, internet users have felt the speed of this data access facility.
This means that it contains various types of files both structured and unstructured. Analysis of unstructured data will require slightly different algorithms, such as text, image, sound and video data.
For such data will require more time to process it, because it could be in the unstructured data there is still other data or new data that can be extracted. For example in MP3 data there are IDv1 and IDv2 tag in JPEG data there are camera type data used, in PDF data there is the name of the application maker, and many more.  4. Value (Value)
The purpose of value is how valuable or meaningful the data is. For example, a employee's biographical data for a printing company will not be of value for the purpose of analyzing employee recruitment predictions in pharmaceutical companies.
These data may not be important or valuable in one case, but can be very important and very valuable in another case. Data that has no value in any case will not be filtered in the Big Data analysis application system.
5. Veracity (Honesty)
Character veracity refers to how accurate and reliable a data is. Continuing one example in point value above, it could be that the MP3 file IDv1 tag has been modified so that the authenticity of the MP3 file is questioned, changes IDv1 tag it could be due to the results of output sound processing application or converter file MP3. Data that does not have the character of honesty or authenticity will not be filtered into the analysis system.
Application Example Framework Big Data Analysis
Apache Hadoop is an Apache collection of applications open-source which are used to collect and analyze data services online . Many call it just a Hadoop. Hadoop began to be made around 2005, officially released in 2006 with the official name Apache Hadoop.
Hadoop was designed using the Java programming language, so it can be run on various platforms / operating systems. Hadoop is a collection of applications that can act as base modules sub modules ecosystems, or collections of one additional software package ( additional ) that can be install on it or side by side with the main Hadoop system itself. The collection of Hadoop applications include: Apache Pig, Apache Hive, Apache HBase, Apache Phoenix, Apache Spark, Apache ZooKeeper, Cloudera Impala, Apache Flume, Apache Sqoop, Apache Oozie, and Apache Storm.
History and concepts of Big Data begin in the 1970s, that was the time when information technology people began to open up their insights into data analysis and its relation to statistics. Continuing until 2000, a period when social media began to grow rapidly, it increasingly made people aware of the importance of data analysis on these social media platforms.
The data that entered into social media was too large to be stored and processed in one media store centrally. Then slowly new technology to overcome this problem emerged, was born the concept of NoSQL developed in Apache Cassandra and framework analysis of Big Data in Apache Hadoop.
That is an explanation of the understanding of big data along with the benefits and examples of applications that can be used to analyze big data. Hopefully useful and easy to understand!