What is Big Data?
Why Big Data?
Today,
An Application without data analytics is like a car without a steering wheel.
It will go, but there’s no controlling its direction.
Data
Management has become extremely complex in recent years, because day by day
data is growing exponentially and it is biggest challenge for IT industry to
store and analyze those data. One-third of organizations experiencing increase
of 25 percent or more annually. Healthcare is one of the fastest growing
segments and growing at 48 percentage yearly compared to 40 percent of overall
digital universe.
Data which fulfill three
criteria is call “Big Data”:
Volume: How much data
Variety: The various types of
data
Velocity: How fast data is
processed
Volume: Big data implies enormous volumes of data (petabytes
of data), it is the most immediate challenge of big data. It is used to be
employees created data, now that data is generated by human, network and
machines interaction on systems like social media. It requires scalable storage
and support for complex, distributed queries across multiple data sources. The
big challenge is to identify, locate, analyze and aggregate particular pieces
of data.
Variety: It is refer to the many types of data both structured
and unstructured. We normally used to store data from sources like database,
text file and spreadsheets, but data come in form of photos, videos, PDfs,
audio, emails. Social media, blogs, web server logs, GPS, RFID and web content.
This verity of unstructured data creates problems for storing, mining and
analyzing data. Standard techniques and technology we normally used for large
volume of structure data. It is very big challenge for current industry to
process and analyze large volume of structure and unstructured data and convert
those data into meaningful information.
Velocity: Analytics for traditional data warehouse tend to be
based on periodic – daily, weekly, monthly - modification of data. In
healthcare industry, area such as Patient analysis and Clinical decision
support, it is very important to have real-time or near real-time data to
elimination of errors and timely decision making. To support automated decision
making data is require real-time you can’t use 5 minute old data for automated
decision making because manual reviews of such decision is expensive and
time-consuming.
How Big Data Help?
Many
company has invented various tools which help industry to deal with high volume
and variety of data. Using this tools industry can easily analyze customer data,
for ex. In Healthcare industry this tool helps to incent and compensate
providers to keep patients healthy. In other side patients is also demanding
information about their healthcare options so that they understand their
choices and can participate in decisions about their care.
Big Data Tools
IBM
Text Analytics: Using this tool we can
extract structured information from unstructured and semi-structured documents.
Annotation Query Language (AQL) is helping in extracting this information from
unstructured document.
Big
R: It helps in analyze, manipulates
and visualize big data. It used the open source R language to enable rich
statistical analysis and predictive model.
IBM
InfoSphere Streams: It is advanced
analytics platform which allow user to develop application quickly, analyze and
correlate the information which is arrive from thousands of real-time sources,
It handle very high data throughput rates, millions of events and messages per
seconds.
IBM
Accelerator: It is used for Machine
data analytics to import, extract, index, searching patterns and analyze significance
machine data files.
Apache
Spark: It is open source big data
processing framework, It enables application in Hadoop clusters to run up to
100 times faster in memory and 10 times faster even when running on disk.
Comments
Post a Comment