What is Big Data?
Big data refers to data that has following properties
- Volume - huge data
- Velocity - comes at high speed
- Variety - different types of data
- Veracity (accuracy) - accurate data
In contrast, Big Data is a method to understand patterns and behaviors of people - like clicks on social media apps and corporate websites (amongst many others). Why would a business want to do that? So they can determine what customers want (based on their behavior) and provide them with a better “digital” experience so that they will buy more over time. As advertising moves to apps and the web, this capability becomes more and more important as they sell to you and me. By the way, professionals in IT consider this type of data to be “unstructured” which is another way of saying the data sits in loose files that have to be gathered, integrated, and analyzed. Think about taking hundreds of thousands of hand written notes and looking for information tends across them. Painful right?
What is data science
Types of data
- unstructured data
- semistructured data
- structured data
It has structured, but that structure depends on the source. You work with semistructured data all the time. Your email is semistructured data. It has a pretty consistent structure. You always have a sender and a recipient, but the names and contents of your field might vary. Data science teams will typically work with more semistructured data than structured data. These are the volumes of email, weblogs, and social network sites which can be analyzed.