Data Science apparently is now becoming a trend. It basically a collection of methods and paradigms in querying, analysing, processing and visualizing big data. Oil companies, business intelligence and marketing, social media, Artificial Intelligence, etc. want that.
Definition
Big Data: Extremely large data sets that may be analyzed computationally to revel patterns, trends, and associations, especially relating to human behavior an interactions.
Definition
Big Data: Extremely large data sets that may be analyzed computationally to revel patterns, trends, and associations, especially relating to human behavior an interactions.
- Cluster: a collection of computers working together in parallel (distributed processing) and in the same local network and using the similar hardware
- Grid: Similar to grid, but computers are geographically spread out and use more heterogeneous hardware.
- Hadoop: a Java-based programming framework that supports the processing and storage of extremely large data sets by using MapReduce framework. It's open source under Apache license.
- MapReduce, a core component of the Apache's Hadoop Software Framework. Originated from Google Research paper with the same name.
- Hiv
- Pig: a high-level platform for creating programs that run on Apache Hadoop. It consists of high-level language called Pig Latin abstracting underlying Java on Hadoop.
- SQL
- Full-stack
- MIKE2.0
- Groovy
- Tez