Large Scale Text Processing and Sentiment Analysis Project with MapReduce, Hive, and Spark

This project is based on the final project I did with 2 teammates for the Cloud Computing and Big Data Application course. Motivation Apache Hive is a data processing software built on the platform of Apache Hadoop for implementing applications of data query, analysis, integration and so on. Among these applications, sentiment analysis is one... Continue Reading →

Why we should not over-trust data

Terms like "Big Data" and "Data Mining" are so popular these days. Businessmen use it for market analysis and investment planning, and researchers use it to expand the knowledge bound for humanity. Even in 'Sherlock Holmes', the main character expressed the most intriguing idea of data mining: The world is woven from billions of lives, every... Continue Reading →

Create a website or blog at

Up ↑

%d bloggers like this: