Stuart's Capstone blog: February 2012

Tuesday, February 28, 2012

Paper 5: Thumbs up?: sentiment classification using machine learning techniques

Title: Thumbs up?: sentiment classification using machine learning techniques
Authors: Bo Pang, Lillian Lee, Shivakumar Vaithyanathan
Link: http://dl.acm.org/citation.cfm?id=1118693.1118704

Due to the recent emergence of online opinion voting sites such as Rotten Tomatoes, the researchers in this paper attempted to develop a system of using machine learning to express user sentiment. Using their system, the researchers are able to analyze a text-based review for a movie, and determine how the reviewer felt without the author assigning a fixed score. Using these analysis, their reviewing system can determine if the aggregate feedback across multiple reviewers is positive, negative, or neutral.

Their results were incredibly promising, the machine learning algorithm beating all prior algorithms, as well as a simple random search. The one area their algorithm struggled was correctly identifying sentiments in reviews with a "narrative" structure to them. When authors start a review with statements such as, "I went into this movie expecting to hate it", but then end with an overall positive review, the system has issues detecting the true sentiment from their author's earlier stated expectations.

Tuesday, February 14, 2012

Paper 4: MovieBase: A movie database for event detection and behavioral analysis

Title: MovieBase: a movie database for event detection and behavioral analysis
Authors: Tat-Seng Chua, Sheng Tang, Remi Trichet, Hung Khoon Tan, Yan Song
Link: http://dl.acm.org/citation.cfm?id=1631135.1631143

The University of Singapore team attempted to improve methods of cataloging and organizing various video clips. Their sample was based off of 69,129 "shots" coming from both feature films and YouTube clips. Each of the shots were hand tagged into 7 different audio and 11 different visual categories.

Their system offered various different ways of analyzing clips. Based off these analytical factors, the researchers were able to glean various statistics about the differences between feature films and shots from Youtube videos. While the statistics proved useful, the researchers lamented the fact that they had to tag each individual shot by hand. They suggested further research into automated scene tagging as well as social tagging for sites such as Youtube.

Thursday, February 9, 2012

Paper 3: A Web Service for Flexible Integration of Mobile Applications with Social Networks

Title: A Web Service for Flexible Integration of Mobile Applications with Social Networks
Authors: Victor Pantoja, Markus Endler
Link: http://dl.acm.org/citation.cfm?id=2090316.2090320

In their paper, the authors discuss their Mobile Social Gateway service, or MoSoGw for short. The framework is designed to connect to social networks, while tapping into the physical hardware capabilities of the phone. MoSoGw works as an intermediary, between social networks and the phone, letting the two pass context information back and forth.

The framework is designed to work with multiple social networks such as Facebook and Twitter. The MoSoGw server application uses standard web technologies such as MySQL databases, while handling the data transfers using HTTP requests and JSON. The sever application is written in Python, while the Android client is written in Java.

Tuesday, February 7, 2012

Paper 2: A mobile peer-to-peer system for opportunistic content-centric networking

Title: A mobile peer-to-peer system for opportunistic content-centric networking
Author: Ólafur R. Helgason, Emre A. Yavuz, Sylvia T. Kouyoumdjieva, Ljubica Pajevic, and Gunnar Karlsson
Link: http://dl.acm.org/citation.cfm?id=1851322.1851330

In their paper, the authors discussed their middleware solution for connecting multiple devices together on a peer to peer network. They focused on the Android platform, and coded their system in Java. While the devices all connected wirelessly together, they required a wifi base station.

Their research focused on the battery power drawn via their system, and the logic behind how to determine which phone should function as the base node. The developers concluded that Bluetooth would make a much better implementation, though the limited bandwidth would pose a challenge. The researchers plan to look into cacheing technology in order to improve their system.

Thursday, February 2, 2012

Paper 1: Lessons from the Netflix Prize Challenge

Paper: Lessons from the Netflix Prize Challenge
Authors: Robert M. Bell and Yehuda Koren
Link: http://dl.acm.org/citation.cfm?id=1345448.1345465

In 2006, the movie renting website Netflix.com launched a competition in order to improve their movie recommendation engine. While no one achieved the target of a 10% improvement over their existing engine, a team out of AT&T labs as still able to come up with several significant improvements. The team was able to come up with four main improvements:

A new method for computing nearest neighbor interpolation weights that better accounts for interactions among neighbors.
A neighborhood-aware factorization method that improves standard factorization models by optimizing criteria more specific to targets of specific predictions.
Integration of information about which movies a user rated into latent factor models for the ratings themselves.
New regulation methods across a variety of models, including both neighborhood and latent factor models.

The article was very interesting overall. Rather than trying to come up with a completely new, more efficient algorithm, this paper focused on improving existing recommendation engines. It was nice to see a modern, real world example as well, rather than a strictly academic work.