Sentiment analysis and online reputation

sample tourist establishments at downtown Madrid, Spain data from Sep 2017 proof of concept
196 establishments and 1649 reviews
This dashboard shows a sample of user reviews extracted programmatically from Google Places API (restaurants and hotels) and Home Away API (apartments). Other APIs available for integration and analytics include Booking, Uber, Trip Advisor, Expedia, Zomato, etc. with different kinds of restrictions, requirements and costs.
We use the sentiment analysis and keyword extractionmachine learning API services at indico.io. Other available machine learning services include Google Natural Language API, IBM's Watson Natural Language Understanding, Microsoft Cognitive Services, etc. Custom models can also be built (for specific terminologies, legal documents, etc.), see for instance b4msa, NLTK, etc.

1. Inspect establishments

Each user review has a text body and a rating assigned by the user. We then compute a sentiment analysis score on the review text body. Both should be coherent. Ratings range from 0 (bad) to 5 (good). Sentiment scores range from 0.0 (negative) to 1.0 (positive).
hover over any dot on the map to see reviews, ratings and sentiment analysis score

2. Competition view per sector

Observe ratings and sentiment analysis score for each establishment individually within each sector.

Sectorial ranking




3. Anomalous reviews

We show anomalous reviews, where the user rating and the sentiment analysis score differ (one is low when the other one is high). This might be because of a user mistake when rating or because the sentiment analysis engine is not detecting the actual user intention. Anomalous reviews found per establishment type and shown in the graph below, and constitute the following proportions: home_away[APARTMENT] 2.09%, google[RESTAURANT] 1.68%, google[LODGING] 1.00%. Observe many reviews at Home Away with zero rating and a high sentiment score. This might signal deficiencies in their web page design or application.

4. Distributions of ratings and sentiment scores

The histograms below show the distributions of the ratings given by users and sentiment scores assigned by the machine learning model. Observe that the model tends to assign extreme sentiment scores (close to 1 or close to 0), whereas users tend to spread out more their ratings. In general, it seems that HomeAway ratings are better and more coincident with sentiment scores.
hover over any dot on the graph above to see review details