Document Level Sentiment Analysis: A survey

Sentiment analysis becomes a very active research area in the text mining field. It aims to extract people's opinions, sentiments, and subjectivity from the texts. Sentiment analysis can be performed at three levels: at document level, at sentence level and at aspect level. An important part of research effort focuses on document level sentiment classification, including works on opinion classification of reviews. This survey paper tackles a comprehensive overview of the last update of sentiment analysis at document level. The main target of this survey is to give nearly full image of sentiment analysis application, challenges and techniques at this level. In addition, some future research issues are also presented.


Introduction
Due to the emergence of Web2.0, users can share their opinions and sentiments on a variety of topics in new interactive forms where users are not only passive information receivers.Because of the importance of this information in several areas (political, commercial or individual), it would be interesting to treat opinions automatically.The term "sentiment analysis" is used to refer to the automatic processing of opinions, sentiments and subjectivity in texts.This field is known as the opinion mining [16] or sentiment analysis [7].
Sentiment analysis is an extremely active field of research in natural language processing, which allows extracting the opinions from a set of documents.Sentiment analysis can be investigated at different levels [7]:  Document level analysis The task at this level is to determine the overall opinion of the document.Sentiment analysis at document level assumes that each document expresses opinions on a single entity.

 Sentence level analysis
The task at this level is to determine if each sentence has expressed an opinion.This level distinguishes the objective sentences expressing factual information and subjective sentences expressing opinions.In this case, treatments are twofold; firstly identify if the sentence has expressed or not an opinion, then assess the polarity of opinion.But the main difficulty comes from the fact that objective sentences can be carrying opinion.

 Aspect level analysis
This level performs a finer analysis and requires the use of natural language processing.In this level, opinion is characterized by a polarity and a target of opinion.In this case, treatments are twofold: first identify the entity and aspects of the entity in question, and then assess the opinion on each aspect.This paper is an extended version of work published in [37].The rest of this paper is arranged as follows: Section 2 presents applications of sentiment analysis.Then the challenges of sentiment analysis are given in Section 3. Section 4 provides background information and related work of document level sentiment classification.Then a comparative study between different works is presented in section 5.Last section concludes our study and discusses some future directions for research

Applications of sentiment analysis
Sentiment analysis has various applications going from identifying customer opinion towards products [7] and services, to voters' reaction to political adverts.Other application areas in which sentiment analysis can be very useful are:

Business Intelligence
Sentiment analysis plays a very important role to many business intelligence applications such as credit rating or company reputation.It is useful to classify each opinion according to the aspect of the business or transaction describes: e.g., product quality, ordering, or integrity [34].
Sentiment analysis helps to evaluate the limitations of particular products and then exploit this information to improve the products or services.It also helps enterprises to understand their customers as well as to plan future products and services [34].

Recommendation system
Sentiment Analysis helps in knowing individual's review on a product or issue.On based of that customers' take decision and further predict what impact that topic has on other domains.
Sentiment Analysis helps in knowing customer's review on a product or service.On based of customers' decision, predict what effect that topic has on other domains and predict what impact demonetization has on economic and social fronts [33].

Summarization of Reviews
Sentiment analysis allows to extracting the opinions about a particular entity.It will provide an overall rating for a given topic.A customer may not be able to decide about product through reading all reviews and make an informed decision about product and the manufacturer may not be able to keep track of consumer opinion [33].

Government Intelligence
Sentiment Analysis in the domain of Government Intelligence allows to extracting the opinions on government policies and decisions to infer possible public reaction on implementation of certain policies [35].

Challenges of sentiment analysis
The important indicators of opinions are opinion words.The opinion words are necessary but not sufficient for sentiment analysis.The challenges of sentiment analysis are discussed in brief:

Contextual polarity ambiguity:
Contextual polarity ambiguity is an important problem in sentiment analysis [36].The same opinion bearing words carry varying polarities in different contexts.

Sarcasm handling
Sarcasm is very difficult to deal with.In sentiment analysis, it means that when one says something positive he actually means negative, and vice versa [7].

Negation handling
Negation handling has an important role in sentiment analysis because polarity gets reversed.

Opinion expressions
On the one hand, many sentences with opinion words may not express any opinion.On the other hand, many others without opinion words can express opinions, many of these sentences are objective sentences that are used to express some factual information [7].

Related Work
The field of sentiment analysis is vast and knows a spectacular growth because of the commercial challenges.A relatively exhaustive state of art was drawn up in 2008 by Pang and Lee [16] they focused on the applications and challenges in sentiment analysis.They mentioned the techniques used to solve each problem in sentiment analysis.
In his book, Bing Liu [7] presents a synthesis of works in the sentiment analysis.It updates state of the art of Pang and Lee (2008) and distinguishes three levels of analysis: document, sentence and aspect level.
Existing approaches for sentiment analysis are categorized into three main classes' machine learning approaches, lexicon based approaches and hybrid approaches [1].

Machine Learning Approaches
Those approaches require labelling a corpus in advance (positive, negative or neutral) [28].The main features used are: words, bigram, tri-gram, part of speech and polarity.Several supervised-based techniques are used, but two of them appear to provide the best results.These are the SVM and NB classifiers [9] [21] [25].
In "bag of words" models, the latter being based only on collections of n-grams of words.Thus, approaches based on n-grams cannot correctly identify the complex sentiment expressions.The authors use the Incremental Parser (XIP) to construct the dependency tree.They tested the model on a set of French reviews of video games, developed as part of the DOXA project [32] on opinion mining.Thus, they were able to show that SVM classifier using features built from sub graphs, extracted from de-pendency trees, gives better results than traditional systems based on unigram.
In [17], the authors propose the use of neural network to learn an effective model of sentiment classification.They compared their work with an SVM model using the multithematic Amazon corpus.The experiment results show identical performance.
In [30], the objective of the work is the classification of Chinese mobile reviews with SVM and NB classifiers; these reviews are much shorter on average than those of the PC.The scoring used in this work is iTunes' score.ITunes has a score of 1 to 5 points, the reviews with 1 or 2 points are marked as negative, reviews with 4 and 5 points are marked as positive and those with 3 points are marked as neutral.The results show that the NB classifier is the better.
Vinodhini and Chandrasekaran [26] evaluated the PCA (Principle Component Analysis) effect with two methods of sentiment classification: SVM and NB.The experiments are performed on product reviews.The performances are improved using the PCA as a method of features reduction.
Several works exist for sentiment analysis at document level treating movie reviews.Pang [14], is the first who experiments this approach with machine learning.The proposed method which proved to be good in text categorization, did not achieve good performance for the sentiment classification.He also demonstrated that the binary representation is more significant than the frequency representation.
The authors in [28] used Classification by minimizing the error (CME) to attribute a score of opinion to each sentence.They then used the SVM classifier to attribute a score to each document based on the values of some features (this features are defined based on subjectivity and the relevancy in all the sentences).So blogs are classified according to their final score based on the relevancy score multiplied by the score of opinion.
The first freely available corpus Arab (OCA) for sentiment analysis is proposed by Rushdi Saleh et al [19].The OCA corpus consist of 500 movie reviews collected from different Arabic web pages, 250 of them considered as positive and the rest as negative opinions.In addition, various experiments were conducted on this corpus using NB and SVM classifiers.
Govindarajan [6] proposed a hybrid method of classification based on the coupling of NB and genetic algorithm (GA).In this method, first the two basic classifiers NB and GA are built to assign a score of opinion, and then the classification of a new review is done by combining the predictions of two basic classifiers, by a majority vote.The author used a set of 2000 movie reviews (they have been extracted from the corpus of Bo Pang).The hybrid method is compared with the two base classifiers NB and GA.The results showed that the hybrid method has improved the performance.
Nguyen et al. [10] proposed a new type of feature named "rating-based feature".The rating-based feature is based on the fact that scores (in which users use to categorize entities in reviews) could provide useful information to improve the performance of opinion classification.For a review with no associated score, the authors use a regression model to predict the score.They combine rating-based feature with unigram, bigram, and tri-gram.
In [23], the authors propose a model for sentiment classification.First, various schemes of pre-processing are applied to the data set.Secondly, the behaviour of the NB and SVM classifiers is studied in combination of different schemes of feature selection.The results of classification show clearly that linear SVM give more precision than NB classifier.
In [5], the authors propose a method for sentiment analysis in Arabic Tweets with the presence of dialectical words.These words were replaced to their corresponding words of the Modern Standard Arabic (MSA) by the use of dialect lexicon.Both NB and SVM classifiers were used to determine the polarity of tweets.Two versions of the same data set are used.The first version consists of tweets containing dialectical words and the second version consists of tweets containing translated words.The results show that the replacement of dialectical words improves the accuracy of classifiers (3%).

Deep-learning-based approach
Deep Learning is the part of machine learning process which refers to Deep Neural Network.These include CNN (Convolutional Neural Networks), RNN (Recurrent Neural Networks), Recursive Neural Networks, DBN (Deep Belief Networks),...In the last years, deep learning approaches have captured the attention of researchers because it has significantly outperformed traditional methods.For example, in [38] Paredes-Valverde et al., proposed a deep learning approach to build a classifier for sentiment detection.This approach is divided into three main modules: (1) pre-processing module, (2) word embeddings, and (3) CNN model.The results show that CNN outperformed traditional models such as SVM and NB. In

Lexicon based Approaches
Lexicon based approaches exploit a sentiment lexicon which is either built independently of any corpus (built from existing dictionaries), or generated from the corpus (words containing opinion are extracted directly from the corpus).The objective of these lexicons is to index the most words carrying possible opinion.If a document contains many subjective words, then it is considered as a document containing opinions [8] [11].
Turney presents a simple algorithm for sentiment classification of reviews [24].The review is classified by the average of the semantic orientation (SO) of the sentences.A sentence has a positive SO when it has good associations and a negative SO when it has bad associations.The SO is calculated as the Point wise Mutual Information (PMI) between the given sentence and "excellent", and also between the given sentence "poor".Finally, the review is classified according to the average of the se-mantic orientation of the sentences that contains.
To overcome the problem of domain dependency in the sentiment analysis, Rothfels and Tibshirani [18] propose an approach for treating movie reviews using the automatic selection of items with positive or negative opinions.They choose two seed reference sets, one positive and one negative to calculate the semantic orientation (SO).SO is calculated as the PMI between the given sentence and the seed reference set.
Baloglu and Aktas [2] proposed a lexicon-based approach.This approach is divided into three phases.Phase 1: crawling phase; the data are collected from blogs on the Web.Phase 2: analysis phase, in which the data are analyzed to extract useful information (predefined keywords) and uses SentiWordNet to determine the sentiment score of each keyword and finally the review is classified based on the average of these scores.Phase 3: visualization, which information is displayed.
A sentiment analysis system named Document based Sentiment Orientation System is proposed in [20].It uses a lexicon based approach which determines the SO of reviews and Word Net to identify synonyms and antonyms of opinion word list.Negation is also handled in this system.This approach provides a summary of the total number of positive and negative documents.

Hybrid Approaches
Hybrid approaches combine the strengths of machine learning and lexicon-based approaches by taking into account the linguistic processing of lexicon-based approaches before starting the learning process as in machine learning approaches.
Ohana and Tierney [12] assess the use of SentiWordNet.Firstly, the vocabulary has been employed to calculate the score of terms found in a document and determine the sentiment direction.Then this method was improved by the construction of a relevant feature by using SentiWordNet as a source and applying the SVM classifier.The results indicate SentiWordNet could be utilized as a source In [29], the authors proposed an ensemble learning method based behaviour-knowledge space BKS, which four basic classifiers are used; single weighted sum of opinion words (SWS), weighted sum of opinion words (WSC), SVM and k-nearest neighbours (KNN).The results show the effectiveness of the proposed method, and show that this method is much higher than the basic classifiers.
Different ways to combine the analysis of discourse RST (Rhetorical Structure Theory) with the sentiment analysis are proposed in [3]: (i) a recurrent neural network on the structure of the RST and (ii) a reweighting discourse units.They show that the reweighting discourse units can lead to substantial improvements for the sentiment analysis basedlexicon, and show that the recurrent neural network using RST structure offers significant improvements over the basic classification methods.

Comparative study
We use the following components to make a comparative study;  Approach used  Technique used  Data source used  Feature construction used  Quantitative Evaluation As we have seen in the surveyed works, three main classes of approaches used for sentiment analysis are: machine learning approach, lexicon based approach, and hybrid approach.Table 1 grouped the surveyed works into these categories.
The goal of feature construction is to select good features for sentiment classification.Many features are used for sentiment analysis: unigram, bigram, n-gram, POS and opinion words.Table 4 presents the feature construction used in the surveyed works.The performance of different works used for opinion mining is evaluated by calculating various metrics like accuracy.Accuracy is calculated by using the following equation.

Accuracy
. This From the figure (Fig. 1), we find that work [6], which used a hybrid method of classification based on the coupling of NB and GA, gave the best results in terms of accuracy.Even for work [10], this used a new type of feature named "rating-based feature".

Conclusion and Future Work
The number of documents expressing opinions is constantly increasing on the World Wide Web.Document Level Sentiment Classification provides an overall opinion of the document on a single entity.In this article, we have presented an overview of related work of sentiment analysis at document level, mainly the approach of machine learning is considered as dominance at this level.The main classifiers used are SVM and NB.The more text representation used is "bag of words" representation, but supervised approaches using n-grams features cannot properly modeled the negation, and cannot correctly identify the complex sentiment expressions due to the loss of information incurred when representing texts with bag of words models.Most of the work uses movie review data for classification.But in the last years, deep learning approaches have captured the attention of researchers because it has significantly outperformed traditional methods such as SVM and NB.
The classification of the documents is not always relevant:  In many applications, the user needs to know what aspects of entities are liked and disliked by consumers, but this level of classification can not extract them. Comparing entities is not easily applicable at document level like forum discussion, blogs and new articles. Different emotions on the different aspects of an entity cannot be extracted separately.It is therefore necessary to move at sentence level, i.e., to classify opinion expressed in each sentence.However, there is no difference between document level and sentence level sentiment classification.
In the future work, more efforts would be done to improve the performance measures, for this purpose, there is a need for finer-grained analysis at the aspect level.Furthermore, the languages that have been studied mostly are English.Presently, there are very few researches conducted on sentiment classification of other languages like Arabic.We intend to propose a new approach for aspect based sentiment analysis of Arabic texts.

[ 13 ]
Pak and Paroubek developed a new sub-graphbased representation extracted from syntactic dependency trees.They represent a text as a collection of sub-graphs, where the nodes are words (or word classes) and arcs the syntactic dependencies between them.Such representation avoids the loss of information associated with the use of EAI Endorsed Transactions on Context-aware Systems and Applications 07 2017 -03 2018 | Volume 4 | Issue 13 | e2

[ 39 ]
Tang et al. introduce neural network models (Conv GRNN and LSTM-GRNN) for document level sentiment classification.The model first learns sentence representation with convolutional neural network or long short-term memory.Afterwards, semantics of sentences and their relations are adaptively encoded in document representation with gated recurrent neural network.They conducted document level sentiment classification on four large-scale review datasets from IMDB and Yelp Dataset Challenge.Experimental results show that LSTM performs better than a multi-filtered CNN in modelling sentence representation.EAI Endorsed Transactions on Context-aware Systems and Applications 07 2017 -03 2018 | Volume 4 | Issue 13 | e2 Sentiment classification techniques are usually distinguished based on approach been used.Several machine learning algorithms are used as a technique for document classification.Prominent methods are: NB, Maximum Entropy, KNN, and SVM.In the last years, deep learning approaches have captured the attention of researchers because it has significantly outperformed traditional methods.Prominent methods are: CNN, RNN, DBN, LSTM and GRNN.The lexicon based methods are EAI Endorsed Transactions on Context-aware Systems and Applications 07 2017 -03 2018 | Volume 4 | Issue 13 | e2 divided into dictionary-based and corpus-based.The surveyed works used different techniques.We have collected the various techniques used,

Figure 1 .
Figure 1.Sentiment analysis for movie review dataset

Table 2 .
Table 2 presents the results.Comparative study by technique

Table 3 .
Comparative study by data sources

Table 4 .
Comparative study by feature construction