0
点赞
收藏
分享

微信扫一扫

Elasticsearch安装与配置:快速搭建本地环境

金刚豆 2024-06-03 阅读 11
前端笔记

3. Web Content Mining

3.1 Introduction to Sentiment Analysis / Opinion Mining

Detection of stances and opinions towards people, companies, and products/services has a tremendous business value: Improving products and services, targeted advertising, revealing trends in election campaigns, …

Sentiment analysis or opinion mining is the computational study of people’s opinions, appraisals, attitudes, and emotions towards. (Entities,individuals,issues,events,topics,and their attributes (aspects))

A general sentiment analysis framework aims to answer

  1. Who is the opinion holder? -> Opinion holder
  2. Towards whom or what is opinion/sentiment expressed? -> Target
  3. What is the polarity and intensity of the opinion?
  4. Is an opinion associated with a time-span?

Opinion

3.2 Constructing Sentiment Lexicons

Sentiment clues (opinion words, sentiment-bearing words) – words and phrases used to express some desired or undesired state
Positive clues: good, amazing, beautiful
Negative clues: bad, awful, terrible, poor

Sentiment clues are often domain-dependent => Separate sentiment lexicons need to be constructed for different domains
Example: Quiet speaker phone vs. quiet car engine

3.2.1 Automated acquisition of sentiment lexicons

Automated acquisition of sentiment lexicon is most often semi-supervised (or weakly supervised)

  1. Start from a small seed lexicon of sentiment words
  2. Iteratively augment the lexicon based on links between words already in the lexicon and words in the large general lexicon or large corpus
  3. Stop when there are no more reliable candidate words to be added to the lexicon

Approaches for constructing sentiment lexicons are either Dictionary-based or Corpus-based

Often there is a final step of manual cleansing of automatically derived sentiment lexicons

3.2.1.1 Dictionary-Based Sentiment Lexicon Acquisition

Bootstrapping using a small seed sentiment lexicon. E.g.,10 positive and 10 negative sentiment words
Idea: exploit semantic links between words in the general lexicon E.g.,synonymy and antonymy links in WordNet. The procedure is typically iterative
Additional information can be used to make better lists: WordNet glosses or Machine learning(classification based on concept definitions)

Cons:

  1. Limited Coverage: they may miss out on nuanced or domain-specific sentiments.
  2. Lack of Context Understanding: These approaches often treat words in isolation without considering their context.
  3. Difficulty Handling Negations and Modifiers: Sentiment analysis dictionaries may struggle with handling negations (e.g., “not good”) or modifiers (e.g., “very good”)(Next pag
举报

相关推荐

0 条评论