1. 问题
  2. 应用
  3. 数据集
  4. 方法
    1. Naive Bayes
    2. RNN
    3. CNN
    4. Others
  5. 参考

学习句子分类,使用深度学习的方法对句子数据集进行分类。

问题

句子分类(Sentence Classification)是指给定一个句子,标注预先设定的若干类别中的一个类别。

句子分类包括情感分析(Sentiment Analysis)、问题分类(Question
Classification)等任务。情感分析又称倾向性分析、意见抽取(Opinion extraction)、意见挖掘(Opinion mining)、情感挖掘(Sentiment mining)、主观分析(Subjectivity analysis),它是对带有情感色彩的主观性文本进行分析、处理、归纳和推理的过程,如从评论文本中分析用户对“数码相机”的“变焦、价格、大小、重量、闪光、易用性”等属性的情感倾向。

应用

了解对电影、商品、Twitter 等的褒贬评价,以此来改善产品和服务、发现竞争对手的优劣势、预测股票走势等。

数据集

Data c l N |V| |V_pre| Test
MR 2 20 10662 18765 16448 CV
SST-1 5 18 11855 17836 16262 2210
SST-2 2 19 9613 16185 14838 1821
Subj 2 23 10000 21323 17913 CV
TREC 6 10 5952 9592 9125 500
CR 2 19 3775 5340 5046 CV
MPQA 2 3 10606 6246 6083 CV
  • MR: Movie reviews 电影评论,每条评论包含一个句子。1

  • SST-1: Stanford Sentiment Treebank,MR 的扩展但划分了 train/dev/test 集合并提供 5 个细粒度标签(非常积极的,积极的,中性的,负面的,非常消极的)。

  • SST-2: 与 SST-1 一样但移除中性评论并用二进制标签。2

  • Subj: Subjectivity 主观性数据集,任务是将句子分类为主观或客观的。3

  • TREC: TREC question dataset TREC 问题数据集,任务是将一个问题分成 6 类(关于人、位置、数字信息等)。4

  • CR: Customer reviews 各种产品的客户评论,任务是预测正面/负面评论。5

  • MPQA: MPQA 数据集意见极性检测任务。6

方法

通常会把任务拆分成几个子任务:

  1. 分词

    把句子根据意思分成多个词,有时可能还需要去掉停用词、了解词性、转换成词向量等操作。

  2. 提取特征

    有时我们不会直接使用分词后的多个词来直接分类,这时需要提取特征来方便分类。

    常用特征:TF-IDF、LDA、LSI

  3. 构建分类器

    输入特征或词向量等,通过一些模型,对该句子进行分类。

Naive Bayes

NBSVM: Naive Bayes SVM

MNB: Multinomial Naive Bayes 7

combine-skip

combine-skip + NB 8

Model MR SST-1 SST-2 Subj TREC CR MPQA
NBSVM 79.4 - - 93.2 - 81.8 86.3
MNB 79.0 - - 93.6 - 80.0 86.3
combine-skip 76.5 - - 93.6 92.2 80.1 87.1
combine-skip+NB 80.4 - - 93.6 - 81.3 87.5

RNN

RCNN: Recurrent Convolutional Neural Networks 9

S-LSTM: Long Short-Term Memory Over Recursive Structures 10

LSTM: Long Short-Term Memory

BLSTM: Bidirectional Long Short-Term Memory

Tree-LSTM: Tree-structured Long Short-Term Memory 11

LSTMN: Long Short-Term Memory-Network 12

Multi-Task: Recurrent Neural Network for Text Classification with Multi-Task Learning 13

BLSTM-Att: Bidirectional Long Short-Term Memory, attention-based model

BLSTM-2DPooling: Bidirectional Long Short-Term Memory Networks with Two-Dimensional Max Pooling

BLSTM-2DCNN: Bidirectional Long Short-Term Memory Networks with 2D convolution 14

Model MR SST-1 SST-2 Subj TREC CR MPQA
RCNN - 47.21 - - - - -
S-LSTM - - 81.9 - - - -
LSTM - 46.4 84.9 - - - -
BLSTM - 49.1 87.5 - - - -
Tree-LSTM - 51.0 88.0 - - - -
LSTMN - 49.3 87.3 - - - -
Multi-Task - 49.6 87.9 94.1 - - -
BLSTM 80.0 49.1 87.6 92.1 93.0 - -
BLSTM-Att 81.0 49.8 88.2 93.5 93.8 - -
BLSTM-2DPooling 81.5 50.5 88.3 93.7 94.8 - -
BLSTM-2DCNN 82.3 52.4 89.5 94.0 96.1 - -

CNN

DCNN: Dynamic Convolutional Neural Network 15

CNN-non-static: Convolutional Neural Networks, the pretrained vectors are fine-tuned for each task

CNN-multichannel: Convolutional Neural Networks with two sets of word vectors 16

TBCNN: Tree-based Convolutional Neural Network 17

Molding-CNN: Molding Convolutional Neural Networks 18

CNN-Ana: Non-static GloVe+word2vec CNN 19

MVCNN: Multichannel Variable-Size Convolution 20

DSCNN: Dependency Sensitive Convolutional Neural Networks 21

Model MR SST-1 SST-2 Subj TREC CR MPQA
DCNN - 48.5 86.8 - 93.0 - -
CNN-non-static 81.5 48.0 87.2 93.4 93.6 84.3 89.5
CNN-multichannel 81.1 47.4 88.1 93.2 92.2 85.0 89.4
TBCNN - 51.4 87.9 - 96.0 - -
Molding-CNN - 51.2 88.6 - - - -
CNN-Ana 81.02 45.98 85.45 93.66 91.37 84.65 89.55
MVCNN - 49.6 89.4 - - - -
DSCNN 81.5 49.7 89.1 93.2 95.4 - -

Others

RAE: Recursive Autoencoders with pre-trained word vectors from Wikipedia 22

AdaSent: self-adaptive hierarchical sentence model 23

RNTN: Recursive Neural Tensor Network 24

DRNN: Deep Recursive Neural Networks 25

Model MR SST-1 SST-2 Subj TREC CR MPQA
RAE 77.7 43.2 82.4 - - - 86.4
AdaSent 83.1 - - 95.5 92.4 86.3 93.3
RNTN - 45.7 85.4 - - - -
DRNN - 49.8 86.6 - - - -

参考


  1. (ACL 2005) Seeing Stars: Exploiting Class Relationships For Sentiment Categorization With Respect To Rating Scales https://www.cs.cornell.edu/people/pabo/movie-review-data/↩︎

  2. (EMNLP 2013) Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank https://nlp.stanford.edu/sentiment/↩︎

  3. (ACL 2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts http://www.cs.cornell.edu/people/pabo/movie-review-data↩︎

  4. (ACL 2002) Learning Question Classifiers http://cogcomp.org/Data/QA/QC/↩︎

  5. (SIGKDD 2004) Mining and Summarizing Customer Reviews http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html↩︎

  6. (Language Resources and Evaluation 2005) Annotating Expressions Of Opinions And Emotions In Language http://mpqa.cs.pitt.edu/↩︎

  7. (ACL 2012) Baselines and Bigrams: Simple, Good Sentiment and Topic Classification↩︎

  8. (NIPS 2015) Skip-Thought Vectors↩︎

  9. (AAAI 2015) Recurrent Convolutional Neural Networks for Text Classification↩︎

  10. (ICML 2015) Long Short-Term Memory Over Recursive Structures↩︎

  11. (ACL 2015) Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks↩︎

  12. (EMNLP2016) Long Short-Term Memory-Networks for Machine Reading↩︎

  13. (IJCAI 2016) Recurrent Neural Network for Text Classification with Multi-Task Learning↩︎

  14. (COLING 2016) Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling↩︎

  15. (ACL 2014) A Convolutional Neural Network for Modelling Sentences↩︎

  16. (EMNLP 2014) Convolutional Neural Networks for Sentence Classification↩︎

  17. (EMNLP 2015) Discriminative Neural Sentence Modeling by Tree-Based Convolution↩︎

  18. (EMNLP 2015) Molding CNNs for text: non-linear, non-consecutive convolutions↩︎

  19. (IJCNLP 2017) A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification↩︎

  20. (CoNLL 2015) Multichannel Variable-Size Convolution for Sentence Classification↩︎

  21. (NAACL 2016) Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents↩︎

  22. (EMNLP 2011) Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions↩︎

  23. (IJCAI 2015) Self-adaptive hierarchical sentence model↩︎

  24. (EMNLP 2013) Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank↩︎

  25. (NIPS 2014) Deep Recursive Neural Networks for Compositionality in Language↩︎