Abstract
We describe a bootstrapping algorithm to learn from partially labeled data, and the results of an empirical study for using it to improve performance of sentiment classification using up to 15 million unlabeled Amazon product reviews. Our experiments cover semi-supervised learning, domain adaptation and weakly supervised learning. In some cases our methods were able to reduce test error by more than half using such large amount of data.
Original language | English |
---|---|
Pages (from-to) | 175-190 |
Number of pages | 16 |
Journal | Journal of Machine Learning Research |
Volume | 25 |
State | Published - 2012 |
Event | 4th Asian Conference on Machine Learning, ACML 2012 - Singapore, Singapore Duration: 4 Nov 2012 → 6 Nov 2012 |
Keywords
- Domain adaptation
- Semi-supervised learning
- Sentiment analysis
- Weakly supervised learning
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence