Opinion Mining and Sentiment Analysis
A number of faculty at SF State are interested in textual analysis, natural language processing and data mining. This article (Opinion Mining and Sentiment Analysis in the journal Foundations and Trends in Information Retrieval, 2, 2008 by B Pang and L Lee discuss the rapid emergence, since 2001, of the area of sentiment analysis. That is, doing information extraction expressly to summarize sentiment and opinion. From intelligence analysis to marketing and movie reviews, this article is a gold mine of information and methodological complexity in this fast breaking field.
FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL BOOKS
AUTHORSHIP ATTRIBUTION
by Patrick Juola (Duquesne University, USA)
Authorship attribution, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and a wide range of application. It is an important problem not only in information retrieval but in many other disciplines as well, from technology to teaching and from finance to forensics. The idea that authors have a statistical "fingerprint'' that can be detected by computers is a compelling one that has received a lot of research attention.
Authorship Attribution surveys the history and present state of the discipline, presenting some comparative results where available. It also provides a theoretical and empirically-tested basis for further work. Many modern techniques are described and evaluated, along with some insights for application for novices and experts alike.
Authorship Attribution will be of particular interest to information retrieval researchers and students who want to keep up with the latest techniques and their applications. It is also a useful resource for people in other disciplines, be it the teacher interested in plagiarism detection or the historian interested in who wrote a particular document.
MUSIC RETRIEVAL
A Tutorial and Review
by Nicola Orio (University of Padova, Italy)
Music Accessing and Retrieval is the first comprehensive survey of the vast new field of Music Information Retrieval (MIR). It describes a number of issues which are peculiar to the language of music — including forms, formats, and dimensions of music — together with the typologies of users and their information needs. To fulfil these needs a number of approaches are discussed, from direct search to information filtering and clustering of music documents. The emphasis is on tools, techniques, and approaches for content-based MIR, rather than on the systems that implement them. The interested reader can, however, find descriptions of more than 35 systems for music retrieval with links to their Web sites.
Music Accessing and Retrieval can be used as both a guide for beginners who are embarking on research in this relatively new area, and a useful reference for established researchers in this field.
OPEN-DOMAIN QUESTION ANSWERING
by John Prager (IBM T.J. Watson Research Center)
Open-Domain Question Answering is an introduction to the field of Question Answering (QA). It covers the basic principles of QA along with a selection of systems that have exhibited interesting and significant techniques, so it serves more as a tutorial than as an exhaustive survey of the field.
Starting with a brief history of the field, it goes on to describe the architecture of a QA system before analysing in detail some of the specific approaches that have been successfully deployed by academia and industry designing and building such systems.
Open-Domain Question Answering is both a guide for beginners who are embarking on research in this area, and a useful reference for established researchers and practitioners in this field.
OPINION MINING AND SENTIMENT ANALYSIS
by Bo Pang (Yahoo! Research, USA) & Lillian Lee (Cornell University, USA)
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object.
Opinion Mining and Sentiment Analysis covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. The focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. The survey includes an enumeration of the various applications, a look at general challenges and discusses categorization, extraction and summarization. Finally, it moves beyond just the technical issues, devoting significant attention to the broader implications that the development of opinion-oriented information-access services have: questions of privacy, vulnerability to manipulation, and whether or not reviews can have measurable economic impact. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.
Opinion Mining and Sentiment Analysis is the first such comprehensive survey of this vibrant and important research area and will be of interest to anyone with an interest in opinion-oriented information-seeking systems.
No comments:
Post a Comment