Unsupervised brand name extraction using domain adaptation

Date
2019-08-01
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Business intelligence and analytics is an area of research that analyzes the existing business data to extract the insights needed for a successful business planning. Textual data derived from tweets, forum posts, and blogs are from different business domains, and contain useful information for the organizations. This thesis proposes a method for extracting brand and product names from text; brand names as a subset of named entities can give a great deal of information about the whole document. In this thesis, a context window is defined to capture the context of a word in a sentence. In addition, a word embedding model is locally trained to have a domain specific model and finally, a domain adaptation technique is employed to transfer the knowledge from one domain with labeled data to a new domain. The results indicate a significant improvement in recall measure for extracting brand names from a new domain.
Description
Keywords
Natural language processing, Named entity recognition, Word embedding, Domain adaptation
Citation