Unsupervised brand name extraction using domain adaptation

Journal Title
Journal ISSN
Volume Title
Business intelligence and analytics is an area of research that analyzes the existing business data to extract the insights needed for a successful business planning. Textual data derived from tweets, forum posts, and blogs are from different business domains, and contain useful information for the organizations. This thesis proposes a method for extracting brand and product names from text; brand names as a subset of named entities can give a great deal of information about the whole document. In this thesis, a context window is defined to capture the context of a word in a sentence. In addition, a word embedding model is locally trained to have a domain specific model and finally, a domain adaptation technique is employed to transfer the knowledge from one domain with labeled data to a new domain. The results indicate a significant improvement in recall measure for extracting brand names from a new domain.
Natural language processing, Named entity recognition, Word embedding, Domain adaptation