Unified processing of natural language and relational data

dc.contributor.advisorPu, Ken
dc.contributor.advisorDavoudi, Kourosh
dc.contributor.authorStoica, Andrei
dc.date.accessioned2022-10-17T19:40:42Z
dc.date.available2022-10-17T19:40:42Z
dc.date.issued2022-09-01
dc.degree.disciplineComputer Science
dc.degree.levelMaster of Science (MSc)
dc.description.abstractThis work outlines a method for performing natural language tasks as part of a relational framework. Utilizing features of PostgreSQL as a relational database and its extensibility to allow for word embedding without leaving the relational database. This system can be extended to incorporate several natural language processing (NLP) techniques, such as latent Dirichlet allocations(LDA) or modern models, such as BERT. The combination of NLP and relational operations allows for extracting data from and analyzing text in the same interface used for general data analysis. This combination allows for gathering richer information from existing sources and makes it all available from one standard interface. The declarative nature of SQL allows for more ad-hoc application of NLP techniques. Two case studies using the DBLP dataset demonstrate this integration’s power. Building an LDA model, augmenting the topic labels for greater descriptiveness, and applying preexisting models for semantic analysis.en
dc.description.sponsorshipUniversity of Ontario Institute of Technologyen
dc.identifier.urihttps://hdl.handle.net/10155/1547
dc.language.isoenen
dc.subjectQuery languageen
dc.subjectDatabaseen
dc.subjectNatural language processingen
dc.subjectEmbedding vectorsen
dc.subjectText processingen
dc.titleUnified processing of natural language and relational dataen
dc.typeThesisen
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Ontario Institute of Technology
thesis.degree.nameMaster of Science (MSc)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Stoica_Andrei.pdf
Size:
742.78 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: