Data curation with ontology functional dependences
dc.contributor.advisor | Szlichta, Jaroslaw | |
dc.contributor.author | Keller, Alexander | |
dc.date.accessioned | 2017-08-03T19:59:40Z | |
dc.date.accessioned | 2022-03-29T17:39:20Z | |
dc.date.available | 2017-08-03T19:59:40Z | |
dc.date.available | 2022-03-29T17:39:20Z | |
dc.date.issued | 2017-04-01 | |
dc.degree.discipline | Computer Science | |
dc.degree.level | Master of Science (MSc) | |
dc.description.abstract | Poor data quality has become a pervasive issue due to the increasing complexity and size of modern datasets. Functional dependencies have been used in existing cleaning solutions to model syntactic equivalence. They are not able to model semantic equivelence, however. We advance the state of data quality constraints by defining, discovering, and cleaning Ontology Functional Dependencies. We define their theoretical foundations, including sound and complete axioms, and linear inference procedure. We develop algorithms for data verification, constraint discovery, data cleaning, ontology versus data inconsistency identification, and optimizations to each. Our experimental evaluation shows the scalability and accuracy of our algorithms. We show that ontology FDs are useful to capture domain attribute relationships, and can significantly reduce the number of false positive errors in data cleaning techniques that rely on traditional FDs. | en |
dc.description.sponsorship | University of Ontario Institute of Technology | en |
dc.identifier.uri | https://hdl.handle.net/10155/792 | |
dc.language.iso | en | en |
dc.subject | Constraints | en |
dc.subject | Data | en |
dc.subject | Quality | en |
dc.subject | Cleaning | en |
dc.subject | Discovery | en |
dc.title | Data curation with ontology functional dependences | en |
dc.type | Thesis | en |
thesis.degree.discipline | Computer Science | |
thesis.degree.grantor | University of Ontario Institute of Technology | |
thesis.degree.name | Master of Science (MSc) |