LLM-powered active learning for cost-effective text classification

dc.contributor.advisorMakrehchi, Masoud
dc.contributor.authorRouzegar, Hamidreza
dc.date.accessioned2024-12-03T16:53:03Z
dc.date.available2024-12-03T16:53:03Z
dc.date.issued2024-10-01
dc.description.abstractThis thesis presents an LLM-powered active learning framework for cost-effective text classification, addressing the challenge of potential LLM annotation errors while balancing annotation quality and model accuracy. Our methodology combines human and large language model (LLM) annotations using uncertainty sampling and confidence scoring. Starting with a small, labeled seed set, the model iteratively selects the most informative data points for annotation, reducing labeling costs while maximizing performance. To simulate real-world scenarios, a dynamically updated proxy validation set mirrors the distribution of the unlabeled pool, enabling reliable performance estimation throughout training. The Performance Improvement Cost Ratio (PICR) is introduced as an objective stopping criterion to optimize the balance between costs and accuracy gains. Additionally, role-based prompting enhances annotation quality, creating a scalable framework adaptable to diverse text classification tasks. Experimental results demonstrate that the proposed approach achieves human-comparable performance at reduced costs, underscoring its potential for practical applications.
dc.identifier.urihttps://hdl.handle.net/10155/1867
dc.language.isoen
dc.subject.otherLLMs
dc.subject.otherText classification
dc.subject.otherActive learning
dc.subject.otherSmart annotation
dc.subject.otherRole design
dc.titleLLM-powered active learning for cost-effective text classification
dc.typeThesis
thesis.degree.disciplineElectrical and Computer Engineering
thesis.degree.grantorUniversity of Ontario Institute of Technology
thesis.degree.nameMaster of Applied Science (MASc)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Rouzegar, Hamidreza.pdf
Size:
1.1 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.89 KB
Format:
Item-specific license agreed upon to submission
Description: