New frontiers in population-based multi-objective feature selection

dc.contributor.advisorRahnamayan, Shahryar
dc.contributor.advisorMakrehchi, Masoud
dc.contributor.authorZanjani Miyandoab, Sevil
dc.date.accessioned2024-06-25T16:12:06Z
dc.date.available2024-06-25T16:12:06Z
dc.date.issued2024-04-01
dc.degree.disciplineElectrical and Computer Engineering
dc.degree.levelMaster of Applied Science (MASc)
dc.description.abstractFeature selection is a persistent challenge aimed at minimizing the number of features while maximizing accuracy of classification, or any other machine learning and data mining task, by mitigating the curse of dimensionality. We frame feature selection as a multi-objective binary optimization task with the objectives of enhancing accuracy and reducing the feature count. Given that feature selection, along with binary optimization problems in general, is a NP-hard problem since the size of search space increases exponentially by the increase of the number of features, so especially in high-dimensional spaces, it can be very challenging. We propose three innovative approaches to tackle large-scale multi-objective feature selection. The first technique involves an augmentation to the diversity of the population in the well-established multi-objective scheme of the genetic algorithm, NSGA-II, which is achieved through the substitution of the worst individuals with new randomly generated individuals with a limited number of features in each generation. As the second method, a binary Compact NSGA-II (CNSGA-II) algorithm has been introduced for feature selection for the first time, which represents the population as a probability distribution not only to be more memory-efficient but also to accelerate finding a better candidate solution. Additionally, we present a novel binary multi-objective coordinate search (MOCS) algorithm, which, to the best of our knowledge, is the first of its kind, demonstrating effectiveness in solving multi-objective binary optimization problems. A comparative study on the proposed methods and NSGA-II showcases the promising performance of our methods in feature selection tasks involving large-scale datasets, surpassing the renowned NSGA-II algorithm. These methods are not limited to feature selection but can be applied to various binary optimization domains, including the Knapsack problem and training Binary Deep Neural Networks.
dc.description.sponsorshipUniversity of Ontario Institute of Technology
dc.identifier.urihttps://hdl.handle.net/10155/1804
dc.language.isoen
dc.subject.otherLarge-scale feature selection
dc.subject.otherMulti-objective feature selection
dc.subject.otherMulti-objective optimization
dc.subject.otherNSGA-II
dc.subject.otherMulti-objective coordinate search
dc.titleNew frontiers in population-based multi-objective feature selection
dc.typeThesis
thesis.degree.disciplineElectrical and Computer Engineering
thesis.degree.grantorUniversity of Ontario Institute of Technology
thesis.degree.nameMaster of Applied Science (MASc)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zanjani_Miyandoab_Sevil.pdf
Size:
2.23 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.92 KB
Format:
Item-specific license agreed upon to submission
Description: