Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thicker Pareto front #364

Open
MilesCranmer opened this issue Oct 27, 2024 · 3 comments
Open

Thicker Pareto front #364

MilesCranmer opened this issue Oct 27, 2024 · 3 comments

Comments

@MilesCranmer
Copy link
Owner

I'm doing some hyperparameter tuning and it really seems to prefer migrating the hall of fame, rather than between populations.
It could be because the hyperparameter tuning was only for 2-minute runs, and thus these hyperparameters are over-optimized
for quickly finding a result, but it makes me curious if having a thicker Pareto front would help (i.e., multiple best expressions),
sort of like NSGA-II. The tricky thing is making sure the other levels in the Pareto front are diverse.

@larsentom
Copy link

Could you use a secondary complexity metric, like mean subtree size, along with total expression size? (For example mean subtree size, alongside the total number of nodes, has worked well in my experiments with different tools to find better equations. This turns the problem into a ≥ 3-objective problem, but if the metrics are chosen well, it can allow more diversity on the front without blowing out the number of nondominated solutions too much.)

You could also have a fixed-size 'central island' a la SPEA2 archive say of 25–100 individuals where the closest individuals (in terms of normalized Euclidean distance) get removed to keep the archive at size—this means you have a very diverse pool of solutions to potentially feed back into the island populations.

@larsentom
Copy link

What do you think about using something like the hash-based tree similarity measure from Burlacu, Affenzeller, Kronberger, and Kommenda to improve diversity across NSGA-II fronts?

In my experiments, standard NSGA-II implementations often end up with individuals that are pretty similar across fronts. SPEA2 does seem to drive more diversity in both solutions and their objectives, but it’s much slower.

@MilesCranmer
Copy link
Owner Author

Very interesting, thanks! For measuring similarity I wonder if we should just measure it in the predicted y space rather than trying to do so in symbolic space which seems unsolvable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants