Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cli improvements #33

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Cli improvements #33

wants to merge 5 commits into from

Conversation

gkronber
Copy link
Member

@gkronber gkronber commented Feb 1, 2024

Additionally, the solutions in the pareto front could also be linearly scaled before printing (as for the best individual).

@foolnotion
Copy link
Member

This is great, thanks! My only concern is that this extra output will make it more difficult to call the cli program from scripts and parse the output. Perhaps the cli tools should have a verbosity setting or some dedicated command line argument for this?

@gkronber
Copy link
Member Author

gkronber commented Feb 1, 2024

A command line option to turn on output of the Pareto front would be good yes.

…s in the Pareto front, and apply scaling for the reports only when linear-scaling is turned on.
@gkronber
Copy link
Member Author

gkronber commented Apr 4, 2024

In my last commit I also changed the code to scale the evaluation results for the progress report only when linear scaling is turned on. Instead of the if-statements in the lambdas I tried to change the taskflow graph (see commented code), but this produced segfaults in my tests.

@foolnotion
Copy link
Member

In my last commit I also changed the code to scale the evaluation results for the progress report only when linear scaling is turned on. Instead of the if-statements in the lambdas I tried to change the taskflow graph (see commented code), but this produced segfaults in my tests.

I believe the segfaults were caused by the fact that the linear scaling subtasks were still added to the taskflow (via taskflow.emplace(...)) even though when scaling == false they are never part of the graph (no .precede or .succeed). This patch solves the issue:

diff --git a/cli/source/operon_nsgp.cpp b/cli/source/operon_nsgp.cpp
index 867dd24..685a52c 100644
--- a/cli/source/operon_nsgp.cpp
+++ b/cli/source/operon_nsgp.cpp
@@ -299,7 +299,8 @@ auto main(int argc, char** argv) -> int
             // scale values
             Operon::Scalar a{1.0};
             Operon::Scalar b{0.0};
-            auto linearScaling = taskflow.emplace([&]() {
+
+            auto fitLeastSquares = [&]() {
                 auto [a_, b_] = Operon::FitLeastSquares(estimatedTrain, targetTrain);
                 a = static_cast<Operon::Scalar>(a_);
                 b = static_cast<Operon::Scalar>(b_);
@@ -317,7 +318,17 @@ auto main(int argc, char** argv) -> int
                 if (nodes.size() > sz) {
                     best.Genotype.UpdateNodes();
                 }
-            });
+            };
+
+            auto scaleTrain = [&]() {
+                Eigen::Map<Eigen::Array<Operon::Scalar, -1, 1>> estimated(estimatedTrain.data(), std::ssize(estimatedTrain));
+                estimated = estimated * a + b;
+            };
+
+            auto scaleTest = [&]() {
+                Eigen::Map<Eigen::Array<Operon::Scalar, -1, 1>> estimated(estimatedTest.data(), std::ssize(estimatedTest));
+                estimated = estimated * a + b;
+            };
 
             double r2Train{};
             double r2Test{};
@@ -326,16 +337,6 @@ auto main(int argc, char** argv) -> int
             double maeTrain{};
             double maeTest{};
 
-            auto scaleTrain = taskflow.emplace([&]() {
-                Eigen::Map<Eigen::Array<Operon::Scalar, -1, 1>> estimated(estimatedTrain.data(), std::ssize(estimatedTrain));
-                estimated = estimated * a + b;
-            });
-
-            auto scaleTest = taskflow.emplace([&]() {
-                Eigen::Map<Eigen::Array<Operon::Scalar, -1, 1>> estimated(estimatedTest.data(), std::ssize(estimatedTest));
-                estimated = estimated * a + b;
-            });
-
             auto calcStats = taskflow.emplace([&]() {
                 // negate the R2 because this is an internal fitness measure (minimization) which we here repurpose
                 r2Train = -Operon::R2{}(estimatedTrain, targetTrain);
@@ -358,9 +359,16 @@ auto main(int argc, char** argv) -> int
             auto calculateOffMemory = taskflow.transform_reduce(off.begin(), off.end(), totalMemory, std::plus{}, [&](auto const& ind) { return getSize(ind); });
 
             // define task graph
-            linearScaling.succeed(evalTrain, evalTest);
-            linearScaling.precede(scaleTrain, scaleTest);
-            calcStats.succeed(scaleTrain, scaleTest);
+            if (scale) {
+                auto scaleModelTask = taskflow.emplace(fitLeastSquares);
+                auto scaleTrainTask = taskflow.emplace(scaleTrain);
+                auto scaleTestTask = taskflow.emplace(scaleTest);
+                scaleModelTask.precede(scaleTrainTask, scaleTestTask);
+                scaleModelTask.succeed(evalTrain, evalTest);
+                calcStats.succeed(scaleTrainTask, scaleTestTask);
+            } else {
+                calcStats.succeed(evalTrain, evalTest);
+            }
             calcStats.precede(calculateLength, calculateQuality, calculatePopMemory, calculateOffMemory);
 
             executor.corun(taskflow);

@gkronber
Copy link
Member Author

gkronber commented Apr 8, 2024

Thanks. I will incorporate this in the branch (also for operon_gp), and will move the Scale() function to utils.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants