diff --git a/.github/ISSUE_TEMPLATE/bug-report.md b/.github/ISSUE_TEMPLATE/bug-report.md new file mode 100644 index 0000000..36636d2 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug-report.md @@ -0,0 +1,20 @@ +--- +name: Bug Report +about: Create a bug report to help us improve +title: "[BUG]: " +labels: bug +assignees: '' + +--- + +**Describe the bug** +A clear and concise description of what the bug is. + +**To Reproduce** +Steps to reproduce the behaviour. + +**Expected behaviour** +A clear and concise description of what you expected to happen. + +**Additional context** +Add any other context about the problem here, including any relevant system information and python version. diff --git a/.github/ISSUE_TEMPLATE/feature-request.md b/.github/ISSUE_TEMPLATE/feature-request.md new file mode 100644 index 0000000..5bbb680 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature-request.md @@ -0,0 +1,20 @@ +--- +name: Feature Request +about: Suggest an idea for this project +title: "[FEATURE]: " +labels: enhancement +assignees: '' + +--- + +**Is your feature request related to a problem? Please describe.** +A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] + +**Describe the solution you'd like** +A clear and concise description of what you want to happen. + +**Describe alternatives you've considered** +A clear and concise description of any alternative solutions or features you've considered. + +**Additional context** +Add any other context or screenshots about the feature request here. diff --git a/.github/ISSUE_TEMPLATE/framework-request.md b/.github/ISSUE_TEMPLATE/framework-request.md new file mode 100644 index 0000000..f009de1 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/framework-request.md @@ -0,0 +1,23 @@ +--- +name: Algorithm Request +about: Suggest for a new algorithm to be added to this project +title: "[ALGORITHM]: " +labels: enhancement +assignees: '' + +--- + +**Which algorithm would you like added to this project** +- algorithm name +- link to academic paper + +**Why should this algorithm be added?** +What benefit is there to adding this algorithm? + +**Short summary of algorithm** +What is the core algorithmic idea behind the algorithm in simple terms. Please give a general overview rather than advanced algorithmic concepts. + +**Which algorithm does this build upon, if any?** + +**Are you willing to submit a PR?** +Are you willing to work on this implementation and submit a PR? diff --git a/.github/ISSUE_TEMPLATE/question.md b/.github/ISSUE_TEMPLATE/question.md new file mode 100644 index 0000000..bddea02 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/question.md @@ -0,0 +1,13 @@ +--- +name: Question +about: Ask a question about this project +title: "[Q]: " +labels: question +assignees: '' + +--- + +**Ask Away!** + +- Please double-check the [docs](https://ruck.dontpanic.sh) to make sure that your question is not already answered there. +- Please double-check the issues to make sure that your question has not been answered before. diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml new file mode 100644 index 0000000..0c23870 --- /dev/null +++ b/.github/workflows/python-publish.yml @@ -0,0 +1,34 @@ +# This workflow will upload a Python Package +# using Twine when a release is created + +name: publish + +on: + release: + types: [created] + +jobs: + deploy: + + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v2 + + - name: Set up Python + uses: actions/setup-python@v2 + with: + python-version: '3.x' + + - name: Install dependencies + run: | + python3 -m pip install --upgrade pip + python3 -m pip install setuptools wheel twine + + - name: Build and publish + env: + TWINE_USERNAME: __token__ + TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }} + run: | + python3 setup.py sdist bdist_wheel + python3 -m twine upload dist/* diff --git a/.github/workflows/python-test.yml b/.github/workflows/python-test.yml new file mode 100644 index 0000000..e84d369 --- /dev/null +++ b/.github/workflows/python-test.yml @@ -0,0 +1,44 @@ +# This workflow will install Python dependencies, +# then run tests over a variety of Python versions. + +name: test + +on: + push: + branches: [ main, dev ] + tags: [ '*' ] + pull_request: + branches: [ main, dev ] + +jobs: + build: + runs-on: ${{ matrix.os }} + strategy: + matrix: + os: [ubuntu-latest] # [ubuntu-latest, windows-latest, macos-latest] + python-version: [3.8] + + steps: + - uses: actions/checkout@v2 + + - name: Set up Python ${{ matrix.python-version }} + uses: actions/setup-python@v2 + with: + python-version: ${{ matrix.python-version }} + + - name: Install dependencies + run: | + python3 -m pip install --upgrade pip + python3 -m pip install -r requirements.txt + python3 -m pip install -r requirements-test.txt + + - name: Test with pytest + run: | + python3 -m pytest --cov=ruck tests/ + + - uses: codecov/codecov-action@v1 + with: + token: ${{ secrets.CODECOV_TOKEN }} + fail_ci_if_error: true + # codecov automatically merges all generated files + # if: matrix.os == 'ubuntu-latest' && matrix.python-version == 3.9 diff --git a/README.md b/README.md index bee9f26..4009b92 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,280 @@ -# ruck -Efficient Parallel Genetic Algorithms For Python + +

+

🧬 Ruck

+

+ Performant evolutionary algorithms for Python +

+

+ +

+ + license + + + python versions + + + pypi version + + + tests status + +

+ +

+

+ Visit the docs for more info, or browse the releases. +

+

+ Contributions are welcome! +

+

+ +------------------------ + +## Goals + +Ruck aims to fill the following criteria: + +1. Provide **high quality**, **readable** implementations of algorithms. +2. Be easily **extensible** and **debuggable**. +3. Performant while maintaining its simplicity. + +## Citing Ruck + +Please use the following citation if you use Ruck in your research: + +```bibtex +@Misc{Michlo2021Ruck, + author = {Nathan Juraj Michlo}, + title = {Ruck - Performant evolutionary algorithms for Python}, + howpublished = {Github}, + year = {2021}, + url = {https://github.com/nmichlo/ruck} +} +``` + +## Overview + +Ruck takes inspiration from PyTorch Lightning's module system. The population creation, +offspring, evaluation and selection steps are all contained within a single module inheriting +from `EaModule`. While the training logic and components are separated into its own class. + +`Members` of a `Population` (A list of Members) are intended to be read-only. Modifications should not +be made to members, instead new members should be created with the modified values. This enables us to +easily implement efficient multi-threading, see below! + +The trainer automatically constructs `HallOfFame` and `LogBook` objects which keep track of your +population and offspring. `EaModule` provides defaults for `get_stats_groups` that can be overridden +if you wish to customize the tracked statistics. + + +### Minimal OneMax Example + +```python +import random +import numpy as np +from ruck import * + + +class OneMaxMinimalModule(EaModule): + """ + Minimal onemax example + - The goal is to flip all the bits of a boolean array to True + - Offspring are generated as bit flipped versions of the previous population + - Selection tournament is performed between the previous population and the offspring + """ + + # evaluate unevaluated members according to their values + def evaluate_values(self, values): + return [v.sum() for v in values] + + # generate 300 random members of size 100 with 50% bits flipped + def gen_starting_values(self): + return [np.random.random(100) < 0.5 for _ in range(300)] + + # randomly flip 5% of the bits of each each member in the population + # the previous population members should never be modified + def generate_offspring(self, population): + return [Member(m.value ^ (np.random.random(m.value.shape) < 0.05)) for m in population] + + # selection tournament between population and offspring + def select_population(self, population, offspring): + combined = population + offspring + return [max(random.sample(combined, k=3), key=lambda m: m.fitness) for _ in range(len(population))] + + +if __name__ == '__main__': + # create and train the population + module = OneMaxMinimalModule() + pop, logbook, halloffame = Trainer(generations=100, progress=True).fit(module) + + print('initial stats:', logbook[0]) + print('final stats:', logbook[-1]) + print('best member:', halloffame.members[0]) +``` + +### Advanced OneMax Example + +Ruck provides various helper functions and implementations of evolutionary algorithms for convenience. +The following example makes use of these additional features so that components and behaviour can +easily be swapped out. + +The three basic evolutionary algorithms provided are based around [deap's](http://www.github.com/deap/deap) +default algorithms from `deap.algorithms`: `eaSimple`, `eaMuPlusLambda`, and `eaMuCommaLambda`. These +algorithms can be accessed from `ruck.functional` which has the alias `R`: `R.factory_simple_ea`, +`R.factory_mu_plus_lambda` and `R.factory_mu_comma_lambda`. + + +
Code Example +

+ +```python +""" +OneMax serial example based on: +https://github.com/DEAP/deap/blob/master/examples/ga/onemax_numpy.py +""" + +import functools +import numpy as np +from ruck import * + + +class OneMaxModule(EaModule): + + def __init__( + self, + population_size: int = 300, + member_size: int = 100, + p_mate: float = 0.5, + p_mutate: float = 0.5, + ): + # save the arguments to the .hparams property. values are taken from the + # local scope so modifications can be captured if the call to this is delayed. + self.save_hyperparameters() + # implement the required functions for `EaModule` + self.generate_offspring, self.select_population = R.factory_simple_ea( + mate_fn=R.mate_crossover_1d, + mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05), + select_fn=functools.partial(R.select_tournament, k=3), + p_mate=self.hparams.p_mate, + p_mutate=self.hparams.p_mutate, + ) + + def evaluate_values(self, values): + return map(np.sum, values) + + def gen_starting_values(self) -> Population: + return [ + np.random.random(self.hparams.member_size) < 0.5 + for i in range(self.hparams.population_size) + ] + + +if __name__ == '__main__': + # create and train the population + module = OneMaxModule(population_size=300, member_size=100) + pop, logbook, halloffame = Trainer(generations=40, progress=True).fit(module) + + print('initial stats:', logbook[0]) + print('final stats:', logbook[-1]) + print('best member:', halloffame.members[0]) +``` + +

+
+ +### Multithreading OneMax Example (Ray) + +If we need to scale up the computational requirements, for example requiring increased +member and population sizes, the above serial implementations will soon run into performance problems. + +The basic Ruck implementations of various evolutionary algorithms are designed around a `map` +function that can be swapped out to add multi-threading support. We can easily do this using +[ray](https://github.com/ray-project/ray) and we even provide various helper functions that +enhance ray support. + +1. We begin by placing member's values into shared memory using ray's read-only object store +and the `ray.put` function. These [ObjectRef's](https://docs.ray.io/en/latest/memory-management.html) +values point to the original `np.ndarray` values. When retrieved with `ray.get` they obtain the original +arrays using an efficient zero-copy procedure. This is advantageous over something like Python's multiprocessing module which uses +expensive pickle operations to pass data around. + +2. The second step is to swap out the aforementioned `map` function in the previous example to a +multiprocessing equivalent. We provide the `ray_map` function that can be used instead, which +automatically wraps functions using `ray.remote`, and provides additional benefits when using `ObjectRef`s. + +3. Finally we need to update our `mate` and `mutate` functions to handle `ObjectRef`s, we provide a convenient +wrapper to automatically call `ray.get` on function arguments and `ray.out` on function results so that +you can re-use your existing code. + +
Code Example +

+ +```python +""" +OneMax parallel example using ray's object store. + +8 bytes * 1_000_000 * 128 members ~= 128 MB of memory to store this population. +This is quite a bit of processing that needs to happen! But using ray +and its object store we can do this efficiently! +""" + +from functools import partial +import numpy as np +import ray +from ruck import * +from ruck.util import * + + +class OneMaxRayModule(EaModule): + + def __init__( + self, + population_size: int = 300, + member_size: int = 100, + p_mate: float = 0.5, + p_mutate: float = 0.5, + ): + self.save_hyperparameters() + # implement the required functions for `EaModule` + # - decorate the functions with `ray_refs_wrapper` which + # automatically `ray.get` arguments and `ray.put` returned results + self.generate_offspring, self.select_population = R.factory_simple_ea( + mate_fn=ray_refs_wrapper(R.mate_crossover_1d, iter_results=True), + mutate_fn=ray_refs_wrapper(partial(R.mutate_flip_bit_groups, p=0.05)), + select_fn=partial(R.select_tournament, k=3), # OK to compute locally, because we only look at the fitness + p_mate=self.hparams.p_mate, + p_mutate=self.hparams.p_mutate, + map_fn=ray_map, # specify the map function to enable multiprocessing + ) + + def evaluate_values(self, values): + # values is a list of `ray.ObjectRef`s not `np.ndarray`s + # ray_map automatically converts np.sum to a `ray.remote` function which + # automatically handles `ray.get`ing of `ray.ObjectRef`s passed as arguments + return ray_map(np.sum, values) + + def gen_starting_values(self): + # generate objects and place in ray's object store + return [ + ray.put(np.random.random(self.hparams.member_size) < 0.5) + for i in range(self.hparams.population_size) + ] + + +if __name__ == '__main__': + # initialize ray to use the specified system resources + ray.init() + + # create and train the population + module = OneMaxRayModule(population_size=128, member_size=1_000_000) + pop, logbook, halloffame = Trainer(generations=100, progress=True).fit(module) + + print('initial stats:', logbook[0]) + print('final stats:', logbook[-1]) + print('best member:', halloffame.members[0]) +``` + +

+
diff --git a/examples/onemax.py b/examples/onemax.py new file mode 100644 index 0000000..20321c1 --- /dev/null +++ b/examples/onemax.py @@ -0,0 +1,73 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +""" +OneMax serial example based on: +https://github.com/DEAP/deap/blob/master/examples/ga/onemax_numpy.py +""" + +import functools +import numpy as np +from ruck import * + + +class OneMaxModule(EaModule): + + def __init__( + self, + population_size: int = 300, + member_size: int = 100, + p_mate: float = 0.5, + p_mutate: float = 0.5, + ): + # save the arguments to the .hparams property. values are taken from the + # local scope so modifications can be captured if the call to this is delayed. + self.save_hyperparameters() + # implement the required functions for `EaModule` + self.generate_offspring, self.select_population = R.factory_simple_ea( + mate_fn=R.mate_crossover_1d, + mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05), + select_fn=functools.partial(R.select_tournament, k=3), + p_mate=self.hparams.p_mate, + p_mutate=self.hparams.p_mutate, + ) + + def evaluate_values(self, values): + return map(np.sum, values) + + def gen_starting_values(self) -> Population: + return [ + np.random.random(self.hparams.member_size) < 0.5 + for i in range(self.hparams.population_size) + ] + + +if __name__ == '__main__': + # create and train the population + module = OneMaxModule(population_size=300, member_size=100) + pop, logbook, halloffame = Trainer(generations=40, progress=True).fit(module) + + print('initial stats:', logbook[0]) + print('final stats:', logbook[-1]) + print('best member:', halloffame.members[0]) diff --git a/examples/onemax_minimal.py b/examples/onemax_minimal.py new file mode 100644 index 0000000..1d30f8e --- /dev/null +++ b/examples/onemax_minimal.py @@ -0,0 +1,63 @@ +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + + +import random +import numpy as np +from ruck import * + + +class OneMaxMinimalModule(EaModule): + """ + Minimal onemax example + - The goal is to flip all the bits of a boolean array to True + - Offspring are generated as bit flipped versions of the previous population + - Selection tournament is performed between the previous population and the offspring + """ + + # evaluate unevaluated members according to their values + def evaluate_values(self, values): + return [v.sum() for v in values] + + # generate 300 random members of size 100 with 50% bits flipped + def gen_starting_values(self): + return [np.random.random(100) < 0.5 for _ in range(300)] + + # randomly flip 5% of the bits of each each member in the population + # the previous population members should never be modified + def generate_offspring(self, population): + return [Member(m.value ^ (np.random.random(m.value.shape) < 0.05)) for m in population] + + # selection tournament between population and offspring + def select_population(self, population, offspring): + combined = population + offspring + return [max(random.sample(combined, k=3), key=lambda m: m.fitness) for _ in range(len(population))] + + +if __name__ == '__main__': + # create and train the population + module = OneMaxMinimalModule() + pop, logbook, halloffame = Trainer(generations=100, progress=True).fit(module) + + print('initial stats:', logbook[0]) + print('final stats:', logbook[-1]) + print('best member:', halloffame.members[0]) diff --git a/examples/onemax_ray.py b/examples/onemax_ray.py new file mode 100644 index 0000000..d2691fd --- /dev/null +++ b/examples/onemax_ray.py @@ -0,0 +1,86 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +""" +OneMax parallel example using ray's object store. + +8 bytes * 1_000_000 * 128 members ~= 128 MB of memory to store this population. +This is quite a bit of processing that needs to happen! But using ray +and its object store we can do this efficiently! +""" + +from functools import partial +import numpy as np +import ray +from ruck import * +from ruck.util import * + + +class OneMaxRayModule(EaModule): + + def __init__( + self, + population_size: int = 300, + member_size: int = 100, + p_mate: float = 0.5, + p_mutate: float = 0.5, + ): + self.save_hyperparameters() + # implement the required functions for `EaModule` + # - decorate the functions with `ray_refs_wrapper` which + # automatically `ray.get` arguments and `ray.put` returned results + self.generate_offspring, self.select_population = R.factory_simple_ea( + mate_fn=ray_refs_wrapper(R.mate_crossover_1d, iter_results=True), + mutate_fn=ray_refs_wrapper(partial(R.mutate_flip_bit_groups, p=0.05)), + select_fn=partial(R.select_tournament, k=3), # OK to compute locally, because we only look at the fitness + p_mate=self.hparams.p_mate, + p_mutate=self.hparams.p_mutate, + map_fn=ray_map, # specify the map function to enable multiprocessing + ) + + def evaluate_values(self, values): + # values is a list of `ray.ObjectRef`s not `np.ndarray`s + # ray_map automatically converts np.sum to a `ray.remote` function which + # automatically handles `ray.get`ing of `ray.ObjectRef`s passed as arguments + return ray_map(np.sum, values) + + def gen_starting_values(self): + # generate objects and place in ray's object store + return [ + ray.put(np.random.random(self.hparams.member_size) < 0.5) + for i in range(self.hparams.population_size) + ] + + +if __name__ == '__main__': + # initialize ray to use the specified system resources + ray.init() + + # create and train the population + module = OneMaxRayModule(population_size=128, member_size=1_000_000) + pop, logbook, halloffame = Trainer(generations=100, progress=True).fit(module) + + print('initial stats:', logbook[0]) + print('final stats:', logbook[-1]) + print('best member:', halloffame.members[0]) diff --git a/pytest.ini b/pytest.ini new file mode 100644 index 0000000..a6fb8d5 --- /dev/null +++ b/pytest.ini @@ -0,0 +1,9 @@ + +[pytest] +minversion = 6.0 +testpaths = + tests + ruck +python_files = + test_*.py + __test__*.py diff --git a/requirements-test.txt b/requirements-test.txt new file mode 100644 index 0000000..412de73 --- /dev/null +++ b/requirements-test.txt @@ -0,0 +1,2 @@ +pytest>=6.2.4 +pytest-cov>=2.12.1 diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..f1f65b4 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,4 @@ +pip>=21.0 +numpy>=1.21.0 +tqdm>=4.60.0 +ray>=1.6.0 diff --git a/ruck/__init__.py b/ruck/__init__.py new file mode 100644 index 0000000..e1ba995 --- /dev/null +++ b/ruck/__init__.py @@ -0,0 +1,36 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + + +# base +from ruck._member import Member +from ruck._member import Population +from ruck._module import EaModule + +# training +from ruck._train import Trainer +from ruck._train import yield_population_steps + +# functional utils +from ruck import functional as R diff --git a/ruck/_history.py b/ruck/_history.py new file mode 100644 index 0000000..8b1796f --- /dev/null +++ b/ruck/_history.py @@ -0,0 +1,245 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import dataclasses +import heapq +from typing import Any +from typing import Callable +from typing import Dict +from typing import Generic +from typing import List +from typing import TypeVar + +from ruck._member import Population + + +T = TypeVar('T') +V = TypeVar('V') + + +# ========================================================================= # +# Type Hints # +# ========================================================================= # + + +ValueFnHint = Callable[[T], V] +StatFnHint = Callable[[V], Any] + + +# ========================================================================= # +# Logbook # +# ========================================================================= # + + +class StatsGroup(Generic[T, V]): + + def __init__(self, value_fn: ValueFnHint[T, V] = None, **stats_fns: StatFnHint[V]): + assert all(str.isidentifier(key) for key in stats_fns.keys()) + assert stats_fns + self._value_fn = value_fn + self._stats_fns = stats_fns + + @property + def keys(self) -> List[str]: + return list(self._stats_fns.keys()) + + def compute(self, value: T) -> Dict[str, Any]: + if self._value_fn is not None: + value = self._value_fn(value) + return { + key: stat_fn(value) + for key, stat_fn in self._stats_fns.items() + } + + +class Logbook(Generic[T]): + + def __init__(self, *external_keys: str, **stats_groups: StatsGroup[T, Any]): + self._all_ordered_keys = [] + self._external_keys = [] + self._stats_groups = {} + self._history = [] + # register values + for k in external_keys: + self.register_external_stat(k) + for k, v in stats_groups.items(): + self.register_stats_group(k, v) + + def _assert_key_valid(self, name: str): + if not str.isidentifier(name): + raise ValueError(f'stat name is not a valid identifier: {repr(name)}') + return name + + def _assert_key_available(self, name: str): + if name in self._external_keys: + raise ValueError(f'external stat already named: {repr(name)}') + if name in self._stats_groups: + raise ValueError(f'stat group already named: {repr(name)}') + return name + + def register_external_stat(self, name: str): + self._assert_key_available(self._assert_key_available(name)) + # add stat + self._external_keys.append(name) + self._all_ordered_keys.append(name) + return self + + def register_stats_group(self, name: str, stats_group: StatsGroup[T, Any]): + self._assert_key_available(self._assert_key_available(name)) + assert isinstance(stats_group, StatsGroup) + assert stats_group not in self._stats_groups.values() + # add stat group + self._stats_groups[name] = stats_group + self._all_ordered_keys.extend(f'{name}:{key}' for key in stats_group.keys) + return self + + def record(self, population: Population[T], **external_values): + # extra stats + if set(external_values.keys()) != set(self._external_keys): + raise KeyError(f'required external_values: {sorted(self._external_keys)}, got: {sorted(external_values.keys())}') + # external values + stats = dict(external_values) + # generate stats + for name, stat_group in self._stats_groups.items(): + for key, value in stat_group.compute(population).items(): + stats[f'{name}:{key}'] = value + # order stats + assert set(stats.keys()) == set(self._all_ordered_keys) + record = {k: stats[k] for k in self._all_ordered_keys} + # record and return stats + self._history.append(record) + return dict(record) + + @property + def history(self) -> List[Dict[str, Any]]: + return list(self._history) + + def __getitem__(self, idx: int): + assert isinstance(idx, int) + return dict(self._history[idx]) + + def __len__(self): + return len(self._history) + + def __iter__(self): + for i in range(len(self)): + yield self[i] + + +# ========================================================================= # +# HallOfFame # +# ========================================================================= # + + +@dataclasses.dataclass(order=True) +class HallOfFameItem: + fitness: float + member: Any = dataclasses.field(compare=False) + + +class HallOfFameFrozenError(Exception): + pass + + +class HallOfFameNotFrozenError(Exception): + pass + + +class HallOfFame(Generic[T]): + + def __init__(self, n_best: int = 5, maximize: bool = True): + self._maximize = maximize + assert maximize + self._n_best = n_best + # update values + self._heap = [] # element 0 is always the smallest + self._scores = {} + # frozen values + self._frozen = False + self._frozen_members = None + self._frozen_values = None + self._frozen_scores = None + + def update(self, population: Population[T]): + if self.is_frozen: + raise HallOfFameFrozenError('The hall of fame has been frozen, no more members can be added!') + # get potential best in population + best = sorted(population, key=lambda m: m.fitness, reverse=True)[:self._n_best] + # add the best + for member in best: + # try add to hall of fame + item = HallOfFameItem(fitness=member.fitness, member=member) + # skip if we already have the same score ... + # TODO: this should not ignore members with the same scores, this is hacky + if item.fitness in self._scores: + continue + # checks + self._scores[item.fitness] = item + if len(self._heap) < self._n_best: + heapq.heappush(self._heap, item) + else: + removed = heapq.heappushpop(self._heap, item) + del self._scores[removed.fitness] + + def freeze(self) -> 'HallOfFame': + if self.is_frozen: + raise HallOfFameFrozenError('The hall of fame has already been frozen, cannot freeze again!') + # freeze + self._frozen = True + self._frozen_members = [m.member for m in sorted(self._heap, reverse=True)] # 0 is best, -1 is worst + # reset values + self._scores = None + self._heap = None + return self + + @property + def is_frozen(self) -> bool: + return self._frozen + + @property + def members(self) -> Population[T]: + return list(self._frozen_members) + + def __getitem__(self, idx: int): + if not self.is_frozen: + raise HallOfFameNotFrozenError('The hall of fame has not yet been frozen by a completed training run, cannot access members!') + assert isinstance(idx, int) + return self._frozen_members[idx] + + def __len__(self): + if not self.is_frozen: + raise HallOfFameNotFrozenError('The hall of fame has not yet been frozen by a completed training run, cannot access length!') + return len(self._frozen_members) + + def __iter__(self): + if not self.is_frozen: + raise HallOfFameNotFrozenError('The hall of fame has not yet been frozen by a completed training run, cannot access members!') + for i in range(len(self)): + yield self[i] + + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/_member.py b/ruck/_member.py new file mode 100644 index 0000000..283af9a --- /dev/null +++ b/ruck/_member.py @@ -0,0 +1,115 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +import re +from typing import Generic +from typing import List +from typing import Optional +from typing import TypeVar + +import numpy as np + + +# ========================================================================= # +# Members # +# ========================================================================= # + + +class MemberIsNotEvaluatedError(Exception): + pass + + +class MemberAlreadyEvaluatedError(Exception): + pass + + +T = TypeVar('T') + + +_RE_WHITESPACE = re.compile(r'\s\s+') + + +class Member(Generic[T]): + + def __init__(self, value: T, fitness: float = None): + self._value = value + self._fitness = None + # set fitness + if fitness is not None: + self.fitness = fitness + + @property + def value(self) -> T: + return self._value + + @property + def fitness_unsafe(self) -> Optional[float]: + return self._fitness + + @property + def fitness(self) -> float: + if not self.is_evaluated: + raise MemberIsNotEvaluatedError('The member has not been evaluated, the fitness has not yet been set.') + return self._fitness + + @fitness.setter + def fitness(self, fitness: float): + if self.is_evaluated: + raise MemberAlreadyEvaluatedError('The member has already been evaluated, the fitness can only ever be set once. Create a new member instead!') + if np.isnan(fitness): + raise ValueError('fitness values cannot be NaN, this is an error!') + self._fitness = float(fitness) + + def set_fitness(self, fitness: float) -> 'Member[T]': + self.fitness = fitness + return self + + @property + def is_evaluated(self) -> bool: + return (self._fitness is not None) + + def __str__(self): + return repr(self) + + def __repr__(self): + value_str = _RE_WHITESPACE.sub(' ', repr(self.value)) + # cut short + if len(value_str) > 33: + value_str = f'{value_str[:14].rstrip(" ")} ... {value_str[-14:].lstrip(" ")}' + # get fitness + fitness_str = f', {self.fitness}' if self.is_evaluated else '' + # combine + return f'{self.__class__.__name__}({value_str}{fitness_str})' + + +# ========================================================================= # +# Population # +# ========================================================================= # + + +Population = List[Member[T]] + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/_module.py b/ruck/_module.py new file mode 100644 index 0000000..8ae8c64 --- /dev/null +++ b/ruck/_module.py @@ -0,0 +1,74 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +from typing import Any +from typing import Dict +from typing import Generic +from typing import List +from typing import Sequence +from typing import TypeVar + +from ruck._history import StatsGroup +from ruck._member import Population +from ruck.util._args import HParamsMixin + + +# ========================================================================= # +# Module # +# ========================================================================= # + + +T = TypeVar('T') + + +class EaModule(Generic[T], HParamsMixin): + + # OVERRIDABLE DEFAULTS + + def get_stats_groups(self) -> Dict[str, StatsGroup[T, Any]]: + # additional stats to be recorded + return {} + + def get_progress_stats(self) -> Sequence[str]: + # which stats are included in the progress bar + return ('evals', 'fit:max',) + + # REQUIRED + + def gen_starting_values(self) -> List[T]: + raise NotImplementedError + + def generate_offspring(self, population: Population[T]) -> Population[T]: + raise NotImplementedError + + def select_population(self, population: Population[T], offspring: Population[T]) -> Population[T]: + raise NotImplementedError + + def evaluate_values(self, values: List[T]) -> List[float]: + raise NotImplementedError + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/_train.py b/ruck/_train.py new file mode 100644 index 0000000..01f90d0 --- /dev/null +++ b/ruck/_train.py @@ -0,0 +1,161 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import itertools +import logging +from typing import Generic +from typing import Iterator +from typing import Tuple +from typing import TypeVar + +import numpy as np +from tqdm import tqdm + +from ruck._history import HallOfFame +from ruck._history import Logbook +from ruck._history import StatsGroup +from ruck._member import Member +from ruck._member import Population +from ruck._module import EaModule + + +log = logging.getLogger(__name__) + + +T = TypeVar('T') + + +# ========================================================================= # +# Utils Trainer # +# ========================================================================= # + + +def _check_population(population: Population[T], required_size: int) -> Population[T]: + assert len(population) > 0, 'population must not be empty' + assert len(population) == required_size, 'population size is invalid' + assert all(isinstance(member, Member) for member in population), 'items in population are not members' + return population + + +# ========================================================================= # +# Evaluate Invalid # +# ========================================================================= # + + +def _evaluate_unevaluated(module: EaModule[T], members: Population[T]) -> int: + # get unevaluated members + unevaluated = [m for m in members if not m.is_evaluated] + # get fitness values + fitnesses = list(module.evaluate_values([m.value for m in unevaluated])) + # save fitness values + assert len(unevaluated) == len(fitnesses) + for m, f in zip(unevaluated, fitnesses): + m.fitness = f + # return the count + return len(unevaluated) + + +# ========================================================================= # +# Functional Trainer # +# ========================================================================= # + + +def yield_population_steps(module: EaModule[T]) -> Iterator[Tuple[int, Population[T], Population[T], int]]: + # 1. create population + population = [Member(m) for m in module.gen_starting_values()] + population_size = len(population) + population = _check_population(population, required_size=population_size) + + # 2. evaluate population + evals = _evaluate_unevaluated(module, population) + + # yield initial population + yield 0, population, population, evals + + # training loop + for i in itertools.count(1): + # 1. generate offspring + offspring = module.generate_offspring(population) + # 2. evaluate + evals = _evaluate_unevaluated(module, offspring) + # 3. select + population = module.select_population(population, offspring) + population = _check_population(population, required_size=population_size) + + # yield steps + yield i, population, offspring, evals + + +# ========================================================================= # +# Class Trainer # +# ========================================================================= # + + +class Trainer(Generic[T]): + + def __init__( + self, + generations: int = 100, + progress: bool = True, + history_n_best: int = 5, + offspring_generator=yield_population_steps, + ): + self._generations = generations + self._progress = progress + self._history_n_best = history_n_best + self._offspring_generator = offspring_generator + assert self._history_n_best > 0 + + def fit(self, module: EaModule[T]) -> Tuple[Population[T], Logbook[T], HallOfFame[T]]: + assert isinstance(module, EaModule) + # history trackers + logbook, halloffame = self._create_default_trackers(module) + # progress bar and training loop + with tqdm(total=self._generations, desc='generation', disable=not self._progress, ncols=120) as p: + for gen, population, offspring, evals in itertools.islice(self._offspring_generator(module), self._generations): + # update statistics with new population + halloffame.update(offspring) + stats = logbook.record(population, gen=gen, evals=evals) + # update progress bar + p.update() + p.set_postfix({k: stats[k] for k in module.get_progress_stats()}) + # done + return population, logbook, halloffame.freeze() + + def _create_default_trackers(self, module: EaModule[T]) -> Tuple[Logbook[T], HallOfFame[T]]: + halloffame = HallOfFame( + n_best=self._history_n_best, + maximize=True, + ) + logbook = Logbook( + 'gen', 'evals', + fit=StatsGroup(lambda pop: [m.fitness for m in pop], min=np.min, max=np.max, mean=np.mean), + **module.get_stats_groups() + ) + return logbook, halloffame + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/functional/__init__.py b/ruck/functional/__init__.py new file mode 100644 index 0000000..f18cc3d --- /dev/null +++ b/ruck/functional/__init__.py @@ -0,0 +1,30 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +from ruck.functional._mate import * +from ruck.functional._mutate import * +from ruck.functional._select import * + +# helper -- should be replaced +from ruck.functional._algorithm import * diff --git a/ruck/functional/_algorithm.py b/ruck/functional/_algorithm.py new file mode 100644 index 0000000..4519e78 --- /dev/null +++ b/ruck/functional/_algorithm.py @@ -0,0 +1,259 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +from typing import Optional +from typing import Tuple +from typing import TypeVar + +from ruck._member import Member +from ruck._member import Population +from ruck.functional import SelectFnHint +from ruck.functional._mate import MateFnHint +from ruck.functional._mutate import MutateFnHint +from ruck.util._iter import replaced_random_taken_pairs +from ruck.util._iter import replaced_random_taken_elems + + +import random +import numpy as np + + +# ========================================================================= # +# Helper # +# ========================================================================= # + + +T = TypeVar('T') + + +# ========================================================================= # +# Crossover & Mutate Helpers # +# ========================================================================= # + + +def _mate_wrap_unwrap_values(mate_fn: MateFnHint[T]): + def wrapper(ma: Member[T], mb: Member[T]) -> Tuple[Member[T], Member[T]]: + va, vb = mate_fn(ma.value, mb.value) + return Member(va), Member(vb) + return wrapper + + +def _mutate_wrap_unwrap_values(mutate_fn: MutateFnHint[T]): + def wrapper(m: Member[T]) -> Member[T]: + v = mutate_fn(m.value) + return Member(v) + return wrapper + + +# ========================================================================= # +# Function Wrappers # +# ========================================================================= # + + +def apply_mate( + population: Population[T], + mate_fn: MateFnHint[T], + p: float = 0.5, + keep_order: bool = True, + map_fn=map, +) -> Population[T]: + # randomize order so we have randomized pairs + if keep_order: + indices = np.arange(len(population)) + np.random.shuffle(indices) + offspring = [population[i] for i in indices] + else: + offspring = list(population) + np.random.shuffle(offspring) + # apply mating to population + offspring = replaced_random_taken_pairs( + fn=_mate_wrap_unwrap_values(mate_fn), + items=offspring, + p=p, + map_fn=map_fn, + ) + # undo random order + if keep_order: + offspring = [offspring[i] for i in np.argsort(indices)] + # done! + assert len(offspring) == len(population) + return offspring + + +def apply_mutate( + population: Population, + mutate_fn: MutateFnHint, + p: float = 0.5, + map_fn=map, +) -> Population: + # apply mutations to population + offspring = replaced_random_taken_elems( + fn=_mutate_wrap_unwrap_values(mutate_fn), + items=population, + p=p, + map_fn=map_fn, + ) + # done! + assert len(offspring) == len(population) + return offspring + + +def apply_mate_and_mutate( + population: Population[T], + mate_fn: MateFnHint[T], + mutate_fn: MutateFnHint[T], + p_mate: float = 0.5, + p_mutate: float = 0.5, + map_fn=map, +) -> Population[T]: + """ + Apply crossover AND mutation + + NOTE: + - Modified individuals need their fitness re-evaluated + - Mate & Mutate should always return copies of the received values. + + ** Should be equivalent to varAnd from DEAP ** + """ + offspring = apply_mate(population, mate_fn, p=p_mate, keep_order=True, map_fn=map_fn) + offspring = apply_mutate(offspring, mutate_fn, p=p_mutate, map_fn=map_fn) + return offspring + + +def _get_generate_member_fn( + mate_fn: MateFnHint[T], + mutate_fn: MutateFnHint[T], + p_mate: float = 0.5, + p_mutate: float = 0.5, +): + def _generate_member(a_b_r: Tuple[Member[T], Optional[Member[T]], float]) -> Member[T]: + ma, mb, r = a_b_r + if r < p_mate: return Member(mate_fn(ma.value, mb.value)[0]) # Apply crossover | only take first item | mb is only defined for this case + elif r < p_mate + p_mutate: return Member(mutate_fn(ma.value)) # Apply mutation + else: return ma # Apply reproduction + return _generate_member + + +def apply_mate_or_mutate_or_reproduce( + population: Population[T], + num_offspring: int, # lambda_ + mate_fn: MateFnHint[T], + mutate_fn: MutateFnHint[T], + p_mate: float = 0.5, + p_mutate: float = 0.5, + map_fn=map, +) -> Population[T]: + """ + Apply crossover OR mutation OR reproduction + + NOTE: + - Modified individuals need their fitness re-evaluated + - Mate & Mutate should always return copies of the received values. + + ** Should be equivalent to varOr from DEAP, but significantly faster for larger populations ** + """ + assert (p_mate + p_mutate) <= 1.0, 'The sum of the crossover and mutation probabilities must be smaller or equal to 1.0.' + + # choose which action should be taken for each element + probabilities = np.random.random(num_offspring) + # select offspring + choices_a = [random.choice(population) for p in probabilities] + choices_b = [random.choice(population) if (p < p_mate) else None for p in probabilities] # these are only needed for crossover, when (p < p_mate) + # get function to generate offspring + # - we create the function so that we don't accidentally pickle anything else + fn = _get_generate_member_fn(mate_fn=mate_fn, mutate_fn=mutate_fn, p_mate=p_mate, p_mutate=p_mutate) + # generate offspring + # - TODO: this is actually not optimal! we should only pass mate and + # mutate operations to the map function, we could distribute + # work unevenly between processes if map_fn is replaced + offspring = list(map_fn(fn, zip(choices_a, choices_b, probabilities))) + # done! + assert len(offspring) == num_offspring + return offspring + + +# ========================================================================= # +# Gen & Select # +# ========================================================================= # + + +def factory_simple_ea( + mate_fn: MateFnHint[T], + mutate_fn: MutateFnHint[T], + select_fn: SelectFnHint[T], + p_mate: float = 0.5, + p_mutate: float = 0.5, + map_fn=map, +): + def generate(population): + return apply_mate_and_mutate(population=select_fn(population, len(population)), p_mate=p_mate, mate_fn=mate_fn, p_mutate=p_mutate, mutate_fn=mutate_fn, map_fn=map_fn) + + def select(population, offspring): + return offspring + + return generate, select + + +def factory_mu_plus_lambda( + mate_fn: MateFnHint[T], + mutate_fn: MutateFnHint[T], + select_fn: SelectFnHint[T], + offspring_num: int, # lambda + p_mate: float = 0.5, + p_mutate: float = 0.5, + map_fn=map, +): + def generate(population): + num = len(population) if (offspring_num is None) else offspring_num + return apply_mate_or_mutate_or_reproduce(population, num, mate_fn=mate_fn, mutate_fn=mutate_fn, p_mate=p_mate, p_mutate=p_mutate, map_fn=map_fn) + + def select(population: Population[T], offspring: Population[T]): + return select_fn(population + offspring, len(population)) + + return generate, select + + +def factory_mu_comma_lambda( + mate_fn: MateFnHint[T], + mutate_fn: MutateFnHint[T], + select_fn: SelectFnHint[T], + offspring_num: Optional[int] = None, # lambda + p_mate: float = 0.5, + p_mutate: float = 0.5, + map_fn=map, +): + def generate(population): + num = len(population) if (offspring_num is None) else offspring_num + return apply_mate_or_mutate_or_reproduce(population, num, mate_fn=mate_fn, mutate_fn=mutate_fn, p_mate=p_mate, p_mutate=p_mutate, map_fn=map_fn) + + def select(population, offspring): + assert len(offspring) >= len(population), f'invalid arguments, the number of offspring: {len(offspring)} (lambda) must be greater than or equal to the size of the population: {len(population)} (mu)' + return select_fn(offspring, len(population)) + + return generate, select + + +# # ========================================================================= # +# # END # +# # ========================================================================= # diff --git a/ruck/functional/_mate.py b/ruck/functional/_mate.py new file mode 100644 index 0000000..efdb718 --- /dev/null +++ b/ruck/functional/_mate.py @@ -0,0 +1,73 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +from functools import wraps +from typing import Callable +from typing import Tuple +from typing import TypeVar + +import numpy as np + + +# ========================================================================= # +# Mate Helper # +# ========================================================================= # + + +F = TypeVar('F') +T = TypeVar('T') +MateFnHint = Callable[[T, T], Tuple[T, T]] + + +def check_mating(fn: F) -> F: + @wraps(fn) + def wrapper(value_a: T, value_b: T, *args, **kwargs) -> Tuple[T, T]: + mated_a, mated_b = fn(value_a, value_b, *args, **kwargs) + assert mated_a is not value_a, f'Mate function: {fn} should return new values' + assert mated_a is not value_b, f'Mate function: {fn} should return new values' + assert mated_b is not value_a, f'Mate function: {fn} should return new values' + assert mated_b is not value_b, f'Mate function: {fn} should return new values' + return mated_a, mated_b + return wrapper + + +# ========================================================================= # +# Mate # +# ========================================================================= # + + +@check_mating +def mate_crossover_1d(a: np.ndarray, b: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: + assert a.ndim == 1 + assert a.shape == b.shape + i, j = np.random.randint(0, len(a), size=2) + i, j = min(i, j), max(i, j) + new_a = np.concatenate([a[:i], b[i:j], a[j:]], axis=0) + new_b = np.concatenate([b[:i], a[i:j], b[j:]], axis=0) + return new_a, new_b + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/functional/_mutate.py b/ruck/functional/_mutate.py new file mode 100644 index 0000000..f493982 --- /dev/null +++ b/ruck/functional/_mutate.py @@ -0,0 +1,73 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +from functools import wraps +from typing import Callable +from typing import TypeVar + +import numpy as np + + +# ========================================================================= # +# Mutate Helper # +# ========================================================================= # + + +F = TypeVar('F') +T = TypeVar('T') +MutateFnHint = Callable[[T], T] + + +def check_mutation(fn: F) -> F: + @wraps(fn) + def wrapper(value: T, *args, **kwargs): + mutated = fn(value, *args, **kwargs) + assert mutated is not value, f'Mutate function: {fn} should return a new value' + return mutated + return wrapper + + +# ========================================================================= # +# Mutate # +# ========================================================================= # + + +@check_mutation +def mutate_flip_bits(a: np.ndarray, p: float = 0.05) -> np.ndarray: + return a ^ (np.random.random(a.shape) < p) + + +@check_mutation +def mutate_flip_bit_groups(a: np.ndarray, p: float = 0.05) -> np.ndarray: + if np.random.random() < 0.5: + # flip set bits + return a ^ ((np.random.random(a.shape) < p) & a) + else: + # flip unset bits + return a ^ ((np.random.random(a.shape) < p) & ~a) + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/functional/_select.py b/ruck/functional/_select.py new file mode 100644 index 0000000..392560f --- /dev/null +++ b/ruck/functional/_select.py @@ -0,0 +1,84 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import random +from functools import wraps +from typing import Callable +from typing import TypeVar + +from ruck._member import Population + + +# ========================================================================= # +# Select Helper # +# ========================================================================= # + + +F = TypeVar('F') +T = TypeVar('T') +SelectFnHint = Callable[[Population[T], int], Population[T]] + + +def check_selection(fn: F) -> F: + @wraps(fn) + def wrapper(population: Population[T], num: int, *args, **kwargs) -> Population[T]: + selected = fn(population, num, *args, **kwargs) + assert selected is not population, f'Select function: {fn} should return a new list' + assert len(selected) == num, f'Select function: {fn} returned an incorrect number of elements, got: {len(selected)}, should be: {num}' + return selected + return wrapper + + +# ========================================================================= # +# Select # +# ========================================================================= # + + +@check_selection +def select_best(population: Population[T], num: int) -> Population[T]: + return sorted(population, key=lambda m: m.fitness, reverse=True)[:num] + + +@check_selection +def select_worst(population: Population[T], num: int) -> Population[T]: + return sorted(population, key=lambda m: m.fitness, reverse=False)[:num] + + +@check_selection +def select_random(population: Population[T], num: int) -> Population[T]: + return random.sample(population, k=num) + + +@check_selection +def select_tournament(population: Population[T], num: int, k: int = 3) -> Population[T]: + key = lambda m: m.fitness + return [ + max(random.sample(population, k=k), key=key) + for _ in range(num) + ] + + +# ========================================================================= # +# Selection # +# ========================================================================= # diff --git a/ruck/util/__init__.py b/ruck/util/__init__.py new file mode 100644 index 0000000..1a85507 --- /dev/null +++ b/ruck/util/__init__.py @@ -0,0 +1,29 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + + +from ruck.util._ray import ray_map +from ruck.util._ray import ray_refs_wrapper + +from ruck.util._timer import Timer diff --git a/ruck/util/_args.py b/ruck/util/_args.py new file mode 100644 index 0000000..c74ec6d --- /dev/null +++ b/ruck/util/_args.py @@ -0,0 +1,77 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +from argparse import Namespace +from typing import Optional +from typing import Sequence + + +# ========================================================================= # +# Hyper Parameters # +# ========================================================================= # + + +class HParamsMixin(object): + + __hparams = None + + def save_hyperparameters(self, ignore: Optional[Sequence[str]] = None, include: Optional[Sequence[str]] = None): + import inspect + import warnings + # get ignored values + ignored = set() if (ignore is None) else set(ignore) + included = set() if (include is None) else set(include) + assert all(str.isidentifier(k) for k in ignored) + assert all(str.isidentifier(k) for k in included) + # get function params & signature + locals = inspect.currentframe().f_back.f_locals + params = inspect.signature(self.__class__.__init__) + # get values + (self_param, *params) = params.parameters.items() + # check that self is correct & skip it + assert self_param[0] == 'self' + assert locals[self_param[0]] is self + # get other values + values = {} + for k, v in params: + if k in ignored: continue + if v.kind == v.VAR_KEYWORD: warnings.warn('variable keywords argument saved, consider converting to explicit arguments.') + if v.kind == v.VAR_POSITIONAL: warnings.warn('variable positional argument saved, consider converting to explicit named arguments.') + values[k] = locals[k] + # get extra values + for k in included: + assert k != 'self' + assert k not in values, 'k has already been included' + values[k] = locals[k] + # done! + self.__hparams = Namespace(**values) + + @property + def hparams(self): + return self.__hparams + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/util/_iter.py b/ruck/util/_iter.py new file mode 100644 index 0000000..20acc2a --- /dev/null +++ b/ruck/util/_iter.py @@ -0,0 +1,131 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + + +import itertools +import random +from typing import Any +from typing import Callable +from typing import Iterable +from typing import Iterator +from typing import List +from typing import Sequence +from typing import Tuple +from typing import TypeVar + +import numpy as np + + +# ========================================================================= # +# Helper # +# ========================================================================= # + + +T = TypeVar('T') + + +# ========================================================================= # +# iter # +# ========================================================================= # + + +# NOTE: +# Iterable: objects that return Iterators when passed to `iter()` +# Iterator: return the next item when used with `next()` +# every Iterator is ALSO an Iterable + + +def ipairs(items: Iterable[T]) -> Iterator[Tuple[T, T]]: + itr_a, itr_b = itertools.tee(items) + itr_a = itertools.islice(itr_a, 0, None, 2) + itr_b = itertools.islice(itr_b, 1, None, 2) + return zip(itr_a, itr_b) + + +# ========================================================================= # +# lists # +# ========================================================================= # + + +def chained(list_of_lists: Iterable[Iterable[T]]) -> List[T]: + return list(itertools.chain(*list_of_lists)) + + +def splits(items: Sequence[Any], num_chunks: int, keep_empty: bool = False) -> List[List[Any]]: + # np.array_split will return empty elements if required + if not keep_empty: + num_chunks = min(num_chunks, len(items)) + # we return a lists of lists, not a list of + # tuples so that it is compatible with ray.get + return [list(items) for items in np.array_split(items, num_chunks)] + + +# ========================================================================= # +# random -- used for ruck.functional._algorithm # +# ========================================================================= # + + +def replaced_random_taken_pairs(fn: Callable[[T, T], Tuple[T, T]], items: Iterable[T], p: float, map_fn=map) -> List[T]: + # shallow copy because we want to update elements in this list + # - we need to take care to handle the special case where the length + # of items is odd, thus we cannot just call random_map with modified + # args using pairs and chaining the output + items = list(items) + # select random items + idxs, vals = [], [] + for i, pair in enumerate(zip(items[0::2], items[1::2])): + if random.random() < p: + vals.append(pair) + idxs.append(i) + # map selected values + vals = map_fn(lambda pair: fn(pair[0], pair[1]), vals) + # update values + for i, (v0, v1) in zip(idxs, vals): + items[i*2+0] = v0 + items[i*2+1] = v1 + # done! + return items + + +def replaced_random_taken_elems(fn: Callable[[T], T], items: Iterable[T], p: float, map_fn=map) -> List[T]: + # shallow copy because we want to update elements in this list + items = list(items) + # select random items + idxs, vals = [], [] + for i, v in enumerate(items): + if random.random() < p: + vals.append(v) + idxs.append(i) + # map selected values + vals = map_fn(fn, vals) + # update values + for i, v in zip(idxs, vals): + items[i] = v + # done! + return items + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/util/_ray.py b/ruck/util/_ray.py new file mode 100644 index 0000000..6c5dea2 --- /dev/null +++ b/ruck/util/_ray.py @@ -0,0 +1,113 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import functools +from typing import Any +from typing import List +from typing import Sequence + +import ray +from ray.remote_function import RemoteFunction + + +# ========================================================================= # +# ray # +# ========================================================================= # + + +@functools.lru_cache(maxsize=16) +def _to_remote_ray_fn(fn): + if not isinstance(fn, RemoteFunction): + fn = ray.remote(fn) + return fn + + +def ray_map(ray_fn, items: Sequence[Any]) -> List[Any]: + """ + A more convenient alternative to `ray.util.multiprocessing.Pool`s `map` function! + Using a similar API to python `map`, except returning a list of mapped values + instead of an iterable. + + The advantage of this functions it that we automatically wrap passed functions to + ray.remote functions, also enabling automatic getting of ObjectRef values. + """ + # make sure the function is a remote function + ray_fn = _to_remote_ray_fn(ray_fn) + # pass each item to ray and wait for the result + return ray.get(list(map(ray_fn.remote, items))) + + +# ========================================================================= # +# ray - object store # +# ========================================================================= # + + +def ray_refs_wrapper(fn = None, get: bool = True, put: bool = True, iter_results: bool = False): + """ + Wrap a function so that we automatically ray.get + all the arguments and ray.put the result. + + iter_results=True instead treats the result as an + iterable and applies ray.put to each result item + + for example: + >>> def mate(a, b): + >>> a, b = ray.get(a), ray.get(b) + >>> a, b = R.mate_crossover_1d(a, b) + >>> return ray.put(a), ray.put(b) + + becomes: + >>> @ray_refs_wrapper(iter_results=True) + >>> def mate(a, b): + >>> return R.mate_crossover_1d(a, b) + """ + + def wrapper(fn): + @functools.wraps(fn) + def inner(*args): + # get values from object store + if get: + args = (ray.get(v) for v in args) + # call function + result = fn(*args) + # store values in the object store + if put: + if iter_results: + result = tuple(ray.put(v) for v in result) + else: + result = ray.put(result) + # done! + return result + return inner + + # handle correct case + if fn is None: + return wrapper + else: + return wrapper(fn) + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/ruck/util/_timer.py b/ruck/util/_timer.py new file mode 100644 index 0000000..16cc089 --- /dev/null +++ b/ruck/util/_timer.py @@ -0,0 +1,43 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import contextlib +import time + + +# ========================================================================= # +# Timer # +# ========================================================================= # + + +@contextlib.contextmanager +def Timer(name: str): + t = time.time() + yield + print(name, time.time() - t, 'seconds') + + +# ========================================================================= # +# lists # +# ========================================================================= # diff --git a/setup.py b/setup.py new file mode 100644 index 0000000..108817a --- /dev/null +++ b/setup.py @@ -0,0 +1,73 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import setuptools + + +# ========================================================================= # +# HELPER # +# ========================================================================= # + + +with open("README.md", "r", encoding="utf-8") as file: + long_description = file.read() + +with open('requirements.txt', 'r') as f: + install_requires = (req[0] for req in map(lambda x: x.split('#'), f.readlines())) + install_requires = [req for req in map(str.strip, install_requires) if req] + + +# ========================================================================= # +# SETUP # +# ========================================================================= # + + +setuptools.setup( + name="ruck", + author="Nathan Juraj Michlo", + author_email="NathanJMichlo@gmail.com", + + version="0.0.1.dev1", + python_requires=">=3.8", + packages=setuptools.find_packages(), + + install_requires=install_requires, + + url="https://github.com/nmichlo/ruck", + description="Performant evolutionary algorithms for Python.", + long_description=long_description, + long_description_content_type="text/markdown", + + classifiers=[ + "License :: OSI Approved :: MIT License", + "Operating System :: OS Independent", + "Programming Language :: Python :: 3.8", + "Intended Audience :: Science/Research", + ], +) + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/tests/test.py b/tests/test.py new file mode 100644 index 0000000..8d6f2ee --- /dev/null +++ b/tests/test.py @@ -0,0 +1,160 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + + +import functools +import random +import numpy as np +import pytest + +from examples.onemax import OneMaxModule +from examples.onemax_minimal import OneMaxMinimalModule +from ruck import Member +from ruck import Trainer +from ruck import R + + +# ========================================================================= # +# TESTS # +# ========================================================================= # + + +def test_mate_keep_order(): + random.seed(77) + np.random.seed(77) + # checks + offspring = R.apply_mate( + population=[Member(c) for c in 'abcde'], + mate_fn=lambda a, b: (a.upper(), b.upper()), + p=0.5, + keep_order=True, + ) + # done + assert ''.join(m.value for m in offspring) == 'ABcde' + + +def test_mate_random_order(): + random.seed(77) + np.random.seed(77) + # checks + offspring = R.apply_mate( + population=[Member(c) for c in 'abcde'], + mate_fn=lambda a, b: (a.upper(), b.upper()), + p=0.5, + keep_order=False, + ) + # done + assert ''.join(m.value for m in offspring) == 'cdBAe' + + +def test_onemax_minimal(): + module = OneMaxMinimalModule() + pop, logbook, halloffame = Trainer(generations=40, progress=False).fit(module) + assert logbook[0]['fit:max'] < logbook[-1]['fit:max'] + + +def test_onemax(): + module = OneMaxModule(population_size=300, member_size=100) + pop, logbook, halloffame = Trainer(generations=40, progress=False).fit(module) + assert logbook[0]['fit:max'] < logbook[-1]['fit:max'] + + +def test_onemax_ea_simple(): + module = OneMaxModule(population_size=300, member_size=100) + + # EA SIMPLE + module.generate_offspring, module.select_population = R.factory_simple_ea( + mate_fn=R.mate_crossover_1d, + mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05), + select_fn=functools.partial(R.select_tournament, k=3), + p_mate=module.hparams.p_mate, + p_mutate=module.hparams.p_mutate, + ) + + pop, logbook, halloffame = Trainer(generations=40, progress=False).fit(module) + assert logbook[0]['fit:max'] < logbook[-1]['fit:max'] + + +def test_onemax_mu_plus_lambda(): + module = OneMaxModule(population_size=300, member_size=100) + + # MU PLUS LAMBDA + module.generate_offspring, module.select_population = R.factory_mu_plus_lambda( + mate_fn=R.mate_crossover_1d, + mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05), + select_fn=functools.partial(R.select_tournament, k=3), + offspring_num=250, + p_mate=module.hparams.p_mate, + p_mutate=module.hparams.p_mutate, + ) + + pop, logbook, halloffame = Trainer(generations=40, progress=False).fit(module) + assert logbook[0]['fit:max'] < logbook[-1]['fit:max'] + + +def test_onemax_mu_comma_lambda(): + module = OneMaxModule(population_size=300, member_size=100) + + # MU COMMA LAMBDA + module.generate_offspring, module.select_population = R.factory_mu_comma_lambda( + mate_fn=R.mate_crossover_1d, + mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05), + select_fn=functools.partial(R.select_tournament, k=3), + offspring_num=250, # INVALID + p_mate=module.hparams.p_mate, + p_mutate=module.hparams.p_mutate, + ) + + with pytest.raises(AssertionError, match=r'invalid arguments, the number of offspring: 250 \(lambda\) must be greater than or equal to the size of the population: 300 \(mu\)'): + pop, logbook, halloffame = Trainer(generations=40, progress=False).fit(module) + + # MU COMMA LAMBDA + module.generate_offspring, module.select_population = R.factory_mu_comma_lambda( + mate_fn=R.mate_crossover_1d, + mutate_fn=functools.partial(R.mutate_flip_bit_groups, p=0.05), + select_fn=functools.partial(R.select_tournament, k=3), + offspring_num=400, + p_mate=module.hparams.p_mate, + p_mutate=module.hparams.p_mutate, + ) + + pop, logbook, halloffame = Trainer(generations=40, progress=False).fit(module) + assert logbook[0]['fit:max'] < logbook[-1]['fit:max'] + + + +def test_member(): + m = Member('abc') + assert str(m) == "Member('abc')" + m = Member('abc', 0.5) + assert str(m) == "Member('abc', 0.5)" + m = Member('abc'*100, 0.5) + assert str(m) == "Member('abcabcabcabca ... cabcabcabcabc', 0.5)" + m = Member('abc '*100, 0.5) + assert str(m) == "Member('abc abc abc a ... abc abc abc ', 0.5)" + + +# ========================================================================= # +# END # +# ========================================================================= # diff --git a/tests/util.py b/tests/util.py new file mode 100644 index 0000000..c2fe6ca --- /dev/null +++ b/tests/util.py @@ -0,0 +1,70 @@ +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ +# MIT License +# +# Copyright (c) 2021 Nathan Juraj Michlo +# +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +# SOFTWARE. +# ~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~=~ + +import contextlib +import os +import sys +from contextlib import contextmanager + + +# ========================================================================= # +# TEST UTILS # +# ========================================================================= # + + +@contextmanager +def no_stdout(): + old_stdout = sys.stdout + sys.stdout = open(os.devnull, 'w') + yield + sys.stdout = old_stdout + + +@contextmanager +def no_stderr(): + old_stderr = sys.stderr + sys.stderr = open(os.devnull, 'w') + yield + sys.stderr = old_stderr + + +@contextlib.contextmanager +def temp_wd(new_wd): + old_wd = os.getcwd() + os.chdir(new_wd) + yield + os.chdir(old_wd) + + +@contextlib.contextmanager +def temp_sys_args(new_argv): + old_argv = sys.argv + sys.argv = new_argv + yield + sys.argv = old_argv + + +# ========================================================================= # +# END # +# ========================================================================= #