2016-09-19 20:15:24 -06:00
|
|
|
2016 09/19b
|
|
|
|
|
|
|
|
After another 2 hours of trouble shooting, I learned that sympy.subs throws the 'zoo' error for a divide-by-zero if
|
|
|
|
working with integers--and stalls (as I witnessed last year when developing Karoo), but if throws 'inf' or '-inf' if
|
|
|
|
working with floats and continues to process unencumbered. This means the 'zoo' trap has not been used for over a year.
|
|
|
|
It is now removed from the method fx_eval_subs().
|
|
|
|
|
|
|
|
|
2016-09-19 01:13:59 -06:00
|
|
|
2016 09/19
|
|
|
|
|
2016-09-19 14:32:48 -06:00
|
|
|
All experiments with lambdify are on hold as it is throwing 'divide by zero' errors when no 0 exists in the data.
|
|
|
|
|
|
|
|
In the evaluation of this issue, I returned to the means by which I deal with 'divide by zero', independent of the
|
|
|
|
library used to process the trees. Originally, 'result' was replaced by the value 1. But a colleague suggested that I
|
|
|
|
might be tossing genetic material of value to future generations. So, now a tree which produces a 'divide by zero'
|
|
|
|
error ('zoo' in .subs) is retained, but given a fitness score of 0.
|
|
|
|
|
|
|
|
Method fx_eval_subs() is updated to assign an 'error' tag to any tree which exhibits 'divide by zero' behaviour.
|
|
|
|
|
|
|
|
Method fx_fitness_eval() is updated to bypass processing of a tree which has been tagged with an 'error' tag.
|
|
|
|
|
|
|
|
Methods fx_test_classify(), fx_test_regress(), and fx_test_match() are each updated to send the 'tree_id' to
|
|
|
|
fx_eval_subs() as is now required.
|
|
|
|
|
|
|
|
For now, it is not recommended that you use the lambdify function in fx_eval_subs(), unless you know how to fix it.
|
2016-09-19 01:13:59 -06:00
|
|
|
|
|
|
|
|
2016-09-16 15:53:14 -06:00
|
|
|
2016 09/16
|
|
|
|
|
|
|
|
With the 09/14 update I failed to upload the new coefficients.csv file to the files/ directory. While not yet engaged,
|
|
|
|
this will be the means by which the user can define the constants desired for the Karoo GP run. If you had run Karoo GP
|
|
|
|
v0.9.2.0 in the past 24 hours without this file, it would have complained. My apology.
|
|
|
|
|
|
|
|
Also, a bit of a roadmap for the 2nd half of 2016, into 2017
|
|
|
|
- validate the new (faster) sympy.lambdify and fully replace the current (slower) sympy.subs
|
|
|
|
- replace the row-by-row dictionaries with vectors for what should be a significant performance increase
|
|
|
|
- complete the introduction of constants in a manner more well defined than is currently supported
|
|
|
|
- investigate replacing pprocess with the multicore library
|
|
|
|
- introduce Theano or Tensor Flow for GPU support
|
|
|
|
|
|
|
|
I welcome any assistance with these, if anyone has experience and time.
|
|
|
|
|
|
|
|
|
2016-09-14 19:41:13 -06:00
|
|
|
2016 09/14 - version 0.9.2.0
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
2016-09-19 01:13:59 -06:00
|
|
|
- Removed redundant lines in the method 'fx_karoo_data_load()'
|
2016-09-14 19:41:13 -06:00
|
|
|
- Added support for the Sympy 'lambdify' function in 'fx_karoo_data_load' (see explanation below)
|
|
|
|
- Added a draft means of catching divide-by-zero errors in the new 'lambdify' function
|
|
|
|
- Discovered the prior 'fx_eval_subs' incorrected applied a value of 1 to the variable 'result' as a means to
|
|
|
|
replace the 'zoo' function for divide by zero errors. However, this could inadvertantly undermine the success of
|
|
|
|
Classification and Regression runs. My apology for not catching this sooner.
|
|
|
|
|
|
|
|
"While attending the CHEAPR 2016 workshop hosted by the Center for Cosmology and Astro-Particle Physics, The Ohio State
|
2016-09-19 01:13:59 -06:00
|
|
|
University, Michael Zevin of Northwestern University proposed that Karoo GP *should* be able to process trees far faster
|
|
|
|
than what we were seeing. I looked into the Sympy functions I was at that time using. Indeed, '.subs' is noted as easy
|
|
|
|
to use, but terribly slow as it relies upon an internal, Python mathematical library. I therefore replaced '.subs' with
|
2016-09-14 19:41:13 -06:00
|
|
|
'.lambdify' which calls upon the C-based Numpy maths library. It is slated to be 500x faster than '.subs', but I am
|
|
|
|
seeing only a 2x performance increase. Clearly, there are yet other barriers to remove.
|
|
|
|
|
|
|
|
In the new 'fx_eval_subs' method you will find both sympy.subs (active) and sympy.lambdify. While preliminary tests
|
|
|
|
worked well, I witnessed an erractic outcome which I yet need to reproduce and investigate. Feel free to comment the
|
|
|
|
.subs and uncomment the .lambdify sections and take it for a spin.
|
|
|
|
|
|
|
|
I believe there are 2 more steps to increase performance: removing the dictionaries which contain each row, such that
|
|
|
|
Karoo is working directly with the Numpy array again, and then processing the array as a vector instead. But this will
|
|
|
|
require substantial recoding.
|
|
|
|
|
|
|
|
I'll keep you informed ..." --kai
|
|
|
|
|
|
|
|
|
2016-08-10 00:09:13 -06:00
|
|
|
2016 08/08 - version 0.9.1.9
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
2016-09-19 01:13:59 -06:00
|
|
|
- Created a new method 'fx_eval_subs()' which conducts the SymPy subs function for both train and test data.
|
2016-08-10 00:09:13 -06:00
|
|
|
- Consolidated duplicate SymPy sub calls in both Train and Test methods, at Erik H. suggestion --thank you!
|
|
|
|
- Consolidated duplicate lines into single lines in both Train and Test methods.
|
|
|
|
- Fixed bug in which Ramped 50/50 trees would not print in Play mode; removed Ramped 50/50 option from Play mode as
|
|
|
|
it does not make sense for a single Tree
|
|
|
|
|
|
|
|
|
2016-07-24 23:02:12 -06:00
|
|
|
2016 07/24 - version 0.9.1.8b
|
|
|
|
|
|
|
|
Sorry. Switched all references to "LOGIC" back to "BOOLEAN". Don't ask.
|
|
|
|
|
|
|
|
|
2016-07-18 07:25:47 -06:00
|
|
|
2016 07/18 - version 0.9.1.8
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- removed some programmer's comments from experiments resolved
|
|
|
|
- modified the screen output for the multi-class classifier such that it is more clear that p (a) is actually
|
|
|
|
predicted (actual) with the addition of the tag "label"
|
|
|
|
|
|
|
|
|
|
|
|
|
2016-08-10 00:09:13 -06:00
|
|
|
2016 07/15 - version 0.9.1.7
|
2016-07-15 00:02:28 -06:00
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- Ramped Half/Half, as originally designed by John Koza is now implemented!
|
|
|
|
- this offers a greater diversity of trees in the initial population
|
|
|
|
- added *** Crossover *** to this method for debug display mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2016 07/14 - version 0.9.1.6b
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- fixed 'debug' display output for branch mutation
|
|
|
|
- fixed a few bugs in branch mutation that reduced evolutionary efficiency
|
|
|
|
- updated all ERROR messages to be uniform in output
|
|
|
|
- added (y/n) to "Are you certain you want to quit?" message --thanks Hunter!
|
|
|
|
|
|
|
|
In karoo_gp_main.py
|
|
|
|
- reset default evoluationary balance to .1/.1/.1/.7
|
|
|
|
|
|
|
|
|
|
|
|
|
2016-07-13 22:52:42 -06:00
|
|
|
2016 07/13 - version 0.9.1.6
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- enabled auto counting of labels (right column) in loaded data (data_y)
|
|
|
|
- re-ordered presentation of screen modes, in order of decreasing feedback [i, g, m, s]
|
|
|
|
- added argparse to further enhance karoo_gp_server.py as a scriptable tool --FINALLY! :)
|
|
|
|
|
|
|
|
In karoo_gp_main.py
|
|
|
|
- removed user query for number of class labels
|
|
|
|
- re-ordered presentation of screen modes, in order of decreasing feedback [i, g, m, s]
|
|
|
|
|
|
|
|
In karoo_go_server.py
|
|
|
|
- re-ordered presentation of screen modes, in order of decreasing feedback [i, g, m, s]
|
|
|
|
- argparse to further enhance karoo_gp_server.py as a scriptable tool
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2016 07/11 - version 0.9.1.5
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- renamed kernel (fitness function) type from 'a' to 'r' for regression
|
|
|
|
- renamed all associated variable and methods to match switch from 'abs_value' to 'regression'
|
|
|
|
- reorganised validation methods to match order of fitness functions
|
|
|
|
- added [other] place holder to validation methods
|
|
|
|
|
|
|
|
In karoo_gp_main.py
|
|
|
|
- renamed kernel (fitness function) type from 'a' to 'r' for regression
|
|
|
|
- adjusted upper boundary for qty of generations from 1000 to 100
|
|
|
|
|
|
|
|
In karoo_go_server.py
|
|
|
|
- renamed kernel (fitness function) type from 'a' to 'r' for regression
|
|
|
|
|
|
|
|
In files/ data_ABS.csv and functions_ABS.csv were rename data_REGRESS.csv and functions_REGRESS.csv accordingly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2016 07/10-11 - version 0.9.1.4
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- renamed variable 'tree_depth_max' to 'tree_depth_base'
|
|
|
|
- renamed variable 'tree_depth_adj' to 'tree_depth_max'
|
|
|
|
- renamed variable 'gp.pop_tree_depth_max' to 'gp.pop_tree_depth_base'
|
|
|
|
- enabled 2 children to be produced by each Crossover function
|
|
|
|
- extensive testing of Crossover in debug to validate process
|
|
|
|
- reduced # of crossover functions per run by 1/2
|
|
|
|
- improved on-screen output for all 4 genetic operators
|
|
|
|
- removed not-in-use 'accuracy' test
|
|
|
|
- added 'Arguments required:' to each method notes
|
|
|
|
- edited a number of method notes
|
|
|
|
|
|
|
|
In karoo_gp_main.py
|
|
|
|
- renamed variables (according to karoo_gp_base_class.py)
|
|
|
|
|
|
|
|
In karoo_go_server.py
|
|
|
|
- renamed variables (according to karoo_gp_base_class.py)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2016 07/08-09 - version 0.9.1.3
|
|
|
|
|
|
|
|
In karoo_gp_base_class.py
|
|
|
|
- added CTRL-C catch to the (pause) menu; removes potential to accidentally kill a run when attempting to
|
|
|
|
copy/paste an on-screen function to research notes (use the mouse instead).
|
|
|
|
- rebuilt each (pause) menu function for improved exception handling
|
|
|
|
- added the new gp.tree_depth_adj user defined variable to branch mutation and crossover, enabling Trees to grow
|
|
|
|
beyond their original size which adds opportunity for more complex solutions, as well as the unavoidable bloat
|
|
|
|
- reduced complexity of a few lines in both branch mutation and crossover methods
|
|
|
|
- tested, tested, tested
|
|
|
|
|
|
|
|
In karoo_gp_main.py
|
|
|
|
- added CTRL-C catch to the at-launch user menu.
|
|
|
|
- renamed the method gp.fx_karoo_crossover_reproduction() to gp.fx_karoo_crossover()
|
|
|
|
- added user input for the new global variable gp.tree_depth_adj
|
|
|
|
|
|
|
|
In karoo_go_server.py
|
|
|
|
- added new gp.tree_depth_adj variable
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2016 07/07 - version 0.9.1.2
|
|
|
|
|
|
|
|
In preparation for public launch of Karoo GP, a number of updates are complete or underway.
|
|
|
|
|
|
|
|
The Quick Start Tutorial is being fully revised. A number of corrections were made, but more importantly, all new
|
|
|
|
content has been added relevant to preparation of datasets and the use of the Karoo GP Tools accordingly. The genetic
|
|
|
|
operators descriptions now feature visuals and revised descriptions, as to many other sections.
|
|
|
|
|
|
|
|
In the karoo_gp/tools/ directory, all scripts have undergone updates, 2 of which now offer automated scaling and a
|
|
|
|
user interface that in the original versions were not present, as follows:
|
|
|
|
|
|
|
|
karoo_data_sort.py (formerly karoo_features_sort.py)
|
|
|
|
This script now engages the user with a query for the number of class labels and the number of data points (rows)
|
|
|
|
for the new, randomly generated subset of the parent dataset. This script is designed to be used prior to
|
|
|
|
karoo_normalise.py.
|
|
|
|
|
|
|
|
karoo_normalise.py
|
|
|
|
This script now auto-scales to any number of columns and rows (within the limit of your computer's capability),
|
|
|
|
and features a text-based user interface. This script is designed to be used following karoo_data_sort.py.
|
|
|
|
|
|
|
|
karoo_multiclassifier.py
|
|
|
|
This script functions as before, but with a minor bug fixed in which the final class was mislabeled.
|
|
|
|
|
|
|
|
karoo_iris_plot.py
|
|
|
|
This script functions as before, but with improved in-script documentation and cleaner code.
|
|
|
|
|
|
|
|
|
|
|
|
In development now are a number of updates and improvements to the base_class such that Karoo GP will more readily
|
|
|
|
conform to the GP standards, as follows:
|
|
|
|
|
|
|
|
1) Karoo GP currently produces only 1 offspring for each parent, where it should produce 2.
|
|
|
|
|
|
|
|
2) The tree generation method 'Ramped Half/Half', in its current state, produces a 50/50 split of Full and Grow
|
|
|
|
methods, not a graduated ramp through all depths.
|
|
|
|
|
|
|
|
3) Karoo GP currently engages a bloat inhibitor, that is, an upper limit on tree depth which is maintained through
|
|
|
|
all modes of mutation and crossover. This will become a user defined setting such that it can be adjusted, enabling
|
|
|
|
growth of trees beyond the original, user defined limit.
|
|
|
|
|
|
|
|
4) Karoo GP will be made to launch as a single, command-line function with all required parameters included, SciKit
|
|
|
|
Learn style.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2015 12/23 - version 0.9.1.1
|
|
|
|
|
|
|
|
It was discovered that when loading external datasets, Karoo was yet extracting variables (terminals) from the data in
|
|
|
|
the files/ directory, according to the selected kernel.
|
|
|
|
|
|
|
|
This is fixed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2015 11/04 - version 0.9.1.0
|
|
|
|
|
|
|
|
Initial development of Karoo GP began in February 2015, on a Python-based evolutionary algorithm for an MSc research
|
|
|
|
project at the University of Cape Town (UCT) / African Institute for Mathematical Sciences (AIMS) and the Square
|
|
|
|
Kilometer Array (SKA). The myriad debug statements evolved into the user interface while the classic Machine Learning
|
|
|
|
test cases became the built-in example runs.
|
|
|
|
|
|
|
|
In the course of six months development, the code base grew to become a flexible, easy-to-use platform for Genetic
|
|
|
|
Programming.
|
|
|
|
|
|
|
|
Karoo GP has been thoroughly tested on a 40-core server at the Square Kilometer Array offices in Cape Town, South
|
|
|
|
Africa, where for one month it chewed through 10,000 rows of data for up to 50 hours without incident. It is proved
|
|
|
|
as a fully functional, multi-core workhorse.
|
|
|
|
|
|
|
|
With all development to date conducted locally, this version 0.9 marks the first release to github.
|
|
|
|
|
|
|
|
This initial github release is private, shared with select collaborators only. Please do not distribute any part of
|
|
|
|
the code until it is made public.
|
|
|
|
|
|
|
|
Kai Staats
|
|
|
|
|
|
|
|
www.kaistaats.com/research/
|
|
|
|
www.kaistaats.com/film/
|