652 lines
31 KiB
Plaintext
652 lines
31 KiB
Plaintext
2019 05/18
|
|
|
|
"My apology to the users of Karoo GP for the incredible gap between the most recent and this update. My research and
|
|
work have for the past year taken me in other directions. This update marks my return to an active effort in working on
|
|
Karoo, and GP applied to a growing research space. I appreciate your patience and understanding." --kai
|
|
|
|
NOTE: this is the *final* update to the Python 2.7 version of Karoo. All future development will be in Python 3.x.
|
|
|
|
And now, the updates:
|
|
|
|
Fixed the Test function to evaluate the Tree selected by the user.
|
|
|
|
Modified the menu option from 't'est to 'e'valulate from the Pause menu.
|
|
|
|
Added a user defined (m)in node count, (f)ull set of features, or (b)oth as the gene_pool regulator. For more about
|
|
the "full set of features" see 2018 04/19 and the new 'swim' variable.
|
|
|
|
|
|
2018 05/18
|
|
|
|
Fixed a bug which automically re-engaged the run after selecting a pause menu option if mid generation. Now, it again
|
|
requires ENTER before continuing.
|
|
|
|
|
|
2018 05/10
|
|
|
|
Returned full_path = os.path.realpath(__file__) (approx. line 342) to support bash scripting of automated runs. This
|
|
remains commented, but can be readily re-activated (remember to comment the next line down).
|
|
|
|
|
|
2018 04/22
|
|
|
|
1) Completed the isolation of the user interface from the base_class such that all mid-run parameter configurations and
|
|
population summaries are now conducted in menu.pause(), and any associatede executions (Tree validation) are conducted
|
|
fully in the base_class.
|
|
|
|
karoo_gp.py imports karoo_gp_base_class.py
|
|
karoo_gp_base_class.py imports karoo_gp_pause.py
|
|
|
|
The user's initial run configuration is conducted in karoo_gp.py. Those settings are passed into karoo_gp_base_class.py
|
|
where they are used to guide the run. At each 'pause' (generation, interactive, or debug display modes; or end of run)
|
|
the user is taken into menu.pause where additional queries and configurations are conducted. Those value changes are
|
|
returned to the base_class where they are executed. When a run is completed (Server or Desktop), the user is returned
|
|
to the original karoo_gp.py for salutations and run termination.
|
|
|
|
Finally, the Karoo GP Python pasta is no longer spaghetti!
|
|
|
|
2) Modified the function of 'cont' such that it either continues if you yet have additional generations to go, or
|
|
querie the user if gen_id = gen_max and the prior run is done.
|
|
|
|
3) Replaced 'generation_id' with 'gen_id' and 'generation_max' with 'gen_max'.
|
|
|
|
4) Conducted a full set of unit tests on every user interface query in menu.pause().
|
|
|
|
5) Cleaned up the code beneath the user interface (menu.pause), making the while-true loops consistent with those in
|
|
karoo_gp_pause.py.
|
|
|
|
6) Merged 'fx_karoo_continue' into 'fx_karoo_gp' such that a simple while loop maintains an active evolutionary process
|
|
until the either the Server script is complete, or the user has terminated the run. This allows for the simple addition
|
|
of generations to an existing run. A far better solution than what I had previously coded.
|
|
|
|
7) Discovered that the loading of seed population_s is far too delicate. Even the slightest change and it fails. I have
|
|
disabled the load function for now, until I have time to rebuild it.
|
|
|
|
8) Please note that all modules are now stored in modules/
|
|
|
|
|
|
2018 04/19
|
|
|
|
Added a new kind of gene_pool filter (environmental pressure) such that you can select partial or full inclusion of
|
|
the operands (features). While not yet incorporated into the user interface (desktop or server), the new 'swim' setting
|
|
enables you to build consecutive gene pools that have *only* Trees that include at least one instance of each feature.
|
|
This may be helpful for regression problems moreso than classification, as regression tends to fall to a minimum
|
|
feature set, depending upon the nature of the solution (fixed or rolling value). I desire to merge tree_depth_min
|
|
setting and this new 'swim' into one user defined system that controls the gene_pool.
|
|
|
|
Discovered 3 more functions that did not include a "Called by: ___" statement. Updated.
|
|
|
|
Fixed a long-standing issue of total failure if you enter 1 for the total number of generations. This setting, while
|
|
it may seem odd, allows Karoo GP to be used to create populations of stochastic multivariate expressions based upon the
|
|
given dataset for use in other applications. Simple fix. The population is written to the runs/ directory as with all
|
|
other modes of operation, including Play.
|
|
|
|
|
|
2018 04/12c
|
|
|
|
Updated the User Guide to match the Desktop and Serve merger, and modified a help print statement in the sys args of
|
|
the Server function.
|
|
|
|
|
|
2018 04/12b
|
|
|
|
Created the 'modules/' directory to hold karoo_base_class and in the future, more modules.
|
|
|
|
|
|
2018 04/12a
|
|
|
|
1) Merged 'Karoo GP Server' and 'Karoo GP Main' into a single 'Karoo GP'. So, just one user script now that provides the
|
|
user interface (TUI) or scripting (comnand line). There are now 3 ways to run Karoo:
|
|
|
|
a) $ run karoo_gp.py
|
|
b) $ run karoo_gp.py [path/][dataset.csv]
|
|
c) $ run karoo_gp.py -fil [path/][dataset.csv] -[arg] [val] -[arg] [val] ... etc.
|
|
|
|
Where [arg][val] are the sysargs defined in the body of karoo_gp.py and the User Guide. Both (a) and (b) will auto-
|
|
run the Desktop interface, providing a user query for each entry and holding the user within the script until quit;
|
|
(c) will skip the user query and go directly into execution, dropping the user back to the command line at the end of
|
|
the run (for scripting).
|
|
|
|
|
|
2) Merged the 'menu' ranges into the while loops, directly (to reduce line count).
|
|
|
|
3) Added the 2**(tree_depth_base +1) - 1 to the tree_depth_min calculation (could have done this a long time ago).
|
|
|
|
4) Auto-calc the number of trees entered into the Tournament (was previously hard coded to 7 assuming a pop 100).
|
|
|
|
5) Added [-evr, -evp, -evb, -evc] to the sysargs to support a change in the balance of the genetic operators mid-run.
|
|
|
|
6) Merged fx_evolve_branch_top_copy with fx_evolve_branch_body_copy as they were alway called sequencially and required
|
|
the same input values.
|
|
|
|
7) Merged fx_evolve_tree_renum into fx_eval_generation as both are used only once, as the former was but 2 lines of code.
|
|
|
|
8) Finally finished commenting all "Called by: ___" statements in each function header.
|
|
|
|
I've been productive!
|
|
|
|
|
|
2018 04/01
|
|
|
|
Two major updates, as follows:
|
|
|
|
1) I have returned to the basic structure of Karoo GP and worked to make it properly Pythonic. For those of you who
|
|
recognised the error in my ways, I appreciate your patience. What I have updated with this v1.1 release does not
|
|
improve performance nor stability (already rock solid), rather, it keeps properly trained developers from cringing
|
|
when they read my code.
|
|
|
|
In the prior versions, I used one of the loose functions of Python that allows the initiation of a variable outside of
|
|
the base_class. Meaning, I imported the base_class to karoo_gp_main.py and karoo_gp_server.py, called a dozen global
|
|
variables by "gp.___" and as such initiated them from the user scripts. Yes, it works. But it caused a certain
|
|
individual with whom I worked at the SKA to yell at me and shake her head (and I had just met her a few days prior).
|
|
I don't like being yelled at, so I fixed it.
|
|
|
|
Now, all user configurations are set as local variables and their values are passed to the base_class.
|
|
|
|
To make thing a bit easier to read and follow, I have renamed and reorganized several functions (methods) as follows:
|
|
|
|
a) Created a new category for data and archiving, moving/renaming all related functions to fx_data_.
|
|
|
|
b) Renamed all fx_gen_ to fx_init_ as this is the initial population.
|
|
|
|
c) Moved the four top-level functions used to construct the next generation of trees into their own fx_nextgen_
|
|
category: fx_karoo_reproduce, fx_nextgen_point_mutate, fx_nextgen_branch_mutate, fx_nextgen_crossover
|
|
|
|
The final breakdown is as follows:
|
|
fx_karoo_ Methods to Run Karoo GP
|
|
fx_data_ Methods to Load and Archive Data
|
|
fx_init_ Methods to Construct the 1st Generation
|
|
fx_eval_ Methods to Evaluate a Tree
|
|
fx_fitness_ Methods to Train and Test a Tree for Fitness
|
|
fx_nextgen_ Methods to Construct the next Generation
|
|
fx_evolve_ Methods to Evolve a Population
|
|
fx_display_ Methods to Visualize a Tree
|
|
|
|
That's it! --kai
|
|
|
|
|
|
2018 02/27
|
|
Updated the Python library versions and improved some explanation of Operators and Operands in the User Guide.
|
|
|
|
|
|
2017 10/26
|
|
An upgrade from Tensorflow 1.1 to 1.3 caused the Classify kernel test to break. Fixed by Iurii by replacing [] with ()
|
|
in the 'fx_fitness_eval' method.
|
|
|
|
|
|
2017 10/17
|
|
To be consistent with the anticipated Machine Learning vocabulary, replaced the term 'label' with 'pred_labels'
|
|
(predicted labels) in the TF graph methods.
|
|
|
|
|
|
2017 08/10
|
|
Iurii fixed a minor bug in which the MATCHING function (used only for demonstrations) would not find a match despite
|
|
the output being (apprently) correct. This was discovered to be due to the way in which TensorFlow was handling
|
|
floating points and precision, as follows:
|
|
|
|
[ -0.29999995 0.69999981 13.69999981 16.70000076 19.70000076
|
|
22.70000076 25.70000076 28.70000076 31.70000076 34.70000076]
|
|
|
|
[ -0.30000001 0.69999999 13.69999981 16.70000076 19.70000076
|
|
22.70000076 25.70000076 28.70000076 31.70000076 34.70000076]
|
|
|
|
As you can see, the values are close, but not equal and so a "match" was not resolved. Iurii used numpy.allclose.html
|
|
as a reference to resolve the situation.
|
|
|
|
I also modified the autosave to the runs/ directory such that if you are using an external dataset (quite likely), the
|
|
new directory (for each run) will be saved as [filename]-[date_time_stamp]/ The idea (thank you Marco) is to help keep
|
|
multiple, automated runs organized and more readily, visually inspected by name alone.
|
|
|
|
|
|
2017 07/21
|
|
|
|
In a rather embarrassing, live demo in which I asked for the audience to create the dataset for Karoo, I discovered a
|
|
bug in the MATCH kernel in which a negative value in the dataset would cause that row to be discarded from the fitness
|
|
function --FIXED.
|
|
|
|
I merged the 3 methods fx_fitness_train_classify, fx_fitness_train_regress, fx_fitness_train_match into fx_fitness_eval
|
|
in order to reduce the quantity of lines of code and simplify the workflow.
|
|
|
|
|
|
2017 07/03
|
|
|
|
I am pleased to announce that Karoo GP is now updated to include a full suite of mathematical operators. I thank the
|
|
expert code development of Iurii Milovanov. He was instrumental in bringing TensorFlow into Karoo last year, and has
|
|
now provided this important improvement.
|
|
|
|
Iurii has prpared an efficient and elegant solution for the addition of a full range of operators, adding support for
|
|
boolean operations (a and b or c), comparison ops (a > b <= c == d) and generally speaking any function available in
|
|
TensorFlow, as given here: https://www.tensorflow.org/api_guides/python/math_ops
|
|
|
|
Now there is a mapping in Karoo GP that connects an expression to the TF function:
|
|
|
|
OPERATOR EXAMPLE
|
|
add a + b
|
|
subtract a - b
|
|
multiply a * b
|
|
divide a / b
|
|
pow a ** 2
|
|
|
|
negative -a
|
|
logical_and a and b
|
|
logical_or a or b
|
|
logical_not not a
|
|
equal a == b
|
|
not_equal a != b
|
|
less a < b
|
|
less_equal a <= b
|
|
greater a > b
|
|
greater_equal a >= 1
|
|
|
|
abs abs(a)
|
|
sign sign(a)
|
|
square square(a)
|
|
sqrt sqrt(a)
|
|
pow pow(a, b)
|
|
log log(a)
|
|
log1p log1p(a)
|
|
cos cos(a)
|
|
sin sin(a)
|
|
tan tan(a)
|
|
acos acos(a)
|
|
asin asin(a)
|
|
atan atan(a)
|
|
|
|
Now when Karoo parses an expression like "asin(x + 10)" it first parses and transforms the args "x + 10" and passes
|
|
the output to “tf.asin” function. So now Karoo is able to evaluate incredibly complx mathematical expressions.
|
|
|
|
It is important to refer to karoo_gp/files/templates/operators_list.txt to learn how to enage operators_[KERNEL].csv
|
|
In its current form Karoo requires some operators, such as abs, cos, and sin to have a unique formating:
|
|
|
|
+ sin,2
|
|
- sin,2
|
|
* sin,2
|
|
/ sin,2
|
|
|
|
In a future version of Karoo, this will be managed automatically by the recursive function which extracts the
|
|
expression from a GP Tree.
|
|
|
|
|
|
2017 06/06
|
|
|
|
A number of changes applied in March and April. My apologies for the delayed updates to github.
|
|
|
|
A timer is added. This is helpful when conducting comparisons of run times, given various configurations and datasets.
|
|
If you go looking for this code, search for 'time.time' in _main.py for the Desktop interface, and in _base_class.py
|
|
for the Server interface.
|
|
|
|
Raised tree_depth_max to 100 in _main in order to essentially remove the tree depth ceiling, allowing for the multi-
|
|
generational increase in tree depth. While this re-introduces bloat, it does provide for a greater degree of
|
|
experimental freedom. A reminder that the _server has always allowed for tree_depth_max to be set to any level.
|
|
|
|
Changed the balance of operators (reproduction, mutation, crossover) from a percentage to a fixed quantity in both the
|
|
default settings and in the user defined alteration.
|
|
|
|
The User Guide is updated with some minor adjustments to wording.
|
|
|
|
I have removed tools/ from Karoo package, as these will be re-launched as their own set of packages and associated
|
|
github repository. I am working to give each a proper interface and more broad functionality outside of the
|
|
Karoo package. Stay tuned!
|
|
|
|
Please also note that the variable os.environ (line 28 in _base_class.py) can be set to [0, 1, 2] to trigger 3 levels
|
|
of on-screen, TensorFlow run-time error logging. You can experiment with this to determine what is right for you.
|
|
|
|
The much desired trig operators will be returned to Karoo GP v1.0.4. Log and boolean operators shortly thereafter. I
|
|
offer my apology for the delay, as I know this has inhibited some expanded uses of Karoo in the past few months.
|
|
|
|
|
|
2017 02/13
|
|
|
|
Returned 'cwd = os.getcwd()' to 'cwd = os.path.dirname(full_path)' in fx_karoo_data_load in order to sustain chron
|
|
compliance (sorry Marco). Then split end-of-run logging into 2 files: runtime_config.txt and test.txt. Looking to the
|
|
future in which the test log, in particular, might be stored in such a way as to be readily parsed by an external
|
|
script for automated comparison of multiple runs.
|
|
|
|
Ideas are welcomed!
|
|
|
|
|
|
2017 02/09
|
|
|
|
The User Guide has been updated to match the functions of Karoo GP v1.0.1.
|
|
|
|
|
|
2017 02/08
|
|
|
|
The parameters.txt file now captures the full run-time configuration of both karoo_gp_main.py and karoo_gp_server.py.
|
|
This file is written to a unique (datetime stamp) directory in karoo_gp/runs/ at the auto-termination of _server.py
|
|
and with the user executed 'q'uit of _main.py.
|
|
|
|
The final list of best fit evolved Trees and the test of the highest numbered (usually the highest performing) Tree is
|
|
also recorded, with the test automatically executed based upon the original kernel designation. This functionality
|
|
supports the fully hands-off execution of Karoo GP on a remote server, from a bash or Python script, for parallel or
|
|
multiple serial evolutionary runs.
|
|
|
|
Thank you Marco Cavaglia for guiding the addition of this functionality to Karoo GP!
|
|
|
|
|
|
2017 02/06
|
|
|
|
Graphics Processing Units (GPU) are now supported with the introduction of the Python library TensorFlow. The end
|
|
result is a staggering improvement in performance. With one comparison of a 10,000 data points (rows) x 9 features
|
|
(columns) dataset on a 40 core Intel Xeon motherboard versus a 2000 core Nvidia GPU card, the wall time was reduced
|
|
from 50 hours to less than 4 minutes. On CPU-only computers, the performance on a single core is as much as 10x
|
|
improved due to the vectorisation of the data and application of the C-based TensorFlow maths library.
|
|
|
|
To install TensorFlow, I recommend visiting https://www.tensorflow.org/get_started/ It is straight forward for Ubuntu,
|
|
but unfortunately can be rather challenging with OSX. Have patience. Review the forums. It's worth the effort.
|
|
|
|
I owe many thanks to the expertise of Iurii Milovanov, a contract developer whom I engaged for this effort. While the
|
|
number of lines of Karoo GP modified were initially less than a dozen (replacing the multi-core pprocess calls), I asked
|
|
Iurii to also rewrite the test functions. As such, both Training and Testing are now fully GPU enabled. Thank you!
|
|
|
|
A number of other changes have been integrated, including:
|
|
|
|
- Karoo GP is now developed against Python 2.7 as provided with Ubuntu Desktop 16.04.1.
|
|
|
|
- A number of Python methods have been deleted, added, modified and/or renamed. In particular, in the category
|
|
'fx_fitness_' If you have built your own code based on the Karoo methods, please review this section carefully.
|
|
|
|
- The user engaged 'bal'ance function (pause menu) has been rebuilt to anticipate exact quantities instead of
|
|
percentages, enabling the user to define precisely how many of each of the four genetic operators will be applied
|
|
with the construction of each subsequent population.
|
|
|
|
- Activation of the 'test' is now conducted with only the letter 't' and the option to engage a specific number of
|
|
'c'ores is removed. Therefore, the 't'imer mode is also removed, as this was a means to discover the optimal number
|
|
for multi-core processing which is now automated by TensorFlow.
|
|
|
|
- The libraries 'pprocess' and 'time' are no longer required nor imported.
|
|
|
|
- The population_* files (.csv) are now written into unique directories created inside of karoo_gp/runs/ with the
|
|
launch of each run. A .txt file is also written to each directory which captures the run-time configuration of
|
|
Karoo GP. This enables truly scriptable runs of Karoo.
|
|
|
|
- The Server interface to Karoo GP (karoo_gp_server.py) now terminates completely, kicking back to the command line.
|
|
This enables bash or chron launches of multiple sequential or parallel runs, enabling the exploration of multiple
|
|
runs with identical configuration, or that of varied configuration parameters.
|
|
|
|
Finally, Karoo GP is now a 1.0 release. I never know when to transition from beta to real, so please forgive me if I
|
|
jumped the gun. But with GPU support and the revised Server script, I have find Karoo GP to be a fully functional,
|
|
powerful machine learning tool. I hope you will agree --kai
|
|
|
|
|
|
2016 09/20
|
|
|
|
Fixed the genetic operator (b)alance function to work with large than 100 trees per population.
|
|
|
|
Introduced the pause for all runtime modes in the Desktop application, such that the user can apply configurations prior
|
|
to the run (eg: change the balance of the genetic operators or the number of engaged cores).
|
|
|
|
|
|
2016 09/19b
|
|
|
|
After another 2 hours of trouble shooting, I learned that sympy.subs throws the 'zoo' error for a divide-by-zero if
|
|
working with integers--and stalls (as I witnessed last year when developing Karoo), but if throws 'inf' or '-inf' if
|
|
working with floats and continues to process unencumbered. This means the 'zoo' trap has not been used for over a year.
|
|
It is now removed from the method fx_eval_subs().
|
|
|
|
|
|
2016 09/19
|
|
|
|
All experiments with lambdify are on hold as it is throwing 'divide by zero' errors when no 0 exists in the data.
|
|
|
|
In the evaluation of this issue, I returned to the means by which I deal with 'divide by zero', independent of the
|
|
library used to process the trees. Originally, 'result' was replaced by the value 1. But a colleague suggested that I
|
|
might be tossing genetic material of value to future generations. So, now a tree which produces a 'divide by zero'
|
|
error ('zoo' in .subs) is retained, but given a fitness score of 0.
|
|
|
|
Method fx_eval_subs() is updated to assign an 'error' tag to any tree which exhibits 'divide by zero' behaviour.
|
|
|
|
Method fx_fitness_eval() is updated to bypass processing of a tree which has been tagged with an 'error' tag.
|
|
|
|
Methods fx_test_classify(), fx_test_regress(), and fx_test_match() are each updated to send the 'tree_id' to
|
|
fx_eval_subs() as is now required.
|
|
|
|
For now, it is not recommended that you use the lambdify function in fx_eval_subs(), unless you know how to fix it.
|
|
|
|
|
|
2016 09/16
|
|
|
|
With the 09/14 update I failed to upload the new coefficients.csv file to the files/ directory. While not yet engaged,
|
|
this will be the means by which the user can define the constants desired for the Karoo GP run. If you had run Karoo GP
|
|
v0.9.2.0 in the past 24 hours without this file, it would have complained. My apology.
|
|
|
|
Also, a bit of a road map for the 2nd half of 2016, into 2017
|
|
- validate the new (faster) sympy.lambdify and fully replace the current (slower) sympy.subs
|
|
- replace the row-by-row dictionaries with vectors for what should be a significant performance increase
|
|
- complete the introduction of constants in a manner more well defined than is currently supported
|
|
- investigate replacing pprocess with the multi-core library
|
|
- introduce Theano or Tensor Flow for GPU support
|
|
|
|
I welcome any assistance with these, if anyone has experience and time.
|
|
|
|
|
|
2016 09/14 - version 0.9.2.0
|
|
|
|
In karoo_gp_base_class.py
|
|
- Removed redundant lines in the method 'fx_karoo_data_load()'
|
|
- Added support for the Sympy 'lambdify' function in 'fx_karoo_data_load' (see explanation below)
|
|
- Added a draft means of catching divide-by-zero errors in the new 'lambdify' function
|
|
- Discovered the prior 'fx_eval_subs' uncorrectly applied a value of 1 to the variable 'result' as a means to
|
|
replace the 'zoo' function for divide by zero errors. However, this could inadvertently undermine the success of
|
|
Classification and Regression runs. My apology for not catching this sooner.
|
|
|
|
"While attending the CHEAPR 2016 workshop hosted by the Center for Cosmology and Astro-Particle Physics, The Ohio State
|
|
University, Michael Zevin of Northwestern University proposed that Karoo GP *should* be able to process trees far faster
|
|
than what we were seeing. I looked into the Sympy functions I was at that time using. Indeed, '.subs' is noted as easy
|
|
to use, but terribly slow as it relies upon an internal, Python mathematical library. I therefore replaced '.subs' with
|
|
'.lambdify' which calls upon the C-based Numpy maths library. It is slated to be 500x faster than '.subs', but I am
|
|
seeing only a 2x performance increase. Clearly, there are yet other barriers to remove.
|
|
|
|
In the new 'fx_eval_subs' method you will find both sympy.subs (active) and sympy.lambdify. While preliminary tests
|
|
worked well, I witnessed an erratic outcome which I yet need to reproduce and investigate. Feel free to comment the
|
|
.subs and uncomment the .lambdify sections and take it for a spin.
|
|
|
|
I believe there are 2 more steps to increase performance: removing the dictionaries which contain each row, such that
|
|
Karoo is working directly with the Numpy array again, and then processing the array as a vector instead. But this will
|
|
require substantial recoding.
|
|
|
|
I'll keep you informed ..." --kai
|
|
|
|
|
|
2016 08/08 - version 0.9.1.9
|
|
|
|
In karoo_gp_base_class.py
|
|
- Created a new method 'fx_eval_subs()' which conducts the SymPy subs function for both train and test data.
|
|
- Consolidated duplicate SymPy sub calls in both Train and Test methods, at Erik H. suggestion --thank you!
|
|
- Consolidated duplicate lines into single lines in both Train and Test methods.
|
|
- Fixed bug in which Ramped 50/50 trees would not print in Play mode; removed Ramped 50/50 option from Play mode as
|
|
it does not make sense for a single Tree
|
|
|
|
|
|
2016 07/24 - version 0.9.1.8b
|
|
|
|
Sorry. Switched all references to "LOGIC" back to "BOOLEAN". Don't ask.
|
|
|
|
|
|
2016 07/18 - version 0.9.1.8
|
|
|
|
In karoo_gp_base_class.py
|
|
- removed some programmer's comments from experiments resolved
|
|
- modified the screen output for the multi-class classifier such that it is more clear that p (a) is actually
|
|
predicted (actual) with the addition of the tag "label"
|
|
|
|
|
|
|
|
2016 07/15 - version 0.9.1.7
|
|
|
|
In karoo_gp_base_class.py
|
|
- Ramped Half/Half, as originally designed by John Koza is now implemented!
|
|
- this offers a greater diversity of trees in the initial population
|
|
- added *** Crossover *** to this method for debug display mode
|
|
|
|
|
|
|
|
2016 07/14 - version 0.9.1.6b
|
|
|
|
In karoo_gp_base_class.py
|
|
- fixed 'debug' display output for branch mutation
|
|
- fixed a few bugs in branch mutation that reduced evolutionary efficiency
|
|
- updated all ERROR messages to be uniform in output
|
|
- added (y/n) to "Are you certain you want to quit?" message --thanks Hunter!
|
|
|
|
In karoo_gp_main.py
|
|
- reset default evolutionary balance to .1/.1/.1/.7
|
|
|
|
|
|
|
|
2016 07/13 - version 0.9.1.6
|
|
|
|
In karoo_gp_base_class.py
|
|
- enabled auto counting of labels (right column) in loaded data (data_y)
|
|
- re-ordered presentation of screen modes, in order of decreasing feedback [i, g, m, s]
|
|
- added argparse to further enhance karoo_gp_server.py as a scriptable tool --FINALLY! :)
|
|
|
|
In karoo_gp_main.py
|
|
- removed user query for number of class labels
|
|
- re-ordered presentation of screen modes, in order of decreasing feedback [i, g, m, s]
|
|
|
|
In karoo_go_server.py
|
|
- re-ordered presentation of screen modes, in order of decreasing feedback [i, g, m, s]
|
|
- argparse to further enhance karoo_gp_server.py as a scriptable tool
|
|
|
|
|
|
|
|
2016 07/11 - version 0.9.1.5
|
|
|
|
In karoo_gp_base_class.py
|
|
- renamed kernel (fitness function) type from 'a' to 'r' for regression
|
|
- renamed all associated variable and methods to match switch from 'abs_value' to 'regression'
|
|
- reorganised validation methods to match order of fitness functions
|
|
- added [other] place holder to validation methods
|
|
|
|
In karoo_gp_main.py
|
|
- renamed kernel (fitness function) type from 'a' to 'r' for regression
|
|
- adjusted upper boundary for qty of generations from 1000 to 100
|
|
|
|
In karoo_go_server.py
|
|
- renamed kernel (fitness function) type from 'a' to 'r' for regression
|
|
|
|
In files/ data_ABS.csv and functions_ABS.csv were rename data_REGRESS.csv and functions_REGRESS.csv accordingly.
|
|
|
|
|
|
|
|
2016 07/10-11 - version 0.9.1.4
|
|
|
|
In karoo_gp_base_class.py
|
|
- renamed variable 'tree_depth_max' to 'tree_depth_base'
|
|
- renamed variable 'tree_depth_adj' to 'tree_depth_max'
|
|
- renamed variable 'gp.pop_tree_depth_max' to 'gp.pop_tree_depth_base'
|
|
- enabled 2 children to be produced by each Crossover function
|
|
- extensive testing of Crossover in debug to validate process
|
|
- reduced # of crossover functions per run by 1/2
|
|
- improved on-screen output for all 4 genetic operators
|
|
- removed not-in-use 'accuracy' test
|
|
- added 'Arguments required:' to each method notes
|
|
- edited a number of method notes
|
|
|
|
In karoo_gp_main.py
|
|
- renamed variables (according to karoo_gp_base_class.py)
|
|
|
|
In karoo_go_server.py
|
|
- renamed variables (according to karoo_gp_base_class.py)
|
|
|
|
|
|
|
|
2016 07/08-09 - version 0.9.1.3
|
|
|
|
In karoo_gp_base_class.py
|
|
- added CTRL-C catch to the (pause) menu; removes potential to accidentally kill a run when attempting to
|
|
copy/paste an on-screen function to research notes (use the mouse instead).
|
|
- rebuilt each (pause) menu function for improved exception handling
|
|
- added the new gp.tree_depth_adj user defined variable to branch mutation and crossover, enabling Trees to grow
|
|
beyond their original size which adds opportunity for more complex solutions, as well as the unavoidable bloat
|
|
- reduced complexity of a few lines in both branch mutation and crossover methods
|
|
- tested, tested, tested
|
|
|
|
In karoo_gp_main.py
|
|
- added CTRL-C catch to the at-launch user menu.
|
|
- renamed the method gp.fx_karoo_crossover_reproduction() to gp.fx_karoo_crossover()
|
|
- added user input for the new global variable gp.tree_depth_adj
|
|
|
|
In karoo_go_server.py
|
|
- added new gp.tree_depth_adj variable
|
|
|
|
|
|
|
|
2016 07/07 - version 0.9.1.2
|
|
|
|
In preparation for public launch of Karoo GP, a number of updates are complete or underway.
|
|
|
|
The Quick Start Tutorial is being fully revised. A number of corrections were made, but more importantly, all new
|
|
content has been added relevant to preparation of datasets and the use of the Karoo GP Tools accordingly. The genetic
|
|
operators descriptions now feature visuals and revised descriptions, as to many other sections.
|
|
|
|
In the karoo_gp/tools/ directory, all scripts have undergone updates, 2 of which now offer automated scaling and a
|
|
user interface that in the original versions were not present, as follows:
|
|
|
|
karoo_data_sort.py (formerly karoo_features_sort.py)
|
|
This script now engages the user with a query for the number of class labels and the number of data points (rows)
|
|
for the new, randomly generated subset of the parent dataset. This script is designed to be used prior to
|
|
karoo_normalise.py.
|
|
|
|
karoo_normalise.py
|
|
This script now auto-scales to any number of columns and rows (within the limit of your computer's capability),
|
|
and features a text-based user interface. This script is designed to be used following karoo_data_sort.py.
|
|
|
|
karoo_multi-classifier.py
|
|
This script functions as before, but with a minor bug fixed in which the final class was mislabeled.
|
|
|
|
karoo_iris_plot.py
|
|
This script functions as before, but with improved in-script documentation and cleaner code.
|
|
|
|
|
|
In development now are a number of updates and improvements to the base_class such that Karoo GP will more readily
|
|
conform to the GP standards, as follows:
|
|
|
|
1) Karoo GP currently produces only 1 offspring for each parent, where it should produce 2.
|
|
|
|
2) The tree generation method 'Ramped Half/Half', in its current state, produces a 50/50 split of Full and Grow
|
|
methods, not a graduated ramp through all depths.
|
|
|
|
3) Karoo GP currently engages a bloat inhibitor, that is, an upper limit on tree depth which is maintained through
|
|
all modes of mutation and crossover. This will become a user defined setting such that it can be adjusted, enabling
|
|
growth of trees beyond the original, user defined limit.
|
|
|
|
4) Karoo GP will be made to launch as a single, command-line function with all required parameters included, SciKit
|
|
Learn style.
|
|
|
|
|
|
|
|
2015 12/23 - version 0.9.1.1
|
|
|
|
It was discovered that when loading external datasets, Karoo was yet extracting variables (terminals) from the data in
|
|
the files/ directory, according to the selected kernel.
|
|
|
|
This is fixed.
|
|
|
|
|
|
|
|
2015 11/04 - version 0.9.1.0
|
|
|
|
Initial development of Karoo GP began in February 2015, on a Python-based evolutionary algorithm for an MSc research
|
|
project at the University of Cape Town (UCT) / African Institute for Mathematical Sciences (AIMS) and the Square
|
|
Kilometer Array (SKA). The myriad debug statements evolved into the user interface while the classic Machine Learning
|
|
test cases became the built-in example runs.
|
|
|
|
In the course of six months development, the code base grew to become a flexible, easy-to-use platform for Genetic
|
|
Programming.
|
|
|
|
Karoo GP has been thoroughly tested on a 40-core server at the Square Kilometer Array offices in Cape Town, South
|
|
Africa, where for one month it chewed through 10,000 rows of data for up to 50 hours without incident. It is proved
|
|
as a fully functional, multi-core workhorse.
|
|
|
|
With all development to date conducted locally, this version 0.9 marks the first release to github.
|
|
|
|
This initial github release is private, shared with select collaborators only. Please do not distribute any part of
|
|
the code until it is made public.
|
|
|
|
Kai Staats
|
|
|
|
www.kaistaats.com/research/
|
|
www.kaistaats.com/film/
|