v1.0 with GPU support
parent
0fea36b252
commit
d8f460b43c
|
@ -1,3 +1,56 @@
|
|||
2017 02/06
|
||||
|
||||
Graphics Processing Units (GPU) are now supported with the introduction of the Python library TensorFlow. The end
|
||||
result is a staggering improvement in performance. With one comparison of a 10,000 data points (rows) x 9 features
|
||||
(columns) dataset on a 40 core Intel Xeon motherboard versus a 2000 core Nvidia GPU card, the wall time as reduced from
|
||||
50 hours to less than 4 minutes. On CPU-only computers, the performance on a single core is as much as 10x improved due
|
||||
to the vectorisation of the data and application of the C-based TensorFlow maths library.
|
||||
|
||||
To install TensorFlow, I recommend visiting https://www.tensorflow.org/get_started/ It is straight forward for Ubuntu,
|
||||
but unfortunately can be rather challenging with OSX. Have patience. Review the forums. It's worth the effort.
|
||||
|
||||
I owe many thanks to the expertise of Iurii Milovanov, a contract developer whom I engaged for this effort. While the
|
||||
number of lines of Karoo GP modified were initially less than a dozen (replacing the multi-core pprocess calls), I asked
|
||||
Iurii to also rewrite the test functions. As such, both Training and Testing are now fully GPU enabled. Thank you!
|
||||
|
||||
A number of other changes have been integrated, including:
|
||||
|
||||
- Karoo GP is now developed against Python 2.7 as provided with Ubuntu Desktop 16.04.1.
|
||||
|
||||
- A number of Python methods have been deleted, added, modified and/or renamed. In particular, in the category
|
||||
'fx_fitness_' If you have built your own code based on the Karoo methods, please review this section carefully.
|
||||
|
||||
- The user engaged 'bal'ance function (pause menu) has been rebuilt to anticipate exact quantities instead of
|
||||
percentages, enabling the user to define precisely how many of each of the four genetic operators will be applied
|
||||
with the construction of each subsequent population.
|
||||
|
||||
- Activation of the 'test' is now conducted with only the letter 't' and the option to engage a specific number of
|
||||
'c'ores is removed. Therefore, the 't'imer mode is also removed, as this was a means to discover the optimal number
|
||||
for multi-core processing which is now automated by TensorFlow.
|
||||
|
||||
- The libraries 'pprocess' and 'time' are no longer required nor imported.
|
||||
|
||||
- The population_* files (.csv) are now deposited into unique directories created with the launch of each run. A .txt
|
||||
file is also written to each directory which captures the run-time configuration of Karoo GP. This enables truly
|
||||
scriptable runs of Karoo.
|
||||
|
||||
- The Server interface to Karoo GP (karoo_gp_server.py) now terminates completely, kicking back to the command line.
|
||||
This enables bash or chron launches of multiple sequential or parallel runs, enabling the exploration of multiple
|
||||
runs with identical configuration, or that of varied configuration parameters.
|
||||
|
||||
Finally, Karoo GP is now a 1.0 release. I never know when to transition from beta to real, so please forgive me if I
|
||||
jumped the gun. But with GPU support and the revised Server script, I have find Karoo GP to be a fully functional,
|
||||
powerful machine learning tool. I hope you will agree --kai
|
||||
|
||||
|
||||
2016 09/20
|
||||
|
||||
Fixed the genetic operator (b)alance function to work with large than 100 trees per population.
|
||||
|
||||
Introduced the pause for all runtime modes in the Desktop application, such that the user can apply configurations prior
|
||||
to the run (eg: change the balance of the genetic operators or the number of engaged cores).
|
||||
|
||||
|
||||
2016 09/19b
|
||||
|
||||
After another 2 hours of trouble shooting, I learned that sympy.subs throws the 'zoo' error for a divide-by-zero if
|
||||
|
@ -31,11 +84,11 @@ With the 09/14 update I failed to upload the new coefficients.csv file to the fi
|
|||
this will be the means by which the user can define the constants desired for the Karoo GP run. If you had run Karoo GP
|
||||
v0.9.2.0 in the past 24 hours without this file, it would have complained. My apology.
|
||||
|
||||
Also, a bit of a roadmap for the 2nd half of 2016, into 2017
|
||||
Also, a bit of a road map for the 2nd half of 2016, into 2017
|
||||
- validate the new (faster) sympy.lambdify and fully replace the current (slower) sympy.subs
|
||||
- replace the row-by-row dictionaries with vectors for what should be a significant performance increase
|
||||
- complete the introduction of constants in a manner more well defined than is currently supported
|
||||
- investigate replacing pprocess with the multicore library
|
||||
- investigate replacing pprocess with the multi-core library
|
||||
- introduce Theano or Tensor Flow for GPU support
|
||||
|
||||
I welcome any assistance with these, if anyone has experience and time.
|
||||
|
@ -47,8 +100,8 @@ In karoo_gp_base_class.py
|
|||
- Removed redundant lines in the method 'fx_karoo_data_load()'
|
||||
- Added support for the Sympy 'lambdify' function in 'fx_karoo_data_load' (see explanation below)
|
||||
- Added a draft means of catching divide-by-zero errors in the new 'lambdify' function
|
||||
- Discovered the prior 'fx_eval_subs' incorrected applied a value of 1 to the variable 'result' as a means to
|
||||
replace the 'zoo' function for divide by zero errors. However, this could inadvertantly undermine the success of
|
||||
- Discovered the prior 'fx_eval_subs' uncorrectly applied a value of 1 to the variable 'result' as a means to
|
||||
replace the 'zoo' function for divide by zero errors. However, this could inadvertently undermine the success of
|
||||
Classification and Regression runs. My apology for not catching this sooner.
|
||||
|
||||
"While attending the CHEAPR 2016 workshop hosted by the Center for Cosmology and Astro-Particle Physics, The Ohio State
|
||||
|
@ -59,7 +112,7 @@ to use, but terribly slow as it relies upon an internal, Python mathematical lib
|
|||
seeing only a 2x performance increase. Clearly, there are yet other barriers to remove.
|
||||
|
||||
In the new 'fx_eval_subs' method you will find both sympy.subs (active) and sympy.lambdify. While preliminary tests
|
||||
worked well, I witnessed an erractic outcome which I yet need to reproduce and investigate. Feel free to comment the
|
||||
worked well, I witnessed an erratic outcome which I yet need to reproduce and investigate. Feel free to comment the
|
||||
.subs and uncomment the .lambdify sections and take it for a spin.
|
||||
|
||||
I believe there are 2 more steps to increase performance: removing the dictionaries which contain each row, such that
|
||||
|
@ -111,7 +164,7 @@ In karoo_gp_base_class.py
|
|||
- added (y/n) to "Are you certain you want to quit?" message --thanks Hunter!
|
||||
|
||||
In karoo_gp_main.py
|
||||
- reset default evoluationary balance to .1/.1/.1/.7
|
||||
- reset default evolutionary balance to .1/.1/.1/.7
|
||||
|
||||
|
||||
|
||||
|
@ -214,7 +267,7 @@ user interface that in the original versions were not present, as follows:
|
|||
This script now auto-scales to any number of columns and rows (within the limit of your computer's capability),
|
||||
and features a text-based user interface. This script is designed to be used following karoo_data_sort.py.
|
||||
|
||||
karoo_multiclassifier.py
|
||||
karoo_multi-classifier.py
|
||||
This script functions as before, but with a minor bug fixed in which the final class was mislabeled.
|
||||
|
||||
karoo_iris_plot.py
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -1,8 +1,8 @@
|
|||
# Karoo GP Main (desktop)
|
||||
# Use Genetic Programming for Classification and Symbolic Regression
|
||||
# by Kai Staats, MSc UCT / AIMS; see LICENSE.md
|
||||
# Much thanks to Emmanuel Dufourq and Arun Kumar for their support, guidance, and free psychotherapy sessions
|
||||
# version 0.9.2.1
|
||||
# by Kai Staats, MSc; see LICENSE.md
|
||||
# Thanks to Emmanuel Dufourq and Arun Kumar for support during 2014-15 devel; TensorFlow support provided by Iurii Milovanov
|
||||
# version 1.0
|
||||
|
||||
'''
|
||||
A word to the newbie, expert, and brave--
|
||||
|
@ -34,6 +34,7 @@ If you include the path to an external dataset, it will auto-load at launch:
|
|||
import sys # sys.path.append('modules/') to add the directory 'modules' to the current path
|
||||
import karoo_gp_base_class; gp = karoo_gp_base_class.Base_GP()
|
||||
|
||||
|
||||
#++++++++++++++++++++++++++++++++++++++++++
|
||||
# User Defined Configuration |
|
||||
#++++++++++++++++++++++++++++++++++++++++++
|
||||
|
@ -50,10 +51,10 @@ gp.karoo_banner()
|
|||
|
||||
print ''
|
||||
|
||||
menu = ['b','r','c','m','p','']
|
||||
menu = ['c','r','m','p','']
|
||||
while True:
|
||||
try:
|
||||
gp.kernel = raw_input('\t Select (r)egression, (c)lassification, (m)atching, or (p)lay (default m): ')
|
||||
gp.kernel = raw_input('\t Select (c)lassification, (r)egression, (m)atching, or (p)lay (default m): ')
|
||||
if gp.kernel not in menu: raise ValueError()
|
||||
gp.kernel = gp.kernel or 'm'; break
|
||||
except ValueError: print '\t\033[32m Select from the options given. Try again ...\n\033[0;0m'
|
||||
|
@ -139,7 +140,7 @@ else: # if any other kernel is selected
|
|||
except ValueError: print '\t\033[32m Enter a number from 1 including 100. Try again ...\n\033[0;0m'
|
||||
except KeyboardInterrupt: sys.exit()
|
||||
|
||||
menu = ['i','g','m','s','db','t','']
|
||||
menu = ['i','g','m','s','db','']
|
||||
while True:
|
||||
try:
|
||||
gp.display = raw_input('\t Display (i)nteractive, (g)eneration, (m)iminal, or (s)ilent (default m): ')
|
||||
|
@ -150,13 +151,12 @@ else: # if any other kernel is selected
|
|||
|
||||
|
||||
# define the ratio between types of mutation, where all sum to 1.0; can be adjusted in 'i'nteractive mode
|
||||
gp.evolve_repro = int(0.1 * gp.tree_pop_max) # percentage of subsequent population to be generated through Reproduction
|
||||
gp.evolve_point = int(0.1 * gp.tree_pop_max) # percentage of subsequent population to be generated through Point Mutation
|
||||
gp.evolve_branch = int(0.1 * gp.tree_pop_max) # percentage of subsequent population to be generated through Branch Mutation
|
||||
gp.evolve_cross = int(0.7 * gp.tree_pop_max) # percentage of subsequent population to be generated through Crossover
|
||||
gp.evolve_repro = int(0.1 * gp.tree_pop_max) # quantity of a population generated through Reproduction
|
||||
gp.evolve_point = int(0.1 * gp.tree_pop_max) # quantity of a population generated through Point Mutation
|
||||
gp.evolve_branch = int(0.1 * gp.tree_pop_max) # quantity of a population generated through Branch Mutation
|
||||
gp.evolve_cross = int(0.7 * gp.tree_pop_max) # quantity of a population generated through Crossover
|
||||
|
||||
gp.tourn_size = 10 # qty of individuals entered into each tournament (standard 10); can be adjusted in 'i'nteractive mode
|
||||
gp.cores = 1 # replace '1' with 'int(gp.core_count)' to auto-set to max; can be adjusted in 'i'nteractive mode
|
||||
gp.precision = 4 # the number of floating points for the round function in 'fx_fitness_eval'; hard coded
|
||||
|
||||
|
||||
|
@ -182,8 +182,8 @@ gp.fx_karoo_construct(tree_type, tree_depth_base) # construct the first populati
|
|||
if gp.kernel != 'p': print '\n We have constructed a population of', gp.tree_pop_max,'Trees for Generation 1\n'
|
||||
|
||||
else: # EOL for Play mode
|
||||
gp.fx_eval_tree_print(gp.tree) # print the current Tree
|
||||
gp.fx_tree_archive(gp.population_a, 'a') # save this one Tree to disk
|
||||
gp.fx_display_tree(gp.tree) # print the current Tree
|
||||
gp.fx_archive_tree_write(gp.population_a, 'a') # save this one Tree to disk
|
||||
sys.exit()
|
||||
|
||||
|
||||
|
@ -206,11 +206,13 @@ if gp.display != 's':
|
|||
if gp.display == 'i': gp.fx_karoo_pause(0)
|
||||
|
||||
gp.fx_fitness_gym(gp.population_a) # 1) extract polynomial from each Tree; 2) evaluate fitness, store; 3) display
|
||||
gp.fx_tree_archive(gp.population_a, 'a') # save the first generation of Trees to disk
|
||||
gp.fx_archive_tree_write(gp.population_a, 'a') # save the first generation of Trees to disk
|
||||
|
||||
# no need to continue if only 1 generation or fewer than 10 Trees were designated by the user
|
||||
if gp.tree_pop_max < 10 or gp.generation_max == 1:
|
||||
gp.fx_karoo_eol(); sys.exit()
|
||||
gp.fx_archive_params_write('Desktop') # save run-time parameters to disk
|
||||
gp.fx_karoo_eol()
|
||||
sys.exit()
|
||||
|
||||
|
||||
#++++++++++++++++++++++++++++++++++++++++++
|
||||
|
@ -238,14 +240,14 @@ for gp.generation_id in range(2, gp.generation_max + 1): # loop through 'generat
|
|||
gp.fx_karoo_crossover() # method 4 - Crossover Reproduction
|
||||
gp.fx_eval_generation() # evaluate all Trees in a single generation
|
||||
|
||||
gp.population_a = gp.fx_evo_pop_copy(gp.population_b, ['GP Tree by Kai Staats, Generation ' + str(gp.generation_id)])
|
||||
gp.population_a = gp.fx_evolve_pop_copy(gp.population_b, ['GP Tree by Kai Staats, Generation ' + str(gp.generation_id)])
|
||||
|
||||
|
||||
#++++++++++++++++++++++++++++++++++++++++++
|
||||
# "End of line, man!" --CLU |
|
||||
#++++++++++++++++++++++++++++++++++++++++++
|
||||
|
||||
gp.fx_tree_archive(gp.population_b, 'f') # save the final generation of Trees to disk
|
||||
gp.fx_archive_tree_write(gp.population_b, 'f') # save the final generation of Trees to disk
|
||||
gp.fx_karoo_eol()
|
||||
|
||||
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
# Karoo GP Server
|
||||
# Use Genetic Programming for Classification and Symbolic Regression
|
||||
# by Kai Staats, MSc UCT / AIMS; see LICENSE.md
|
||||
# Much thanks to Emmanuel Dufourq and Arun Kumar for their support, guidance, and free psychotherapy sessions
|
||||
# version 0.9.2.1
|
||||
# by Kai Staats, MSc; see LICENSE.md
|
||||
# Thanks to Emmanuel Dufourq and Arun Kumar for support during 2014-15 devel; TensorFlow support provided by Iurii Milovanov
|
||||
# version 1.0
|
||||
|
||||
'''
|
||||
A word to the newbie, expert, and brave--
|
||||
|
@ -14,7 +14,7 @@ of its intent and design.
|
|||
KAROO GP SERVER
|
||||
This is the Karoo GP server application. It can be internally scripted, fully command-line configured, or a combination
|
||||
of both. If this is your first time using Karoo GP, please run the desktop application karoo_gp_main.py first in order
|
||||
that you come to understand its full functionality.
|
||||
that you come to understand the full functionality of this particular Genetic Programming platform.
|
||||
|
||||
To launch Karoo GP server:
|
||||
|
||||
|
@ -52,18 +52,18 @@ import argparse
|
|||
import karoo_gp_base_class; gp = karoo_gp_base_class.Base_GP()
|
||||
|
||||
ap = argparse.ArgumentParser(description = 'Karoo GP Server')
|
||||
ap.add_argument('-ker', action = 'store', dest = 'kernel', default = 'm', help = '[r,c,m] fitness function: (r)egression, (c)lassification, or (m)atching')
|
||||
ap.add_argument('-ker', action = 'store', dest = 'kernel', default = 'm', help = '[c,r,m] fitness function: (r)egression, (c)lassification, or (m)atching')
|
||||
ap.add_argument('-typ', action = 'store', dest = 'type', default = 'r', help = '[f,g,r] Tree type: (f)ull, (g)row, or (r)amped half/half')
|
||||
ap.add_argument('-bas', action = 'store', dest = 'depth_base', default = 3, help = '[3...10] maximum Tree depth for the initial population')
|
||||
ap.add_argument('-max', action = 'store', dest = 'depth_max', default = 3, help = '[3...10] maximum Tree depth for the entire run')
|
||||
ap.add_argument('-min', action = 'store', dest = 'depth_min', default = 3, help = '[3...100] minimum number of nodes')
|
||||
ap.add_argument('-pop', action = 'store', dest = 'pop_max', default = 100, help = '[10...1000] maximum population')
|
||||
ap.add_argument('-gen', action = 'store', dest = 'gen_max', default = 10, help = '[1...100] number of generations')
|
||||
ap.add_argument('-fil', action = 'store', dest = 'filename', default = 'files/data_MATCH.csv', help = '/path/to_your/data.csv')
|
||||
ap.add_argument('-fil', action = 'store', dest = 'filename', default = 'files/data_MATCH.csv', help = '/path/to_your/[data].csv')
|
||||
|
||||
args = ap.parse_args()
|
||||
|
||||
# set the same parameters found in the Karoo GP desktop application, but potentially passed from the command line
|
||||
# pass the argparse defaults and/or user inputs to the required variables
|
||||
gp.kernel = str(args.kernel)
|
||||
tree_type = str(args.type)
|
||||
tree_depth_base = int(args.depth_base)
|
||||
|
@ -73,14 +73,13 @@ gp.tree_pop_max = int(args.pop_max)
|
|||
gp.generation_max = int(args.gen_max)
|
||||
filename = str(args.filename)
|
||||
|
||||
gp.display = 'n' # display mode is set to (s)ilent
|
||||
gp.evolve_repro = int(0.1 * gp.tree_pop_max) # percentage of subsequent population to be generated through Reproduction
|
||||
gp.evolve_point = int(0.1 * gp.tree_pop_max) # percentage of subsequent population to be generated through Point Mutation
|
||||
gp.evolve_branch = int(0.2 * gp.tree_pop_max) # percentage of subsequent population to be generated through Branch Mutation
|
||||
gp.evolve_cross = int(0.6 * gp.tree_pop_max) # percentage of subsequent population to be generated through Crossover
|
||||
gp.display = 's' # display mode is set to (s)ilent
|
||||
gp.evolve_repro = int(0.1 * gp.tree_pop_max) # quantity of a population generated through Reproduction
|
||||
gp.evolve_point = int(0.0 * gp.tree_pop_max) # quantity of a population generated through Point Mutation
|
||||
gp.evolve_branch = int(0.2 * gp.tree_pop_max) # quantity of a population generated through Branch Mutation
|
||||
gp.evolve_cross = int(0.7 * gp.tree_pop_max) # quantity of a population generated through Crossover
|
||||
|
||||
gp.tourn_size = 10 # qty of individuals entered into each tournament (standard 10); can be adjusted in 'i'nteractive mode
|
||||
gp.cores = 1 # replace '1' with 'int(gp.core_count)' to auto-set to max; can be adjusted in 'i'nteractive mode
|
||||
gp.precision = 4 # the number of floating points for the round function in 'fx_fitness_eval'; hard coded
|
||||
|
||||
# run Karoo GP
|
||||
|
|
Loading…
Reference in New Issue