Tuesday, June 25, 2013

Product enumeration example using Z3


copy-paste into:

and you'll get 8 products.

Joe: Presenting GALE

Hello, my name is Joe Krall, phd student here at WVU.  I'm currently away, near San Jose, California, on internship with the NASA Ames Research Center.  My work there entails software research with intelligent systems design focusing on air traffic management (specifically, about avoiding airspace collisions between aircraft.)

My Ph.D. work here at WVU has largely been centered around gaming and procedural content generation for the aim and purpose of keeping players in the game (and providing fresher content to keep them around longer.)  Procedural Content Generation can be the process of parameter tuning to some generative content algorithm.  Thus, we lead into the problem of optimizing "fun" by tuning these parameters.

It is not timely or efficient to poll thousands of players about their experience while playing a game which utilizes some procedurally generated content.  Players are often biased, or lie, or just don't care in their responses to surveys.  It is often difficult to substantiate a lot of respondents.  It has become a major necessity to be able to optimize in very few evaluations (< 50), as opposed to thousands.

For this purpose, I've worked and developed what I call GALE.  It is capable of being used for my intents of exploring the optimization of procedural content generation algorithms, and in general, we can optimize things in less than 20 evaluations.


GALE is a powerful tool for evolutionary algorithms that optimize the solutions to problems.  As a template, we consider problems to be simulations that have input parameters called Decisions, and output scores called Objectives.  Whereas traditional evolutionary algorithms may require the use of the simulation thousands of times before it learns the best decisions to optimize the objectives, GALE requires a mere fraction of these number of simulation objective evaluations.

Our principle focus is to optimize POM3, a revised simulation of the requirements engineering process. POM3 can be split into POM3A,B,C,D, where each subset problem represents different kinds of software projects (for instance, POM3A represents all projects while POM3B represents small but safety-critical projects.)


GALE stands for Geometric Active Learning Evolution.  It is based on techniques that use Response Surface Approximation methods to estimate objective score rather than evaluate them, without the use of a simulation that can be costly and timely to run.  GALE makes the use of a Response Surface technique called RRSL (Rapid Recursive Spectral Learning), which recursively clusters in a tree-like manner, a population of unevaluated individuals by projecting them all onto a 1-dimensional line, and then discarding any dominated clusters (by choosing east and west representatives of each cluster, and using those to evaluate via the simulation.)  The remaining non-dominated individuals of leaf clusters are mutated closer to "heaven" by examining each decision of each individual, and making the best choice of how to mutate it so that the individual is more optimized.

Illustration of GALE:


Plotting objective scores during evolution (50 runs).  Each plot marker indicates the mean objective score for that generation (not necessarily the final generation of the algorithm - so we can get a clear picture of how objective scores evolve.)  Vertical axes represent percentage of change for the better from initial, and horizontal axes are the number of evaluations.   Shown here is a plot for ZDT1, one of our 11 evaluated problems.

Average speedups for each algorithm and problem are show below.  (We average about 80x fewer than state of the art.)  Note: This is a slightly outdated result, due to some minor revising in the GALE algorithm.  RHO indicates how long the algorithm had to run before it converged on optimization.  The higher the RHO, the fewer the generations needed for convergence.  Since GALE typically has a higher RHO, it has fewer generations and also fewer evaluations.  (Also Note: IBD is our measure of quality on how good the optimization was - the lower the IBD, the better.)

Next:  Frequency of decisions as chosen by GALE (top) are much more precise than NSGAII (bottom).   GALE has the power to make very concretely defined decisions.