GE consistently discovered the target function when both the duplication and pruning operators were employed. Examination of potential solutions from the other runs showed they tended to fixate on operators such as Sin
or Exp
as these gene
rated a curve similar to that of the target function.
Space constraints prevent us from describing the nature of individuals generated by GE as it was constructing the correct solution. However, the system exploits the variable length genome by concentrating on the start of the string initially. Although the target function can be represented by eighteen distinct genes, there are several other possibilities which, while not perfectly fit, are considerably shorter. Individuals such as X2 + X generate curves quite comparable to the target, using a mere five genes.
Typically, individuals of this form were discovered early in a run, and contained valuable gene sequences, particularly of the form X * X which, if replicated could subsequently be used to generate individuals with higher powers of X. GE is subject to problems of dependencies similar to GP [O'Reilly 97], i.e. the further from the root of a genome a gene is, the more likely its function is to be affected by other genes.
By biasing individuals to a shorter length, they were encouraged to discover shorter, albeit less fit expressions early in a run, and then generate progressively longer genomes and hence increasingly complex functions in subsequent generations.