next up previous
Next: Example Individual Up: Grammatical Evolution : Evolving Previous: Genetic Algorithm for Developing

Grammatical Evolution

The GADs approach suffers from a number of drawbacks. In particular, as the number of productions grows, the chance of any particular production being chosen by a gene reduces. Paterson's suggestion to combat this was to simply increase the population size or the genome size. Another suggestion was to duplicate certain production rules in the BNF.

This results in a serious proliferation of introns, consider (using our own notation, not Patersons) the following chromosome using the productions from above.

1D 1A 2B 3B 3A .... 4A





Because of the initial rule that was used, there is only one possible choice rule to apply, and any other rule that occurs in the genome must be ignored. This can lead to a situation where the majority of genome consists of introns.

Instead of coding the transitions, our system codes a set of pseudo random numbers, which are used to decide which choice to take when a non terminal has one or more outcomes. A chromosome consists of a variable number of binary genes, each of which encodes an 8 bit number.


 
Table 2: The chromosome of an individual. Each gene represents a random number which can be used in the translation from genotype to phenotype.
220 203 17 3 109 215 104 30

Consider rule #1 from the previous example:

(1) <expr> ::= <expr> <op> <expr> | ( <expr> <op> <expr> ) | 
           <pre-op> ( <expr> ) | <var>

In this case, the non-terminal can produce one of four different results, our system takes the next available random number from the chromosome to decide which production to take. Each time a decision has to be made, another pseudo random number is read from the chromosome, and in this way, the system traverses the chromosome.

In natural biology, there is no direct mapping from gene to physical expression [Elseth 95]. When genes are expressed they generate proteins, which, either independantly or, more commonly in conjunction with other proteins, created by other genes, affect physical traits. We treat each transition as a protein, on their own, each transition cannot generate a physical trait. However, when other proteins are present, physical traits can be generated. Moreover, while a particular gene always generates the same protein, the physical results depend on the other proteins that are present immediately before and after.

Consider the individual in Table 2. The fourth gene generates the number 3, which, in our system, is analgous to the protein 3. Notice that this will be generated regardless of where the gene appears on the chromosome. However, it may have slightly different effects depending on what other proteins have been generated previously. The following section describes the mapping from genotype to phenotype in detail.

It is possible for individuals to run out of genes, and in this case there are two alternatives. The first is to declare the individual invalid and punish them with a suitably harsh fitness value; the alternative is to wrap the individual, and reuse the genes. This is quite an unusual approach in EAs, as it is entirely possible for certain genes to be used two or more times. Each time the gene is expressed it will always generate the same protein, but depending on the other proteins present, may have a different effect. The latter is the more biologically plausible approach, and often occurs in nature. What is crucial, however, is that each time a particular individual is mapped from its genotype to its phenotype, the same output is generated. This is because the same choices are made each time.

To complete the BNF definition for a C function, we need to include the following rules with the earlier definition:

<func> ::= <header>

<header> ::= float symb(float X) { <body> }

<body> ::= <declarations><code><return>

<declarations ::= float a;

<code> ::= a = <expr>;

<return> ::= return (a);

Notice that this function is limited to a single line of code. However, this is because of the nature of the problem, the system can easily generate functions which use several lines of code. Specifically, if the rule for code was modified to read:


  
Figure 1: The Grammatical Evolution System
\begin{figure}\psfig{file=geoutline.eps,width=4in}
\end{figure}

<code> ::= <line>;  | <line>; <code>

<line> ::= <var> = <expr>

then the system could generate functions of arbitrary length.



 
next up previous
Next: Example Individual Up: Grammatical Evolution : Evolving Previous: Genetic Algorithm for Developing
Red Hat Linux User
1998-10-02