A brief history of PGSL

Dr. Benny Raphael

People always ask me, how I came up with this strange algorithm with four nested cycles and funny ways to update a probability distribution function (PDF). After repeating the story many times, I decided to write it down. But I never got time to do it until the Christmas break of 2003. I realised that Christmas is the time to write about the birth of an algorithm that might provide a new light to this dark world of difficult optimisation problems. PGSL was born after a long journey through space-time, from Madras to Lausanne through space, from 1991 to 2003 in time. I will briefly summarise this journey.

Madras

I was a teaching assistant at IIT Madras while I was doing my masters by research (MS) during 1990-1992. Dr. Rajeev requested me to prepare some student projects for the course, computer methods in civil engineering. I prepared some good projects which students were able to complete on time. But there was one project that failed miserably. This was to trace a non-linear curve in two dimensions. My idea was to generate the first point by solving the nonlinear equation using Newton-Raphson method, then trace the curve by applying increments. I already had the code for Newton-Raphson method in one variable and I thought that it could easily be extended to two dimensions. It did not work even for simple equations - it always ran into stability problems. My students were extremely angry with me for giving such a difficult project. The fact that it is not easy to solve a two-dimensional equation using gradients puzzled me. But I did not have time to investigate the problem in more detail at that time. Nevertheless, it always remained at the back of my head.

Dr. Rajeev was a lecturer at that time and was also doing his PhD under the guidance of Prof. C.S.Krishnamoorthy. He did his PhD in Genetic algorithms (GA). I first learnt about GAs through him. I never could believe that you can find the optimum by randomly cutting and pasting parts of solutions and randomly flipping bits. Dr. Rajeev's answer was that the convergence of simple genetic algorithm has already been proved. I always wanted to experiment with GAs to find out how it really worked and to test if it could be used to find solutions to non-linear equations. Again, I did not have to time to do these experiments during my MS.

Glasgow

I started my PhD at the University of Strathclyde, Glasgow in 1992. Dr. Bimal Kumar was my supervisor. While I was still trying to define the topic of my research, I happened to see a paper written by Bimal Kumar and John Gero. The topic of this paper was the optimisation of a beam section using Voxel representation (I have only a vague memory of this paper now). My interest in optimisation was rekindled after reading this paper. I developed my own code for Simple Genetic Algorithm (SGA) and first tried the non-linear equation solving problem. The result was disappointing. Not only that it took too many evaluations, accuracy was very poor because of discretization. I had not heard about real-coded GAs at that time. (I wonder if it ever existed in 1992!).

My experiments with SGA helped me understand probabilistic algorithms better. I found that the probability of getting a better solution through cross-over is very low (for non-linear continuous variables). Faster improvements were obtained by randomly changing the value of a single variable at a time. There is no need to combine two solutions to get a better solution! Using this technique I was able to get faster convergence. This procedure inevitably resulted in local minima, nevertheless, I was happy that the procedure was much more reliable than the Newton Raphson method. During my attempts to avoid local minima without slowing down convergence, I realised that the size of the search space was crucial. After many experiments, I developed an algorithm called Quantum Genetic Search (QGS). The idea was to reduce the size of the search space by restricting movements to discrete positions along each axis (similar to energy levels in quantum mechanics). QGS worked well for certain types of problems, but required lot more work to be used as a general purpose optimisation program. During this time, my research focus shifted to case based reasoning and I never got time to continue experimentation with optimisation. In fact, I even lost the QGS code. QGS was left to die and decay in one of my computers at Glasgow. The only remains of QGS that has been found is a report that I wrote during this period. (Bimal Kumar still believes that PGSL is the same as QGS. Actually, I did not realise the similarities between QGS and PGSL until Bimal Kumar mentioned about it to me at a conference in Iceland).

Lausanne

I joined EPFL (Swiss Federal Institute of Technology, Lausanne) in 1997. Before that I was working at Infosys Technologies Limited, Bangalore. Prof Ian Smith was brave enough to gamble with a little known person from India. (Apart from a few papers from my PhD, I did not have a strong academic background). Dr. Kristi Shea was another member of Prof. Smith's team. Her specialization was simulated annealing. Her work on optimisation tempted me to do some experiments in this area. But my other responsibilities prevented me from doing this.

In 1998, I was working on an industrial project sponsored by Karl Steiner Engineering. This project involved probabilistic analysis. I first implemented Monte-Carlo simulation for computing probabilities. I was not happy with the results because I was not getting "perfect triangles" when the theoretical distributions were triangles (when two rectangles are added). Then I developed my own algorithm for computing PDFs through numerical integration. This algorithm is very efficient when the dependency graph is a tree structure. However, when there are repeated variables the algorithm breaks down due to exponential complexity with respect to the number of repeated variables. What struck me was the robustness of Monte-Carlo simulation - you can always get an approximate solution. The solutions are reasonably accurate where probabilities are large - the difficulty is only in regions containing small probabilities. On the other hand, numerical integration is extremely accurate but fails down completely beyond a few (repeated) variables.

The success of montecarlo prompted me to think about applying it to optimisation. My initial idea was to generate a PDF representing good regions through sampling and then use this PDF to generate more solutions in good regions. I was very convinced that this would work, but resisted temptations to start this experiment. From my previous experience, I knew how I could get sucked into it and coudl spend weeks and months working on it.

In 1999, Yvan Robert-Nicoud joined the laboratory and started his PhD in the area of bridge diagnosis. I was supervising his work and we used to spend lot of time discussing about his research. At that time, he was trying to find the best values of parameters of the Lutrive bridge that produced (predicted) deflections matching measurements. One day, I saw something interesting on his computer. He had an excel sheet in which the value of Young's modulus (E) varied in constant steps along one column and in another column he had the value of the predicted deflection. He had implemented a complete statically indeterminate structural analysis in a spread sheet (including numerical integration)! He showed me that he can find the best value of E that produced a good match with measurements using this spreadsheet. I thought that this was remarkable. It struck me that he was actually performing optimisation in a single variable. Now I could not wait any longer for my optimisation experiments. I told him that I will develop a better tool for this task and he will be able to handle multiple variables with this tool.

So in 1999, the first version of PGSL was born. That time, I called it PGS (Probabilistic Global Search). This version had only two cycles. Sampling and probability updating. It did not have good speed of convergence. So I added another cycle for faster convergence. This was the focusing cycle. For parameter-estimation of Lutrive bridge, this was enough. But when I tested on highly non-linear functions, I found that it got trapped in local minima most of the time. So I introduced the subdomain cycle to restart the search. After many trials and errors I had the near perfect algorithm!! Now the algorithm had many small details that made it work well on a wide variety of problems. There were conditions to prevent premature convergence, there were mechanisms to avoid too much focused search in small regions, etc.

In the beginning, PGS remained a secret - only Yvan knew about its existance. This was because of two reasons. My earlier algorithm for numerical integration of probabilities was over-hyped and in the end was not very useful. Also, I did not know if similar algorithms existed already. I did a certain amount of search and could not find any similar algorithm. Finally, I decided that the best way to find out is to publish it. I wrote the first paper on PGSL with Prof. Smith. He suggested that a four letter acronym is better than a three letter one. So PGS was changed to PGSL, Probabilistic Global Search, Lausanne, to give credit to the place where it was developed. The first paper was published at the EGSEAI conference in Reykjavik, 2000. At that time I did not know about algorithms such as adaptive random search and controlled random search. In fact, I knew about these algorithms after I got comments from anonymous reviewers. Now I know that PGSL belongs to a class of algorithms based on sampling using a PDF. The uniqueness of PGSL is in the procedure to update the PDF in four nested cycles.

There has been many developments in PGSL during the last three years. The basics of the algorithm have remained the same. However, small details have been changed in order to perform well in a wide range of problems. The number of parameters to be tuned has also been reduce to one from four. In PGSL version 3, the only parameter that needs to be specified is the maximum number of evaluations of the objective function. All other parameters are automatically computed. There is also a java version of PGSL, called SPGSL (Simplified PGSL). The algorithm has been simplified (at the cost of reliability) and is better suited for smoother objective functions.

A foot note

I was eventually able to solve the non-linear equation problem using gradients. The trick is to use a combination of line-search and gradients: Compute the gradient, perform a line search in this direction, get the best point, compute the gradient at this point and repeat until convergence is reached. This procedure is very efficient and does not suffer from stability problems.

Back to my homepage .