Particle Swarm Optimization
Particle Swarm Optimization (PSO) is a technique used to explore the search space of a given problem to find the settings or parameters required to maximize or minimize a particular objective. This technique, first described by James Kennedy and Russell C. Eberhart in 1995 , originates from two separate concepts:
- The idea of swarm intelligence based off the observation of swarming habits by certain kinds of animals (such as birds and fish).
- The field of evolutionary computation.
Optimization in General
Let me talk a bit about optimization before we dig into the details of PSO. Optimization is the mechanism by which one finds the maximum or minimum value or values of a function or process. This mechanism is used in fields such as physics, chemistry, economics and engineering where the goal is to maximize efficiency, production or some other measures. Optimization can either refer to minimization or maximization.
Mathematically, a minimization task is defined as:
In the above minimization task we are looking for x (optimal solution)which could be a single value or a set of values(our parameters) which when substituted in our objective function will result in it’s minimization which means any other value or set of values in the objective function will always yield larger or equal value than our optimal solution.
Similarly, a maximization task is defined as:
The same goes for maximization but here our optimal solution will maximize the objective function.
Certain terminologies we need to clarify before we move forward:
- Objective function(f) - It is a function which we need to maximize or minimize. For a known (differentiable) function f, calculus can fairly easily provide us with the minima and maxima of f. However, in real-life optimization tasks, this objective function f is often not directly known. Instead, the objective function is a “black box” to which we apply parameters (the candidate solution) and receive an output value.
- Domain - The domain of an optimization problem is basically the region where we look for the solution. R^n of f is referred to as the search space (or parameter space). Each element of R^n is called a candidate solution(possible solution)in the search space, with x being the optimal solution.
- Dimensions(Number of Parameters) - The value n denotes the number of dimensions of the search space, and thus the number of parameters involved in the optimization problem.
- Function Space - The function f is called the objective function, which maps the search space to the function space. Since a function has only one output, this function space is usually one-dimensional.
- Fitness Space - The function space is then mapped to the one-dimensional fitness space, providing a single fitness value for each set of parameters. This single fitness value determines the optimality of the set of parameters for the desired task. In most cases, the function space can be directly mapped to the fitness space. However, the distinction between function space and fitness space is important in cases such as multiobjective optimization tasks, which include several objective functions drawing input from the same parameter space.
- Fitness - As we have already talked that in real life scenarios the objective function is most of the time a “black-box” to which we apply our parameters (the candidate solution) and receive an output value. The result of this evaluation of a candidate solution becomes the solution’s fitness. The final goal of an optimization task is to find the parameters in the search space that maximize or minimize this fitness.
In some optimization tasks, called constrained optimization tasks, the elements in a candidate solution can be subject to certain constraints (such as being greater than or less than zero).
A simple example of function optimization can be seen in Figure 1. This figure shows a selected region the function f, demonstrated as the curve seen in the diagram. This function maps from a one-dimensional parameter space — the set of real numbers R on the horizontal x-axis — to a one-dimensional function space — the set of real numbers R on the vertical y-axis. The x-axis represents the candidate solutions, and the y-axis represents the results of the objective function when applied to these candidate solutions. This type of diagram demonstrates what is called the fitness landscape of an optimization problem. The fitness landscape plots the n-dimensional parameter space against the one-dimensional fitness for each of these parameters.
Figure 1 also shows the presence of a local maximum in addition to the marked global maximum. A local maximum is a candidate solution that has a higher value from the objective function than any candidate solution in a particular region of the search space. For example, if we choose the interval [0,2.5] in Figure 1, the objective function has a local maximum located at the approximate value x = 1.05. Many optimization algorithms are only designed to find the local maximum, ignoring other local maxima and the global maximum. However, the PSO algorithm as described in this tutorial is intended to find the global maximum.
PSO Algorithm
The PSO algorithm works by simultaneously maintaining several candidate solutions in the search space. During each iteration of the algorithm, each candidate solution is evaluated by the objective function being optimized, determining the fitness of that solution. Each candidate solution can be thought of as a particle “flying” through the fitness landscape finding the maximum or minimum of the objective function.
Initially, the PSO algorithm chooses candidate solutions randomly within the search space. Figure 2 shows the initial state of a four-particle PSO algorithm seeking the global maximum in a one-dimensional search space. The search space is composed of all the possible solutions along the x-axis; the curve denotes the objective function. It should be noted that the PSO algorithm has no knowledge of the underlying objective function, and thus has no way of knowing if any of the candidate solutions are near to or far away from a local or global maximum. The PSO algorithm simply uses the objective function to evaluate its candidate solutions, and operates upon the resultant fitness values.
Components of PSO
Each particle maintains its :-
- Position (composed of the candidate solution)
- Evaluated fitness
- Velocity
Additionally, it remembers the best fitness value it has achieved thus far during the operation of the algorithm, referred to as the individual best fitness, and the candidate solution that achieved this fitness, referred to as the individual best position or individual best candidate solution. Finally, the PSO algorithm maintains the best fitness value achieved among all particles in the swarm, called the global best fitness, and the candidate solution that achieved this fitness, called the global best position or global best candidate solution.
Steps in PSO Algorithm
The PSO algorithm consists of just three steps, which are repeated until some stopping condition is met :
1. Evaluate the fitness of each particle.
2. Update individual and global best fitnesses and positions.
3. Update velocity and position of each particle.
The first two steps are fairly trivial. Fitness evaluation is conducted by supplying the candidate solution to the objective function. Individual and global best fitnesses and positions are updated by comparing the newly evaluated fitnesses against the previous individual and global best fitnesses, and replacing the best fitnesses and positions as necessary.
The velocity and position update step is responsible for the optimization ability of the PSO algorithm. The velocity of each particle in the swarm is updated using the following equation:
Where i represents the index of the particle. Thus, vi(t) is the velocity of particle i at time t and xi(t) is the position of particle i at time t.
The parameters w, c1, and c2 are user-supplied coefficients. The values r1 and r2 are random values regenerated for each velocity update. The value ˆxi(t) is the individual best candidate solution for particle i at time t, and g(t) is the swarm’s global best candidate solution at time t.
3 Components of Equation 1
Each of the three terms of the velocity update equation have different roles in the PSO algorithm.
- The first term wvi(t) is the inertia component, responsible for keeping the particle moving in the same direction it was originally heading( Value usually between 0.8 and 1.2), which can either dampen the particle’s inertia or accelerate the particle in its original direction. If you want to converge the movement of the particle to optima you should keep the value of the first term towards lower end and if you want to encourage the exploration of the entire search space you should probably keep it at the higher end.
- The second term c1r1[ˆxi(t)−xi(t)], called the cognitive component, acts as the particle’s memory, causing it to tend to return to the regions of the search space in which it has experienced high individual fitness(this cognitive component c1 is usually close to 2), and affects the size of the step the particle takes toward its individual best candidate solution ˆxi.
- The third term c2r2[g(t) − xi(t)], called the social component, causes the particle to move to the best region the swarm has found so far(This social coefficient c2 is typically close to 2), and represents the size of the step the particle takes toward the global best candidate solution g(x) the swarm has found up until that point.
In order to keep the particles from moving too far beyond the search space, we use a technique called velocity clamping to limit the maximum velocity of each particle. For a search space bounded by the range [−xmax,xmax], velocity clamping limits the velocity to the range [−vmax,vmax], where vmax = k × xmax. The value k represents a user-supplied velocity clamping factor, 0.1 ≤ k ≤ 1.0. In many optimization tasks, the search space is not centered around 0 and thus the range [−xmax,xmax] is not an adequate definition of the search space. In such a case where the search space is bounded by [xmin,xmax], we define vmax = k × (xmax − xmin)/2.
Once the velocity for each particle is calculated, each particle’s position is updated by applying the new velocity to the particle’s previous position:
This process is repeated until some stopping condition is met. Some common stopping conditions include: a preset number of iterations of the PSO algorithm, a number of iterations since the last update of the global best candidate solution, or a predefined target fitness value.
PSO Variations
Apart from the canonical PSO algorithm described in Section 3, many variations of the PSO algorithm exist. For instance, the inertia weight coefficient was originally not a part of the PSO algorithm, but was a later modification that became generally accepted. Additionally, some variations of the PSO do not include a single global best aspect of the algorithm, and instead use multiple global best that are shared by separate subpopulations of the particles.
Reference — : http://www.cs.armstrong.edu/saad/csci8100/pso_tutorial.pdf