GAGG algorithm is used to realize genes clusters according to their expression profiles.
It is essentially intended to biologists, statisticians and bioinformaticians, who have a minimal prerequisite in the use of R software.
Statistical knowledge and in particular in principal component analysis can facilitate the understanding of the graphics, but are not indispensable because the groups are generated in a self organizing manner.
In the same way, default parameters are set for the genetic algorithm:
Tpop and Ngene parameters corresponding respectively to the population size and the number of generations, can be modified by the user. The more these values are high, the more the chance to converge to the optimal solution will be increasing, but the computational time will be increased too.
The algorithm allows to indifferently treat monocolor or bicolor microarrays, the pre-treatment of data is let to the user who can choose his normalization (Quantile normalization, loess, lowess etc..) and standardization technics.
The data matrix will be presented with genes in rows and experimental conditions in columns. If necessary, a pre-treatment step will be added to the algorithm later.
GAGG method gives good results for gene clustering, it uses a genetic algorithm which is greedy in computation, that implies a long execution time (several hours), in function of Tpop and Ngene parameters. At the beginning, a message asks to the user how many components he wants to compute (some information are given to help with this choice), most of the time two components are sufficient. .
The code source may be downloaded.