An Empirical Validation of an Automated Genetic Software Effort Prediction Framework Using the ISBSG Dataset
Archivos
Fecha
2016-04
Tipo
contribución de congreso
Autores
Murillo Morera, Juan
Quesada López, Christian Ulises
Castro Herrera, Carlos
Jenkins Coronas, Marcelo
Título de la revista
ISSN de la revista
Título del volumen
Editor
Resumen
The complexity of providing accurate software
effort prediction models is well known in the software industry. Several
prediction models have been proposed in the literature using different
techniques, with different results, in different contexts. Objectives: This
paper reports a benchmarking study using a genetic approach that automatically
generates and compares different learning schemes (preprocessing+attribute
selection+learning algorithms). The effectiveness of the
software development effort prediction models (using function points)
were validated using the ISBSG R12 dataset. Methods: Eight subsets
of projects were analyzed running a M×N-fold cross-validation. We used
a genetic approach to automatically select the components of the learning
schemes, to evaluate, and to report the learning scheme with the best performance.
Results: In total, 150 learning schemes were studied (2 data
preprocessors, 5 attribute selectors, and 15 modeling techniques). The
most common learning schemes were: Log+ForwardSelection+M5-Rules,
Log+BestFirst+M5-Rules, Log+LinearForwardSelection+SMOreg, ForwardSelection+SMOreg
and ForwardSelection+ SMOreg, BackwardElimination+SMOreg,
LinearForwardSelection+SMOreg, and Log+Best
First+SMOreg. Conclusions: The results show that we should select a
different learning schemes for each datasets. Our results support previous
findings regarding that the setup applied in evaluations can completely
reverse findings. A genetic approach that automatically selects best combination
based on a specific dataset could improve the performance of
software effort prediction models.
Descripción
Palabras clave
Effort prediction model, Learning schemes, Genetic approach, Experiment, ISBSG dataset, Function points