Abstract :
 

Setting up a software product line aims at building and maintaining a family of similar software products in a reuse-based manner. Reuse helps reduce the development / maintenance effort, shorten time-to-market and improve overall quality of software. To migrate from existing software product variants into a software product line, one has to understand their similarities and differences that can be expressed in terms of offered features. In this dissertation, we tackle the problem of building a software product line from the source code of its product variants and from several complementary artifacts such as use case diagrams, when they are available. Our contributions focus on one of the main steps of this building which consists in extracting and organizing a feature model in an automated manner. The first contribution of this thesis is a new approach to mine features from the object-oriented source code of a set of software variants. Three techniques are used to do so: Formal Concept Analysis, Latent Semantic Indexing and analysis of structural code dependencies. These techniques exploit commonalities and variable parts across software variants, at source code level. The second contribution consists in documenting a mined feature by providing a name and description. It exploits both the source code of the feature and use-cases, which contains the logical organization of external functionalities together with textual descriptions of these functionalities. Relational Concept Analysis completes the same three techniques used previously as it can group entities according to their relations. In the third contribution, we propose an automatic approach to organize the mined and documented features into a feature model. Features are organized in a tree tagged with operations and enriched with logical expressions that highlight mandatory features, optional features and feature groups (AND, OR, XOR groups), and complementary textual constraints that express requirements or mutual exclusions. This model is built thanks to a structure obtained using Formal Concept Analysis where variants are described by their features. To validate our approach, we applied it on three case studies: ArgoUML-SPL, Health complaint-SPL and Mobile media software product variants. These case studies already are structures software product lines. We consider several products from these examples as if they were software variants, apply our approach and evaluate its efficiency comparing our automatically extracted feature models to initial models (those provided by each software product line developers).