-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdiverse_summary.csv
We can't make this file beautiful and searchable because it's too large.
433 lines (433 loc) · 575 KB
/
diverse_summary.csv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
,hal_id,year,title,authors,abstract
0,hal-03897639,2023,{BURST: Benchmarking Uniform Random Sampling Techniques},"Acher, Mathieu and Perrouin, Gilles and Cordy, Maxime","BURST is a benchmarking platform for uniform random sampling (URS) techniques. Given: i) the description of a sampling space provided as a Boolean formula (DIMACS), and ii) a sampling budget (time and strength of uniformity), BURST evaluates ten samplers for scalability and uniformity. BURST measures scalability based on the time required to produce a sample, and uniformity based on the state-of-the-art and proven statistical test Barbarik. BURST is easily extendable to new samplers and offers: i) 128 feature models (for highly-configurable systems), ii) many other models mined from the artificial intelligence/satisfiability solving benchmarks. BURST envisions supporting URS assessment and design across multiple research communities."
1,hal-01522779,2017,{Teaching Software Product Lines: A Snapshot of Current Practices and Challenges},"Acher, Mathieu and Lopez-Herrejon, Roberto Erick and Rabiser, Rick","Software Product Line (SPL) engineering has emerged to provide the means to efficiently model, produce, and maintain multiple similar software variants, exploiting their common properties, and managing their variabilities (differences). With over two decades of existence, the community of SPL researchers and practitioners is thriving as can be attested by the extensive research output and the numerous successful industrial projects. Education has a key role to support the next generation of practitioners to build highly complex, variability-intensive systems. Yet, it is unclear how the concepts of variability and SPLs are taught, what are the possible missing gaps and difficulties faced, what are the benefits, or what is the material available. Also, it remains unclear whether scholars teach what is actually needed by industry. In this article we report on three initiatives we have conducted with scholars, educators, industry practitioners, and students to further understand the connection between SPLs and education, i.e., an online survey on teaching SPLs we performed with 35 scholars, another survey on learning SPLs we conducted with 25 students, as well as two workshops held at the International Software Product Line Conference in 2014 and 2015 with both researchers and industry practitioners participating. We build upon the two surveys and the workshops to derive recommendations for educators to continue improving the state of practice of teaching SPLs, aimed at both individual educators as well as the wider community."
2,hal-01088464,2014,{Consistency Checking in Early Software Product Line Specifications -The VCC Approach},"Alférez, Mauricio and Lopez-Herrejon, Roberto Erick and Moreira, Ana and Amaral, Vasco and Egyed, Alexander","Software Product Line Engineering (SPLE) is a successful paradigm to produce a family of products for a specific domain. A challenge in SPLE is to check that different models used in early SPL specification do not contain inconsistent information that may be propagated and generate inconsistent products that do not conform to its requirements. This challenge is difficult to address due to the high number of possible combinations of product features and model fragments specifying those features. Variability Consistency Checking (VCC) offers automatic means to address that challenge. VCC relates information inferred from the relationships between features and from base models related to those features. Validating if all the products in an SPL satisfy user-defined consistency constraints is based on searching for a satisfying assignment of each formula generated by VCC. We validated VCC and its supporting tool on two case studies from different application domains, the results were encouraging as we did not observed significant performance penalties."
3,hal-01088537,2014,"{Evaluating scenario-based SPL requirements approaches: the case for modularity, stability and expressiveness}","Alférez, Mauricio and Bonifácio, Rodrigo and Teixeira, Leopoldo and Accioly, Paola and Kulesza, Uirá and Moreira, Ana and Araujo, Joao and Borba, Paulo","Software product lines (SPL) provide support for productivity gains through systematic reuse. Among the various quality attributes supporting these goals, modularity, stability and expressiveness of feature specifications, their composition and configuration knowledge emerge as strategic values in modern software development paradigms. This paper presents a metric-based evaluation aiming at assessing how well the chosen qualities are supported by scenario-based SPL requirements approaches. The selected approaches for this study span from type of notation (textual or graphical based), style to support variability (annotation or composition based), and specification expressiveness. They are compared using the metrics developed in a set of releases from an exemplar case study. Our major findings indicate that composition-based approaches have greater potential to support modularity and stability, and that quantification mechanisms simplify and increase expressiveness of configuration knowledge and composition specifications."
4,hal-01688247,2019,{Modeling Variability in the Video Domain: Language and Experience Report},"Alférez, Mauricio and Acher, Mathieu and Galindo, José A and Baudry, Benoit and Benavides, David","[Context] In an industrial project, we addressed the challenge of developing a software-based video generator such that consumers and providers of video processing algorithms can benchmark them on a wide range of video variants. [Objective] This article aims to report on our positive experience in modeling, controlling, and implementing software variability in the video domain. [Method] We describe how we have designed and developed a variability modeling language, called VM, resulting from the close collaboration with industrial partners during two years. We expose the specific requirements and advanced variability constructs we developed and used to characterize and derive variations of video sequences. [Results] The results of our experiments and industrial experience show that our solution is effective to model complex variability information and supports the synthesis of hundreds of realistic video variants. [Conclusions] From the software language perspective, we learned that basic variability mechanisms are useful but not enough; attributes and multi-features are of prior importance; meta-information and specific constructs are relevant for scalable and purposeful reasoning over variability models. From the video domain and software perspective, we report on the practical benefits of a variability approach. With more automation and control, practitioners can now envision benchmarking video algorithms over large, diverse, controlled, yet realistic datasets (videos that mimic real recorded videos) – something impossible at the beginning of the project."
5,hal-01089268,2015,{Multitier diversification in Web-based software applications},"Allier, Simon and Barais, Olivier and Baudry, Benoit and Bourcier, Johann and Daubert, Erwan and Fleurey, Franck and Monperrus, Martin and Song, Hui and Tricoire, Maxime","Internet applications are formed from reusable components all over the software stack. Both on the client and the server side, web-specific libraries and frameworks enable creative developers to wrap up rich applications within very short time. Reuse and modularity are key for liberating creativity and entrepreneurship in the Internet world. However, this bright world has a darker side. The problem is that they participate in creating a massive monoculture, in a way that has never been encountered before. We propose to counter this phenomenon by extending software diversification beyond the operating system level. The way to go is to diversify the different layers, up to the functional code. We have experimented with a realistic Internet application to demonstrate the feasibility of multi-tier diversification. This experiment highlights the challenges that are ahead of software engineers if they want to systematically break the applicative monoculture of Internet applications."
6,hal-02148791,2021,{Learning Software Configuration Spaces: A Systematic Literature Review},"Alves Pereira, Juliana and Martin, Hugo and Acher, Mathieu and Jézéquel, Jean-Marc and Botterweck, Goetz and Ventresque, Anthony","Most modern software systems (operating systems like Linux or Android, Web browsers like Firefox or Chrome, video encoders like ffmpeg, x264 or VLC, mobile and cloud applications, etc.) are highly configurable. Hundreds of configuration options, features, or plugins can be combined, each potentially with distinct functionality and effects on execution time, security, energy consumption, etc. Due to the combinatorial explosion and the cost of executing software, it is quickly impossible to exhaustively explore the whole configuration space. Hence, numerous works have investigated the idea of learning it from a small sample of configurations' measurements. The pattern ``sampling, measuring, learning"" has emerged in the literature, with several practical interests for both software developers and end-users of configurable systems. In this systematic literature review, we report on the different application objectives (e.g., performance prediction, configuration optimization, constraint mining), use-cases, targeted software systems, and application domains. We review the various strategies employed to gather a representative and cost-effective sample. We describe automated software techniques used to measure functional and non-functional properties of configurations. We classify machine learning algorithms and how they relate to the pursued application. Finally, we also describe how researchers evaluate the quality of the learning process. The findings from this systematic review show that the potential application objective is important; there are a vast number of case studies reported in the literature related to particular domains or software systems. Yet, the huge variant space of configurable systems is still challenging and calls to further investigate the synergies between artificial intelligence and software engineering."
7,hal-01083759,2015,{Formal Verification Techniques for Model Transformations: A Tridimensional Classification},"Amrani, Moussa and Combemale, Benoit and Lúcio, Levi and Selim, Gehan and Dingel, Jürgen and Le Traon, Yves and Vangheluwe, Hans and Cordy, James R.","In Model Driven Engineering (MDE), models are first-class citizens, and model transformation is MDE's ""heart and soul"". Since model transformations are executed for a family of (conforming) models, their validity becomes a crucial issue. This paper proposes to explore the question of the formal verification of model transformation properties through a tridimensional approach: the transformation involved, the properties of interest addressed, and the formal verification techniques used to establish the properties. This work is intended for a double audience. For newcomers, it provides a tutorial introduction to the field of formal verification of model transformations. For readers more familiar with formal methods and model transformations, it proposes a literature review (although not systematic) of the contributions of the field. Overall, this work allows to better understand the evolution, trends and current practice in the domain of model transformation verification. This work opens an interesting research line for building an engineering of model transformation verification guided by the notion of model transformation intent."
8,hal-00988164,2014,{Towards an Automation of the Mutation Analysis Dedicated to Model Transformation},"Aranega, Vincent and Mottu, Jean-Marie and Etien, Anne and Degueule, Thomas and Baudry, Benoit and Dekeyser, Jean-Luc","A major benefit of Model Driven Engineering (MDE) relies on the automatic generation of artefacts from high-level models through intermediary levels using model transformations. In such a process, the input must be well-designed and the model transformations should be trustworthy. Due to the specificities of models and transformations, classical software test techniques have to be adapted. Among these techniques, mutation analysis has been ported and a set of mutation operators has been defined. However, mutation analysis currently requires a considerable manual work and suffers from the test data set improvement activity. This activity is seen by testers as a difficult and time-consuming job, and reduces the benefits of the mutation analysis. This paper addresses the test data set improvement activity. Model transformation traceability in conjunction with a model of mutation operators, and a dedicated algorithm allow to automatically or semi-automatically produce test models that detect new faults. The proposed approach is validated and illustrated in a case study written in Kermeta."
9,hal-02942374,2020,{Live Modeling in the Context of State Machine Models and Code Generation},"Bagherzadeh, Mojtaba and Jahed, Karim and Combemale, Benoit and Dingel, Juergen","Live modeling has been recognized as an important technique to edit behavioral models while being executed and helps in better understanding the impact of a design choice. In the context of Model-driven Development (MDD) models can be executed by interpretation or by the translation of models into existing programming languages, often by code generation. This work is concerned with the support of live modeling in the context of state machine models when they are executed by code generation. To this end, we propose an approach that is completely independent of any live programming support offered by the target language. This independence is achieved with the help of a model transformation which equips the model with support for features which are required for live modeling. A subsequent code generation then produces a self-reflective program that allows changes to the model elements at runtime (through synchronization of design and runtime models). We have applied the approach in the context of UML-RT and created a prototype (Live-UMLRT) that provides a full set of services for live modeling of UML-RT state machines such as re-execution, adding/removing states and transitions, and adding/removing action code. We have evaluated the prototype on several use-cases. The evaluation shows that (1) generation of a self-reflective and model instrumentation can be carried out with reasonable performance, and (2) our approach can apply model changes to the running execution faster than the standard approach that depends on the live programming support of the target language."
10,hal-01350442,2015,{Quand l’écologie inspire de nouvelles manières de fabriquer des logiciels},"Baudry, Benoit","Si au lieu de distribuer des millions de copies identiques, les fabricants de logiciels avaient la possibilité de distribuer des variantes, et que ces variantes avaient la possibilité de s’adapter à leur environnement, ces « écosystèmes logiciels » ne pourraient-ils pas renforcer la résistance de tous les services qui nous accompagnent au quotidien ?"
11,hal-01182103,2015,{The Multiple Facets of Software Diversity: Recent Developments in Year 2000 and Beyond},"Baudry, Benoit and Monperrus, Martin","Early experiments with software diversity in the mid 1970's investigated N-version programming and recovery blocks to increase the reliability of embedded systems. Four decades later, the literature about software diversity has expanded in multiple directions: goals (fault-tolerance, security, software engineering); means (managed or automated diversity) and analytical studies (quantification of diversity and its impact). Our paper contributes to the field of software diversity as the first paper that adopts an inclusive vision of the area, with an emphasis on the most recent advances in the field. This survey includes classical work about design and data diversity for fault tolerance, as well as the cybersecurity literature that investigates randomization at different system levels. It broadens this standard scope of diversity, to include the study and exploitation of natural diversity and the management of diverse software products. Our survey includes the most recent works, with an emphasis from 2000 to present. The targeted audience is researchers and practitioners in one of the surveyed fields, who miss the big picture of software diversity. Assembling the multiple facets of this fascinating topic sheds a new light on the field."
12,hal-01096969,2015,{Breathing Ontological Knowledge Into Feature Model Synthesis: An Empirical Study},"Bécan, Guillaume and Acher, Mathieu and Baudry, Benoit and Ben Nasr, Sana","Feature Models (FMs) are a popular formalism for modeling and reasoning about the configurations of a software product line. As the manual construction of an FM is time-consuming and error-prone, management operations have been developed for reverse engineering, merging, slicing, or refactoring FMs from a set of configurations/dependencies. Yet the synthesis of meaningless ontological relations in the FM – as defined by its feature hierarchy and feature groups – may arise and cause severe difficulties when reading, maintaining or exploiting it. Numerous synthesis techniques and tools have been proposed, but only a few consider both configuration and ontolog-ical semantics of an FM. There are also few empirical studies investigating ontological aspects when synthesizing FMs. In this article, we define a generic, ontologic-aware synthesis procedure that computes the likely siblings or parent candidates for a given feature. We develop six heuristics for clustering and weighting the logical, syntactical and semantical relationships between feature names. We then perform an empirical evaluation on hundreds of FMs, coming from the SPLOT repository and Wikipedia. We provide evidence that a fully automated synthesis (i.e., without any user intervention) is likely to produce FMs far from the ground truths. As the role of the user is crucial, we empirically analyze the strengths and weak-nesses of heuristics for computing ranking lists and different kinds of clusters. We show that a hybrid approach mixing logical and ontological techniques outperforms state-of-the-art solutions. We believe our approach, environment, and empirical results support researchers and practitioners working on reverse engineering and management of FMs."
13,hal-01427218,2017,{Automated Extraction of Product Comparison Matrices From Informal Product Descriptions},"Ben Nasr, Sana and Bécan, Guillaume and Acher, Mathieu and Bosco, João Ferreira Filho and Sannier, Nicolas and Baudry, Benoit and Davril, Jean-Marc","Domain analysts, product managers, or customers aim to capture the important features and differences among a set of related products. A case-by-case reviewing of each product description is a laborious and time-consuming task that fails to deliver a condense view of a family of product. In this article, we investigate the use of automated techniques for synthesizing a product comparison matrix (PCM) from a set of product descriptions written in natural language. We describe a tool-supported process, based on term recognition, information extraction, clustering, and similarities, capable of identifying and organizing features and values in a PCM – despite the informality and absence of structure in the textual descriptions of products. We evaluate our proposal against numerous categories of products mined from BestBuy. Our empirical results show that the synthesized PCMs exhibit numerous quantitative, comparable information that can potentially complement or even refine technical descriptions of products. The user study shows that our automatic approach is capable of extracting a significant portion of correct features and correct values. This approach has been implemented in MatrixMiner a web environment with an interactive support for automatically synthesizing PCMs from informal product descriptions. MatrixMiner also maintains traceability with the original descriptions and the technical specifications for further refinement or maintenance by users."
14,hal-01863885,2018,{Distributing Relational Model Transformation on MapReduce},"Benelallam, Amine and Gómez, Abel and Tisi, Massimo and Cabot, Jordi","MDE has been successfully adopted in the production of software for several domains. As the models that need to be handled in MDE grow in scale, it becomes necessary to design scalable algorithms for model transformation (MT) as well as suitable frameworks for storing and retrieving models efficiently. One way to cope with scalability is to exploit the wide availability of distributed clusters in the Cloud for the parallel execution of MT. However, because of the dense interconnectivity of models and the complexity of transformation logic, the efficient use of these solutions in distributed model processing and persistence is not trivial. This paper exploits the high level of abstraction of an existing relational MT language, ATL, and the semantics of a distributed programming model, MapReduce, to build an ATL engine with implicitly distributed execution. The syntax of the language is not modified and no primitive for distribution is added. Efficient distribution of model elements is achieved thanks to a distributed persistence layer, specifically designed for relational MT. We demonstrate the effectiveness of our approach by making an implementation of our solution publicly available and using it to experimentally measure the speed-up of the transformation system while scaling to larger models and clusters."
15,hal-01120558,2015,{Assessing the Use of Slicing-based Visualizing Techniques on the Understanding of Large Metamodels},"Blouin, Arnaud and Moha, Naouel and Baudry, Benoit and Sahraoui, Houari and Jézéquel, Jean-Marc","Context. Metamodels are cornerstones of various metamodeling activities. Such activities consist of, for instance, transforming models into code or comparing metamodels. These activities thus require a good understanding of a metamodel and/or its parts. Current metamodel editing tools are based on standard interactive visualization features, such as physical zooms. Objective. However, as soon as metamodels become large, navigating through large metamodels becomes a tedious task that hinders their understanding. So, a real need to support metamodel comprehension appears. Method. In this work we promote the use of model slicing techniques to build interactive visualization tools for metamodels. Model slicing is a model comprehension technique inspired by program slicing. We show how the use of Kompren, a domain-specific language for defining model slicers, can ease the development of such interactive visualization features. Results. We specifically make four main contributions. First, the proposed interactive visualization techniques permit users to focus on metamodel elements of interest, which aims at improving the understandability. Second, these proposed techniques are developed based on model slicing, a model comprehension technique that involves extracting a subset of model elements of interest. Third, we develop a metamodel visualizer, called Explen, embedding the proposed interactive visualization techniques. Fourth, we conducted experiments. showing that Explen significantly outperforms EcoreTools, in terms of time, correctness, and navigation effort, on metamodeling tasks. Conclusion. The results of the experiments, in favor of Explen, show that improving metamodel understanding can be done using slicing-based interactive navigation features."
16,hal-03231669,2021,{Interacto: A Modern User Interaction Processing Model},"Blouin, Arnaud and Jézéquel, Jean-Marc","Since most software systems provide their users with interactive features, building user interfaces (UI) is one of the core software engineering tasks. It consists in designing, implementing and testing ever more sophisticated and versatile ways for users to interact with software systems, and safely connecting these interactions with commands querying or modifying their state. However, most UI frameworks still rely on a low level model, the bare bone UI event processing model. This model was suitable for the rather simple UIs of the early 80's (menus, buttons, keyboards, mouse clicks), but now exhibits major software engineering flaws for modern, highly interactive UIs. These flaws include lack of separation of concerns, weak modularity and thus low reusability of code for advanced interactions, as well as low testability. To mitigate these flaws, we propose Interacto as a high level user interaction processing model. By reifying the concept of user interaction, Interacto makes it easy to design, implement and test modular and reusable advanced user interactions, and to connect them to commands with built-in undo/redo support. To demonstrate its applicability and generality, we briefly present two open source implementations of Interacto for Java/JavaFX and TypeScript/Angular. We evaluate Interacto interest (1) on a real world case study, where it has been used since 2013, and with (2) a controlled experiment with 44 master students, comparing it with traditionnal UI frameworks."
17,hal-00746566,2015,{Kompren: Modeling and Generating Model Slicers},"Blouin, Arnaud and Combemale, Benoit and Baudry, Benoit and Beaudoux, Olivier",
18,hal-01499106,2018,{User Interface Design Smell: Automatic Detection and Refactoring of Blob Listeners},"Blouin, Arnaud and Lelli, Valéria and Baudry, Benoit and Coulon, Fabien","Context. User Interfaces (UIs) intensively rely on event-driven programming: widgets send UI events, which capture users' interactions, to dedicated objects called controllers. Controllers use several UI listeners that handle these events to produce UI commands. Objective. First, we reveal the presence of design smells in the code that describes and controls UIs. Second, we demonstrate that specific code analyses are necessary to analyze and refactor UI code, because of its coupling with the rest of the code. Method. We conducted an empirical study on four large Java Swing and SWT open-source software systems: Eclipse, JabRef, ArgouML, and FreeCol. We study to what extent the number of UI commands that a UI listener can produce has an impact on the change-and fault-proneness of the UI listener code. We develop a static code analysis for detecting UI commands in the code. Results. We identify a new type of design smell, called Blob listener that characterizes UI listeners that can produce more than two UI commands. We propose a systematic static code analysis procedure that searches for Blob listener that we implement in InspectorGuidget. We conducted experiments on the four software systems for which we manually identified 53 instances of Blob listener. InspectorGuidget successfully detected 52 Blob listeners out of 53. The results exhibit a precision of 81.25 % and a recall of 98.11 %. We then developed a semi-automatically and behavior-preserving refactoring process to remove Blob listeners. 49.06 % of the 53 Blob listeners were automatically refactored. Patches for JabRef, and FreeCol have been accepted and merged. Discussions with developers of the four software systems assess the relevance of the Blob listener. Conclusion. This work shows that UI code also suffers from design smells that have to be identified and characterized. We argue that studies have to be conducted to find other UI design smells and tools that analyze UI code must be developed."
19,hal-02422437,2019,{Leveraging metamorphic testing to automatically detect inconsistencies in code generator families},"Boussaa, Mohamed and Barais, Olivier and Sunyé, Gerson and Baudry, Benoit","Generative software development has paved the way for the creation of multiple code generators that serve as a basis for automatically generating code to different software and hardware platforms. In this context, the software quality becomes highly correlated to the quality of code generators used during software development. Eventual failures may result in a loss of confidence for the developers, who will unlikely continue to use these generators. It is then crucial to verify the correct behaviour of code generators in order to preserve software quality and reliability. In this paper, we leverage the metamorphic testing approach to automatically detect inconsistencies in code generators via so-called ""metamorphic relations"". We define the metamorphic relation (i.e., test oracle) as a comparison between the variations of performance and resource usage of test suites running on different versions of generated code. We rely on statistical methods to find the threshold value from which an unexpected variation is detected. We evaluate our approach by testing a family of code generators with respect to resource usage and performance metrics for five different target software platforms. The experimental results show that our approach is able to detect, among 95 executed test suites, 11 performance and 15 memory usage inconsistencies."
20,hal-01614377,2019,{Advanced and efficient execution trace management for executable domain-specific modeling languages},"Bousse, Erwan and Mayerhofer, Tanja and Combemale, Benoit and Baudry, Benoit","Executable Domain-Specific Modeling Languages (xDSMLs) enable the application of early dynamic verification and validation (V&V) techniques for behavioral models. At the core of such techniques, execution traces are used to represent the evolution of models during their execution. In order to construct execution traces for any xDSML, generic trace metamodels can be used. Yet, regarding trace manipulations, generic trace metamodels lack efficiency in time because of their sequential structure, efficiency in memory because they capture superfluous data, and usability because of their conceptual gap with the considered xDSML. Our contribution is a novel generative approach that defines a multidimensional and domain-specific trace metamodel enabling the construction and manipulation of execution traces for models conforming to a given xDSML. Efficiency in time is improved by providing a variety of navigation paths within traces, while usability and memory are improved by narrowing the scope of trace metamodels to fit the considered xDSML. We evaluated our approach by generating a trace metamodel for fUML and using it for semantic differencing, which is an important V&V technique in the realm of model evolution. Results show a significant performance improvement and simplification of the semantic differencing rules as compared to the usage of a generic trace metamodel."
21,hal-01662336,2018,{Omniscient Debugging for Executable DSLs},"Bousse, Erwan and Leroy, Dorian and Combemale, Benoit and Wimmer, Manuel and Baudry, Benoit","Omniscient debugging is a promising technique that relies on execution traces to enable free traversal of the states reached by a model (or program) during an execution. While a few General-Purpose Languages (GPLs) already have support for omniscient debugging, developing such a complex tool for any executable Domain Specific Language (DSL) remains a challenging and error prone task. A generic solution must: support a wide range of executable DSLs independently of the metaprogramming approaches used for implementing their semantics; be efficient for good responsiveness. Our contribution relies on a generic omniscient debugger supported by efficient generic trace management facilities. To support a wide range of executable DSLs, the debugger provides a common set of debugging facilities, and is based on a pattern to define runtime services independently of metaprogramming approaches. Results show that our debugger can be used with various executable DSLs implemented with different metaprogramming approaches. As compared to a solution that copies the model at each step, it is on average six times more efficient in memory, and at least 2.2 faster when exploring past execution states, while only slowing down the execution 1.6 times on average."
22,hal-02317864,2020,{Comparing and Classifying Model Transformation Reuse Approaches across Metamodels},"Bruel, Jean-Michel and Combemale, Benoit and Guerra, Esther M and Jézéquel, Jean-Marc and Kienzle, Jörg and de Lara, Juan and Mussbacher, Gunter and Syriani, Eugene and Vangheluwe, Hans","Model transformations are essential elements of Model-driven Engineering (MDE) solutions, as they enable the automatic manipulation of models. MDE promotes the creation of domain-specific metamodels, but without proper reuse mechanisms, model transformations need to be developed from scratch for each new metamodel. In this paper, our goal is to understand whether transformation reuse across metamodels is needed by the community, evaluate its current state, identify practical needs and propose promising lines for further research. For this purpose, we first report on a survey to understand the reuse approaches used currently in practice and the needs of the community. Then, we propose a classification of reuse techniques based on a feature model, and compare a sample of specific approaches – model types, concepts, a-posteriori typing, multilevel modeling, typing requirement models, facet-oriented modeling, mapping operators, constraint-based model types, and design patterns for model transformations – based on this feature model and a common example. We discuss strengths and weaknesses of each approach, provide a reading grid used to compare their features, compare with community needs, identify gaps in current transformation reuse approaches in relation to these needs and propose future research directions."
23,hal-04037520,2023,{On Understanding Context Modelling for Adaptive Authentication Systems},"Bumiller, Anne and Challita, Stéphanie and Combemale, Benoit and Barais, Olivier and Aillery, Nicolas and Le Lan, Gael","In many situations, it is of interest for authentication systems to adapt to context ( e.g., when the user’s behavior differs from the previous behavior). Hence, representing the context with appropriate and well-designed models is crucial. We provide a comprehensive overview and analysis of research work on Context Modelling for Adaptive Authentication systems (CM4AA) . To this end, we pursue three goals based on the Systematic Mapping Study (SMS) and Systematic Literature Review (SLR) research methodologies. We first present a SMS to structure the research area of CM4AA ( goal 1 ). We complement the SMS with a SLR to gather and synthesise evidence about context information and its modelling for adaptive authentication systems ( goal 2 ). From the knowledge gained from goal 2, we determine the desired properties of the context information model and its use for adaptive authentication systems ( goal 3 ). Motivated to find out how to model context information for adaptive authentication, we provide a structured survey of the literature to date on CM4AA and a classification of existing proposals according to several analysis metrics. We demonstrate the ability of capturing a common set of contextual features that are relevant for adaptive authentication systems independent from the application domain. We emphasise that despite the possibility of a unified framework, no standard for CM4AA exists."
24,hal-01186015,2015,{An Analysis of Metamodeling Practices for MOF and OCL},"Cadavid, Juan and Combemale, Benoit and Baudry, Benoit","The definition of a metamodel that precisely captures domain knowledge for effective know-how capitalization is a challenging task. A major obstacle for domain experts who want to build a metamodel is that they must master two radically different languages: an object-oriented, MOF-compliant, modeling language to capture the domain structure and first order logic (the Object Constraint Language) for the definition of well-formedness rules. However, there are no guidelines to assist the conjunct usage of both paradigms, and few tools support it. Consequently, we observe that most metamodels have only an object-oriented domain structure, leading to inaccurate metamodels. In this paper, we perform the first empirical study, which analyzes the current state of practice in metamodels that actually uselogical expressions to constrain the structure. We analyze 33 metamodels including 995 rules coming from industry, academia and the Object Management Group, to understand how metamodelers articulate both languages. We implement a set of metrics in the OCLMetrics tool to evaluate the complexity of both parts, as well as the coupling between both. We observe that all metamodels tend to have a small, core subset of concepts, which are constrained by most of the rules, in general the rules are loosely coupled to the structure and we identify the set of OCL constructs actually used in rules."
25,hal-03122452,2021,{Model-Based Cloud Resource Management with TOSCA and OCCI},"Challita, Stéphanie and Korte, Fabian and Erbel, Johannes and Zalila, Faiez and Grabowski, Jens and Merle, Philippe","With the advent of cloud computing, different cloud providers with heterogeneous cloud services (compute, storage, network, applications, etc.) and their related Application Programming Interfaces (APIs) have emerged. This heterogeneity complicates the implementation of an interoperable cloud system. Several standards have been proposed to address this challenge and provide a unified interface to cloud resources. The Open Cloud Computing Interface (OCCI) thereby focuses on the standardization of a common API for Infrastructureas-a-Service (IaaS) providers while the Topology and Orchestration Specification for Cloud Applications (TOSCA) focuses on the standardization of a template language to enable the proper definition of the topology of cloud applications and their orchestrations on top of a cloud system. TOSCA thereby does not define how the application topologies are created on the cloud. Therefore, we analyse the conceptual similarities between the two approaches and we study how we can integrate them to obtain a complete standard-based approach to manage both Cloud Infrastructure and Cloud application layers. We propose an automated extensive mapping between the concepts of the two standards and we provide TOSCA Studio, a model-driven tool chain for TOSCA that conforms to OCCI. TOSCA Studio allows to graphically design cloud applications as well as to deploy and manage them at runtime using a fully model-driven cloud orchestrator based on the two standards. Our contribution is validated by successfully transforming and deploying three cloud applications: WordPress, Node Cellar and Multi-Tier."
26,hal-02612087,2021,{A Hitchhiker's Guide to Model-Driven Engineering for Data-Centric Systems},"Combemale, Benoit and Kienzle, Jörg and Mussbacher, Gunter and Ali, Hyacinth and Amyot, Daniel and Bagherzadeh, Mojtaba and Batot, Edouard and Bencomo, Nelly and Benni, Benjamin and Bruel, Jean-Michel and Cabot, Jordi and Cheng, Betty H C and Collet, Philippe and Engels, Gregor and Heinrich, Robert and Jézéquel, Jean-Marc and Koziolek, Anne and Mosser, Sébastien and Reussner, Ralf and Sahraoui, Houari and Saini, Rijul and Sallou, June and Stinckwich, Serge and Syriani, Eugene and Wimmer, Manuel",
27,hal-01803008,2018,{Concern-Oriented Language Development (COLD): Fostering Reuse in Language Engineering},"Combemale, Benoit and Kienzle, Jörg and Mussbacher, Gunter and Barais, Olivier and Bousse, Erwan and Cazzola, Walter and Collet, Philippe and Degueule, Thomas and Heinrich, Robert and Jézéquel, Jean-Marc and Leduc, Manuel and Mayerhofer, Tanja and Mosser, Sébastien and Schöttle, Matthias and Strittmatter, Misha and Wortmann, Andreas","Domain-Specific Languages (DSLs) bridge the gap between the problem space, in which stakeholders work, and the solution space, i.e., the concrete artifacts defining the target system. They are usually small and intuitive languages whose concepts and expressive-ness fit a particular domain. DSLs recently found their application in an increasingly broad range of domains, e.g., cyber-physical systems, computational sciences and high performance computing. Despite recent advances, the development of DSLs is error-prone and requires substantial engineering efforts. Techniques to reuse from one DSL to another and to support customization to meet new requirements are thus particularly welcomed. Over the last decade, the Software Language Engineering (SLE) community has proposed various reuse techniques. However, all these techniques remain disparate and complicate the development of real-world DSLs involving different reuse scenarios. In this paper, we introduce the Concern-Oriented Language Development (COLD) approach, a new language development model that promotes modularity and reusability of language concerns. A language concern is a reusable piece of language that consists of usual language artifacts (e.g., abstract syntax, concrete syntax, semantics) and exhibits three specific interfaces that support (1) variability management, (2) customization to a specific context, and (3) proper usage of the reused artifact. The approach is supported by a conceptual model which introduces the required concepts to implement COLD. We also present concrete examples of some language concerns and the current state of their realization with metamodel-based and grammar-based language workbenches. We expect this work to provide insights into how to foster reuse in language specification and implementation, and how to support it in language workbenches."
28,hal-00994551,2014,{Globalizing Modeling Languages},"Combemale, Benoit and Deantoni, Julien and Baudry, Benoit and France, Robert B. and Jézéquel, Jean-Marc and Gray, Jeff","In the software and systems modeling community, research on domain-specific modeling languages (DSMLs) is focused on providing technologies for developing languages and tools that allow domain experts to develop system solutions efficiently. Unfortunately, the current lack of support for explicitly relating concepts expressed in different DSMLs makes it very difficult for software and system engineers to reason about information spread across models describing different system aspects. Supporting coordinated use of DSMLs leads to what we call the globalization of modeling languages. This article presents a research initiative that broadens the DSML research focus beyond the development of independent DSMLs to one that supports globalized DSMLs, that is, DSMLs that facilitate coordination of work across different domains of expertise."
29,hal-01611057,2017,{A WebRTC Extension to Allow Identity Negotiation at Runtime},"Corre, Kevin and Bécot, Simon and Barais, Olivier and Sunyé, Gerson","In this paper we describe our implementation of the WebRTC identity architecture. We adapt OpenID Connect servers to support WebRTC peer to peer authentication and detail the issues and solutions found in the process. We observe that although WebRTC allows for the exchange of identity assertion between peers, users lack feedback and control over the other party authentication. To allow identity negotiation during a WebRTC communication setup, we propose an extension to the Session Description Protocol. Our implementation demonstrates current limitations with respect to the current WebRTC specification."
30,hal-01611048,2017,{Why can’t users choose their identity providers on the web?},"Corre, Kevin and Barais, Olivier and Sunyé, Gerson and Frey, Vincent and Crom, Jean-Michel","Authentication delegation is a major function of the modern web. Identity Providers (IdP) acquired a central role by providing this function to other web services. By knowing which web services or web applications access its service, an IdP can violate the end-user privacy by discovering information that the user did not want to share with its IdP. For instance, WebRTC introduces a new field of usage as authentication delegation happens during the call session establishment, between two users. As a result, an IdP can easily discover that Bob has a meeting with Alice. A second issue that increases the privacy violation is the lack of choice for the end-user to select its own IdP. Indeed, on many web-applications, the end-user can only select between a subset of IdPs, in most cases Facebook or Google. In this paper, we analyze this phenomena, in particular why the end-user cannot easily select its preferred IdP, though there exists standards in this field such as OpenID Connect and OAuth 2? To lead this analysis, we conduct three investigations. The first one is a field survey on OAuth 2 and OpenID Connect scope usage by web sites to understand if scopes requested by web-sites could allow for user defined IdPs. The second one tries to understand whether the problem comes from the OAuth 2 protocol or its implementations by IdP. The last one tries to understand if trust relations between websites and IdP could prevent the end user to select its own IdP. Finally, we sketch possible architecture for web browser based identity management, and report on the implementation of a prototype."
31,hal-02290742,2019,{A Snowballing Literature Study on Test Amplification},"Danglot, Benjamin and Vera-Pérez, Oscar Luis and Yu, Zhongxing and Zaidman, Andy and Monperrus, Martin and Baudry, Benoit","The adoption of agile approaches has put an increased emphasis on testing,resulting in extensive test suites. These suites include a large number of tests, in whichdevelopers embed knowledge about meaningful input data and expected properties as oracles.This article surveys works that exploit this knowledge to enhance manually written tests withrespect to an engineering goal (e.g., improve coverage or refine fault localization). While theseworks rely on various techniques and address various goals, we believe they form an emergingand coherent field of research, which we coin “test amplification”. We devised a first set ofpapers from DBLP, searching for all papers containing ""test"" and ""amplification"" in theirtitle. We reviewed the 70 papers in this set and selected the 4 papers that fit the definitionof test amplification. We use them as the seeds for our snowballing study, and systematicallyfollowed the citation graph. This study is the first that draws a comprehensive picture of thedifferent engineering goals proposed in the literature for test amplification. We believe thatthis survey will help researchers and practitioners entering this new field to understand morequickly and more deeply the intuitions, concepts and techniques used for test amplification."
32,hal-03121735,2020,{An approach and benchmark to detect behavioral changes of commits in continuous integration},"Danglot, Benjamin and Monperrus, Martin and Rudametkin, Walter and Baudry, Benoit","When a developer pushes a change to an application’s codebase, a good practice is to have a test case specifying this behavioral change. Thanks to continuous integration (CI), the test is run on subsequent commits to check that they do no introduce a regression for that behavior. In this paper, we propose an approach that detects behavioral changes in commits. As input, it takes a program, its test suite, and a commit. Its output is a set of test methods that capture the behavioral difference between the pre-commit and postcommit versions of the program. We call our approach DCI (Detecting behavioral changes in CI). It works by generating variations of the existing test cases through (i) assertion amplification and (ii) a search-based exploration of the input space. We evaluate our approach on a curated set of 60 commits from 6 open source Java projects. To our knowledge, this is the first ever curated dataset of real-world behavioral changes. Our evaluation shows that DCI is able to generate test methods that detect behavioral changes. Our approach is fully automated and can be integrated into current development processes. The main limitations are that it targets unit tests and works on a relatively small fraction of commits. More specifically, DCI works on commits that have a unit test that already executes the modified code. In practice, from our benchmark projects, we found 15.29% of commits to meet the conditions required by DCI."
33,hal-01923575,2019,{Automatic Test Improvement with DSpot: a Study with Ten Mature Open-Source Projects},"Danglot, Benjamin and Vera-Pérez, Oscar Luis and Baudry, Benoit and Monperrus, Martin","In the literature, there is a rather clear segregation between manually written tests by developers and automatically generated ones. In this paper, we explore a third solution: to automatically improve existing test cases written by developers. We present the concept, design and implementation of a system called DSpot, that takes developer-written test cases as input (JUnit tests in Java) and synthesizes improved versions of them as output. Those test improvements are given back to developers as patches or pull requests, that can be directly integrated in the main branch of the test code base. We have evaluated DSpot in a deep, systematic manner over 40 real-world unit test classes from 10 notable and open-source software projects. We have amplified all test methods from those 40 unit test classes. In 26/40 cases, DSpot is able to automatically improve the test under study, by triggering new behaviors andadding new valuable assertions. Next, for ten projects under consideration, wehave proposed a test improvement automatically synthesized by DSpot to thelead developers. In total, 13/19 proposed test improvements were accepted bythe developers and merged into the main code base. This shows that DSpotis capable of automatically improving unit-tests in real-world, large scale Java software."
34,hal-01378523,2018,{Correctness Attraction: A Study of Stability of Software Behavior Under Runtime Perturbation},"Danglot, Benjamin and Preux, Philippe and Baudry, Benoit and Monperrus, Martin","Can the execution of a software be perturbed without breaking the correctness of the output? In this paper, we devise a novel protocol to answer this rarely investigated question. In an experimental study, we observe that many perturbations do not break the correctness in ten subject programs. We call this phenomenon ``correctness attraction''. The uniqueness of this protocol is that it considers a systematic exploration of the perturbation space as well as perfect oracles to determine the correctness of the output. To this extent, our findings on the stability of software under execution perturbations have a level of validity that has never been reported before in the scarce related work. A qualitative manual analysis enables us to set up the first taxonomy ever of the reasons behind correctness attraction."
35,hal-02013421,2021,{Investigating Machine Learning Algorithms for Modeling SSD I/O Performance for Container-based Virtualization},"Dartois, Jean-Emile and Boukhobza, Jalil and Knefati, Anas and Barais, Olivier","One of the cornerstones of the cloud provider business is to reduce hardware resources cost by maximizing their utilization. This is done through smartly sharing processor, memory, network and storage, while fully satisfying SLOs negotiated with customers. For the storage part, while SSDs are increasingly deployed in data centers mainly for their performance and energy efficiency, their internal mechanisms may cause a dramatic SLO violation. In effect, we measured that I/O interference may induce a 10x performance drop. We are building a framework based on autonomic computing which aims to achieve intelligent container placement on storage systems by preventing bad I/O interference scenarios. One prerequisite to such a framework is to design SSD performance models that take into account interactions between running processes/containers, the operating system and the SSD. These interactions are complex. In this paper, we investigate the use of machine learning for building such models in a container based Cloud environment. We have investigated five popular machine learning algorithms along with six different I/O intensive applications and benchmarks. We analyzed the prediction accuracy, the learning curve, the feature importance and the training time of the tested algorithms on four different SSD models. Beyond describing modeling component of our framework, this paper aims to provide insights for cloud providers to implement SLO compliant container placement algorithms on SSDs. Our machine learning-based framework succeeded in modeling I/O interference with a median Normalized Root-Mean-Square Error (NRMSE) of 2.5%."
36,hal-01367305,2016,{Safe Model Polymorphism for Flexible Modeling},"Degueule, Thomas and Combemale, Benoit and Blouin, Arnaud and Barais, Olivier and Jézéquel, Jean-Marc","Domain-Specific Languages (DSLs) are increasingly used by domain experts to handle various concerns in systems and software development. To support this trend, the Model-Driven Engineering (MDE) community has developed advanced techniques for designing new DSLs. However, the widespread use of independently developed, and constantly evolving DSLs is hampered by the rigidity imposed to the language users by the DSLs and their tooling, e.g., for manipulating a model through various similar DSLs or successive versions of a given DSL. In this paper, we propose a disciplined approach that leverages type groups' polymorphism to provide an advanced type system for manipulating models, in a polymorphic way, through different DSL interfaces. A DSL interface, aka. model type, specifies a set of features, or services, available on the model it types, and subtyping relations among these model types define the safe substitutions. This type system complements the Melange language workbench and is seamlessly integrated into the Eclipse Modeling Framework (EMF), hence providing structural interoperability and compatibility of models between EMF-based tools. We illustrate the validity and practicability of our approach by bridging safe interoperability between different semantic and syntactic variation points of a finite-state machine (FSM) language, as well as between successive versions of the Unified Modeling Language (UML)."
37,hal-01090552,2014,{Efficient Multi-Objective Optimisation of Service Compositions in Mobile Ad hoc Networks Using Lightweight Surrogate Models},"Efstathiou, Dionysios and Mcburney, Peter and Zschaler, Steffen and Bourcier, Johann","Infrastructure-less Mobile Ad hoc NETworks (MANETs) and Service-Oriented Architecture (SOA) enable the development of pervasive applications. Based on SOA, we can abstract devices' resources as software services which can be combined into value-added composite services providing complex functionalities while exhibiting specified QoS properties. Configuring compositions with optimal QoS is challenging due to dynamic network topologies and availability of resources. Existing approaches seek to optimise the selection of which services to participate in a centralised orchestration without considering the overhead for estimating their combined QoS. QoS metrics can be used as fitness functions to guide the search for optimal compositions. When com-posing services offered by diverse devices, there is no trivial relationship between the composition's QoS and its component services. Measuring the fitness values of a candi-date composition could be done either by monitoring its actual invocation or simulating it. However, both approaches are too expensive to be used within an optimisation pro-cess. In this paper, we propose a surrogate-based multi-objective optimisation approach for exploring trade-off compositions. The evaluation results show that by replacing the expensive fitness functions with lightweight surrogate models, we can vastly accelerate the optimisation algorithm while producing trade-off solutions of high quality."
38,hal-03466396,2022,{Conceptualizing Digital Twins},"Eramo, Romina and Bordeleau, Francis and Combemale, Benoit and van den Brand, Mark and Wimmer, Manuel and Wortmann, Andreas","Digital Twins are an emerging concept which is gaining importance in several fields. It refers to a comprehensive software representation of an actual system, which includes structures, properties, conditions, behaviours, history and possible futures of that system through models and data to be continuously synchronized. Digital Twins can be built for different purposes, such as for the design, development, analysis, simulation, and operations of non-digital systems in order to understand, monitor, and/or optimize the actual system. To realize Digital Twins, data and models originated from diverse engineering disciplines have to be integrated, synchronized, and managed to leverage the benefits provided by software (digital) technologies. However, properly arranging the different models, data sources, and their relations to engineer Digital Twins is challenging. We, therefore, propose a conceptual modeling framework for Digital Twins that captures the combined usage of heterogeneous models and their respective evolving data for the twin's entire life-cycle."
39,hal-01026581,2014,{Generating Counterexamples of Model-based Software Product Lines},"Ferreira Filho, Joao Bosco and Barais, Olivier and Acher, Mathieu and Le Noir, Jérôme and Legay, Axel and Baudry, Benoit","In a Model-based Software Product Line (MSPL), the variability of the domain is characterized in a variability model and the core artifacts are base models conforming to a modeling language (also called metamodel). A realization model connects the features of the variability model to the base model elements, triggering operations over these elements based on a configuration. The design space of an MSPL is extremely complex to manage for the engineer, since the number of variants may be exponential and the derived product models have to be conforming to numerous well-formedness and business rules. In this paper, the objective is to provide a way to generate MSPLs, called counterexamples (also called anti-patterns), that can produce invalid product models despite a valid configuration in the variability model. We describe the foundations and motivate the usefulness of counterexamples (e.g., inference of guidelines or domain-specific rules to avoid earlier the specification of incorrect mappings; testing oracles for increasing the robustness of derivation engines given a modeling language). We provide a generic process, based on the Common Variability Language (CVL) to randomly search the space of MSPLs for a specific modelling language. We develop LineGen a tool on top of CVL and modeling technologies to support the methodology and the process. LineGen targets different scenarios and is flexible to work either with just a domain metamodel as input or also with pre-defined variability models and base models. We validate the effectiveness of this process for three formalisms at different scales (up to 247 metaclasses and 684 rules). We also apply the approach in the context of a real industrial scenario involving a large-scale metamodel."
40,hal-02471302,2020,{MALTESQUE 2019 Workshop Summary},"Fontana, Francesca Arcelli and Perrouin, Gilles and Ampatzoglou, Apostolos and Acher, Mathieu and Walter, Bartosz and Cordy, Maxime and Palomba, Fabio and Devroey, Xavier","Welcome to the third edition of the workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE 2019), held in Tallinn, Estonia, August 27th, 2019, co-located with ESEC / FSE 2019. This year MALTESQUE merged with the MASES (Machine Learning and Software Engineering in Symbiosis) workshop, co-located with the ASE 2018 conference. Ten papers from all over the world were submitted, seven of them were accepted. The program also featured a keynote by Lionel Briand on the use of machine learning to improve software testing."
41,hal-01204510,2014,{Supporting Distributed Product Configuration by Integrating Heterogeneous Variability Modeling Approaches},"Galindo Duarte, José Angel and Dhungana, Deepak and Rabiser, Rick and Benavides, David and Botterweck, Goetz and Grünbacher, Paul","In industrial settings products are developed by more than one organization. Software vendors and supplierscommonly typically maintain their own product lines, which contribute to a larger (multi) product line or software ecosystem.It is unrealistic to assume that the participating organizations will agree on using a specific variability modeling technique—they will rather use different approaches and tools to manage the variability of their systems. Objective: We aim to supportproduct configuration in software ecosystems based on several variability models with different semantics that have been createdusing different notations. Method: We present an integrative approach that provides a unified perspective to users configuringproducts in multi product line environments, regardless of the different modeling methods and tools used internally. We alsopresent a technical infrastructure and a prototype implementation based on web services. Results: We show the feasibility ofthe approach and its implementation by using it with the three most widespread types of variability modeling approaches in theproduct line community, i.e., feature-based, OVM-style, and decision-oriented modeling. To demonstrate the feasibility andflexibility of our approach, we present an example derived from industrial experience in enterprise resource planning. We furtherapplied the approach to support the configuration of privacy settings in the Android ecosystem based on multiple variabilitymodels. We also evaluated the performance of different model enactment strategies used in our approach. Conclusions: Toolsand techniques allowing stakeholders to handle variability in a uniform manner can considerably foster the initiation and growthof software ecosystems from the perspective of software reuse and configuration."
42,hal-01354999,2016,{ScapeGoat: Spotting abnormal resource usage in component-based reconfigurable software systems},"Gonzalez-Herrera, Inti and Bourcier, Johann and Daubert, Erwan and Rudametkin, Walter and Barais, Olivier and Fouquet, François and Jézéquel, Jean-Marc and Baudry, Benoit","Modern component frameworks support continuous deployment and simultaneous execution of multiple software components on top of the same virtual machine. However , isolation between the various components is limited. A faulty version of any one of the software components can compromise the whole system by consuming all available resources. In this paper, we address the problem of efficiently identifying faulty software components running simultaneously in a single virtual machine. Current solutions that perform permanent and extensive monitoring to detect anomalies induce high overhead on the system, and can, by themselves, make the system unstable. In this paper we present an optimistic adaptive monitoring system to determine the faulty components of an application. Suspected components are finely analyzed by the monitoring system, but only when required. Unsuspected components are left untouched and execute normally. Thus, we perform localized just-in-time monitoring that decreases the accumulated overhead of the monitoring system. We evaluate our approach on two case studies against a state-of-the-art monitoring system and show that our technique correctly detects faulty components, while reducing overhead by an average of 93%."
43,hal-01829928,2018,"{Test them all, is it worth it? Assessing configuration sampling on the JHipster Web development stack}","Halin, Axel and Nuttinck, Alexandre and Acher, Mathieu and Devroey, Xavier and Perrouin, Gilles and Baudry, Benoit","Many approaches for testing configurable software systems start from the same assumption: it is impossible to test all configurations. This motivated the definition of variability-aware abstractions and sampling techniques to cope with large configuration spaces. Yet, there is no theoretical barrier that prevents the exhaustive testing of all configurations by simply enumerating them, if the effort required to do so remains acceptable. Not only this: we believe there is lots to be learned by systematically and exhaustively testing a configurable system. In this case study, we report on the first ever endeavour to test all possible configurations of an industry-strength, open source configurable software system, JHipster, a popular code generator for web applications. We built a testing scaffold for the 26,000+ configurations of JHipster using a cluster of 80 machines during 4 nights for a total of 4,376 hours (182 days) CPU time. We find that 35.70% configurations fail and we identify the feature interactions that cause the errors. We show that sampling strategies (like dissimilarity and 2-wise): (1) are more effective to find faults than the 12 default configurations used in the JHipster continuous integration; (2) can be too costly and exceed the available testing budget. We cross this quantitative analysis with the qualitative assessment of JHipster's lead developers."
44,hal-03921298,2022,{API beauty is in the eye of the clients: 2.2 million Maven dependencies reveal the spectrum of client–API usages},"Harrand, Nicolas and Benelallam, Amine and Soto-Valero, César and Bettega, François and Barais, Olivier and Baudry, Benoit","Hyrum’s law states a common observation in the software industry: “With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody”. Meanwhile, recent research results seem to contradict this observation when they state that “for most APIs, there is a small number of features that are actually used”. In this work, we perform a large scale empirical study of client–API relationships in the Maven ecosystem, in order to investigate this seeming paradox between the observations in industry and the research literature.We study the 94 most popular libraries in Maven Central, as well as the 829,410 client artifacts that declare a dependency to these libraries and that are available in Maven Central, summing up to 2.2M dependencies. Our analysis indicates the existence of a wide spectrum of API usages, with enough clients, most API types end up being used at least once. Our second key observation is that, for all libraries, there is a small set of API types that are used by the vast majority of its clients. The practical consequences of this study are two-fold: (i) it is possible for API maintainers to find an essential part of their API on which they can focus their efforts; (ii) API developers should limit the public API elements to the set of features for which they are ready to have users."
45,hal-01352831,2016,{Optimal Minimisation of Pairwise-covering Test Configurations Using Constraint Programming},"Hervieu, Aymeric and Marijan, Dusica and Gotlieb, Arnaud and Baudry, Benoit","Context: Testing highly-configurable software systems is challenging due to a large number of test configurations that have to be carefully selected in order to reduce the testing effort as much as possible, while maintaining high software quality. Finding the smallest set of valid test configurations that ensure sufficient coverage of the system's feature interactions is thus the objective of validation engineers, especially when the execution of test configurations is costly or time-consuming. However, this problem is NP-hard in general and approximation algorithms have often been used to address it in practice. Objective: In this paper, we explore an alternative approach based on constraint programming that will allow engineers to increase the effectiveness of configuration testing while keeping the number of configurations as low as possible. Method: Our approach consists in using a (time-aware) minimisation algorithm based on constraint programming. Given the amount of time, our solution generates a minimised set of valid test configurations that ensure coverage of all pairs of feature values (a.k.a. pairwise coverage). The approach has been implemented in a tool called PACOGEN. Results: PACOGEN was evaluated on 224 feature models in comparison with the two existing tools that are based on a greedy algorithm. For 79% of 224 feature models, PACOGEN generated up to 60% fewer test configurations than the competitor tools. We further evaluated PACOGEN in the case study of large industrial highly-configurable video conferencing software with a feature model of 169 features, and found 60% fewer configurations compared with the manual approach followed by test engineers. The set of test configurations generated by PACOGEN decreased the time required by test engineers in manual test configuration by 85%, increasing the feature-pairs coverage at the same time. Conclusion: Extensive evaluation concluded that optimal minimisation of pairwise-covering test configurations is efficiently addressed using constraint programming techniques."
46,hal-01503821,2017,{Comment maîtriser la complexité des logiciels ?},"Jézéquel, Jean-Marc and Jongwane, Joanna","Le logiciel est partout dans notre vie quotidienne et cela n'est pas près de s'arrêter ! Jean-Marc Jézéquel, chercheur en informatique, nous parle des enjeux autour de cet objet mathématique qui envahit le monde réel, dans cet épisode du podcast Interstices."
47,hal-00829839,2015,{Mashup of Meta-Languages and its Implementation in the Kermeta Language Workbench},"Jézéquel, Jean-Marc and Combemale, Benoit and Barais, Olivier and Monperrus, Martin and Fouquet, François","With the growing use of domain-specific languages (DSL) in industry, DSL design and implementation goes far beyond an activity for a few experts only and becomes a challenging task for thousands of software engineers. DSL implementation indeed requires engineers to care for various concerns, from abstract syntax, static semantics, behavioral semantics, to extra-functional issues such as run-time performance. This paper presents an approach that uses one meta-language per language implementation concern. We show that the usage and combination of those meta-languages is simple and intuitive enough to deserve the term ""mashup"". We evaluate the approach by completely implementing the non trivial fUML modeling language, a semantically sound and executable subset of the Unified Modeling Language (UML)."
48,hal-04064771,2023,{Uncertainty-aware Simulation of Adaptive Systems},"Jézéquel, Jean-Marc and Vallecillo, Antonio","Adaptive systems manage and regulate the behavior of devices or other systems using control loops to automatically adjust the value of some measured variables to equal the value of a desired set-point. These systems normally interact with physical parts or operate in physical environments, where uncertainty is unavoidable. Traditional approaches to manage that uncertainty use either robust control algorithms that consider bounded variations of the uncertain variables and worst-case scenarios, or adaptive control methods that estimate the parameters and change the control laws accordingly. In this paper we propose to include the sources of uncertainty in the system models as first-class entities using random variables, in order to simulate adaptive and control systems more faithfully, including not only the use of random variables to represent and operate with uncertain values, but also to represent decisions based on their comparisons. Two exemplar systems are used to illustrate and validate our proposal."
49,hal-02192486,2019,{An Empirical Study on the Impact of Inconsistency Feedback during Model and Code Co-changing.},"Kanakis, Georgios and Khelladi, Djamel Eddine and Fischer, Stefan and Tröls, Michael and Egyed, Alexander","Model and code co-changing is about the coordinated modification of models and code during evolution. Intermittent inconsistencies are a common occurrence during co-changing. A partial co-change is the period in which the developer changed, say, the model but has not yet propagated the change to the code. Inconsistency feedback can be provided to developers for helping them to complete partial co-changes. However, there is no evidence whether such inconsistency feedback is useful to developers. To investigate this problem, we conducted a controlled experiment with 36 subjects who were required to complete ten partially completed change tasks between models and code of two non-trivial systems. The tasks were of different levels of complexity depending on how many model diagrams they affected. All subjects had to work on all change tasks but sometimes with and sometimes without inconsistency feedback. We then measured differences between task effort and correctness. We found that when subjects were given inconsistency feedback during tasks, they were 268% more likely to complete the co-change correctly compared to when they were not given inconsistency feedback. We also found that when subjects were not given inconsistency feedback, they nearly always failed in completing co-change tasks with high complexity where the partially completed changes were spread across different diagrams in the model. These findings suggest that inconsistency feedback (i.e. detection and repair) should form an integral part of co-changing, regardless of whether the code or the model changes first. Furthermore, these findings suggest that merely having access to changes (as with the given partially completed changes) is insufficient for effective co-changing."
50,hal-01319576,2016,{Correct-by-construction model driven engineering composition operators},"Kezadri, Mounira and Pantel, Marc and Thirioux, Xavier and Combemale, Benoit","Model composition is a crucial activity in Model Driven Engineering both to reuse validated and verified model elements and to handle separately the various aspects in a complex system and then weave them while preserving their properties. Many research activities target this compositional validation and verification (V & V) strategy: allow the independent assessment of components and minimize the residual V & V activities at assembly time. However, there is a continuous and increasing need for the definition of new composition operators that allow the reconciliation of existing models to build new systems according to various requirements. These ones are usually built from scratch and must be systematically verified to assess that they preserve the properties of the assembled elements. This verification is usually tedious but is mandatory to avoid verifying the composite system for each use of the operators. Our work addresses these issues, we first target the use of proof assistants for specifying and verifying compositional verification frameworks relying on formal verification techniques instead of testing and proofreading. Then, using a divide and conquer approach, we focus on the development of elementary composition operators that are easy to verify and can be used to further define complex composition operators. In our approach, proofs for the complex operators are then obtained by assembling the proofs of the basic operators. To illustrate our proposal, we use the Coq proof assistant to formalize the language-independent elementary composition operators Union and Substitution and the proof that the conformance of models with respect to metamodels is preserved during composition. We show that more sophisticated composition operators that share parts of the implementation and have several properties in common (especially: aspect oriented modeling composition approach, invasive software composition, and package merge) can then be built from the basic ones, and that the proof of conformance preservation can also be built from the proofs of basic operators."
51,hal-01342351,2016,{Traceability analyses between features and assets in software product lines},"Khandu Narwane, Ganesh and Galindo Duarte, José Angel and Narayanan Krishna, Shankara and Benavides, David and Millo, Jean-Vivien and Ramesh, S","In a Software Product Line (SPL), the central notion of implementability provides the requisite connection between specifications and their implementations, leading to the definition of products. While it appears to be a simple extension of the traceability relation between components and features, it involves several subtle issues that were overlooked in the existing literature. In this paper, we have introduced a precise and formal definition of implementability over a fairly expressive traceability relation. The consequent definition of products in the given SPL naturally entails a set of useful analysis problems that are either refinements of known problems or are completely novel. We also propose a new approach to solve these analysis problems by encoding them as Quantified Boolean Formulae (QBF) and solving them through Quantified Satisfiability (QSAT) solvers. QBF can represent more complex analysis operations, which cannot be represented by using propositional formulae. The methodology scales much better than the SAT-based solutions hinted in the literature and were demonstrated through a tool called SPLAnE (SPL Analysis Engine) on a large set of SPL models."
52,hal-01949050,2019,{A Unifying Framework for Homogeneous Model Composition},"Kienzle, Jörg and Mussbacher, Gunter and Combemale, Benoit and Deantoni, Julien","The growing use of models for separating concerns in complex systems has lead to a proliferation of model composition operators. These composition operators have traditionally been defined from scratch following various approaches differing in formality, level of detail, chosen paradigm, and styles. Due to the lack of proper foundations for defining model composition (concepts, abstractions, or frameworks), it is difficult to compare or reuse composition operators. In this paper , we stipulate the existence of a unifying framework that reduces all structural composition operators to structural merging, and all composition operators acting on discrete behaviors to event scheduling. We provide convincing evidence of this hypothesis by discussing how structural and behavioral homogeneous model composition operators (i.e., weavers) can be mapped onto this framework. Based on this discussion, we propose a conceptual model of the framework, and identify a set of research challenges, which, if addressed, lead to the realization of this framework to support rigorous and efficient engineering of model composition operators for homogeneous and eventually heterogeneous modeling languages."
53,hal-02146543,2020,{Towards Model-Driven Sustainability Evaluation},"Kienzle, Jörg and Mussbacher, Gunter and Combemale, Benoit and Bastin, Lucy and Bencomo, Nelly and Bruel, Jean-Michel and Becker, Christoph and Betz, Stefanie and Chitchyan, Ruzanna and Cheng, Betty and Klingert, Sonja and Paige, Richard and Penzenstadler, Birgit and Seyff, Norbert and Syriani, Eugene and Venters, Colin C","Sustainability has emerged as a concern of central relevance. As a wicked problem, it poses challenges to business-as-usual in many areas, including that of modeling. This article addresses a question at the intersection of model-driven engineering and sustainability research: ""How can we better support sustainability by bringing together model-driven engineering, data, visualization and self-adaptive systems, to facilitate engagement, exploration, and understanding of the effects that individual and organizational choices have on sustainability?"" We explore this question via an idealized vision of an evaluation environment that facilitates integration and mapping of models from multiple diverse sources, visual exploration, and evaluation of what-if scenarios, for stakeholders with divergent perspectives. The article identifies research challenges to be addressed to enable decision making to support sustainability and provides a map of sustainability modeling issues across disciplines."
54,hal-03029432,2021,{Consistent Change Propagation within Models},"Kretschmer, Roland and Khelladi, Djamel Eddine and Lopez-Herrejon, Roberto E and Egyed, Alexander","Developers change models with clear intentions-e.g., for refactoring, defects removal, or evolution. However, in doing so, developers are often unaware of the consequences of their changes. Changes to one part of a model may aect other parts of the same model and/or even other models, possibly created and maintained by other developers. The consequences are incomplete changes and with it inconsistencies within or across models. Extensive works exist on detecting and repairing inconsistencies. However, literature tends to focus on inconsistencies as errors in need of repairs rather than on incomplete changes in need of further propagation. Many changes are non-trivial and require a series of coordinated model changes. As developers start changing the model, intermittent inconsistencies arise with other parts of the model that developers have not yet changed. These inconsistencies are cues for incomplete change propagation. Resolving these inconsistencies should be done in a manner that is consistent with the original changes. We speak of consistentchange propagation. This paper leverages classical inconsistency repair mechanisms to explore the vast search space of change propagation. Our approachnot only suggests changes to repair a given inconsistency but also changesto repair inconsistencies caused by the aforementioned repair. In doing so,our approach follows the developer's intent where subsequent changes maynot contradict or backtrack earlier changes. We argue that consistent changepropagation is essential for eective model driven engineering. Our approachand its tool implementation were empirically assessed on 18 case studies fromindustry, academia, and GitHub to demonstrate its feasibility and scalability.A comparison with two versioned models shows that our approach identiesactual repair sequences that developers had chosen. Furthermore, an experiment involving 22 participants shows that our change propagation approachmeets the work ow of how developers handle changes by always computingthe sequence of repairs resulting from the change propagation."
55,hal-03127118,2021,{Transforming Abstract to Concrete Repairs with a Generative Approach of Repair Values},"Kretschmer, Roland and Khelladi, Djamel Eddine and Egyed, Alexander","Software models, often comprise of interconnected diagrams, change continuously, and developers often fail in keeping these diagrams consistent. Detecting inconsistencies quickly and efficiently is state of the art. However, repairing them is not trivial, because there are typically multiple model elements that need to be repaired, leading to an exponentially growing space of combinations of repair choices. Despite extensive research on consistency checking, existing approaches either provide abstract repairs only (i.e., identifying the model element but failing to describe the change), which is not satisfactory. This paper presents a novel approach that provides concrete repair choices based on values from the inconsistent models. Thus, our approach first retrieves repair values from the model, turn them to repair choices, and groups them based on their effects. This grouping lets our approach explore the repair space in its entirety, providing quick example-like feedback for all possible repairs. Our approach and its tool implementation have been empirically assessed on 10 case studies from industry, academia, and GitHub to demonstrate its feasibility and scalability. A comparison with three versioned models shows that our approach identifies useful repair values that developers have chosen."
56,hal-01549042,2018,{Automatic Production of End User Documentation for DSLs},"Le Moulec, Gwendal and Blouin, Arnaud and Gouranton, Valérie and Arnaldi, Bruno","Domain-specific languages (DSL) are developed for answering specific problems by leveraging the expertise of domain stakeholders. The development of DSLs requires a significant software engineering effort: editors, code generators, etc., must be developed to make a DSL usable. Documenting a DSL is also a major and time-consuming task required to promote it and address its learning curve. Recent research work in software language engineering focus on easing the development of DSLs. This work focuses on easing the production of documentation of textual DSLs. The API documentation domain identified challenges we adapted to DSL documentation. Based on these challenges we propose a model-driven approach that relies on DSL artifacts to extract information required to build documentation. Our implementation, called Docywood, targets two platforms: Markdown documentation for static web sites and Xtext code fragments for live documentation while modeling. We used Docywood on two DSLs, namely ThingML and Target Platform Definition. Feedback from end users and language designers exhibits qualitative benefits of the proposal with regard to the DSL documentation challenges. End user experiments conducted on ThingML and Target Platform Definition show benefits on the correctness of the created models when using Docywood on ThingML."
57,hal-02395867,2020,{Automatic generation of Truffle-based interpreters for Domain-Specific Languages},"Leduc, Manuel and Jouneaux, Gwendal and Degueule, Thomas and Le Guernic, Gurvan and Barais, Olivier and Combemale, Benoit","Numerous language workbenches have been proposed over the past decade to ease the definition of Domain-Specific Languages (DSLs). Language workbenches enable language designers to specify DSLs using high-level metalanguages and to generate their implementation (e.g., parsers, interpreters) and tool support (e.g., editors, debuggers) automatically. However, little attention has been given to the performance of the resulting interpreters. In many domains where performance is key (e.g., scientific and high-performance computing), this forces language designer to hand-craft ad-hoc optimizations in the interpreter implementations, or to lose compatibility with tool support. In this paper, we propose to systematically exploit the domain-specific information of language specifications to derive optimized Truffle-based language interpreters executed over the GraalVM. We implement our approach on top of the Eclipse Modeling Framework (EMF) by complementing its existing compilation chain with Truffle-specific information, which drives the GraalVM to benefit from an optimized just-in-time compilation. A key benefit of our approach is that it leverages existing language specifications and does not require additional information from language designers who remain oblivious of Truffle's low-level intricacies and JIT optimizations in general while staying compatible with tool support. We evaluate our approach using a representative set of four DSLs and eight conforming programs. Compared to the standard interpreters generated by EMF running on the GraalVM, we observe an average speed-up of x1.14, ranging from x1.07 to x1.26. Although the benefits vary slightly from one DSL or program to another, we conclude that our approach yields substantial performance gains while remaining non-intrusive of EMF abstractions."
58,hal-02399166,2020,{The Software Language Extension Problem},"Leduc, Manuel and Degueule, Thomas and van Wyk, Eric and Combemale, Benoit",
59,hal-02296030,2019,"{Applying Model-Driven Engineering to High-Performance Computing: Experience Report, Lessons Learned, and Remaining Challenges}","Lelandais, Benoît and Oudot, Marie-Pierre and Combemale, Benoit",
60,hal-02565549,2020,{Behavioral interfaces for executable DSLs},"Leroy, Dorian and Bousse, Erwan and Wimmer, Manuel and Mayerhofer, Tanja and Combemale, Benoit and Schwinger, Wieland","Executable domain-specific languages (DSLs) enable the execution of behavioral models. While an execution is mostly driven by the model content (e.g., control structures), many use cases require with the running model, such as simulating scenarios in an automated or interactive way, or coupling the model with other models of the system or environment. The management of these interactions is usually hardcoded into the semantics of the DSL, which prevents its reuse for other DSLs and the provision of generic interaction-centric tools (e.g., event injector). In this paper, we propose a metalanguage for complementing the definition of executable DSLs with explicit behavioral interfaces to enable external tools to interact with executed models in a unified way. We implemented the proposed metalanguage in the GEMOC Studio and show how behavioral interfaces enable the realization of tools that are generic and thus usable for different executable DSLs."
61,hal-03799289,2022,{On the role of computer languages in scientific computing},"Leroy, Dorian and Sallou, June and Bourcier, Johann and Combemale, Benoit","Scientific software are complex software systems. Their engineering involves various stakeholders using specific computer languages for defining artifacts at different abstraction levels and for different purposes. In this paper, we review the overall process leading to the development of scientific software, and discuss the role of computer languages in the definition of the different artifacts. We then provide guidelines to make informed decisions when the time comes to choose the computer languages to use when developing scientific software."
62,hal-03109992,2020,{Runtime Monitoring for Executable DSLs},"Leroy, Dorian and Jeanjean, Pierre and Bousse, Erwan and Wimmer, Manuel and Combemale, Benoit","Runtime monitoring is a fundamental technique used throughout the lifecycle of a system for many purposes, such as debugging, testing, or live analytics. While runtime monitoring for general purpose programming languages has seen a great amount of research, developing such complex facilities for any executable Domain Specific Language (DSL) remains a challenging, reoccurring and error prone task. A generic solution must both support a wide range of executable DSLs (xDSLs) and induce as little execution time overhead as possible. Our contribution is a fully generic approach based on a temporal property language with a semantics tailored for runtime verification. Properties can be compiled to efficient runtime monitors that can be attached to any kind of executable discrete event model within an integrated development environment. Efficiency is bolstered using a novel combination of structural model queries and complex event processing. Our evaluation on 3 xDSLs shows that the approach is applicable with an execution time overhead of 121% (on executions shorter than 1s), to 79% (on executions shorter than 20s) making it suitable for model testing and debugging."
63,hal-03318348,2021,{When Scientific Software Meets Software Engineering},"Leroy, Dorian and Sallou, June and Bourcier, Johann and Combemale, Benoit","The development of scientific software relies on the collaboration of various stakeholders for the scientific computing and software engineering activities. Computer languages have an impact on both activities and related concerns, as well as on the engineering principles required to ensure the development of reliable scientific software. The more general-purpose the language is-with low-level, computing-related, system abstractions-the more flexibility it will provide, but also the more rigorous engineering principles and Validation & Verification (V&V) activities it will require from the language user. In this paper, we investigate the different levels of abstraction, linked to the diverse artifacts of the scientific software development process, a software language can propose, and the V&V facilities associated to the corresponding level of abstraction the language can provide to the user. We aim to raise awareness among scientists, engineers and language providers on their shared responsibility in developing reliable scientific software."
64,hal-03476464,2023,{Input Sensitivity on the Performance of Configurable Systems: An Empirical Study},"Lesoil, Luc and Acher, Mathieu and Blouin, Arnaud and Jézéquel, Jean-Marc","Widely used software systems such as video encoders are by necessity highly configurable, with hundreds or even thousands of options to choose from. Their users often have a hard time finding suitable values for these options (i.e., finding a proper configuration of the software system) to meet their goals for the tasks at hand, e.g., compress a video down to a certain size. One dimension of the problem is of course that performance depends on the input data: e.g., a video as input to an encoder like x264 or a file fed to a tool like xz . To achieve good performance, users should therefore take into account both dimensions of (1) software variability and (2) input data. This paper details a large study over 8 configurable systems that quantifies the existing interactions between input data and configurations of software systems. The results exhibit that (1) inputs fed to software systems can interact with their configuration options in non-monotonous ways, significantly impacting their performance properties (2) input sensitivity can challenge our knowledge of software variability and question the relevance of performance predictive models for a field deployment. Given the results of our study, we call researchers to address the problem of input sensitivity when tuning, predicting, understanding, and benchmarking configurable systems."
65,hal-03220449,2022,{Impact of Data Cleansing for Urban Bus Commercial Speed Prediction},"Lyan, Gauthier and Gross-Amblard, David and Jézéquel, Jean-Marc and Malinowski, Simon","Public Transportation Information Systems (PTIS) are widely used for public bus services amongst cities in the world. These systems gather information about trips, bus stops, bus speeds, ridership, etc. This massive data is an inviting source of information for machine learning predictive tools. However, it most often suffers from quality deficiencies, due to multiple data sets with multiple structures, to different infrastructures using incompatible technologies, to human errors or hardware failures. In this paper, we consider the impact of data cleansing on a classical machine-learning task: predicting urban bus commercial speed. We show that simple, transport specific business and quality rules can drastically enhance data quality, whereas more sophisticated rules may offer little improvements despite a high computational cost."
66,hal-03921928,2022,{Reasoning over Time into Models with DataTime},"Lyan, Gauthier and Jézéquel, Jean-Marc and Gross-Amblard, David and Lefeuvre, Romain and Combemale, Benoit","Models at runtime have been initially investigated for adaptive systems. Models are used as a reflective layer of the current state of the system to support the implementation of a feedback loop. More recently, models at runtime have also been identified as key for supporting the development of full-fledged digital twins. However, this use of models at runtime raises new challenges, such as the ability to seamlessly interact with the past, present and future states of the system. In this paper, we propose a framework called DataTime to implement models at runtime which capture the state of the system according to the dimensions of both time and space, here modeled as a directed graph where both nodes and edges bear local states (ie. values of properties of interest). DataTime offers a unifying interface to query the past, present and future (predicted) states of the system. This unifying interface provides i) an optimized structure of the time series that capture the past states of the system, possibly evolving over time, ii) the ability to get the last available value provided by the system's sensors, and iii) a continuous micro-learning over graph edges of a predictive model to make it possible to query future states, either locally or more globally, thanks to a composition law. The framework has been developed and evaluated in the context of the Intelligent Public Transportation Systems of the city of Rennes (France). This experimentation has demonstrated how DataTime can be used for managing data from the past, the presentand the future, and facilitate the development of digital twins."
67,hal-03358817,2022,{Transfer Learning Across Variants and Versions: The Case of Linux Kernel Size},"Martin, Hugo and Acher, Mathieu and Pereira, Juliana Alves and Lesoil, Luc and Jézéquel, Jean-Marc and Khelladi, Djamel Eddine","With large scale and complex configurable systems, it is hard for users to choose the right combination of options (i.e., configurations) in order to obtain the wanted trade-off between functionality and performance goals such as speed or size. Machine learning can help in relating these goals to the configurable system options, and thus, predict the effect of options on the outcome, typically after a costly training step. However, many configurable systems evolve at such a rapid pace that it is impractical to retrain a new model from scratch for each new version. In this paper, we propose a new method to enable transfer learning of binary size predictions among versions of the same configurable system. Taking the extreme case of the Linux kernel with its ≈ 14, 500 configuration options, we first investigate how binary size predictions of kernel size degrade over successive versions. We show that the direct reuse of an accurate prediction model from 2017 quickly becomes inaccurate when Linux evolves, up to a 32% mean error by August 2020. We thus propose a new approach for transfer evolution-aware model shifting (TEAMS). It leverages the structure of a configurable system to transfer an initial predictive model towards its future versions with a minimal amount of extra processing for each version. We show that TEAMS vastly outperforms state of the art approaches over the 3 years history of Linux kernels, from 4.13 to 5.8."
68,hal-01372702,2016,{Leveraging Software Product Lines Engineering in the Development of External DSLs: A Systematic Literature Review},"Méndez-Acuña, David and Galindo Duarte, José Angel and Degueule, Thomas and Combemale, Benoit and Baudry, Benoit","The use of domain-specific languages (DSLs) has become a successful technique in the development of complex systems. Consequently, nowadays we can find a large variety of DSLs for diverse purposes. However, not all these DSLs are completely different; many of them share certain commonalities coming from similar modeling patterns –such as state machines or petri nets– used for several purposes. In this scenario, the challenge for language designers is to take advantage of the commonalities existing among similar DSLs by reusing, as much as possible, formerly defined language constructs. The objective is to leverage previous engineering efforts to minimize implementation from scratch. To this end, recent research in software language engineering proposes the use of product line engineering, thus introducing the notion of language product lines. Nowadays, there are several approaches that result useful in the construction of language product lines. In this article, we report on an effort for organizing the literature on language product line engineering. More precisely, we propose a definition for the life-cycle of language product lines, and we use it to analyze the capabilities of current approaches. In addition, we provide a mapping between each approach and the technological space it supports."
69,hal-01524632,2017,{Reverse Engineering Language Product Lines from Existing DSL Variants},"Méndez-Acuña, David A and Galindo, José A and Combemale, Benoit and Blouin, Arnaud and Baudry, Benoit","The use of domain-specific languages (DSLs) has become a successful technique to develop complex systems. In this context, an emerging phenomenon is the existence of DSL variants, which are different versions of a DSL adapted to specific purposes but that still share commonalities. In such a case, the challenge for language designers is to reuse, as much as possible, previously defined language constructs to narrow implementation from scratch. To overcome this challenge, recent research in software languages engineering introduced the the notion of language product lines. Similarly to software product lines, language product lines are often built from a set of existing DSL variants. In this article, we propose a reverse-engineering technique to ease-off such a development scenario. Our approach receives a set of DSL variants which are used to automatically recover a language modular design and to synthesize the corresponding variability models. The validation is performed in a project involving industrial partners that required three different variants of a DSL for finite state machines. This validation shows that our approach is able to correctly identify commonalities and variability."
70,hal-02876536,2020,{Opportunities in Intelligent Modeling Assistance},"Mussbacher, Gunter and Combemale, Benoit and Kienzle, Jörg and Abrahão, Silvia and Ali, Hyacinth and Bencomo, Nelly and Búr, Márton and Burgueño, Loli and Engels, Gregor and Jeanjean, Pierre and Jézéquel, Jean-Marc and Kühn, Thomas and Mosser, Sébastien and Sahraoui, Houari and Syriani, Eugene and Varró, Dániel and Weyssow, Martin","Modeling is requiring increasingly larger efforts while becoming indispensable given the complexity of the problems we are solving. Modelers face high cognitive load to understand a multitude of complex abstractions and their relationships. There is an urgent need to better support tool builders to ultimately provide modelers with intelligent modeling assistance that learns from previous modeling experiences, automatically derives modeling knowledge, and provides context-aware assistance. However, current Intelligent Modeling Assistants (IMAs) lack adaptability and flexibility for tool builders, and do not facilitate understanding the differences and commonalities of IMAs for modelers. Such a patchwork of limited IMAs is a lost opportunity to provide modelers with better support for the creative and rigorous aspects of software engineering. In this expert voice, we present a conceptual reference framework (RF-IMA) and its properties to identify the foundations for intelligent modeling assistance. For tool builders, RF-IMA aims to help build IMAs more systematically. For modelers, RF-IMA aims to facilitate comprehension, comparison, and integration of IMAs, and ultimately to provide more intelligent support. We envision a momentum in the modeling community that leads to the implementation of RF-IMA and consequently future IMAs. We identify open chal"
71,hal-03137877,2020,{Open Access: all you wanted to know and never dared to ask},"Pierantonio, Alfonso and Brand, Mark van Den and Combemale, Benoit","This editorial presents the various forms of open access, discusses their pros and cons from the perspective of the Journal of Object Technology and its editors in chiefs, and illustrates how JOT implements a platinum open access model. The regular reader will also notice that this editorial features a new template for the journal that will be used from now on."
72,hal-02408017,2019,{The JOT Journal: Towards a Rising Generation},"Pierantonio, Alfonso and van den Brand, Mark and Combemale, Benoit",
73,hal-01179369,2015,{Using Slicing to Improve the Performance of Model Invariant Checking},"Sun, Wuliang and Combemale, Benoit and France, Robert B. and Blouin, Arnaud and Baudry, Benoit and Ray, Indrakshi",
74,hal-00942576,2014,{Model-Based Testing of Global Properties on Large-Scale Distributed Systems},"Sunyé, Gerson and Cunha de Almeida, Eduardo and Le Traon, Yves and Baudry, Benoit and Jézéquel, Jean-Marc","Large-scale distributed systems are becoming commonplace with the large popularity of peer-to-peer and cloud computing. The increasing importance of these systems contrasts with the lack of integrated solutions to build trustworthy software. A key concern of any large-scale distributed system is the validation of global properties, which cannot be evaluated on a single node. Thus, it is necessary to gather data from distributed nodes and to aggregate these data into a global view. This turns out to be very challenging because of the system's dynamism that imposes very frequent changes in local values that affect global properties. This implies that the global view has to be frequently updated to ensure an accurate validation of global properties. In this paper, we present a model-based approach to define a dynamic oracle for checking global properties. Our objective is to abstract relevant aspects of such systems into models. These models are updated at runtime, by monitoring the corresponding distributed system. We conduce real-scale experimental validation to evaluate the ability of our approach to check global properties. In this validation, we apply our approach to test two open-source implementations of distributed hash tables. The experiments are deployed on two clusters of 32 nodes. The experiments reveal an important defect on one implementation and show clear performance differences between the two implementations. The defect would not be detected without a global view of the system. Testing global properties on distributed software consists of gathering data from different nodes and building a global view of the system, where properties are validated. This process requires a distributed test architecture and tools for representing and validating global properties. Model-based techniques are an expressive mean for building oracles that validate global properties on distributed systems."
75,hal-01596116,2017,{Cross-domain identity and discovery framework for web calling services},"Tariq Javed, Ibrahim and Copeland, Rebecca and Crespi, Noel and Emmelmann, Marc and Corici, Andreea Ancuta and Bouabdallah, Ahmed and Zhang, Tuo and El Jaouhari, Saad and Beierle, Felix and Göndör, Sebastian and Küpper, Axel and Corre, Kevin and Crom, Jean-Michel and Oberle, Frank and Friese, Ingo and Caldeira, Ana and Dias, Gil and Santos, Nuno and Chaves, Ricardo and Lopes Pereira, Ricardo","Cross-domain identity management remains a major challenge for potential WebRTC adopters. In order to provide a global web-based communication system, it is critical to locate the destination called party, map the identity to the user device, and provide mutual authentication for both caller and called party. In this paper, we present a novel identity management and user discovery framework that enables callers to search and locate users across service domains. The identity management is decoupled from the used calling service, allowing users to manage their profiles and credentials independently of the applications. The framework is designed to preserve privacy and exploit web technology to gain trust and contact list management."
76,hal-03045797,2020,{Empirical Assessment of Generating Adversarial Configurations for Software Product Lines},"Temple, Paul and Perrouin, Gilles and Acher, Mathieu and Biggio, Battista and Jézéquel, Jean-Marc and Roli, Fabio","Software product line (SPL) engineering allows the derivation of products tailored to stakeholders' needs through the setting of a large number of configuration options. Unfortunately, options and their interactions create a huge configuration space which is either intractable or too costly to explore exhaustively. Instead of covering all products, machine learning (ML) approximates the set of acceptable products (e.g., successful builds, passing tests) out of a training set (a sample of configurations). However, ML techniques can make prediction errors yielding non-acceptable products wasting time, energy and other resources. We apply adversarial machine learning techniques to the world of SPLs and craft new configurations faking to be acceptable configurations but that are not and vice-versa. It allows to diagnose prediction errors and take appropriate actions. We develop two adversarial configuration generators on top of state-of-the-art attack algorithms and capable of synthesizing configurations that are both adversarial and conform to logical constraints. We empirically assess our generators within two case studies: an industrial video synthesizer (MOTIV) and an industry-strength, open-source Web-appconfigurator (JHipster). For the two cases, our attacks yield (up to) a 100% misclassification rate without sacrificing the logical validity of adversarial configurations. This work lays the foundations of a quality assurance framework for ML-based SPLs."
77,hal-02177158,2019,{Empirical Assessment of Multimorphic Testing},"Temple, Paul and Acher, Mathieu and Jézéquel, Jean-Marc","The performance of software systems (such as speed, memory usage, correct identification rate) tends to be an evermore important concern, often nowadays on par with functional correctness for critical systems.Systematically testing these performance concerns is however extremely difficult, in particular because there exists no theory underpinning the evaluation of a performance test suite, i.e., to tell the software developer whether such a test suite is ""good enough"" or even whether a test suite is better than another one. This paper proposes to apply Multimorphic testing and empirically assess the effectiveness of performance test suites of software systems coming from various domains. By analogy with mutation testing, our core idea is to leverage the typical configurability of these systems, and to check whether it makes any difference in the outcome of the tests: i.e., are some tests able to ""kill"" underperforming system configurations? More precisely, we propose a framework for defining and evaluating the coverage of a test suite with respect to a quantitative property of interest. Such properties can be the execution time, the memory usage or the success rate in tasks performed by a software system. This framework can be used to assess whether a new test case is worth adding to a test suite or to select an optimal test suite with respect to a property of interest. We evaluate several aspects of our proposal through 3 empirical studies carried out in different fields: object tracking in videos, object recognition in images, and code generators."
78,hal-01659137,2017,{Learning-Contextual Variability Models},"Temple, Paul and Acher, Mathieu and Jézéquel, Jean-Marc and Barais, Olivier","Modeling how contextual factors relate to a software system’s configuration space is usually a manual, error-prone task that depends highly on expert knowledge. Machine-learning techniques can automatically predict the acceptable software configurations for a given context. Such an approach executes and observes a sample of software configurations within a sample of contexts. It then learns what factors of each context will likely discard or activate some of the software’s features. This lets developers and product managers automatically extract the rules that specialize highly configurable systems for specific contexts."
79,hal-02951745,2020,{A framework for managing the imperfect modularity of variability implementations},"Tërnava, Xhevahire and Collet, Philippe","In many industrial settings, the common and varying features of related software-intensive systems , as their reusable units, are likely to be implemented by a combined set of traditional techniques. Features do not align perfectly well with the used language constructs, e.g., classes, thus hindering the management of implemented variability. Herein, we provide a detailed framework to capture, model, and trace this imperfectly modular variability in terms of variation points with variants. We describe an implementation of this framework, as a domain-specific language, and report on its application on four subject systems and usage for variability management, showing its feasibility."
80,hal-03593967,2022,{Identification and visualization of variability implementations in object-oriented variability-rich systems: a symmetry-based approach},"Tërnava, Xhevahire and Mortara, Johann and Collet, Philippe and Le Berre, Daniel","Most modern object-oriented software systems are variability-rich, despite that they may not be developed as product lines. Their variability is implemented by several traditional techniques in combination, such as inheritance, overloading, or design patterns. As domain features or variation points with variants are not a by-product of these techniques, variability in code assets of such systems is implicit, and hardly documented, hampering qualities such as understandability and maintainability. In this article, we present an approach for automatic identification and visualization of variability implementation places, that is, variation points with variants, in variability-rich systems. To uniformly identify them, we propose to rely on the existing symmetries in the different software constructs and patterns. We then propose to visualize them according to their density. By means of our realized toolchain implementing the approach, symfinder , we report on a threefold evaluation, (i) on the identified potential variability in sixteen large open-source systems and symfinder 's scalability, (ii) on measuring symfinder 's precision and robustness when mapping identified variability to domain features, and (iii) on its usage by a software architect. Results show that symfinder can indeed help in identifying and comprehending the variability of the targeted systems."
81,hal-02184520,2017,{Identification of LRP-1 as an endocytosis and recycling receptor for B1-integrin in thyroid cancer cells},"Theret, Louis and Jeanne, Albin and Langlois, Benoît and Hachet, Cathy and David, Marion and Khrestchatisky, Michel and Devy, Jérome and Hervé, Emonard and Almagro, Sébastien and Dedieu, Stéphane","LRP-1 is a large endocytic receptor mediating the clearance of various molecules from the extracellular matrix. LRP-1 was reported to control focal adhesion turnover to optimize the adhesion-deadhesion balance to support invasion. To better understand how LRP-1 coordinates cell-extracellular matrix interface, we explored its ability to regulate cell surface integrins in thyroid carcinomas. Using an antibody approach, we demonstrated that β1-integrin levels were increased at the plasma membrane under LRP1 silencing or upon RAP treatment, used as LRP-1 antagonist. Our data revealed that LRP-1 binds with both inactive and active β1-integrin conformations and identified the extracellular ligand-binding domains II or IV of LRP-1 as sufficient to bind β1-integrin. Using a recombinant β1-integrin, we demonstrated that LRP-1 acts as a regulator of β1-integrin intracellular traffic. Moreover, RAP or LRP-1 blocking antibodies decreased up to 36% the number of β1-integrin-containing endosomes. LRP-1 blockade did not significantly affect the levels of β1-integrin-containing lysosomes while decreasing localization of β1-integrin within Rab-11 positive vesicles. Overall, we identified an original molecular process in which LRP-1 acts as a main regulator of β1-integrin internalization and recycling in thyroid cancer cells."
82,hal-01867423,2018,{A Comprehensive Study of Pseudo-tested Methods},"Vera-Pérez, Oscar Luis and Danglot, Benjamin and Monperrus, Martin and Baudry, Benoit","Pseudo-tested methods are defined as follows: they are covered by the test suite, yet no test case fails when the method body is removed, i.e., when all the effects of this method are suppressed. This intriguing concept was coined in 2016, by Niedermayr and colleagues, who showed that such methods are systematically present, even in well-tested projects with high statement coverage. This work presents a novel analysis of pseudo-tested methods. First, we run a replication of Niedermayr's study with 28K+ methods, enhancing its external validity thanks to the use of new tools and new study subjects. Second, we perform a systematic characterization of these methods, both quantitatively and qualitatively with an extensive manual analysis of 101 pseudo-tested methods. The first part of the study confirms Niedermayr's results: pseudo-tested methods exist in all our subjects. Our in-depth characterization of pseudo-tested methods leads to two key insights: pseudo-tested methods are significantly less tested than the other methods; yet, for most of them, the developers would not pay the testing price to fix this situation. This calls for future work on targeted test generation to specify those pseudo-tested methods without spending developer time."
83,hal-02282028,2020,{Modeling Languages in Industry 4.0: An Extended Systematic Mapping Study},"Wortmann, Andreas and Barais, Olivier and Combemale, Benoit and Wimmer, Manuel","Industry 4.0 integrates cyber-physical systems with the Internet of Things to optimize the complete value-added chain. Successfully applying Industry 4.0 requires the cooperation of various stakeholders from different domains. Domain-specific modeling languages promise to facilitate their involvement through leveraging (domain-specific) models to primary development artifacts. We aim to assess the use of modeling in Industry 4.0 through the lens of modeling languages in a broad sense. Based on an extensive literature review, we updated our systematic mapping study on modeling languages and modeling techniques used in Industry 4.0 (Wortmann et al., Conference on model-driven engineering languages and systems (MODELS’17), IEEE, pp 281–291, 2017) to include publications until February 2018. Overall, the updated study considers 3344 candidate publications that were systematically investigated until 408 relevant publications were identified. Based on these, we developed an updated map of the research landscape on modeling languages and techniques for Industry 4.0. Research on modeling languages in Industry 4.0 focuses on contributing methods to solve the challenges of digital representation and integration. To this end, languages from systems engineering and knowledge representation are applied most often but rarely combined. There also is a gap between the communities researching and applying modeling languages for Industry 4.0 that originates from different perspectives on modeling and related standards. From the vantage point of modeling, Industry 4.0 is the combination of systems engineering, with cyber-physical systems, and knowledge engineering. Research currently is splintered along topics and communities and accelerating progress demands for multi-disciplinary, integrated research efforts."
84,hal-01309004,2016,{B-Refactoring: Automatic Test Code Refactoring to Improve Dynamic Analysis},"Xuan, Jifeng and Cornu, Benoit and Martinez, Matias and Baudry, Benoit and Seinturier, Lionel and Monperrus, Martin","Context: Developers design test suites to verify that software meets its expected behaviors. Many dynamic analysis techniques are performed on the exploitation of execution traces from test cases. In practice, one test case may imply various behaviors. However, the execution of a test case only yields one trace, which can hide the others.Objective: In this article, we propose a new technique of test code refactoring, called B-Refactoring. The idea behind B-Refactoring is to split a test case into small test fragments, which cover a simpler part of the control flow to provide better support for dynamic analysis.Method: For a given dynamic analysis technique, B-Refactoring monitors the execution of test cases and constructs small test cases without loss of the testability. We apply B-Refactoring to assist two existing analysis tasks: automatic repair of if-condition bugs and automatic analysis of exception contracts.Results: Experimental results show that B-Refactoring can effectively improve the execution traces of the test suite. Real-world bugs that could not be previously fixed with the original test suites are fixed after applying B-Refactoring; meanwhile, exception contracts are better verified via applying B-Refactoring to original test suites.Conclusions: We conclude that applying B-Refactoring improves the execution traces of test cases for dynamic analysis. This improvement can enhance existing dynamic analysis tasks."
85,hal-01382964,2016,{Online Genetic Improvement on the java virtual machine with ECSELR},"Yeboah-Antwi, Kwaku and Baudry, Benoit","Online Genetic Improvement embeds the ability to evolve and adapt inside a target software system enabling it to improve at runtime without any external dependencies or human intervention. We recently developed a general purpose tool enabling Online Genetic Improvement in software systems running on the java virtual machine. This tool, dubbed ECSELR, is embedded inside extant software systems at runtime, enabling such systems to self-improve and adapt autonomously online. We present this tool, describing its architecture and focusing on its design choices and possible uses."
86,hal-03927522,2022,"{MoDMaCAO: a model-driven framework for the design, validation and configuration management of cloud applications based on OCCI}","Zalila, Faiez and Korte, Fabian and Erbel, Johannes and Challita, Stéphanie and Grabowski, Jens and Merle, Philippe","To tackle the cloud-provider lock-in, the Open Grid Forum is developing the Open Cloud Computing Interface (OCCI), a standardized interface for managing any kind of cloud resources. Besides the OCCI Core model, which defines the basic modeling elements for cloud resources, further standardised extensions exist that reflect the requirements of different cloud service levels, such as infrastructure and platform elements. However, so far the OCCI platform extension is very coarse-grained and lacks supporting usecases and implementations. Especially, it does not define how the components of the application itself can be managed. In this paper, we discuss the features of MoDMaCAO, a model-driven framework that extends the OCCI platform extension. The users of the framework are able to design and validate cloud application topologies and subsequently deploy them on OCCI compliant clouds by using configuration management tools."
87,hal-03921704,2022,{A Generic Framework for Representing and Analysing Model Concurrency},"Zschaler, Steffen and Bousse, Erwan and Deantoni, Julien and Combemale, Benoit","Recent results in language engineering simplify the development of tool-supported executable domain-specific modelling languages (xDSMLs), including editing (e.g., completion and error checking) and execution analysis tools (e.g., debugging, monitoring and live modelling). However, such frameworks are currently limited to sequential execution traces, and cannot handle execution traces resulting from an execution semantics with a concurrency model supporting parallelism or interleaving. This prevents the development of concurrency analysis tools, like debuggers supporting the exploration of model executions resulting from different interleavings. In this paper, we present a generic framework to integrate execution semantics with either implicit or explicit concurrency models, to explore the possible execution traces of conforming models, and to define strategies for helping in the exploration of the possible executions. This framework is complemented with a protocol to interact with the resulting executions and hence to build advanced concurrency analysis tools. The approach has been implemented within the GEMOC Studio. We demonstrate how to integrate two representative concurrent meta-programming approaches (MoCCML/Java and Henshin), which use different paradigms and underlying foundations to define an xDSML's concurrency model. We also demonstrate the ability to define an advanced concurrent omniscient debugger with the proposed protocol. The paper, thus, contributes key abstractions and an associated protocol for integrating concurrent meta-pro\-gram\-ming approaches in a language workbench, and dynamically exploring the possible executions of a model in the modelling workbench."
88,hal-00913139,2014,{Reverse Engineering Web Configurators},"Abbasi, Ebrahim Khalil and Acher, Mathieu and Heymans, Patrick and Cleve, Anthony","A Web configurator offers a highly interactive environment to assist users in customising sales products through the selection of configuration options. Our previous empirical study revealed that a significant number of configurators are suboptimal in reliability, efficiency, and maintainability, opening avenues for re-engineering support and methodologies. This paper presents a tool-supported reverse-engineering process to semi-automatically extract configuration-specific data from a legacy Web configurator. The extracted and structured data is stored in formal models (e.g., variability models) and can be used in a forward-engineering process to generate a customized interface with an underlying reliable reasoning engine. Two major components are presented: (1) a Web Wrapper that extracts structured configuration-specific data from unstructured or semi-structured Web pages of a configurator, and (2) a Web Crawler that explores the configuration space (i.e., all objects representing configuration-specific data) and simulates users' configuration actions. We describe variability data extraction patterns, used on top of the Wrapper and the Crawler to extract configuration data. Experimental results on five existing Web configurators show that the specification of a few variability patterns enable the identification of hundreds of options."
89,hal-03882594,2023,{A Call for Removing Variability},"Acher, Mathieu and Lesoil, Luc and Randrianaina, Georges Aaron and Tërnava, Xhevahire and Zendra, Olivier","Software variability is largely accepted and explored in software engineering and seems to have become a norm and a must, if only in the context of product lines. Yet, the removal of superfluous or unneeded software artefacts and functionalities is an inevitable trend. It is frequently investigated in relation to software bloat. This paper is essentially a call to the community on software variability to devise methods and tools that will facilitate the removal of unneeded variability from software systems. The advantages are expected to be numerous in terms of functional and non-functional properties, such as maintainability (lower complexity), security (smaller attack surface), reliability, and performance (smaller binaries)."
90,hal-00916746,2014,{A Survey on Teaching of Software Product Lines},"Acher, Mathieu and Lopez-Herrejon, Roberto Erick and Rabiser, Rick","With around two decades of existence, the community of Software Product Line (SPL) researchers and practitioners is thriving as can be attested by the extensive research output and the numerous successful industrial projects. Education has a key role to support the next generation of engineers to build highly complex SPLs. Yet, it is unclear how SPLs are taught, what are the possible missing gaps and difficulties faced, what are the benefits, or what is the material available. In this paper, we carry out a survey with over 30 respondents with the purpose of capturing a snapshot of the state of teaching in our community. We report and discuss quantitative as well as qualitative results of the survey. We build upon them and sketch six concrete actions to continue improving the state of practice of SPL teaching."
91,hal-01018937,2014,{Customization and 3D Printing: A Challenging Playground for Software Product Lines},"Acher, Mathieu and Baudry, Benoit and Barais, Olivier and Jézéquel, Jean-Marc","3D printing is gaining more and more momentum to build customized product in a wide variety of fields. We conduct an exploratory study of Thingiverse, the most popular Website for sharing user-created 3D design files, in order to establish a possible connection with software product line (SPL) engineering. We report on the socio-technical aspects and current practices for modeling variability, implementing variability, configuring and deriving products, and reusing artefacts. We provide hints that SPL-alike techniques are practically used in 3D printing and thus relevant. Finally, we discuss why the customization in the 3D printing field represents a challenging playground for SPL engineering."
92,hal-03720273,2022,{Feature Subset Selection for Learning Huge Configuration Spaces: The case of Linux Kernel Size},"Acher, Mathieu and Martin, Hugo and Pereira, Juliana Alves and Lesoil, Luc and Blouin, Arnaud and Jézéquel, Jean-Marc and Khelladi, Djamel Eddine and Barais, Olivier","Linux kernels are used in a wide variety of appliances, many of them having strong requirements on the kernel size due to constraints such as limited memory or instant boot. With more than nine thousands of configuration options to choose from, developers and users of Linux actually spend significant effort to document, understand, and eventually tune (combinations of) options for meeting a kernel size. In this paper, we describe a large-scale endeavour automating this task and predicting a given Linux kernel binary size out of unmeasured configurations. We first experiment that state-of-theart solutions specifically made for configurable systems such as performance-influence models cannot cope with that number of options, suggesting that software product line techniques may need to be adapted to such huge configuration spaces. We then show that tree-based feature selection can learn a model achieving low prediction errors over a reduced set of options. The resulting model, trained on 95 854 kernel configurations, is fast to compute, simple to interpret and even outperforms the accuracy of learning without feature selection."
93,hal-04160693,2023,{Generative AI for Reengineering Variants into Software Product Lines: An Experience Report},"Acher, Mathieu and Martinez, Jabier","The migration and reengineering of existing variants into a software product line (SPL) is an error-prone and time-consuming activity. Many extractive approaches have been proposed, spanning different activities from feature identification and naming to the synthesis of reusable artefacts. In this paper, we explore how large language model (LLM)-based assistants can support domain analysts and developers. We revisit four illustrative cases of the literature where the challenge is to migrate variants written in different formalism (UML class diagrams, Java, GraphML, statecharts). We systematically report on our experience with ChatGPT-4, describing our strategy to prompt LLMs and documenting positive aspects but also failures. We compare the use of LLMs with state-of-the-art approach, BUT4Reuse. While LLMs offer potential in assisting domain analysts and developers in transitioning software variants into SPLs, their intrinsic stochastic nature and restricted ability to manage large variants or complex structures necessitate a semiautomatic approach, complete with careful review, to counteract inaccuracies."
94,hal-02342130,2019,{Learning the Linux Kernel Configuration Space: Results and Challenges},"Acher, Mathieu",
95,hal-01061576,2014,{Metamorphic Domain-Specific Languages: A Journey Into the Shapes of a Language},"Acher, Mathieu and Combemale, Benoit and Collet, Philippe","External or internal domain-specific languages (DSLs) or (fluent) APIs? Whoever you are -- a developer or a user of a DSL -- you usually have to choose side; you should not! What about metamorphic DSLs that change their shape according to your needs? Our 4-years journey of providing the ""right"" support (in the domain of feature modeling), led us to develop an external DSL, different shapes of an internal API, and maintain all these languages. A key insight is that there is no one-size-fits-all solution or no clear superiority of a solution compared to another. On the contrary, we found that it does make sense to continue the maintenance of an external and internal DSL. Based on our experience and on an analysis of the DSL engineering field, the vision that we foresee for the future of software languages is their ability to be self-adaptable to the most appropriate shape (including the corresponding integrated development environment) according to a particular usage or task. We call metamorphic DSL such a language, able to change from one shape to another shape."
96,hal-04153310,2023,{On Programming Variability with Large Language Model-based Assistant},"Acher, Mathieu and Duarte, José Galindo and Jézéquel, Jean-Marc","Programming variability is central to the design and implementation of software systems that can adapt to a variety of contexts and requirements, providing increased flexibility and customization. Managing the complexity that arises from having multiple features, variations, and possible configurations is known to be highly challenging for software developers. In this paper, we explore how large language model (LLM)-based assistants can support the programming of variability. We report on new approaches made possible with LLM-based assistants, like: features and variations can be implemented as prompts; augmentation of variability out of LLM-based domain knowledge; seamless implementation of variability in different kinds of artefacts, programming languages, and frameworks, at different binding times (compile-time or run-time). We are sharing our data (prompts, sessions, generated code, etc.) to support the assessment of the effectiveness and robustness of LLMs for variability-related tasks."
97,hal-01234342,2015,{Product Lines Can Jeopardize Their Trade Secrets},"Acher, Mathieu and Bécan, Guillaume and Combemale, Benoit and Baudry, Benoit and Jézéquel, Jean-Marc","What do you give for free to your competitor when you exhibit a product line? This paper addresses this question through several cases in which the discovery of trade secrets of a product line is possible and can lead to severe consequences. That is, we show that an outsider can understand the variability realization and gain either confidential business information or even some economical direct advantage. For instance, an attacker can identify hidden constraints and bypass the product line to get access to features or copyrighted data. This paper warns against possible naive mod-eling, implementation, and testing of variability leading to the existence of product lines that jeopardize their trade secrets. Our vision is that defensive methods and techniques should be developed to protect specifically variability – or at least further complicate the task of reverse engineering it."
98,hal-03528889,2022,{Reproducible Science and Deep Software Variability},"Acher, Mathieu","Biology, medicine, physics, astrophysics, chemistry: all these scientific domains need to process large amount of data with more and more complex software systems. For achieving reproducible science, there are several challenges ahead involving multidisciplinary collaboration and socio-technical innovation with software at the center of the problem. Despite the availability of data and code, several studies report that the same data analyzed with different software can lead to different results. I am seeing this problem as a manifestation of deep software variability: many factors (operating system, third-party libraries, versions, workloads, compile-time options and flags, etc.) themselves subject to variability can alter the results, up to the point it can dramatically change the conclusions of some scientific studies. In this keynote, I argue that deep software variability is a threat and also an opportunity for reproducible science. I first outline some works about (deep) software variability, reporting on preliminary evidence of complex interactions between variability layers. I then link the ongoing works on variability modelling and deep software variability in the quest for reproducible science."
99,hal-02268373,2019,{Seventh international workshop on reverse variability engineering (REVE 2019)},"Acher, Mathieu and Ziadi, Tewfik and Lopez-Herrejon, Roberto E and Martinez, Jabier",
100,hal-00980126,2014,"{Software Diversity: Challenges to handle the imposed, Opportunities to harness the chosen}","Acher, Mathieu and Barais, Olivier and Baudry, Benoit and Blouin, Arnaud and Bourcier, Johann and Combemale, Benoit and Jézéquel, Jean-Marc and Plouzeau, Noël","Diversity emerges as a critical concern that spans all activities in software engineering (from design to verification, from deployment to runtime resilience) and appears in all sorts of domains, which rely on software intensive systems, from systems of systems to pervasive combinations of Internet of Things and Internet of Services. If these domains are apparently radically different, we envision a strong convergence of the scientific principles underpinning their construction and validation towards flexible and open yet dependable systems. In this paper, we discuss the software engineering challenges raised by these requirements for flexibility and openness, focusing on four dimensions of diversity: the diversity of functionalities required by the different customers; the diversity of languages used by the stakeholders involved in the construction of these systems; the diversity of runtime environments in which software has to run and adapt; the diversity of failures against which the system must be able to react. In particular, we want to emphasize the challenges for handling imposed diversity, as well as the opportunities to leverage chosen diversity. The main challenge is that software diversity imposes to integrate the fact that software must adapt to changes in the requirements and environment -- in all development phases and in unpredictable ways. Yet, exploiting and increasing software diversity is a great opportunity to allow the spontaneous exploration of alternative software solutions and proactively prepare for unforeseen changes. Concretely, we want to provide software engineers with the ability: to characterize an 'envelope' of possible variations; to compose 'envelopes' (to discover new macro envelopes in an opportunistic manner); to dynamically synthesize software inside a given envelop."
101,hal-01024990,2014,{SPLTea 2014: First International Workshop on Software Product Line Teaching},"Acher, Mathieu and Lopez-Herrejon, Roberto Erick and Rabiser, Rick","Education has a key role to play for disseminating the constantly growing body of Software Product Line (SPL) knowledge. Teaching SPLs is challenging and it is unclear how SPLs can be taught, what are the possible benefits, or what is the material available. This workshop aims to explore and explain the current status and ongoing work on teaching SPLs at universities, colleges, and in industry (e.g., by consultants). Participants will discuss gaps and difficulties faced when teaching SPLs, benefits to research and industry, different ways to teach SPL knowledge, common threads, interests, and problems. The overall goal is to strengthen the important aspect of teaching in the SPL community."
102,hal-01243213,2015,{SPLTea 2015: Second International Workshop on Software Product Line Teaching},"Acher, Mathieu and Lopez-Herrejon, Roberto Erick and Rabiser, Rick","Education has a key role to play for disseminating the constantly growing body of Software Product Line (SPL) knowledge. Teaching SPLs is challenging; it is unclear, for example , how SPLs can be taught and what is the material available. This workshop aims to explore and explain the current status and ongoing work on teaching SPLs at universities, colleges, and in industry (e.g., by consultants). This second edition will continue the effort made at SPLTea'14. In particular we seek to design and populate an open repository of resources dedicated to SPL teaching."
103,hal-01829933,2018,{Teaching Software Product Lines: A Snapshot of Current Practices and Challenges (Journal-First Abstract)},"Acher, Mathieu and Lopez-Herrejon, Roberto E and Rabiser, Rick","This extended abstract summarizes our article entitled "" Teaching Software Product Lines: A Snapshot of Current Practices and Challenges "" published in the ACM Transactions on Computing Education , vol. 18 in 2017 (http://doi.acm.org/10.1145/3088440). The article reports on three initiatives we have conducted with scholars, educators, industry practitioners, and students to understand the connection between software product lines and education and to derive recommendations for educators to continue improving the state of practice of teaching SPLs."
104,hal-01659161,2018,{VaryLaTeX: Learning Paper Variants That Meet Constraints},"Acher, Mathieu and Temple, Paul and Jézéquel, Jean-Marc and Galindo Duarte, José Ángel and Martinez, Jabier and Ziadi, Tewfik","How to submit a research paper, a technical report, a grant proposal , or a curriculum vitae that respect imposed constraints such as formatting instructions and page limits? It is a challenging task, especially when coping with time pressure. In this work, we present VaryL A T E X, a solution based on variability, constraint programming , and machine learning techniques for documents written in L A T E X to meet constraints and deliver on time. Users simply have to annotate L A T E X source files with variability information, e.g., (de)activating portions of text, tuning figures' sizes, or tweaking line spacing. Then, a fully automated procedure learns constraints among Boolean and numerical values for avoiding non-acceptable paper variants, and finally, users can further configure their papers (e.g., aesthetic considerations) or pick a (random) paper variant that meets constraints, e.g., page limits. We describe our implementation and report the results of two experiences with VaryL A T E X."
105,hal-01020933,2014,{ViViD: A Variability-Based Tool for Synthesizing Video Sequences},"Acher, Mathieu and Alférez, Mauricio and Galindo Duarte, José Angel and Romenteau, Pierre and Baudry, Benoit","We present ViViD, a variability-based tool to synthesize variants of video sequences. ViViD is developed and used in the context of an industrial project involving consumers and providers of video processing algorithms. The goal is to synthesize synthetic video variants with a wide range of characteristics to then test the algorithms. We describe the key components of ViViD(1) a variability language and an environment to model what can vary within a video sequence; (2) a reasoning back-end to generate relevant testing configurations; (3) a video synthesizer in charge of producing variants of video sequences corresponding to configurations. We show how ViViD can synthesize realistic videos with different characteristics such as luminances, vehicles and persons that cover a diversity of testing scenarios."
106,hal-01090269,2014,{Towards a Model@runtime Middleware for Cyber Physical Systems},"Acosta Padilla, Francisco Javier and Weis, Frédéric and Bourcier, Johann","Cyber Physical Systems (CPS) or Internet of Things systems are typically formed by a myriad of many small interconnected de-vices. This underlying hardware infrastructure raises new chal-lenges in the way we administrate the software layer of these sys-tems. Indeed, the limited computing power and battery life of each node combined with the very distributed nature of these systems, greatly adds complexity to distributed software layer management. In this paper we propose a new middleware dedicated to CPS to enable the management of software deployment and the dynamic reconfiguration of these systems. Our middleware is inspired from the Component Based Systems and the model@runtime paradigm which has been adapted to the context of Cyber Physical Systems. We have conducted an initial evaluation on a typical Cyber Physical Systems hardware infrastructure which demonstrates the feasibility of providing a model@runtime middleware for these systems."
107,hal-01091541,2014,{An Architecture-Based Approach for Compute-Intensive Pervasive Systems in Dynamic Environments},"Al Ali, Rima and Gerostathopoulos, Ilias and Gonzalez-Herrera, Inti and Juan-Verdejo, Adrian and Kit, Michal and Surajbali, Bholanathsingh",Distributed systems have continued to evolve and we note two important trends: the dramatically increasing level of dynamism in contemporary distributed systems and the convergence of mobile computing with cloud computing. The end result is that it is very difficult to achieve the required level of scalability and dependability in a systematic way when considering pervasive systems that are software-and compute-intensive and whose functionality is typically augmented by static cloud infrastructure resources. This work discusses relevant challenges and requirements for integrating cloud computing with pervasive systems operating in dynamic environments. We present a set of requirements using a holistic case study and describe a reference architecture to address these requirements.
108,hal-02924645,2020,{A Language Agnostic Approach to Modeling Requirements: Specification and Verification},"Alidra, Abdelghani and Beugnard, Antoine and Godfroy, Hubert and Kimmel, Pierre and Le Guernic, Gurvan","Modeling is a complex and error prone activity which can result in ambiguous models containing omissions and inconsistencies. Many works have addressed the problem of checking models' consistency. However, most of these works express consistency requirements for a specific modeling language. On the contrary, we argue that in some contexts those requirements should be expressed independently from the modeling language of the models to be checked. We identify a set of modeling requirements in the context of embedded systems design that are expressed independently from any modeling language concrete syntax. We propose a dedicated semantic domain to support them and give a formal characterization of those requirements that is modeling language agnostic."
109,hal-01367344,2016,"{Lightning Talk:""I solemnly pledge"" A Manifesto for Personal Responsibility in the Engineering of Academic Software}","Allen, Alice and Aragon, Cecilia and Becker, Christoph and Carver, Jeffrey C. and Chis, Andrei and Combemale, Benoit and Croucher, Mike and Crowston, Kevin and Garijo, Daniel and Gehani, Ashish and Goble, Carole and Haines, Robert and Hirschfeld, Robert and Howison, James and Huff, Kathryn and Jay, Caroline and Katz, Daniel and Kirchner, Claude and Kuksenok, Kateryna and Lämmel, Ralf and Nierstrasz, Oscar and Turk, Matthew and van Nieuwpoort, Rob V. and Vaughn, Matthew and Vinju, Jurgen","Software is fundamental to academic research work, both as part of the method and as the result of research. In June 2016 25 people gathered at Schloss Dagstuhl for a week-long Perspectives Workshop and began to develop a manifesto which places emphasis on the scholarly value of academic software and on personal responsibility. Twenty pledges cover the recognition of academic software, the academic software process and the intellectual content of academic software. This is still work in progress. Through this lightning talk, we aim to get feedback and hone these further, as well as to inspire the WSSSPE audience to think about actions they can take themselves rather than actions they want others to take. We aim to publish a more fully developed Dagstuhl Manifesto by December 2016."
110,hal-02356290,2020,{Sampling Effect on Performance Prediction of Configurable Systems: A Case Study},"Alves Pereira, Juliana and Acher, Mathieu and Martin, Hugo and Jézéquel, Jean-Marc",
111,hal-01990767,2019,{Towards Learning-Aided Configuration in 3D Printing: Feasibility Study and Application to Defect Prediction},"Amand, Benoit and Cordy, Maxime and Heymans, Patrick and Acher, Mathieu and Temple, Paul and Jézéquel, Jean-Marc","Configurators rely on logical constraints over parameters to aid users and determine the validity of a configuration. However, for some domains, capturing such configuration knowledge is hard, if not infeasible. This is the case in the 3D printing industry, where parametric 3D object models contain the list of parameters and their value domains, but no explicit constraints. This calls for a complementary approach that learns what configurations are valid based on previous experiences. In this paper, we report on preliminary experiments showing the capability of state-of-the-art classification algorithms to assist the configuration process. While machine learning holds its promises when it comes to evaluation scores, an in-depth analysis reveals the opportunity to combine the classifiers with constraint solvers."
112,hal-02407932,2019,{Live-UMLRT: A Tool for Live Modeling of UML-RT Models},"Bagherzadeh, Mojtaba and Jahed, Karim and Combemale, Benoit and Dingel, Juergen",
113,hal-01609576,2017,{Language Engineering with the GEMOC Studio},"Barais, Olivier and Combemale, Benoit and Wortmann, Andreas","This tutorial provides a practical approach for developing and integrating various Domain-Specific (modeling) Languages (DSLs) used in the development of modern complex software-intensive systems, with the main objective to support abstraction and separation of concerns. The tutorial leverages on the tooling provided by the GEMOC studio to present the various facilities offered by the Eclipse platform (incl., EMF/Ecore, Xtext, Sirius) and introduces the advanced features to extend a DSL with a well-defined execution semantics, possibly including formal concurrency constraints and coordination patterns. From such a specification, we demonstrate the ability of the studio to automatically support model execution, graphical animation, omniscient debugging, concurrency analysis and concurrent execution of heterogeneous models. The tutorial is composed of both lectures and hands-on sessions. Hands-on sessions allow participants to experiment on a concrete use case of an architecture description language used to coordinate heterogeneous behavioral and structural components."
114,hal-01427277,2016,{Towards microservices architecture to transcode videos in the large at low costs},"Barais, Olivier and Bourcier, Johann and Bromberg, Yérom-David and Dion, Christophe","The increasing popularity of videos over Internet, combined with the wide heterogeneity of various kinds of end users' devices, imposes strong requirements on the underlying infrastructure and computing resources to meet the users expectations. In particular, designing an adequate transcoding workflow in the cloud to stream videos at large scale is: (i) costly, and (ii) complex. By inheriting key concepts from the software engineering domain, such as separation of concerns and microservice architecture style, we are giving our experience feedbacks of building both a low cost and efficient transcoding platform over an ad hoc computing cloud built around a rack of Raspberry Pis."
115,hal-00916281,2014,{DIVERSIFY: Ecology-inspired software evolution for diversity emergence},"Baudry, Benoit and Monperrus, Martin and Mony, Cendrine and Chauvel, Franck and Fleurey, Franck and Clarke, Siobhàn","DIVERSIFY is an EU funded project, which aims at favoring spontaneous diversification in software systems in order to increase their adaptive capacities. This objective is founded on three observations: software has to constantly evolve to face unpredictable changes in its requirements, execution environment or to respond to failure (bugs, attacks, etc.); the emergence and maintenance of high levels of diversity are essential to provide adaptive capacities to many forms of complex systems, ranging from ecological and biological systems to social and economical systems; diversity levels tend to be very low in software systems. DIVERSIFY explores how the biological evolutionary mechanisms, which sustain high levels of biodiversity in ecosystems (speciation, phenotypic plasticity and natural selection) can be translated in software evolution principles. In this work, we consider evolution as a driver for diversity as a means to increase resilience in software systems. In particular, we are inspired by bipartite ecological relationships to investigate the automatic diversification of the server side of a client-server architecture. This type of software diversity aims at mitigating the risks of software monoculture. The consortium gathers researchers from the software-intensive, distributed systems and the ecology areas in order to transfer ecological concepts and processes as software design principles."
116,hal-00938855,2014,{Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants},"Baudry, Benoit and Allier, Simon and Monperrus, Martin","The predictability of program execution provides attackers a rich source of knowledge who can exploit it to spy or remotely control the program. Moving target defense addresses this issue by constantly switching between many diverse variants of a program, which reduces the certainty that an attacker can have about the program execution. The effectiveness of this approach relies on the availability of a large number of software variants that exhibit different executions. However, current approaches rely on the natural diversity provided by off-the-shelf components, which is very limited. In this paper, we explore the automatic synthesis of large sets of program variants, called sosies. Sosies provide the same expected functionality as the original program, while exhibiting different executions. They are said to be computationally diverse. This work addresses two objectives: comparing different transformations for increasing the likelihood of sosie synthesis (densifying the search space for sosies); demonstrating computation diversity in synthesized sosies. We synthesized 30184 sosies in total, for 9 large, real-world, open source applications. For all these programs we identified one type of program analysis that systematically increases the density of sosies; we measured computation diversity for sosies of 3 programs and found diversity in method calls or data in more than 40% of sosies. This is a step towards controlled massive unpredictability of software."
117,hal-01058440,2014,{Automating the Formalization of Product Comparison Matrices},"Bécan, Guillaume and Sannier, Nicolas and Acher, Mathieu and Barais, Olivier and Blouin, Arnaud and Baudry, Benoit","Product Comparison Matrices (PCMs) form a rich source of data for comparing a set of related and competing products over numerous features. Despite their apparent simplicity, PCMs contain heterogeneous, ambiguous, uncontrolled and partial information that hinders their efficient exploitations. In this paper, we formalize PCMs through model-based automated techniques and develop additional tooling to support the edition and re-engineering of PCMs. 20 participants used our editor to evaluate the PCM metamodel and automated transformations. The results over 75 PCMs from Wikipedia show that (1) a significant proportion of the formalization of PCMs can be automated - 93.11% of the 30061 cells are correctly formalized; (2) the rest of the formalization can be realized by using the editor and mapping cells to existing concepts of the metamodel. The automated approach opens avenues for engaging a community in the mining, re-engineering, edition, and exploitation of PCMs that now abound on the Internet."
118,hal-01104797,2015,{On the Variability Secrets of an Online Video Generator},"Bécan, Guillaume and Acher, Mathieu and Jézéquel, Jean-Marc and Menguy, Thomas","We relate an original experience concerning a popular online video service that offers to generate variants of an humorous video. To further the understanding of the generator, we have reverse engineered its general behavior, architecture, as well as its variation points and its configuration space. The reverse engineering also allows us to create a new generator and online configurator that proposes 18 variation points – instead of only 3 as in the original generator. We explain why and how we have collaborated and are collaborating with the original creators of the video generator. We also highlight how our reverse engineering work represents a threat to the original service and call for further investigating variability-aware security mechanisms."
119,hal-01178454,2015,{Synthesis of Attributed Feature Models From Product Descriptions},"Bécan, Guillaume and Behjati, Razieh and Gotlieb, Arnaud and Acher, Mathieu","Many real-world product lines are only represented as non-hierarchical collections of distinct products, described by their configuration values. As the manual preparation of feature models is a tedious and labour-intensive activity, some techniques have been proposed to automatically generate boolean feature models from product descriptions. However , none of these techniques is capable of synthesizing feature attributes and relations among attributes, despite the huge relevance of attributes for documenting software product lines. In this paper, we introduce for the first time an algorithmic and parametrizable approach for computing a legal and appropriate hierarchy of features, including feature groups, typed feature attributes, domain values and relations among these attributes. We have performed an empirical evaluation by using both randomized configuration matrices and real-world examples. The initial results of our evaluation show that our approach can scale up to matrices containing 2,000 attributed features, and 200,000 distinct configurations in a couple of minutes."
120,hal-01022912,2014,{WebFML: Synthesizing Feature Models Everywhere},"Bécan, Guillaume and Ben Nasr, Sana and Acher, Mathieu and Baudry, Benoit","Feature Models (FMs) are the de-facto standard for documenting, model checking, and reasoning about the configurations of a software system. This paper introduces WebFML a comprehensive environment for synthesizing FMs from various kinds of artefacts (e.g. propositional formula, dependency graph, FMs or product comparison matrices). A key feature of WebFML is an interactive support (through ranking lists, clusters, and logical heuristics) for choosing a sound and meaningful hierarchy. WebFML opens avenues for numerous practical applications (e.g., merging multiple product lines, slicing a configuration process, reverse engineering configurable systems)."
121,hal-03342678,2021,{From Monolithic to Microservice Architecture: The Case of Extensible and Domain-Specific IDEs},"Belafia, Romain and Jeanjean, Pierre and Barais, Olivier and Le Guernic, Gurvan and Combemale, Benoit","Integrated Development Environments (IDEs) are evolving towards cloud-native applications with the aim to relocate the language services provided by an IDE on distant servers. Existing research works focus on the overall migration process to handle more efficiently their specific requirements. However, the microservicization of legacy monolithic applications is still highly dependent on the specific properties of the application of interest. In this paper, we report our experiment on the microservicization process of the Cloud-Based graphical modeling workbench Sirius Web. We aim to identify the technical challenges related to applications with similar properties, and provide insights for practitioners to migrate their similar applications towards microservices. We discuss the main lessons learned and identify the underlying challenges to be further addressed by the community."
122,hal-01234338,2015,{MatrixMiner: A Red Pill to Architect Informal Product Descriptions in the Matrix},"Ben Nasr, Sana and Bécan, Guillaume and Acher, Mathieu and Ferreira Filho, Joao Bosco and Baudry, Benoit and Sannier, Nicolas and Davril, Jean-Marc","Domain analysts, product managers, or customers aim to capture the important features and differences among a set of related products. A case-by-case reviewing of each product description is a laborious and time-consuming task that fails to deliver a condensed view of a product line. This paper introduces MatrixMiner: a tool for automatically synthesizing product comparison matrices (PCMs) from a set of product descriptions written in natural language. Ma-trixMiner is capable of identifying and organizing features and values in a PCM – despite the informality and absence of structure in the textual descriptions of products. Our empirical results of products mined from BestBuy show that the synthesized PCMs exhibit numerous quantitative, comparable information. Users can exploit MatrixMiner to visualize the matrix through a Web editor and review, refine, or complement the cell values thanks to the traceability with the original product descriptions and technical specifications."
123,hal-01019537,2014,{Moving Toward Product Line Engineering in a Nuclear Industry Consortium},"Ben Nasr, Sana and Sannier, Nicolas and Acher, Mathieu and Baudry, Benoit","Nuclear power plants are some of the most sophisticated and complex energy systems ever designed. These systems perform safety critical functions and must conform to national safety institutions and international regulations. In many cases, regulatory documents provide very high level and ambiguous requirements that leave a large margin for interpretation. As the French nuclear industry is now seeking to spread its activities outside France, it is but necessary to master the ins and the outs of the variability between countries safety culture and regulations. This sets both an industrial and a scientific challenge to introduce and propose a product line engineering approach to an unaware industry whose safety culture is made of interpretations, specificities, and exceptions. This paper presents our current work within the French R&D project CONNEXION, while introducing variability modeling to the French nuclear industry. In particular, we discuss the background, the quest for the best variability paradigm, the practical modeling of requirements variability as well as the mapping between variable requirements and variable architecture elements."
124,hal-01580554,2017,{Raising Time Awareness in Model-Driven Engineering},"Benelallam, Amine and Hartmann, Thomas and Mouline, Ludovic and Fouquet, Francois and Bourcier, Johann and Barais, Olivier and Le Traon, Yves","The conviction that big data analytics is a key for the success of modern businesses is growing deeper, and the mo-bilisation of companies into adopting it becomes increasingly important. Big data integration projects enable companies to capture their relevant data, to efficiently store it, turn it into domain knowledge, and finally monetize it. In this context, historical data, also called temporal data, is becoming increasingly available and delivers means to analyse the history of applications, discover temporal patterns, and predict future trends. Despite the fact that most data that today's applications are dealing with is inherently temporal current approaches, methodologies, and environments for developing these applications don't provide sufficient support for handling time. We envision that Model-Driven Engineering (MDE) would be an appropriate ecosystem for a seamless and orthogonal integration of time into domain modelling and processing. In this paper, we investigate the state-of-the-art in MDE techniques and tools in order to identify the missing bricks for raising time-awareness in MDE and outline research directions in this emerging domain."
125,hal-02080243,2019,{The Maven Dependency Graph: a Temporal Graph-based Representation of Maven Central},"Benelallam, Amine and Harrand, Nicolas and Soto-Valero, César and Baudry, Benoit and Barais, Olivier","The Maven Central Repository provides an extraordinary source of data to understand complex architecture and evolution phenomena among Java applications. As of September 6, 2018, this repository includes 2.8M artifacts (compiled piece of code implemented in a JVM-based language), each of which is characterized with metadata such as exact version, date of upload and list of dependencies towards other artifacts. Today, one who wants to analyze the complete ecosystem of Maven artifacts and their dependencies faces two key challenges: (i) this is a huge data set; and (ii) dependency relationships among artifacts are not modeled explicitly and cannot be queried. In this paper, we present the Maven Dependency Graph. This open source data set provides two contributions: a snapshot of the whole Maven Central taken on September 6, 2018, stored in a graph database in which we explicitly model all dependencies; an open source infrastructure to query this huge dataset."
126,hal-01085400,2014,"{""May the fork be with you"": novel metrics to analyze collaboration on GitHub}","Biazzini, Marco and Baudry, Benoit","Multi–repository software projects are becoming more and more popular, thanks to web–based facilities such as GitHub. Code and process metrics generally assume a single reposi-tory must be analyzed, in order to measure the characteris-tics of a codebase. Thus they are not apt to measure how much relevant information is hosted in multiple repositories contributing to the same codebase. Nor can they feature the characteristics of such a distributed development pro-cess. We present a set of novel metrics, based on an original classification of commits, conceived to capture some interest-ing aspects of a multi–repository development process. We also describe an efficient way to build a data structure that allows to compute these metrics on a set of Git reposito-ries. Interesting outcomes, obtained by applying our metrics on a large sample of projects hosted on GitHub, show the usefulness of our contribution."
127,hal-01063789,2014,{On Analyzing the Topology of Commit Histories in Decentralized Version Control Systems},"Biazzini, Marco and Monperrus, Martin and Baudry, Benoit","Empirical analysis of software repositories usually deals with linear histories derived from centralized versioning systems. Decentralized version control systems allow a much richer structure of commit histories, which presents features that are typical of complex graph models. In this paper we bring some evidences of how the very structure of these commit histories carries relevant information about the distributed development process. By means of a novel data structure that we formally define, we analyze the topological characteristics of commit graphs of a sample of git projects. Our findings point out the existence of common recurrent structural patterns which identically occur in different projects and can be consider building blocks of distributed collaborative development."
128,hal-03259896,2021,{Programmez vos IHM avec Interacto: une démonstration},"Blouin, Arnaud and Jézéquel, Jean-Marc",
129,hal-01056217,2014,{Slicing-based Techniques for Visualizing Large Metamodels},"Blouin, Arnaud and Moha, Naouel and Baudry, Benoit and Sahraoui, Houari","In model-driven engineering, a model describes an aspect of a system. A model conforms to a metamodel that defines the concepts and relationships of a given domain. Metamodels are thus corner-stones of various meta-modeling activities that require a good understanding of the metamodels or parts of them. Current metamodel editing tools are based on standard visualization and navigation features, such as physical zooms. However, as soon as metamodels become larger, navigating through large metamodels becomes a tedious task that hinders their understanding. In this work, we promote the use of model slicing techniques to build visualization techniques dedicated to metamodels. We propose an approach based on model slicing, inspired from program slicing, to build interactive visualization techniques dedicated to metamodels. These techniques permit users to focus on metamodel elements of interest, which aims at improving the understandability. This approach is implemented in a metamodel visualizer, called Explen."
130,hal-01958443,2018,{Tool-Support of Socio-Technical Coordination in the Context of Heterogeneous Modeling: A Research Statement and Associated Roadmap},"Bordeleau, Francis and Combemale, Benoit and Eramo, Romina and Brand, Mark van Den and Wimmer, Manuel","The growing complexity of everyday life systems (and devices) over the last decades has forced the industry to use and investigate different development techniques to manage the many different aspects of the systems. In this context, the use of model-driven engineering (MDE) has emerged and is now common practice for many engineering disciplines. However, this comes with important challenges. As a set of main challenges relates to the fact that different modeling techniques, languages, and tools are required to deal with the different system aspects, and that support is required to ensure consistency and coherence between the different models. This paper identifies a number of challenges and paints a roadmap on how tooling can support a multi-model integrated way of working."
131,hal-02946949,2020,{Towards Model-Driven Digital Twin Engineering: Current Opportunities and Future Challenges},"Bordeleau, Francis and Combemale, Benoit and Eramo, Romina and van den Brand, Mark and Wimmer, Manuel","Digital Twins have emerged since the beginning of this millennium to better support the management of systems based on (real-time) data collected in different parts of the operating systems. Digital Twins have been successfully used in many application domains, and thus, are considered as an important aspect of Model-Based Systems Engineering (MBSE). However, their development , maintenance, and evolution still face major challenges, in particular: (i) the management of heterogeneous models from different disciplines, (ii) the bi-directional synchronization of digital twins and the actual systems, and (iii) the support for collaborative development throughout the complete life-cycle. In the last decades, the Model-Driven Engineering (MDE) community has investigated these challenges in the context of software systems. Now the question arises, which results may be applicable for digital twin engineering as well. In this paper, we identify various MDE techniques and technologies which may contribute to tackle the three mentioned digital twin challenges as well as outline a set of open MDE research challenges that need to be addressed in order to move towards a digital twin engineering discipline."
132,hal-01186019,2015,{CLARITY: Open-Sourcing the Model-Based Systems Engineering Solution Capella},"Boudjennah, Christophe and Combemale, Benoit and Exertier, Daniel and Lacrampe, Stéphane and Peraldi-Frati, Marie-Agnès","The central concern of the CLARITY project is to support open innovation for Model-Based Systems Engineering by structuring an ecosystem of major actors of the domain and providing an open-source environment, Capella, and its underlying methodology, Arcadia. In this experience report, we relate the Capella story from the initial development by Thales down to the open-sourcing strategy. The rationale for adopting the open-source model is detailed with commercial, competitive and strategic arguments. The global roadmap of Capella is then presented. One year after the start of the CLARITY project, a first feedback is shared."
133,hal-01090641,2014,{Conversations with my Washing Machine: An in-the-wild Study of Demand Shifting with Self-generated Energy},"Bourgeois, Jacky and van Der Linden, Janet and Kortuem, Gerd and Price, Blaine A. and Rimmer, Christopher","Domestic microgeneration is the onsite generation of low-and zero-carbon heat and electricity by private households to meet their own needs. In this paper we explore how an everyday household routine – that of doing laundry – can be augmented by digital technologies to help households with photovoltaic solar energy generation to make better use of self-generated energy. This paper presents an 8-month in-the-wild study that involved 18 UK households in longitudinal energy data collection, prototype deployment and participatory data analysis. Through a series of technology interventions mixing energy feedback, proactive suggestions and direct control the study uncovered opportunities, potential rewards and barriers for families to shift energy consuming household activities and highlights how digital technology can act as mediator between household laundry routines and energy demand-shifting behaviors. Finally, the study provides insights into how a ""smart"" energy-aware washing machine shapes organization of domestic life and how people ""communicate"" with their washing machine."
134,hal-01090643,2014,{Using Participatory Data Analysis to Understand Social Constraints and Opportunities of Electricity Demand-Shifting},"Bourgeois, Jacky and van Der Linden, Janet and Kortuem, Gerd and Rimmer, Christopher","This paper reports results from an empirical study that explored the potential of interactive electricity Demand-Shifting – a particular form of behavior change where electricity consumption is shifted towards times of the day when production is at its highest – in the context of residential solar electricity generation. We conducted an in-the-wild user study with eighteen households over a period of six months, combining electricity data from smart meters and smart plugs with in-depth interviews to explore laundry routines and washing machine usage for each household. The study highlights that, although washing machines can be considered to be ""shiftable appliances"", the specific ways in which each household goes about shifting varies considerably. Furthermore, there is a clear need for ICTs to support people by analyzing current context and future plans to automate electricity demand-shifting and synchronization of both laundry and other appliances. Such support will help them maximize the self-consumption of micro-generated electricity while balancing this with environmental concerns, financial benefits and day-to-day issues relating to convenience. Index Terms— Microgeneration, demand-shifting, in-home study, energy management."
135,hal-01121228,2015,{A Novelty Search Approach for Automatic Test Data Generation},"Boussaa, Mohamed and Barais, Olivier and Sunyé, Gerson and Baudry, Benoit","In search-based structural testing, metaheuristic search techniques have been frequently used to automate the test data generation. In Genetic Algorithms (GAs) for example, test data are rewarded on the basis of an objective function that represents generally the number of statements or branches covered. However, owing to the wide diversity of possible test data values, it is hard to find the set of test data that can satisfy a specific coverage criterion. In this paper, we introduce the use of Novelty Search (NS) algorithm to the test data generation problem based on statement-covered criteria. We believe that such approach to test data generation is attractive because it allows the exploration of the huge space of test data within the input domain. In this approach, we seek to explore the search space without regard to any objectives. In fact, instead of having a fitness-based selection, we select test cases based on a novelty score showing how different they are compared to all other solutions evaluated so far."
136,hal-01248177,2015,{A Novelty Search-based Test Data Generator for Object-oriented Programs},"Boussaa, Mohamed and Barais, Olivier and Sunyé, Gerson and Baudry, Benoit","In search-based structural testing, meta-heuristic search techniques have been frequently used to automate test data generation. In this paper, we introduce the use of novelty search algorithm to the test data generation problem based on statement-covered criterion. In this approach, we seek to explore the search space by considering diversity as the unique objective function to be optimized. In fact, instead of having a fitness-based selection, we select test cases based on a novelty score showing how different they are compared to all other solutions evaluated so far."
137,hal-01356849,2016,{Automatic Non-functional Testing of Code Generators Families},"Boussaa, Mohamed and Barais, Olivier and Baudry, Benoit and Sunyé, Gerson","The intensive use of generative programming techniques provides an elegant engineering solution to deal with the heterogeneity of platforms and technological stacks. The use of domain-specific languages for example, leads to the creation of numerous code generators that automatically translate highlevel system specifications into multi-target executable code. Producing correct and efficient code generator is complex and error-prone. Although software designers provide generally high-level test suites to verify the functional outcome of generated code, it remains challenging and tedious to verify the behavior of produced code in terms of non-functional properties. This paper describes a practical approach based on a runtime monitoring infrastructure to automatically check the potential inefficient code generators. This infrastructure, based on system containers as execution platforms, allows code-generator developers to evaluate the generated code performance. We evaluate our approach by analyzing the performance of Haxe, a popular high-level programming language that involves a set of cross-platform code generators. Experimental results show that our approach is able to detect some performance inconsistencies that reveal real issues in Haxe code generators."
138,hal-01344835,2016,{NOTICE: A Framework for Non-functional Testing of Compilers},"Boussaa, Mohamed and Barais, Olivier and Baudry, Benoit and Sunyé, Gerson","—Generally, compiler users apply different optimizations to generate efficient code with respect to non-functional properties such as energy consumption, execution time, etc. However, due to the huge number of optimizations provided by modern compilers, finding the best optimization sequence for a specific objective and a given program is more and more challenging. This paper proposes NOTICE, a component-based framework for non-functional testing of compilers through the monitoring of generated code in a controlled sand-boxing environment. We evaluate the effectiveness of our approach by verifying the optimizations performed by the GCC compiler. Our experimental results show that our approach is able to auto-tune compilers according to user requirements and construct optimizations that yield to better performance results than standard optimization levels. We also demonstrate that NOTICE can be used to automatically construct optimization levels that represent optimal trade-offs between multiple non-functional properties such as execution time and resource usage requirements."
139,hal-01154225,2015,{A Generative Approach to Define Rich Domain-Specific Trace Metamodels},"Bousse, Erwan and Mayerhofer, Tanja and Combemale, Benoit and Baudry, Benoit","Executable Domain-Specific Modeling Languages (xDSMLs) open many possibilities for performing early verification and validation (V&V) of systems. Dynamic V&V approaches rely on execution traces, which represent the evolution of models during their execution. In order to construct traces, generic trace metamodels can be used. Yet, regarding trace manipulations, they lack both efficiency because of their sequential structure, and usability because of their gap to the xDSML. Our contribution is a generative approach that defines a rich and domain-specific trace metamodel enabling the construction of execution traces for models conforming to a given xDSML. Efficiency is increased by providing a variety of navigation paths within traces, while usability is improved by narrowing the concepts of the trace metamodel to fit the considered xDSML. We evaluated our approach by generating a trace metamodel for fUML and using it for semantic differencing, which is an important V&V activity in the realm of model evolution. Results show a significant performance improvement and simplification of the semantic differencing rules as compared to the usage of a generic trace metamodel."
140,hal-01355391,2016,{Execution Framework of the GEMOC Studio (Tool Demo)},"Bousse, Erwan and Degueule, Thomas and Vojtisek, Didier and Mayerhofer, Tanja and Deantoni, Julien and Combemale, Benoit","The development and evolution of an advanced modeling environment for a Domain-Specific Modeling Language (DSML) is a tedious task, which becomes recurrent with the increasing number of DSMLs involved in the development and management of complex software-intensive systems. Recent efforts in language workbenches result in advanced frameworks that automatically provide syntactic tooling such as advanced editors. However, defining the execution semantics of languages and their tooling remains mostly hand crafted. Similarly to editors that share code completion or syntax highlighting, the development of advanced debuggers, animators,and others execution analysis tools shares common facilities, which should be reused among various DSMLs.In this tool demonstration paper, we present the execution framework offered by the GEMOC studio, an Eclipse-based language and modeling workbench. The framework provides a generic interface to plug in different execution engines associated to their specific metalanguages used to define the discrete-event operational semantics of DSMLs. It also integrates generic runtime services that are shared among the approaches used to implement the execution semantics, such as graphical animation or omniscient debugging."
141,hal-01023681,2014,{Scalable Armies of Model Clones through Data Sharing},"Bousse, Erwan and Combemale, Benoit and Baudry, Benoit","Cloning a model is usually done by duplicating all its runtime objects into a new model. This approach leads to memory consumption problems for operations that create and manipulate large quantities of clones (e.g., design space exploration). We propose an original approach that exploits the fact that operations rarely modify a whole model. Given a set of immutable properties, our cloning approach determines the objects and fields that can be shared between the runtime representations of a model and its clones. Our generic cloning algorithm is parameterized with three strategies that establish a trade-off between memory savings and the ease of clone manipulation. We implemented the strategies within the Eclipse Modeling Framework (EMF) and evaluated memory footprints and computation overheads with 100 randomly generated metamodels and models. Results show a positive correlation between the proportion of shareable properties and memory savings, while the worst median overhead is 9,5% when manipulating the clones."
142,hal-01182517,2015,{Supporting Efficient and Advanced Omniscient Debugging for xDSMLs},"Bousse, Erwan and Corley, Jonathan and Combemale, Benoit and Gray, Jeff and Baudry, Benoit","Omniscient debugging is a promising technique that relies on execution traces to enable free traversal of the states reached by a system during an execution. While some General-Purpose Languages (GPLs) already have support for omniscient debugging, developing such a complex tool for any executable Domain-Specific Modeling Language (xDSML) remains a challenging and error prone task. A solution to this problem is to define a generic omniscient debugger for all xDSMLs. However, generically supporting any xDSML both compromises the efficiency and the usability of such an approach. Our contribution relies on a partly generic omniscient debugger supported by generated domain-specific trace management facilities. Being domain-specific, these facilities are tuned to the considered xDSML for better efficiency. Usability is strengthened by providing multidimensional omniscient debugging. Results show that our approach is on average 3.0 times more efficient in memory and 5.03 more efficient in time when compared to a generic solution that copies the model at each step."
143,hal-01061740,2014,{Towards Scalable Multidimensional Execution Traces for xDSMLs},"Bousse, Erwan and Combemale, Benoit and Baudry, Benoit","Executable Domain Specific Modeling Languages (xDSML) opens many possibilities in terms of early verification and validation (V&V) of systems, including the use of dynamic V&V approaches. Such approaches rely on the notion of execution trace, i.e. the evolution of a system during a run. To benefit from dynamic V&V approaches, it is therefore necessary to characterize what is the structure of the executions traces of a given xDSML. Our goal is to provide an approach to design trace metamodels for xDSMLs. We identify seven problems that must be considered when modeling execution traces, including concurrency, modularity, and scalability. Then we present our envisioned approach to design scalable multidimensional trace metamodels for xDSMLs. Our work in progress relies on the dimensions of a trace (i.e. subsets of mu- table elements of the traced model) to provide an original structure that faces the identified problems, along with a trace API to manipulate them."
144,hal-03805561,2022,{Semi-automatic ladderisation: improving code security through rewriting and dependent types},"Brown, Christopher and Barwell, Adam and Marquer, Yoann and Zendra, Olivier and Richmond, Tania and Gu, Chen","Cyber attacks become more and more prevalent every day.One type of cyber attack is known as a side channel attack, where attackers exploit information leakage from the physical execution of a program, e.g. timing or power leakage, to uncover secret information, such as encryption keys or other sensitive data. There have been various attempts at addressing the problem of preventing side-channel attacks, often relying on various measures to decrease the discernibility of several code variants or code paths. Most techniques require a high-degree of expertise by the developer, who often employs ad hoc, hand-crafted code-patching in an attempt to make it more secure. In this paper, we take a different approach: building on the idea of ladderisation, inspired by Montgomery Ladders. We present a semi-automatic tool-supported technique, aimed at the non-specialised developer, which refactors (a class of) C programs into functionally (and even algorithmically) equivalent counterparts with improved security properties. Our approach provides refactorings that transform the source code into its ladderised equivalent, driven by an underlying verified rewrite system, based on dependent types.Our rewrite system automatically finds rewritings of selected C expressions, facilitating the production of their equivalent ladderised counterparts for a subset of C. We demonstrate our approach on a number of representative examples from the cryptographic domain, showing increased security."
145,hal-01141393,2015,{MDE in Practice for Computational Science},"Bruel, Jean-Michel and Combemale, Benoit and Ober, Ileana and Raynal, Hélène","The complex problems that computational science addresses are more and more benefiting from the progress of computing facilities (simulators, librairies, accessible languages,. . .). Nevertheless , the actual solutions call for several improvements. Among those, we address in this paper the needs for leveraging on knowledge and expertise by focusing on Domain-Specific Mod-eling Languages application. In this vision paper we illustrate, through concrete experiments, how the last DSML research help getting closer the problem and implementation spaces."
146,hal-01910113,2018,{Model Transformation Reuse across Metamodels - A classification and comparison of approaches},"Bruel, Jean-Michel and Combemale, Benoit and Guerra, Esther and Jézéquel, Jean-Marc and Kienzle, Jörg and de Lara, Juan and Mussbacher, Gunter and Syriani, Eugene and Vangheluwe, Hans","Model transformations (MTs) are essential elements of model-driven engineering (MDE) solutions. MDE promotes the creation of domain-specific metamodels, but without proper reuse mechanisms, MTs need to be developed from scratch for each new metamodel. In this paper, we classify reuse approaches for MTs across different metamodels and compare a sample of specific approaches-model types, concepts, a-posteriori typing, multilevel modeling, and design patterns for MTs-with the help of a feature model developed for this purpose, as well as a common example. We discuss strengths and weaknesses of each approach , provide a reading grid used to compare their features, and identify gaps in current reuse approaches."
147,hal-03729080,2022,{A Context-Driven Modelling Framework for Dynamic Authentication Decisions},"Bumiller, Anne and Barais, Olivier and Challita, Stéphanie and Combemale, Benoit and Aillery, Nicolas and Le Lan, Gael","Nowadays, many mechanisms exist to perform authentication, such as text passwords and biometrics. However, reasoning about their relevance (e.g., the appropriateness for security and usability) regarding the contextual situation is challenging for authentication system designers. In this paper, we present a Context-driven Modelling Framework for dynamic Authentication decisions (COFRA), where the context information specifies the relevance of authentication mechanisms. COFRA is based on a precise metamodel that reveals framework abstractions and a set of constraints that specify their meaning. Therefore, it provides a language to determine the relevant authentication mechanisms (characterized by properties that ensure their appropriateness) in a given context. The framework supports the adaptive authentication system designers in the complex trade-off analysis between context information, risks and authentication mechanisms, according to usability, deployability, security, and privacy. We validate the proposed framework through case studies and extensive exchanges with authentication and modelling experts. We show that model instances describing real-world use cases and authentication approaches proposed in the literature can be instantiated validly according to our metamodel. This validation highlights the necessity, sufficiency, and soundness of our framework."
148,hal-03789500,2022,{Towards a Better Understanding of Impersonation Risks},"Bumiller, Anne and Barais, Olivier and Aillery, Nicolas and Le Lan, Gael","In many situations, it is of interest for authentication systems to adapt to context (e.g., when the user's behavior differs from the previous behavior). Hence, during authentication events, it is common to use contextually available features to calculate an impersonation risk score. This paper proposes an explainability model that can be used for authentication decisions and, in particular, to explain the impersonation risks that arise during suspicious authentication events (e.g., at unusual times or locations). The model applies Shapley values to understand the context behind the risks. Through a case study on 30,000 real world authentication events, we show that risky and nonrisky authentication events can be grouped according to similar contextual features, which can explain the risk of impersonation differently and specifically for each authentication event. Hence, explainability models can effectively improve our understanding of impersonation risks. The risky authentication events can be classified according to attack types. The contextual explanations of the impersonation risk can help authentication policymakers and regulators who attempt to provide the right authentication mechanisms, to understand the suspiciousness of an authentication event and the attack type, and hence to choose the suitable authentication mechanism."
149,hal-02114296,2019,{Comparison Matrices of Semantic RESTful APIs Technologies},"Cheron, Antoine and Bourcier, Johann and Barais, Olivier and Michel, Antoine","Semantic RESTful APIs combine the power of the REST architectural style, the Semantic Web and Linked Data. They picture a world in which Web APIs are easier to browse and more meaningful for humans while also being machine-interpretable, turning them into platforms that developers and companies can build on. We counted 36 technologies that target building such APIs. As there is no one-size-fits-all technology, they have to be combined. This makes selecting the appropriate set of technologies to a specific context a difficult task for architects and developers. So, how the selection of such a set of technologies can be eased? In this paper we propose three comparison matrices of Semantic RESTful APIs enabling technologies. It is based on the analysis of the differences and commonalities between existing technologies. It intends to help developers and architects in making an informed decision on the technologies to use. It also highlights the limitations of state-of-the-art technologies from which open challenges are derived."
150,hal-03933973,2022,{A community-sourced view on engineering digital twins: A Report from the EDT.Community},"Cleophas, Loek and Godfrey, Thomas and Khelladi, Djamel Eddine and Lehner, Daniel and Combemale, Benoit and van den Brand, Mark and Vierhauser, Michael and Wimmer, Manuel and Zschaler, Steffen","Digital Twins are an important concept, enabling what-if scenario exploration, predictive maintenance, and other approaches. They help in saving time and physical resources when developing and evolving systems, whether natural or engineered. However, constructing and maintaining digital twins is a challenging engineering task -- and, to date, there is a lack of understanding of the engineering techniques and methodologies required. To address these challenges, we created EDT.Community, a programme of seminars on the engineering of digital twins hosting digital twins experts from academia and industry. In this paper, we report on the main topics of discussion from the first year of the programme. We contribute by providing (1) a common understanding of open challenges in research and practice of the engineering of digital twins, and (2) an entry point to researchers who aim to close gaps in the current state of the art."
151,hal-01152342,2015,{A Solution to the TTC'15 Model Execution Case Using the GEMOC Studio},"Combemale, Benoit and Deantoni, Julien and Barais, Olivier and Blouin, Arnaud and Bousse, Erwan and Brun, Cédric and Degueule, Thomas and Vojtisek, Didier","We present in this paper a complete solution to the Model Execution case of the Transformation Tool Contest 2015 using the GEMOC Studio. The solution proposes an implementation of the most complete version (variant 3) of the UML Activity Diagram language. The implementation uses different technologies integrated into the GEMOC Studio for implementing the various concerns of the language: Kermeta is used to modularly implement the operational semantics and to weave it into the provided metamodel, Melange is used to build the overall language runtime seamlessly integrated to EMF, Sirius Animator is used to develop a graphical animator, the GEMOC execution engine is used to execute the conforming models according to the operational semantics and to build a rich and efficient execution trace that can be manipulated through a powerful timeline, which provides common facilities like, for instance trace visualization, and step-by-step execution (incl., breakpoint, step forward and step backward). Finally, MoCCML is used to provide an alternative to the implementation with an explicit and formal concurrency model for activity diagrams supported by a solver and analysis tools. We evaluate our implementation with regard to the evaluation criteria provided into the case description and provide evidence of the correctness, understandability, conciseness and performance of our solution."
152,hal-01258358,2016,{A Tool-Supported Approach for Concurrent Execution of Heterogeneous Models},"Combemale, Benoit and Brun, Cédric and Champeau, Joël and Crégut, Xavier and Deantoni, Julien and Le Noir, Jérome","In the software and systems modeling community, research on domain-specific modeling languages (DSMLs) is focused on providing technologies for developing languages and tools that allow domain experts to develop system solutions efficiently. Unfortunately, the current lack of support for explicitly relating concepts expressed in different DSMLs makes it very difficult for software and system engineers to reason about information spread across models describing different system aspects [4]. As a particular challenge, we investigate in this paper relationships between, possibly heterogeneous, behavioral models to support their concurrent execution. This is achieved by following a modular executable metamodeling approach for behavioral semantics understanding, reuse, variability and composability [5]. This approach supports an explicit model of concurrency (MoCC) [6] and domain-specific actions (DSA) [10] with a well-defined protocol between them (incl., mapping, feedback and callback) reified through explicit domain-specific events (DSE) [12]. The protocol is then used to infer a relevant behavioral language interface for specifying coordination patterns to be applied on conforming executable models [17]. All the tooling of the approach is gathered in the GEMOC studio, and outlined in the next section. Currently, the approach is experienced on a systems engineering language provided by Thales, named Capella 7. The goal and current state of the case study are exposed in this paper. 7 Cf. https://www.polarsys.org/capella/"
153,hal-02285737,2019,{Bringing Intelligence to Sociotechnical IoT Systems: Modeling Opportunities and Challenges},"Combemale, Benoit",
154,hal-01517387,2016,{Modeling for Sustainability},"Combemale, Benoit and Cheng, Betty H.C. and Moreira, Ana and Bruel, Jean-Michel and Gray, Jeff","Various disciplines use models for different purposes. An engineering model, including a software engineering model, is often developed to guide the construction of a non-existent system. A scientific model is created to better understand a natural phenomenon (i.e., an already existing system). An engineering model may incorporate scientific models to build a system. Sustainability is an area that requires both types of models. Both engineering and scientific models have been used to support sustainability, but largely independently of one another. Due to the inherent complex nature of sustainability that must delicately balance trade-offs between social, environmental, and economic concerns, modeling challenges abound for both the scientific and engineering disciplines. This paper offers a vision that promotes a unique approach that combines engineering and scientific models to enable broader engagement of society for addressing sustainability concerns. We introduce a research roadmap to support this vision that emphasizes the socio-technical benefits of modeling."
155,hal-02964806,2020,{Modular and Distributed IDE},"Coulon, Fabien and Auvolat, Alex and Combemale, Benoit and Bromberg, Yérom-David and Taïani, François and Barais, Olivier and Plouzeau, Noël","Integrated Development Environments (IDEs) are indispensable companions to programming languages. They are increasingly turning towards Web-based infrastructure. The rise of a protocol such as the Language Server Protocol (LSP) that standardizes the separation between a language-agnostic IDE, and a language server that provides all language services (e.g., auto completion, compiler...) has allowed the emergence of high quality generic Web components to build the IDE part that runs in the browser. However, all language services require different computing capacities and response times to guarantee a user-friendly experience within the IDE. The monolithic distribution of all language services prevents to leverage on the available execution platforms (e.g., local platform, application server, cloud). In contrast with the current approaches that provide IDEs in the form of a monolithic client-server architecture, we explore in this paper the modularization of all language services to support their individual deployment and dynamic adaptation within an IDE. We evaluate the performance impact of the distribution of the language services across the available execution platforms on four EMF-based languages, and demonstrate the benefit of a custom distribution."
156,hal-01889155,2018,{Shape-Diverse DSLs: Languages without Borders (Vision Paper)},"Coulon, Fabien and Degueule, Thomas and van Der Storm, Tijs and Combemale, Benoit","Domain-Specific Languages (DSLs) manifest themselves in remarkably diverse shapes, ranging from internal DSLs embedded as a mere fluent API within a programming language, to external DSLs with dedicated syntax and tool support. Although different shapes have different pros and cons, combining them for a single language is problematic: language designers usually commit to a particular shape early in the design process, and it is hard to reconsider this choice later. In this new ideas paper, we envision a language engineering approach enabling (i) language users to manipulate language constructs in the most appropriate shape according to the task at hand, and (ii) language designers to combine the strengths of different technologies for a single DSL. We report on early experiments and lessons learned building Prism, our prototype approach to this problem. We illustrate its applicability in the engineering of a simple shape-diverse DSL implemented conjointly in Rascal, EMF, and Java. We hope that our initial contribution will raise the awareness of the community and encourage future research."
157,hal-02179453,2019,{Cuckoo: Opportunistic MapReduce on Ephemeral and Heterogeneous Cloud Resources},"Dartois, Jean-Emile and Ribeiro, Heverson B. and Boukhobza, Jalil and Barais, Olivier",
158,hal-02362257,2019,{Leveraging cloud unused resources for Big data application while achieving SLA},"Dartois, Jean-Emile and Meriau, Ivan and Handaoui, Mohamed and Boukhobza, Jalil and Barais, Olivier",
159,hal-02303153,2019,{Tracking Application Fingerprint in a Trustless Cloud Environment for Sabotage Detection},"Dartois, Jean-Emile and Boukhobza, Jalil and Francoise, Vincent and Barais, Olivier",
160,hal-01898438,2018,{Using Quantile Regression for Reclaiming Unused Cloud Resources while achieving SLA},"Dartois, Jean-Emile and Knefati, Anas and Boukhobza, Jalil and Barais, Olivier","Although Cloud computing techniques have reduced the total cost of ownership thanks to virtualization, the average usage of resources (e.g., CPU, RAM, Network, I/O) remains low. To address such issue, one may sell unused resources. Such a solution requires the Cloud provider to determine the resources available and estimate their future use to provide availability guarantees. This paper proposes a technique that uses machine learning algorithms (Random Forest, Gradient Boosting Decision Tree, and Long Short Term Memory) to forecast 24-hour of available resources at the host level. Our technique relies on the use of quantile regression to provide a flexible trade-off between the potential amount of resources to reclaim and the risk of SLA violations. In addition, several metrics (e.g., CPU, RAM, disk, network) were predicted to provide exhaustive availability guarantees. Our methodology was evaluated by relying on four in production data center traces and our results show that quantile regression is relevant to reclaim unused resources. Our approach may increase the amount of savings up to 20% compared to traditional approaches."
161,hal-01243571,2015,{On Breaking The Curse of Dimensionality in Reverse Engineering Feature Models},"Davril, Jean-Marc and Heymans, Patrick and Bécan, Guillaume and Acher, Mathieu","Feature models have become one of the most widely used formalism for representing the variability among the products of a product line. The design of a feature model from a set of existing products can help stakeholders communicate on the commonal-ities and differences between the products, facilitate the adoption of mass customization strategies, or support the definition of the solution space of a product configurator (i.e. the sets of products that will be and will not be offered to the targeted customers). As the manual construction of feature models proves to be a time-consuming and error prone task, researchers have proposed various approaches for automatically deriving feature models from available product data. Existing reverse engineering techniques mostly rely on data mining algorithms that search for frequently occurring patterns between the features of the available product configurations. However, when the number of features is too large, the sparsity among the configurations can reduce the quality of the extracted model. In this paper, we discuss motivations for the development of dimensionality reduction techniques for product lines in order to support the extraction of feature models in the case of high-dimensional product spaces. We use a real world dataset to illustrate the problems arising with high dimensionality and present four research questions to address them."
162,hal-01243006,2015,{Using fuzzy modeling for consistent definitions of product qualities in requirements},"Davril, Jean-Marc and Cordy, Maxime and Heymans, Patrick and Acher, Mathieu","Companies increasingly rely on product differentiation and personalization strategies to provide their customers with an expansive catalog, and tools to assist them in finding the product meeting their needs. These tools include product search facilities, recommender systems, and product configurators. They typically represent a product as a set of features, which refer to a large number of technical specifications (e.g. size, weight, battery life). However, customers usually communicate and reason about products in terms of their qualities (e.g. ease-of-use, portability, ergonomics). In this paper, we tackle the problem of formalizing product qualities in the requirements of product-centred applications. Our goal is to extract product qualities from their technical features, so that customers can better perceive and evaluate the proposed products. To this end, we design a procedure for identifying segments of textual product documentation related to specific product qualities, and propose an approach based on fuzzy modeling to represent product qualities on top of technical specifications. Preliminary experiments we carried out on a catalog of cameras tend to show that fuzzy modeling is an appropriate formalism for representing product qualities. We also illustrate how modeled qualities can support the design of product configurators that are centered on the customers' needs."
163,hal-01087442,2015,{Towards a Meta-Language for the Concurrency Concern in DSLs},"Deantoni, Julien and Issa Diallo, Papa and Teodorov, Ciprian and Champeau, Joël and Combemale, Benoit","Abstract—Concurrency is of primary interest in the development of complex software-intensive systems, as well as thedeployment on modern platforms. Furthermore, Domain-Specific Languages (DSLs) are increasingly used in industrial processes toseparate and abstract the various concerns of complex systems.However, reifying the definition of the DSL concurrency remainsa challenge. This not only prevents leveraging the concurrencyconcern of a particular domain or platform, but it also hinders:a) the development of a complete understanding of the DSLsemantics; b) the effectiveness of concurrency-aware analysistechniques; c) the analysis of the deployment on parallelarchitectures. In this paper, we present M oCCML, a dedicatedmeta-language for formally specifying the concurrency concernwithin the definition of a DSL. The concurrency constraintscan reflect the knowledge in a particular domain, but also theconstraints of a particular platform. MoCCML comes with acomplete language workbench to help a DSL designer in thedefinition of the concurrency directly within the concepts of theDSL itself, and a generic workbench to simulate and analyzeany model conforming to this DSL. Mo CCML is illustrated onthe definition of an lightweight extension of SDF (SynchronousData Flow)."
164,hal-01197038,2015,{Melange: A Meta-language for Modular and Reusable Development of DSLs},"Degueule, Thomas and Combemale, Benoit and Blouin, Arnaud and Barais, Olivier and Jézéquel, Jean-Marc","Domain-Specific Languages (DSLs) are now developed for a wide variety of domains to address specific concerns in the development of complex systems. When engineering new DSLs, it is likely that previous efforts spent on the development of other languages could be leveraged, especially when their domains overlap. However, legacy DSLs may not fit exactly the end user requirements and thus require further extension, restriction, or specialization. While current language workbenches provide import mechanisms, they usually lack an explicit support for such customizations of imported artifacts. In this paper, we propose an approach for building DSLs by safely assembling and customizing legacy DSLs artifacts. This approach is based on typing relations that provide a reasoning layer for manipulating DSLs while ensuring type safety. On top of this reasoning layer, we provide an algebra of operators for extending, restricting, and assembling separate DSL artifacts. We implemented the typing relations and algebra into the Melange metalanguage. We illustrate Melange through the modular definition of an executable modeling language for the Internet Of Things domain. We show how it eases the definition of new DSLs by maximizing the reuse of legacy artifacts without introducing issues in terms of performance, technical ecosystem compatibility, or generated code volume."
165,hal-01197039,2015,{Reusing Legacy DSLs with Melange},"Degueule, Thomas and Combemale, Benoit and Blouin, Arnaud and Barais, Olivier","The proliferation of independently-developed and constantly-evolving domain-specific languages (DSLs) in many domains raises new challenges for the software language engineering community. Instead of starting the definition of new DSLs from scratch, language designers would benefit from the reuse of previously defined DSLs. While the support for engineering isolated DSLs is getting more and more mature , there is still little support in language workbenches for importing, assembling, and customizing legacy languages to form new ones. Melange is a new language workbench where new DSLs are built by assembling pieces of syntax and semantics. These pieces can be imported and subsequently extended , restricted, or customized to fit specific requirements. The demonstration will introduce the audience to the main features of Melange through the definition of an executable DSL for the design and execution of Internet of Things systems. Specifically, we will show how such a language can be obtained from the assembly of other popular languages while maintaining the compatibility with their tools and transformations ."
166,hal-01242180,2015,{Tooling Support for Variability and Architectural Patterns in Systems Engineering},"Degueule, Thomas and Ferreira Filho, Joao Bosco and Barais, Olivier and Acher, Mathieu and Le Noir, Jérôme and Madelénat, Sébastien and Gailliard, Grégory and Burlot, Godefroy and Constant, Olivier","In systems engineering, the deployment of software components is error-prone since numerous safety and security rules have to be preserved. Furthermore, many deployments on different heterogeneous platforms are possible. In this paper we present a technological solution to assist industrial practitioners in producing a safe and secure solution out of numerous architectural variants. First, we introduce a pattern technology that provides correct-by-construction deployment models through the reuse of modeling artifacts organized in a catalog. Second, we develop a variability solution , connected to the pattern technology and based on an extension of the common variability language, for supporting the synthesis of model-based architectural variants. This paper describes a live demonstration of an industrial effort seeking to bridge the gap between variability modeling and model-based systems engineering practices. We illustrate the tooling support with an industrial case study (a secure radio platform)."
167,hal-01002994,2014,{Vers une approche centrée humain pour la définition de langages de modélisation graphiques},"Dupuy-Chessa, Sophie and Combemale, Benoit and Gervais, Marie-Pierre and Nodenot, Thierry and Le Pallec, Xavier and Wouters, Laurent","Avec la complexification des systèmes d'information (systèmes ubiquitaires, entreprises ouvertes, etc.), de nombreux nouveaux langages de modélisation sont proposés. Face à ce développement de langages spécifiques, leur qualité devient un enjeu important pour la modélisation des systèmes d'information. Cet article traite de ce problème en suggérant une approche centrée utilisateur pour la création de langages. Nos propositions s'appuient sur nos travaux antérieurs issus de différents projets, mais constituent un tout cohérent qui nous permet de montrer l'intérêt et la faisabilité de l'approche."
168,hal-00969072,2014,{When Model Driven Engineering meets Virtual Reality: Feedback from Application to the Collaviz Framework},"Duval, Thierry and Blouin, Arnaud and Jézéquel, Jean-Marc","Despite the increasing use of 3D Collaborative Virtual Environments (3D CVE), their development is still a cumbersome task. The various concerns to consider (distributed system, 3D graphics, etc.) complexify the development as well as the evolution of CVEs. Software engineering recently proposed methods and tools to ease the development process of complex software systems. Among them, Model-Driven Engineering (MDE) considers models as first-class entities. A model is an abstraction of a specific aspect of the system under study for a specific purpose. MDE thus breaks down a complex system into as many models for different purposes, such as: generating code from models; building domain specific programming/modeling languages (DSL); generating tools such as graphical or textual editors. In this paper we leverage MDE for developing 3D CVEs. We show how the Collaviz framework took benefits from a DSL we built. The benefits are multiple: 3D CVE designers can focus on the behavior of their virtual objects without bothering with distributed and graphics features; configuring the content of 3D CVEs and their deployment on various software and hardware platforms can be automated through code generation. We detail the development process we propose and the experiments we conducted on Collaviz."
169,hal-00983064,2014,{Surrogate-Assisted Optimisation of Composite Applications in Mobile Ad hoc Networks},"Efstathiou, Dionysios and Mcburney, Peter and Zschaler, Steffen and Bourcier, Johann","Infrastructure-less mobile ad hoc networks enable the development of collaborative pervasive applications. Within such dynamic networks, collaboration between devices can be realised through service-orientation by abstracting device resources as services. Recently, a framework for QoS-aware service composition has been introduced which takes into account a spectrum of orchestration patterns, and enables compositions of a better QoS than traditional centralised orchestration approaches. In this paper, we focus on the automated exploration of trade-off compositions within the search space de fined by this flexible composition model. For the studied problem, the evaluation of the fi tness functions guiding the search process is computationally expensive because it either involves a high- fidelity simulation or actually requires calling the composite service. To overcome this limitation, we have developed e fficient surrogate models for estimating the QoS metrics of a candidate solution during the search. Our experimental results show that the use of surrogates can produce solutions with good convergence and diversity properties at a much lower computational e ffort."
170,hal-01090246,2014,{Optimizing Multi-objective Evolutionary Algorithms to Enable Quality-Aware Software Provisioning},"El Kateb, Donia and Fouquet, François and Bourcier, Johann and Le Traon, Yves","—Elasticity [19] is a key feature for cloud infrastruc-tures to continuously align allocated computational resources to evolving hosted software needs. This is often achieved by relaxing quality criteria, for instance security or privacy [8] because quality criteria are often conflicting with performance. As an example, software replication could improve scalability and uptime while decreasing privacy by creating more potential leakage points. The conciliation of these conflicting objectives has to be achieved by exhibiting trade-offs. Multi-Objective Evolutionary Algorithms (MOEAs) have shown to be suitable candidates to find these trade-offs and have been even applied for cloud architecture optimizations [21]. Still though, their runtime efficiency limits the widespread adoption of such algorithms in cloud engines, and thus the consideration of quality criteria in clouds. Indeed MOEAs produce many dead-born solutions because of the Darwinian inspired natural selection, which results in a resources wastage. To tackle MOEAs efficiency issues, we apply a process similar to modern biology. We choose specific artificial mutations by anticipating the optimization effect on the solutions instead of relying on the randomness of natural selection. This paper introduces the Sputnik algorithm, which leverages the past history of actions to enhance optimization processes such as cloud elasticity engines. We integrate Sputnik in a cloud elasticity engine, dealing with performance and quality criteria, and demonstrate significant performance improvement, meeting the runtime requirements of cloud optimization."
171,hal-01087323,2014,{Sound Merging and Differencing for Class Diagrams},"Fahrenberg, Uli and Acher, Mathieu and Legay, Axel and Wąsowski, Andrzej","Class diagrams are among the most popular modeling lan-guages in industrial use. In a model-driven development process, class diagrams evolve, so it is important to be able to assess differences between revisions, as well as to propagate differences using suitable merge operations. Existing differencing and merging methods are mainly syntactic, concentrating on edit operations applied to model elements, or they are based on sampling: enumerating some examples of instances which characterize the difference between two diagrams. This paper presents the first known (to the best of our knowledge) automatic model merging and differencing operators supported by a formal semantic theory guaranteeing that they are semantically sound. All instances of the merge of a model and its difference with another model are automatically instances of the second model. The differences we synthesize are represented using class diagram notation (not edits, or instances), which allows creation of a simple yet flexible algebra for diffing and merging. It also allows presenting changes comprehensively, in a notation already known to users."
172,hal-01163423,2015,{Assessing Product Line Derivation Operators Applied to Java Source Code: An Empirical Study},"Ferreira Filho, João Bosco and Allier, Simon and Barais, Olivier and Acher, Mathieu and Baudry, Benoit","Product Derivation is a key activity in Software Product Line Engineering. During this process, derivation operators modify or create core assets (e.g., model elements, source code instructions, components) by adding, removing or substituting them according to a given configuration. The result is a derived product that generally needs to conform to a programming or modeling language. Some operators lead to invalid products when applied to certain assets, some others do not; knowing this in advance can help to better use them, however this is challenging, specially if we consider assets expressed in extensive and complex languages such as Java. In this paper, we empirically answer the following question: which product line operators, applied to which program elements , can synthesize variants of programs that are incorrect , correct or perhaps even conforming to test suites? We implement source code transformations, based on the derivation operators of the Common Variability Language. We automatically synthesize more than 370,000 program variants from a set of 8 real large Java projects (up to 85,000 lines of code), obtaining an extensive panorama of the sanity of the operations."
173,hal-01116694,2015,{Challenges on Software Unbundling: Growing and Letting Go},"Ferreira Filho, João Bosco and Acher, Mathieu and Barais, Olivier","Unbundling is a phenomenon that consists of dividing an existing software artifact into smaller ones. For example, mobile applications from well-known companies are being divided into simpler and more focused new ones. Despite its current importance, little is known or studied about unbundling or about how it relates to existing software engineering approaches, such as modularization. Consequently, recent cases points out that it has been performed unsystematically and arbitrarily. In this paper, our main goal is to present this novel and relevant concept and its challenges in the light of software engineering, exemplifying it with recent cases. We relate unbundling to standard software modularization, presenting the new motivations behind it, the resulting problems, and drawing perspectives for future support in the area."
174,hal-01159131,2015,{Emergent Robustness in Software Systems through Decentralized Adaptation: an Ecologically-Inspired ALife Approach},"Fleurey, Franck and Baudry, Benoit and Gauzens, Benoit and Elie, André and Yeboah-Antwi, Kwaku","The ecosystem of web applications faces a critical paradox: on one hand, the Internet is a constantly evolving and unpredictable computing platform, on the other hand, the software services that run on top of it hardly have the ability to adapt to the evolution of this platform. Among the software services, we distinguish between service providers that provide micro services and service consumers that aggregate several micro services to deliver macro services to customers. Providers and consumers must handle uncertainty: providers cannot know in advance what consumers need; consumers rely on third-parties that can disappear at any time. Our proposal analogizes the software consumer / provider network to a bi-partite ecological graph. This analogy provides the foundations for the design of EVOSERV, an individual-based ALife simulator used to experiment with decentralized adaptation strategies for providers and consumers. The initial model of a software network is tuned according to observations gathered from real-world software networks. The key insights about our experiments are that, 1) we can successfully model software systems as an ALife system, and 2) we succeed in emerging a global property from local decisions: when consumers and providers adapt with local decision strategies, the global robustness of the network increases. We show that these results hold with different initial situations, different scales and different topological constraints on the network."
175,hal-01096941,2014,{Designing and Evolving Distributed Architecture using Kevoree},"Fouquet, François and Nain, Grégory and Daubert, Erwan and Bourcier, Johann and Barais, Olivier and Plouzeau, Noel and Morin, Brice","Modern software applications are distributed and often operate in dynamic contexts, where requirements, assumptions about the en-vironment, and usage profiles continuously change. These changes are difficult to predict and to anticipate at design time. The running software system should thus be able to react on its own, by dynam-ically adapting its behavior, in order to sustain a required quality of service. A key challenge is to provide the system with the neces-sary flexibility to perform self-adaptation, without compromising dependability. Models@Runtime is an emerging paradigm aiming at transferring traditional modeling activities (focusing on qual-ity, verification, and so on) performed by humans, to the running system. In this trend, Kevoree provides a models@ runtime plat-form to design heterogeneous, distributed and adaptive applications based on the component based software engineering paradigm. At the end of this tutorial, applicants will be able to develop and as-semble new components and communication channel to design com-plex self-adaptable distributed architectures by reusing existing piece of code."
176,hal-01003148,2014,{A Variability-Based Testing Approach for Synthesizing Video Sequences},"Galindo Duarte, José Angel and Alférez, Mauricio and Acher, Mathieu and Baudry, Benoit and Benavides, David","A key problem when developing video processing software is the difficulty to test different input combinations. In this paper, we present VANE, a variability-based testing approach to derive video sequence variants. The ideas of VANE are i) to encode in a variability model what can vary within a video sequence; ii) to exploit the variability model to generate testable configurations; iii) to synthesize variants of video sequences corresponding to configurations. VANE computes T-wise covering sets while optimizing a function over attributes. Also, we present a preliminary validation of the scalability and practicality of VANE in the context of an industrial project involving the test of video processing algorithms.VANE computes T-wise covering sets while optimizing a function over attributes. Also, we present a preliminary validation of the scalability and practicality of VANE in the context of an industrial project involving the test of video processing algorithms."
177,hal-01334851,2016,{Exploiting the Enumeration of All Feature Model Configurations},"Galindo Duarte, José Angel and Acher, Mathieu and Tirado, Juan Manuel and Vidal, Cristian and Baudry, Benoit and Benavides, David","Feature models are widely used to encode the configurations of a software product line in terms of mandatory, optional and exclusive features as well as propositional constraints over the features. Numerous computationally expensive procedures have been developed to model check, test, configure, debug, or compute relevant information of feature models. In this paper we explore the possible improvement of relying on the enumeration of all configurations when performing automated analysis operations. The key idea is to pre-compile configurations so that reasoning operations (queries and transformations) can then be performed in polytime. We tackle the challenge of how to scale the existing enu-meration techniques. We show that the use of distributed computing techniques might offer practical solutions to previously unsolvable problems and opens new perspectives for the automated analysis of software product lines."
178,hal-02282591,2019,{A Collaborative Strategy for mitigating Tracking through Browser Fingerprinting},"Gómez-Boix, Alejandro and Frey, Davide and Bromberg, Yérom-David and Baudry, Benoit",
179,hal-01611101,2017,{Fingerprinting mobile devices: A short analysis},"Gómez-Boix, Alejandro and Laperdrix, Pierre and Baudry, Benoit","The variety of possibilities for customizing web browser, combined with the capacity of a remote server to query information without requesting any permission, has created a new kind of privacy issue called browser fingerprinting. Previous studies demonstrated the effectiveness of fingerprinting to uniquely identify browsers running on laptops and desktops. Meanwhile, the majority of browsing activities have switched from laptops to mobile devices. Yet, there is currently a lack of systematic analysis of browser fingerprinting on mobile devices. In this paper, we describe browser fingerprinting technologies and summarize previous observations on laptops. Then, we present preliminary observations on mobile devices, highlighting differences with previous results, and challenges for future work."
180,hal-01718234,2018,{Hiding in the Crowd: an Analysis of the Effectiveness of Browser Fingerprinting at Large Scale},"Gómez-Boix, Alejandro and Laperdrix, Pierre and Baudry, Benoit","Browser fingerprinting is a stateless technique, which consists in collecting a wide range of data about a device through browser APIs. Past studies have demonstrated that modern devices present so much diversity that fingerprints can be exploited to identify and track users online. With this work, we want to evaluate if browser fingerprinting is still effective at uniquely identifying a large group of users when analyzing millions of fingerprints over a few months.We analyze 2,067,942 browser fingerprints collected from one of the top 15 French websites. The observations made on this novel dataset shed a newlight on the ever-growing browser fingerprinting domain. The key insight is that the percentage of unique fingerprints in this dataset is much lower than what was reported in the past: only 33.6% of fingerprints are unique by opposition to over 80% in previous studies. We show that non-unique fingerprints tend to be fragile. If some features of the fingerprint change, it is very probable that the fingerprint will become unique. We also confirm that the current evolution of web technologies is benefiting users’ privacy significantly as the removal of plugins brings down substantively the rate of unique desktop machines."
181,hal-01090565,2014,{Designing resource-aware distriubted system based on system level containers},"Gonzalez-Herrera, Inti and Bourcier, Johann and Barais, Olivier and Fouquet, François","Resource management is critical for application domains where components share their execution environments but belong to dif-ferent stakeholders, such as smart homes or cloud systems. Yet, current middleware and application containers often hide system-level details needed for dynamic resource management. In particular, they tend to hide resource usage by offering automatic management of these resources (e.g., CPU, memory and I/O). In contrast, system-level containers, such as Linux Containers (LXC), allow fine-grain resource management. How-ever, they lack knowledge about the application's structure and its re-quirements in order to provide fine tuned resource management. In this tutorial, we will expose Squirrel: a new middleware which aims at combining the benefits from the component based software engineering to design flexible and modular application and the system level contain-ers to manage resources. Squirrel follows an approach where develop-ers specifies contracts on components and connections to describe the expected behavior of their application regarding resource consumption. These high level contracts are then used to automatically configure the system level containers which will hosts the running applications. At the end of this tutorial, applicants will be able to design applications and contracts using Squirrel and run their application inside system level containers to ensure a correct behavior of their application regarding resource consumption."
182,hal-00983045,2014,{Scapegoat: an Adaptive monitoring framework for Component-based systems},"Gonzalez-Herrera, Inti and Bourcier, Johann and Daubert, Erwan and Rudametkin, Walter and Barais, Olivier and Fouquet, François and Jézéquel, Jean-Marc","Modern component frameworks support continuous deployment and simultaneous execution of multiple software components on top of the same virtual machine. However, isolation between the various components is limited. A faulty version of any one of the software components can compromise the whole system by consuming all available resources. In this paper, we address the problem of efficiently identifying faulty software components running simultaneously in a single virtual machine. Current solutions that perform permanent and extensive monitoring to detect anomalies induce high overhead on the system, and can, by themselves, make the system unstable. In this paper we present an optimistic adaptive monitoring system to determine the faulty components of an application. Suspected components are finely instrumented for deeper analysis by the monitoring system, but only when required. Unsuspected components are left untouched and execute normally. Thus, we perform localized just-in-time monitoring that decreases the accumulated overhead of the monitoring system. We evaluate our approach against a state-of-the-art monitoring system and show that our technique correctly detects faulty components, while reducing overhead by an average of 80%."
183,hal-01355000,2016,{Squirrel: Architecture Driven Resource Management},"Gonzalez-Herrera, Inti and Bourcier, Johann and Rudametkin, Walter and Barais, Olivier and Fouquet, Francois","Resource management is critical to guarantee Quality of Service when various stakeholders share the execution environment , such as cloud or mobile environments. In this context, providing management techniques compatible with standard practices, such as component models, is essential. Resource management is often realized through monitoring or process isolation (using virtual machines or system containers). These techniques (i) impose varying levels of overhead depending on the managed resource, and (ii) are applied at different abstraction levels, such as processes, threads or objects. Thus, mapping components to system-level abstractions in the presence of resource management requirements can lead to sub-optimal systems. We propose Squirrel, an approach to tune component deployment and resource management in order to reduce management overhead. At run-time, Squirrel uses an architectural model annotated with resource requirements to guide the mapping of components to system abstractions, providing different resource management capabilities and overhead. We present an implementation of Squirrel, using a Java component framework, and a set of experiments to validate its feasibility and overhead. We show that choosing the right component-to-system mappings at deployment-time reduces performance penalty and/or volatile main memory use."
184,hal-01468084,2017,{Yo Variability! JHipster: A Playground for Web-Apps Analyses},"Halin, Axel and Nuttinck, Alexandre and Acher, Mathieu and Devroey, Xavier and Perrouin, Gilles and Heymans, Patrick","Though variability is everywhere, there has always been a shortage of publicly available cases for assessing variability-aware tools and techniques as well as supports for teaching variability-related concepts. Historical software product lines contains industrial secrets their owners do not want to disclose to a wide audience. The open source community contributed to large-scale cases such as Eclipse, Linux kernels, or web-based plugin systems (Drupal, WordPress). To assess accuracy of sampling and prediction approaches (bugs, performance), a case where all products can be enumerated is desirable. As configuration issues do not lie within only one place but are scattered across technologies and assets, a case exposing such diversity is an additional asset. To this end, we present in this paper our efforts in building an explicit product line on top of JHipster, an industrial open-source Web-app configurator that is both manageable in terms of configurations (~ 163,000) and diverse in terms of technologies used. We present our efforts in building a variability-aware chain on top of JHipster's configurator and lessons learned using it as a teaching case at the University of Rennes. We also sketch the diversity of analyses that can be performed with our infrastructure as well as early issues found using it. Our long term goal is both to support students and researchers studying variability analysis and JHipster developers in the maintenance and evolution of their tools."
185,hal-01158925,2014,{Cost in the Cloud: Rationalization and Research Trails},"Hamadache, Kahina and Simko, Viliam and Dautov, Rustem and Gonidis, Fotis and Zerva, Paraskevi and Paez Anaya, Ivan Dario and Polyviou, Ariana","Cloud Computing provides simplicity to its consumers by saving them the efforts to deal with their own infrastructure, environments or software. This simplicity relies on the shifting of problems from the client to the provider, introducing new paradigms (virtualization, scalability, flexibility, pay-per-use, etc.). This simplicity comes with a price for the consumer that may accurately, or not, reflect the costs of the provider. In this paper we propose to identify the different points, in the Cloud Computing architecture, where the costs are generated , how their reduction/optimisation are considered, and finally we point-out which of these key points need to be further investigated, according to their foreseeable efficiency."
186,hal-02989286,2020,{ReLeaSER: A Reinforcement Learning Strategy for Optimizing Utilization Of Ephemeral Cloud Resources},"Handaoui, Mohamed and Dartois, Jean-Emile and Boukhobza, Jalil and Barais, Olivier and d'Orazio, Laurent","Cloud data center capacities are over-provisioned to handle demand peaks and hardware failures which leads to low resources' utilization. One way to improve resource utilization and thus reduce the total cost of ownership is to offer unused resources (referred to as ephemeral resources) at a lower price. However, reselling resources needs to meet the expectations of its customers in terms of Quality of Service. The goal is so to maximize the amount of reclaimed resources while avoiding SLA penalties. To achieve that, cloud providers have to estimate their future utilization to provide availability guarantees. The prediction should consider a safety margin for resources to react to unpredictable workloads. The challenge is to find the safety margin that provides the best trade-off between the amount of resources to reclaim and the risk of SLA violations. Most state-of-the-art solutions consider a fixed safety margin for all types of metrics (e.g., CPU, RAM). However, a unique fixed margin does not consider various workloads variations over time which may lead to SLA violations or/and poor utilization. In order to tackle these challenges, we propose ReLeaSER, a Reinforcement Learning strategy for optimizing the ephemeral resources' utilization in the cloud. ReLeaSER dynamically tunes the safety margin at the host-level for each resource metric. The strategy learns from past prediction errors (that caused SLA violations). Our solution reduces significantly the SLA violation penalties on average by 2.7x and up to 3.4x. It also improves considerably the CPs' potential savings by 27.6% on average and up to 43.6%."
187,hal-02497029,2020,{Salamander: a Holistic Scheduling of MapReduce Jobs on Ephemeral Cloud Resources},"Handaoui, Mohamed and Dartois, Jean-Emile and Lemarchand, Laurent and Boukhobza, Jalil","Most cloud data centers are over-provisioned and underutilized, primarily to handle peak loads and sudden failures. This has motivated many researchers to reclaim the unused resources, which are by nature ephemeral, to run data-intensive applications at a lower cost. Hadoop MapReduce is one of those applications. However, it was designed on the assumption that resources are available as long as users pay for the service. In order to make it possible for Hadoop to run on unused (ephemeral) resources, we have designed a heterogeneity and volatility-aware holistic scheduler consisting of three different components: (1) A MapReduce task and job scheduler that relies on a global vision of resource utilization predictions, (2) a scheduler-based data placement strategy that improves the data locality, and (3) a reactive QoS controller that ensures customers’ service-level agreement (SLA) and minimizes interference between co-located workloads. Our framework makes it possible to take advantage of ephemeral resources efficiently. Indeed, for a given set of jobs, it reduces the overall execution time by up to 47.6% and an average of 18.7% as compared to state-of-the-art strategies."
188,hal-01097020,2014,{A Native Versioning Concept to Support Historized Models at Runtime},"Hartmann, Thomas and Fouquet, François and Nain, Gregory and Morin, Brice and Klein, Jacques and Barais, Olivier and Le Traon, Yves","[email protected] provides semantically rich reflection lay-ers enabling intelligent systems to reason about themselves and their surrounding context. Most reasoning processes require not only to ex-plore the current state, but also the past history to take sustainable decisions e.g. to avoid oscillating between states. [email protected] and model-driven engineering in general lack native mechanisms to efficiently support the notion of history, and current approaches usually generate redundant data when versioning models, which reasoners need to nav-igate. Because of this limitation, models fail in providing suitable and sustainable abstractions to deal with domains relying on history-aware reasoning. This paper tackles this issue by considering history as a na-tive concept for modeling foundations. Integrated, in conjunction with lazy load/storage techniques, into the Kevoree Modeling Framework, we demonstrate onto a smart grid case study, that this mechanisms enable a sustainable reasoning about massive historized models."
189,hal-02129131,2019,{Discovering Indicators for Classifying Wikipedia Articles in a Domain: A Case Study on Software Languages},"Heinz, Marcel and Lämmel, Ralf and Acher, Mathieu","Wikipedia is a rich source of information across many knowledge domains. Yet, recovering articles relevant to a specific domain is a difficult problem since such articles may be rare and tend to cover multiple topics. Furthermore, Wikipedia's categories provide an ambiguous classification of articles as they relate to all topics and thus are of limited use. In this paper, we develop a new methodology to isolate Wikipedia's articles that describe a specific topic within the scope of relevant categories; the methodology uses supervised machine learning to retrieve a decision tree classifier based on articles' features (URL patterns, summary text, infoboxes, links from list articles). In a case study, we retrieve 3000+ articles that describe software (computer) languages. Available fragments of ground truths serve as an essential part of the training set to detect relevant articles. The results of the classification are thoroughly evaluated through a survey, in which 31 domain experts participated."
190,hal-04077663,2023,{Polyglot AST: Towards Enabling Polyglot Code Analysis},"Houdaille, Philémon and Khelladi, Djamel Eddine and Briend, Romain and Jongeling, Robbert and Combemale, Benoit","Today, a plethora of programming languages exists, each better suited for a particular concern. For example, Python is suited for data analysis but not web development, whereas JavaScript is the inverse. As software complexity grows and needs to address multiple concerns, different programming languages are often used in combination, despite the burden of bridging them (e.g., using Java Native Interface). Polyglot programming emerged as a solution allowing the seamless mixing of multiple programming languages. GraalVM and PolyNote are examples of runtimes allowing polyglot programming. However, there is a striking lack of support at design time for building and analyzing polyglot code. To the best of our knowledge, there is no uniform language-agnostic way of reasoning over multiple languages to provide seamless code analysis, since each language comes with its own form of Abstract Syntax Trees (AST). In this paper, we present an approach to build a uniform yet polyglot AST over polyglot code, so that it is easier to perform global analysis. We first motivate this challenge and identify the main requirements for building a polyglot AST. We then propose a proof of concept implementation of our solutions on GraalVM's polyglot API. On top of the polyglot AST, we demonstrate the ability to implement several polyglot-specific analysis services, namely auto-completion, consistency checking, type inference, and rename refactoring. Our evaluation on three polyglot projects taken from GitHub, and involving JavaScript and Python code, shows that we can build a polyglot AST without significant overhead. We also demonstrate the usefulness of the polyglot analysis services through the provided automation, as well as their scalability."
191,hal-01382971,2016,{libmask: Protecting Browser JIT Engines from the Devil in the Constants},"Jangda, Abhinav and Mishra, Mohit and Baudry, Benoit","JavaScript (JS) engines are virtual machines that execute JavaScript code. These engines find frequent application in web browsers like Google Chrome, Mozilla Firefox, Microsoft Internet Explorer and Apple Safari. Since, the purpose of a JS engine is to produce executable code, it cannot be run in a non-executable environment, and is susceptible to attacks like Just-in-Time (JIT) Spraying, which embed return-oriented programming (ROP) gadgets in arithmetic or logical instructions as immediate offsets. This paper introduces libmask, a JIT compiler extension to prevent the JIT-spraying attacks as an effective alternative to XOR based constant blinding. libmask transforms constants into global variables and marks the memory area for these global variables as read only. Hence, any constant is referred to by a memory address making exploitation of arithmetic and logical instructions more difficult. Further, these memory addresses are randomized to further harden the security. The scheme has been implemented and evaluated as a librddy extension to Google V8 scripting engine with optimizations that contain performance overhead and make libmask a feasible approach. We demonstrate that libmask masks all the constants in JITed code, and effectively raise the bar for JIT-spray and JIT-ROP attacks. The average overhead incurred upon memory is less than 300 kilobytes, while in most benchmarks the memory overhead is less than 10 KB. The average performance overhead observed with optimizations measures is 5.31%. Further, this new approach shows a modest performance improvement over currently deployed constant blinding technique in Google V8."
192,hal-02307953,2019,{From DSL Specification to Interactive Computer Programming Environment},"Jeanjean, Pierre and Combemale, Benoit and Barais, Olivier",
193,hal-03107122,2021,{IDE as Code: Reifying Language Protocols as First-Class Citizens},"Jeanjean, Pierre and Combemale, Benoit and Barais, Olivier","To cope with the ever-growing number of programming languages, manufacturers of Integrated Development Environments (IDE) have recently defined protocols as a way to use and share multiple language services (e.g., auto-completion, type checker, language runtime) in language-agnostic environments (i.e., the user interface provided by the IDE): the most notable are the Language Server Protocol (LSP) for textual editors, and the Debug Adapter Protocol (DAP) for debugging facilities. These protocols rely on a proper specification of the services that are commonly found in the tool support of general-purpose languages, and define a fixed set of capabilities to offer in the IDE. However, new languages appear regularly offering unique constructs (e.g., Domain-Specific Languages), and supported by dedicated services to be offered as new capabilities in IDEs. This trend leads to the multiplication of new protocols, hard to combine and possibly incompatible (e.g., overlap, different technological stacks). Beyond the proposition of specific protocols, the goal of this paper is to stress out the importance of being able to specify language protocols and to offer IDEs to be configured with such protocol specifications. We present our vision by discussing the main concepts for the specification of language protocols, and an approach that can make use of these specifications in order to deploy an IDE as a set of coordinated, individually deployed, language capabilities (e.g., microservice choreography). IDEs went from directly supporting languages to protocols, and we envision in this paper the next step: IDE as Code, where language protocols are created or inferred on demand and serve as support of an adaptation loop taking in charge of the (re)configuration of the IDE."
194,hal-01098012,2014,{Domain Specific Languages: From Craft to Engineering},"Jézéquel, Jean-Marc","The engineering of systems involves many different stake-holders, each with their own domain of expertise. Hence more and more organizations are developing an ever grow-ing number of Domain Specific Languages (DSLs) to allow domain experts to express solutions directly in terms of rel-evant domain concepts. This new trend raises new chal-lenges about designing not just one DSL but many of them, evolving a set of DSLs and coordinating the use of multiple DSLs. In this talk we explore various dimensions of these challenges, and outline a possible research roadmap for ad-dressing them. We detail one of these challenges, which is the safe reuse of model transformations. Indeed both DSL definition and tooling (eg. checkers, document or code gen-erators, model transformations) require significant develop-ment efforts, for a limited audience (by definition), because the current state of the art of Model Driven Engineering still makes it hard to reuse and evolve these definitions and tool-ing across several DSLs, even when these DSLs are concep-tually very close to one other. We outline a new extension to the Kermeta language that leverages Family Polymorphism to allow model polymorphism, inheritance among DSLs, as well as evolution and interoperability of DSLs."
195,hal-03788437,2022,{From feature models to feature toggles in practice},"Jézéquel, Jean-Marc and Kienzle, Jörg and Acher, Mathieu","Feature Toggles (often also referred to as Feature Flags) are a powerful technique, providing an alternative to maintaining multiple feature branches in source code. A condition within the code enables or disables a feature at runtime, hence providing a kind of runtime variability resolution. Several works have already identified the proximity of this concept with the notion of Feature found in Software Product Lines. In this paper, we propose to go one step further in unifying these concepts to provide a seamless transition between design time and runtime variability resolutions. We propose to model all the variability using a feature model. Then this feature model can be partially resolved at design time (yielding an incomplete product derivation), the unresolved variability being used to generate feature toggles that can be enabled/disabled at runtime. We first demonstrate these ideas on the toy example of the Expression Product Line, and then show how it can scale to build a configurable authentication system, where a partially resolved feature model can interface with popular feature toggle frameworks such as Togglz."
196,hal-01097975,2014,{Safely Reusing Model Transformations through Family Polymorphism},"Jézéquel, Jean-Marc","The engineering of systems involves many different stakeholders, each with their own domain of expertise. Hence more and more organizations are adopting Domain Specific Languages (DSLs) to allow domain experts to express solutions directly in terms of relevant domain concepts. This new trend raises new challenges about designing DSLs, evolving a set of DSLs and coordinating the use of multiple DSLs. In this talk we explore various dimensions of these challenges, and outline a possible research roadmap for addressing them. We detail one of these challenges, which is the safe reuse of model transformations.Indeed both DSL definition and tooling (e.g., checkers, document or code generators, model transformations) require significant development efforts, for a limited audience (by definition), because the current state of the art of Model Driven Engineering still makes it hard to reuse and evolve these definitions and tooling across several DSLs, even when these DSLs are conceptually very close to one other. We outline a new extension to the Kermeta language that leverages Family Polymorphism to allow model polymorphism, inheritance among DSLs, as well as evolution and interoperability of DSLs."
197,hal-01024166,2014,{When Systems Engineering Meets Software Language Engineering},"Jézéquel, Jean-Marc and Mendez, David and Degueule, Thomas and Combemale, Benoit and Barais, Olivier","The engineering of systems involves many different stakeholders, each with their own domain of expertise. Hence more and more organizations are adopting Domain Specific Languages (DSLs) to allow domain experts to express solutions directly in terms of relevant domain concepts. This new trend raises new challenges about designing DSLs, evolving a set of DSLs and coordinating the use of multiple DSLs for both DSL designers and DSL users. This paper explores various dimensions of these challenges, and outlines a possible research roadmap for addressing them. The message of this paper is also to claim that if language engineering techniques to design any single (disposable) language are mature, the language engineering community needs to fundamentally change its view on software language design. We need to take the next step and adopt the perspective that a software language is, fundamentally, software too and thus the result of a composition of design decisions. These design decisions should be represented as first-class entities in the software languages workbench and it should be possible, during the language lifecycle, to add, remove and change language design decisions with limited effort to go from continuous design to continuous meta-design."
198,hal-01179294,2015,{Towards Functional Model Transformations with OCL},"Jouault, Frédéric and Beaudoux, Olivier and Brun, Matthias and Clavreul, Mickaël and Savaton, Guillaume","Several model transformation approaches such as QVT and ATL use OCL as expression language for its model-querying capabilities. However, they need to add specific and incompatible syntactic constructs for pattern matching as well as model element creation and mutation.In this paper, we present an exploratory approach to enable the expression of whole model transformations in OCL. This approach leverages some OCL extensions proposed for inclusion in the upcoming OCL 2.5: pattern matching and shadow objects. It also relies on a specific execution layer to enable traceability and side effects on models.With model transformations as OCL functions, it becomes possible to use a single, standard, well-known, functional, and formalized model querying language to perform tasks traditionally assigned to model transformation languages. Thus, functional techniques such as function composition and higher-order become directly applicable to model transformations."
199,hal-03355253,2021,{SEALS: A framework for building Self-Adaptive Virtual Machines},"Jouneaux, Gwendal and Barais, Olivier and Combemale, Benoit and Mussbacher, Gunter","Over recent years, self-adaptation has become a major concern for software systems that evolve in changing environments. While expert developers may choose a manual implementation when self-adaptation is the primary concern, selfadaptation should be abstracted for non-expert developers or when it is a secondary concern. We present SEALS, a framework for building self-adaptive virtual machines for domainspecific languages. This framework provides first-class entities for the language engineer to promote domain-specific feedback loops in the definition of the DSL operational semantics. In particular, the framework supports the definition of (i) the abstract syntax and the semantics of the language as well as the correctness envelope defining the acceptable semantics for a domain concept, (ii) the feedback loop and associated trade-off reasoning, and (iii) the adaptations and the predictive model of their impact on the trade-off. We use this framework to build three languages with self-adaptive virtual machines and discuss the relevance of the abstractions, effectiveness of correctness envelopes, and compare their code size and performance results to their manually implemented counterparts. We show that the framework provides suitable abstractions for the implementation of self-adaptive operational semantics while introducing little performance overhead compared to a manual implementation."
200,hal-03318816,2021,{Towards Self-Adaptable Languages},"Jouneaux, Gwendal and Barais, Olivier and Combemale, Benoit and Mussbacher, Gunter","Over recent years, self-adaptation has become a concern for many software systems that have to operate in complex and changing environments. At the core of self-adaptation, there is a feedback loop and associated trade-off reasoning to decide on the best course of action. However, existing software languages do not abstract the development and execution of such feedback loops for self-adaptable systems. Developers have to fall back to ad-hoc solutions to implement self-adaptable systems, often with wide-ranging design implications (e.g., explicit MAPE-K loop). Furthermore, existing software languages do not capitalize on monitored usage data of a language and its modeling environment. This hinders the continuous and automatic evolution of a software language based on feedback loops from the modeling environment and runtime software system. To address the aforementioned issues, this paper introduces the concept of <i>Self-Adaptable Language</i> (SAL) to abstract the feedback loops at both system and language levels. We propose <i>L-MODA (Language, Models, and Data)</i> as a conceptual reference framework that characterizes the possible feedback loops abstracted into a SAL. To demonstrate SALs, we present emerging results on the abstraction of the system feedback loop into the language semantics. We report on the concept of <i>Self-Adaptable Virtual Machines</i> as an example of semantic adaptation in a language interpreter and present a roadmap for SALs."
201,hal-02192482,2019,{Supporting A Flexible Grouping Mechanism for Collaborating Engineering Teams},"Kanakis, Georgios and Fischer, Stefan and Khelladi, Djamel Eddine and Egyed, Alexander","Most engineering tools do not provide much support for collaborating teams and today's engineering knowledge repositories lack flexibility and are limited. Engineering teams have different needs and their team members have different preferences on how and when to collaborate. These needs may depend on the individual work style, the role an engineer has, and the tasks they have to perform within the collaborating group. However, individual collaboration is insufficient and engineers need to collaborate in groups. This work presents a collaboration framework for collaborating groups capable of providing synchronous and asynchronous mode of collaboration. Additionally, our approach enables engineers to mix these collaboration modes to meet the preferences of individual group members. We evaluate the scalability of this framework using four real life large collaboration projects. These projects were found from GitHub and they were under active development by the time of evaluation. We have tested our approach creating groups of different sizes for each project. The results showed that our approach scales to support every case for the groups created. Additionally, we scouted the literature and discovered studies that support the usefulness of different groups with collaboration styles."
202,hal-04126496,2023,{Towards Leveraging Tests to Identify Impacts of Metamodel and Code Co-evolution},"Kebaili, Zohra Kaouter and Khelladi, Djamel Eddine and Acher, Mathieu and Barais, Olivier","Models play a significant role in Model-Driven Engineering (MDE) and metamodels are commonly transformed into code. Developers intensively rely on the generated code to build language services and tooling, such as editors and views which are also tested to ensure their behavior. The metamodel evolution between releases updates the generated code, and this may impact the developers' additional, client code. Accordingly, the impacted code must be co-evolved too, but there is no guarantee of preserving its behavior correctness. This paper envisions an automatic approach for ensuring code co-evolution correctness. It first aims to trace the tests impacted by the metamodel evolution before and after the code co-evolution, and then compares them to analyze the behavior of the code. Preliminary evaluation on two implementations of OCL and Modisco Eclipse projects. showed that we can successfully traced the impacted tests automatically by selecting 738 and 412 tests, before and after co-evolution respectively, based on 303 metamodel changes. By running these impacted tests, we observed both behaviorally correct and incorrect code co-evolution."
203,hal-01024067,2014,{A formal framework to prove the correctness of model driven engineering composition operators},"Kezadri, Mounira and Pantel, Marc and Combemale, Benoit and Thirioux, Xavier","Current trends in system engineering combine modeling, composition and verification technologies in order to harness their ever growing complexity. Each composition operator dedicated to a different modeling concern should be proven to be property preserving at assembly time. These proofs are usually burdensome with repetitive aspects. Our work targets the factorisation of these aspects relying on primitive generic composition operators used to express more sophisticated language specific ones. These operators are defined for languages expressed with OMG MOF metamodeling technologies. The proof are done with the Coq proof assistant relying on the Coq4MDE framework defined previously. These basic operators, Union and Substitution, are illustrated using the MOF Package Merge as composition operator and the preservation of model conformance as verified property."
204,hal-03252267,2014,{Correct-by-construction model composition: Application to the Invasive Software Composition method},"Kezadri, Mounira and Pantel, Marc and Combemale, Benoit and Thirioux, Xavier","Composition technologies improve reuse in the development of large-scale complex systems. Safety critical systems require intensive validation and verification activities. These activities should be compositional in order to reduce the amount of residual verification activities that must be conducted on the composite in addition to the ones conducted on each components. In order to ensure the correctness of compositional verification and assess the minimality of the residual verification, the contribution proposes to use formal specification and verification at the composition operator level. A first experiment was conducted in [15] using proof assistants to formalize the generic composition technology ISC and prove that type checking was compositional. This contribution extends our early work to handle full model conformance and study the mandatory residual verification. It shows that ISC operators are not fully compositional with respect to conformance and provides the minimal preconditions on the operators mandatory to ensure compositional conformance. The appropriate operators from ISC (especially bind) have been implemented in the COQ4MDE framework that provides a full implementation of MOF in the COQ proof assistant. Expected properties, respectively residual verification, are expressed as post, respectfully pre, conditions for the composition operators. The correctness of the compositional verification is proven in COQ."
205,hal-03029429,2020,{Co-Evolving Code with Evolving Metamodels},"Khelladi, Djamel Eddine and Combemale, Benoit and Acher, Mathieu and Barais, Olivier and Jézéquel, Jean-Marc","Metamodels play a significant role to describe and analyze the relations between domain concepts. They are also cornerstone to build a software language (SL) for a domain and its associated tooling. Metamodel definition generally drives code generation of a core API. The latter is further enriched by developers with additional code implementing advanced functionalities, e.g., checkers, recommenders, etc. When a SL is evolved to the next version, the metamodels are evolved as well before to regenerate the core API code. As a result, the developers added code both in the core API and the SL toolings may be impacted and thus may need to be co-evolved accordingly. Many approaches support the co-evolution of various artifacts when metamodels evolve. However, not the co-evolution of code. This paper fills this gap. We propose a semi-automatic co-evolution approach based on change propagation. The premise is that knowledge of the metamodel evolution changes can be propagated by means of resolutions to drive the code co-evolution. Our approach leverages on the abstraction level of metamodels where a given metamodel element has often different usages in the code. It supports alternative co-evaluations to meet different developers needs. Our work is evaluated on three Eclipse SL implementations, namely OCL, Modisco, and Papyrus over several evolved versions of metamodels and code. In response to five different evolved metamodels, we co-evolved 976 impacts over 18 projects.A comparison of our co-evolved code with the versioned ones shows the usefulness of our approach. Our approach was able to reach a weighted average of 87.4% and 88.9% respectively of precision and recall while supporting useful alternative co-evolution that developers have manually performed."
206,hal-02326034,2019,{Detecting and Exploring Side Effects when Repairing Model Inconsistencies},"Khelladi, Djamel Eddine and Kretschmer, Roland and Egyed, Alexander",
207,hal-03029426,2020,{On the Power of Abstraction: a Model-Driven Co-evolution Approach of Software Code},"Khelladi, Djamel Eddine and Combemale, Benoit and Acher, Mathieu and Barais, Olivier","Model-driven software engineering fosters abstraction through the use of models and then automation by transforming them into various artefacts, in particular to code, for example: 1) from architectural models to code, 2) from metamodels to API code (with EMF in Eclipse), 3) from entity models to front-end and back-end code in Web stack application (with JHispter), etc. In all these examples, the generated code is usually enriched by developers with additional code implementing advanced functionalities (e.g., checkers, recommenders, etc.) to build a full coherent system. When the system must evolve, so are the models to regenerate the code. As a result, the developers' enriched code may be impacted and thus need to co-evolve accordingly. Many approaches support the co-evolution of various artifacts, but not the co-evolution of code. This paper sheds light on this issue and envisions to fill this gap. We formulate the hypothesis that the code co-evolution can be driven by the model changes by means of change propagation. To investigate this hypothesis, we implemented a prototype for the case of metamodels and their accompanying code in EMF Eclipse. As a preliminary evaluation, we considered the case of the OCL Pivot metamodel evolution and its code co-evolution in two projects from version 3.2.2 to 3.4.4. Preliminary results confirms our hypothesis that model-driven evolution changes can effectively drive the code co-evolution. On 562 impacts in two projects' code by 221 metamodel changes, our approach was able to reach the average of 89% and 92,5% respectively of precision and recall."
208,hal-03770004,2022,{Global Decision Making Over Deep Variability in Feedback-Driven Software Development},"Kienzle, Jörg and Combemale, Benoit and Mussbacher, Gunter and Alam, Omar and Bordeleau, Francis and Burgueño, Loli and Engels, Gregor and Galasso, Jessie and Jézéquel, Jean-Marc and Kemme, Bettina and Mosser, Sébastien and Sahraoui, Houari and Schiedermeier, Maximilian and Syriani, Eugene","To succeed with the development of modern software, organizations must have the agility to adapt faster to constantly evolving environments to deliver more reliable and optimized solutions that can be adapted to the needs and environments of their stakeholders including users, customers, business, development, and IT. However, stakeholders do not have sufficient automated support for global decision making, considering the increasing variability of the solution space, the frequent lack of explicit representation of its associated variability and decision points, and the uncertainty of the impact of decisions on stakeholders and the solution space. This leads to an ad-hoc decision making process that is slow, error-prone, and often favors local knowledge over global, organization-wide objectives. The Multi-Plane Models and Data (MP-MODA) framework explicitly represents and manages variability, impacts, and decision points. It enables automation and tool support in aid of a multi-criteria decision making process involving different stakeholders within a feedback-driven software development process where feedback cycles aim to reduce uncertainty. We present the conceptual structure of the framework, discuss its potential benefits, and enumerate key challenges related to tool supported automation and analysis within MP-MODA."
209,hal-01287720,2016,{VCU: The Three Dimensions of Reuse},"Kienzle, Jörg and Mussbacher, Gunter and Alam, Omar and Schöttle, Matthias and Belloir, Nicolas and Collet, Philippe and Combemale, Benoit and Deantoni, Julien and Klein, Jacques and Rumpe, Bernhard","Reuse, enabled by modularity and interfaces, is one of the most important concepts in software engineering. This is evidenced by an increasingly large number of reusable artifacts, ranging from small units such as classes to larger, more sophisticated units such as components, services, frameworks, software product lines, and concerns. This paper presents evidence that a canonical set of reuse interfaces has emerged over time: the variation, customization, and usage interfaces (VCU). A reusable artifact that provides all three interfaces reaches the highest potential of reuse, as it explicitly exposes how the artifact can be manipulated during the reuse process along these three dimensions. We demonstrate the wide applicability of the VCU interfaces along two axes: across abstraction layers of a system specification and across existing reuse techniques. The former is shown with the help of a comprehensive case study including reusable requirements, software, and hardware models for the authorization domain. The latter is shown with a discussion on how the VCU interfaces relate to existing reuse techniques."
210,hal-01241673,2016,{A Formal Modeling and Analysis Framework for Software Product Line of Preemptive Real-Time Systems},"Kim, Jin Hyun and Legay, Axel and Traonouez, Louis-Marie and Acher, Mathieu and Kang, Sungwon","This paper presents a formal analysis framework to analyze a family of platform products w.r.t. real-time properties. First, we propose an extension of the widely-used feature model, called Property Feature Model (PFM), that distinguishes features and properties explicitly Second, we present formal behavioral models of components of a real-time scheduling unit such that all real-time scheduling units implied by a PFM are automatically composed to be analyzed against the properties given by the PFM. We apply our approach to the verification of the schedulability of a family of scheduling units using the symbolic and statistical model checkers of Uppaal."
211,hal-03921373,2022,{Risk Explorer for Software Supply Chains},"Ladisa, Piergiorgio and Plate, Henrik and Martinez, Matias and Barais, Olivier and Ponta, Serena Elisa","Supply chain attacks on open-source projects aim at injecting and spreading malicious code such that it is executed by direct and indirect downstream users. Recent work systematized the knowledge about such attacks and proposed a taxonomy in the form of an attack tree. We propose a visualization tool called Risk Explorer for Software Supply Chains, which allows inspecting the taxonomy of attack vectors, their descriptions, references to real-world incidents and other literature, as well as information about associated safeguards. Being open-source itself, the community can easily reference new attacks, accommodate for entirely new attack vectors or reflect the development of new safeguards."
212,hal-03921362,2022,{Towards the Detection of Malicious Java Packages},"Ladisa, Piergiorgio and Plate, Henrik and Martinez, Matias and Barais, Olivier and Ponta, Serena Elisa","Open-source software supply chain attacks aim at infecting downstream users by poisoning open-source packages. The common way of consuming such artifacts is through package repositories and the development of vetting strategies to detect such attacks is ongoing research. Despite its popularity, the Java ecosystem is the less explored one in the context of supply chain attacks. In this paper, we present simple-yet-effective indicators of malicious behavior that can be observed statically through the analysis of Java bytecode. Then we evaluate how such indicators and their combinations perform when detecting malicious code injections. We do so by injecting three malicious payloads taken from realworld examples into the Top-10 most popular Java libraries from libraries.io. We found that the analysis of strings in the constant pool and of sensitive APIs in the bytecode instructions aid in the task of detecting malicious Java packages by significantly reducing the information, thus, making also manual triage possible. CCS CONCEPTS • Security and privacy → Malware and its mitigation."
213,hal-01285470,2016,{Beauty and the Beast: Diverting modern web browsers to build unique browser fingerprints},"Laperdrix, Pierre and Rudametkin, Walter and Baudry, Benoit","Worldwide, the number of people and the time spent browsing the web keeps increasing. Accordingly, the technologies to enrich the user experience are evolving at an amazing pace. Many of these evolutions provide for a more interactive web (e.g., boom of JavaScript libraries, weekly innovations in HTML5), a more available web (e.g., explosion of mobile devices), a more secure web (e.g., Flash is disappearing, NPAPI plugins are being deprecated), and a more private web (e.g., increased legislation against cookies, huge success of extensions such as Ghostery and AdBlock). Nevertheless, modern browser technologies, which provide the beauty and power of the web, also provide a darker side, a rich ecosystem of exploitable data that can be used to build unique browser fingerprints. Our work explores the validity of browser fingerprinting in today's environment. Over the past year, we have collected 118,934 fingerprints composed of 17 attributes gathered thanks to the most recent web technologies. We show that innovations in HTML5 provide access to highly discriminating attributes, notably with the use of the Canvas API which relies on multiple layers of the user's system. In addition, we show that browser fingerprinting is as effective on mobile devices as it is on desktops and laptops, albeit for radically different reasons due to their more constrained hardware and software environments. We also evaluate how browser fingerprinting could stop being a threat to user privacy if some technological evolutions continue (e.g., disappearance of plugins) or are embraced by browser vendors (e.g., standard HTTP headers)."
214,hal-01527580,2017,{FPRandom: Randomizing core browser objects to break advanced device fingerprinting techniques},"Laperdrix, Pierre and Baudry, Benoit and Mishra, Vikas","The rich programming interfaces (APIs) provided by web browsers can be diverted to collect a browser fingerprint. A small number of queries on these interfaces are sufficient to build a fingerprint that is statistically unique and very stable over time. Consequently, the fingerprint can be used to track users. Our work aims at mitigating the risk of browser fingerprinting for users privacy by 'breaking' the stability of a fingerprint over time. We add randomness in the computation of selected browser functions, in order to have them deliver slightly different answers for each browsing session. Randomization is possible thanks to the following properties of browsers implementations: (i) some functions have a nondeterministic specification, but a deterministic implementation ; (ii) multimedia functions can be slightly altered without deteriorating user's perception. We present FPRandom, a modified version of Firefox that adds randomness to mitigate the most recent fingerprinting algorithms, namely canvas fingerprinting, AudioContext fingerprinting and the unmasking of browsers through the order of JavaScript properties. We evaluate the effectiveness of FPRandom by testing it against known fingerprinting tests. We also conduct a user study and evaluate the performance overhead of randomization to determine the impact on the user experience."
215,hal-01121108,2015,{Mitigating browser fingerprint tracking: multi-level reconfiguration and diversification},"Laperdrix, Pierre and Rudametkin, Walter and Baudry, Benoit"," The diversity of software components (e.g., browsers, plugins, fonts) is a wonderful opportunity for users to customize their platforms. Yet, massive customization creates a privacy issue: browsers are slightly different from one another, allowing third parties to collect unique and stable fingerprints to track users. Although software diversity appears to be the source of this privacy issue, we claim that this same diversity, combined with automatic reconfiguration, provides the essential ingredients to constantly change browsing platforms. Constant change acts as a moving target defense strategy against fingerprint tracking by breaking one essential property: stability over time. We leverage virtualization and modular architectures to automatically assemble and reconfigure software components at multiple levels. We operate on operating systems, browsers, fonts and plugins. This work is the first application of software reconfiguration to build a moving target defense against browser fingerprint tracking. The main objective is to automatically modify the fingerprint a platform exhibits. We have developed a prototype called Blink to experiment the effectiveness of our approach at randomizing fingerprints. We have assembled and reconfigured thousands of platforms, and we observe that all of them exhibit different fingerprints, and that commercial fingerprinting solutions are not able to detect that the different platforms actually correspond to a single user."
216,hal-01222999,2015,{Coping with Semantic Variation Points in Domain-Specific Modeling Languages},"Latombe, Florent and Crégut, Xavier and Deantoni, Julien and Pantel, Marc and Combemale, Benoit","Even if they exhibit differences, many Domain-Specific Modeling Languages (DSMLs) share elements from their concepts, notations and semantics. StateCharts is a well known family of DSMLs that share many concepts but exhibit notational differences and many execution semantics variants (called Semantic Variation Points – SVPs –). For instance, when two conflicting transitions in a state machine are enabled by the same event occurrence, which transition is fired depends on the language variant (Harel original StateCharts, UML, Rhapsody, etc.) supported by the execution tool. Tools usually provide only one implementation of SVPs. It complicates communication both between tools and end-users, and hinders the co-existence of multiple variants. More generally, Language Workbenches dedicated to the specification and implementation of eXecutable Domain-Specific Modeling Languages (xDSMLs) often do not offer the tools and facilities to manage these SVPs, making it a time-consuming and troublesome activity. In this paper, we describe a modularized approach to the operational execution semantics of xDSMLs and show how it allows us to manage SVPs. We illustrate this proposal on StateCharts."
217,hal-01185911,2015,{Weaving Concurrency in eXecutable Domain-Specific Modeling Languages},"Latombe, Florent and Crégut, Xavier and Combemale, Benoit and Deantoni, Julien and Pantel, Marc","The emergence of modern concurrent systems (e.g., Cyber-Physical Systems or the Internet of Things) and highly-parallel platforms (e.g., many-core, GPGPU pipelines, and distributed platforms) calls for Domain-Specific Modeling Languages (DSMLs) where concurrency is of paramount importance. Such DSMLs are intended to propose constructs with rich concurrency semantics, which allow system designers to precisely define and analyze system behaviors. However , specifying and implementing the execution semantics of such DSMLs can be a difficult, costly and error-prone task. Most of the time the concurrency model remains implicit and ad-hoc, embedded in the underlying execution environment. The lack of an explicit concurrency model prevents: the precise definition, the variation and the complete understanding of the semantics of the DSML, the effective usage of concurrency-aware analysis techniques, and the exploitation of the concurrency model during the system refinement (e.g., during its allocation on a specific platform). In this paper, we introduce a concurrent executable metamodeling approach, which supports a modular definition of the execution semantics , including the concurrency model, the semantic rules, and a well-defined and expressive communication protocol between them. Our approach comes with a dedicated metalanguage to specify the communication protocol, and with an execution environment to simulate executable models. We illustrate and validate our approach with an implementation of fUML, and discuss the modularity and applicability of our approach."
218,hal-03764541,2022,{HyperAST: Enabling Efficient Analysis of Software Histories at Scale},"Le Dilavrec, Quentin and Khelladi, Djamel Eddine and Blouin, Arnaud and Jézéquel, Jean-Marc","Syntax Trees (ASTs) are widely used beyond compilers in many tools that measure and improve code quality, such as code analysis, bug detection, mining code metrics, refactoring. With the advent of fast software evolution and multistage releases, the temporal analysis of an AST history is becoming useful to understand and maintain code. However, jointly analyzing thousands versions of ASTs independently faces scalability issues, mostly combinatorial, both in terms of memory and CPU usage. In this paper, we propose a novel type of AST, called HyperAST , that enables efficient temporal code analysis on a given software history by: 1/ leveraging code redundancy through space (between code elements) and time (between versions); 2/ reusing intermediate computation results. We show how the HyperAST can be built incrementally on a set of commits to capture all multiple ASTs at once in an optimized way. We evaluated the HyperAST on a curated list of large software projects. Compared to Spoon, a state-of-the-art technique, we observed that the HyperAST outperforms it with an order-of-magnitude difference from ×6 up to ×8076 in CPU construction time and from ×12 up to ×1159 in memory footprint. While the HyperAST requires up to 2 h 22 min and 7.2 GB for the biggest project, Spoon requires up to 93 h and 31 min and 2.2 TB. The gains in construction time varied from 83.4 % to 99.99 % and the gains in memory footprint varied from 91.8 % to 99.9 %. We further compared the task of finding references of declarations with the HyperAST and Spoon. We observed on average 90 % precision and 97 % recall without a significant difference in search time."
219,hal-03340174,2021,{Untangling Spaghetti of Evolutions in Software Histories to Identify Code and Test Co-evolutions},"Le Dilavrec, Quentin and Khelladi, Djamel Eddine and Blouin, Arnaud and Jézéquel, Jean-Marc","Version Control Systems are key elements of modern software development. They provide the history of software systems, serialized as lists of commits. Practitioners may rely on this history to understand and study the evolutions of software systems, including the co-evolution amongst strongly coupled development artifacts such as production code and their tests. However, a precise identification of code and test co-evolutions requires practitioners to manually untangle spaghetti of evolutions. In this paper, we propose an automated approach for detecting co-evolutions between code and test, independently of the commit history. The approach creates a sound knowledge base of code and test co-evolutions that practitioners can use for various purposes in their projects. We conducted an empirical study on a curated set of 45 open-source systems having Git histories. Our approach exhibits a precision of 100 % and an underestimated recall of 37.5 % in detecting the code and test co-evolutions. Our approach also spotted different kinds of code and test co-evolutions, including some of those researchers manually identified in previous work."
220,hal-02398053,2019,{Experience Report on the Development of a Specialized Multi-view Multi-stakeholder Model-Based Engineering Framework},"Le Guernic, Gurvan",
221,hal-01401849,2016,{Industrial Experience Report on the Formal Specification of a Packet Filtering Language Using the K Framework},"Le Guernic, Gurvan and Combemale, Benoit and Galindo Duarte, José Angel","Many project-specific languages, including in particular filtering languages, are defined using non-formal specifications written in natural languages. This leads to ambiguities and errors in the specification of those languages. This paper reports on an industrial experiment on using a tool-supported language specification framework (K) for the formal specification of the syntax and semantics of a filtering language having a complexity similar to those of real-life projects. This experimentation aims at estimating, in a specific industrial setting, the difficulty and benefits of formally specifying a packet filtering language using a tool-supported formal approach."
222,hal-01613873,2017,{AGENT: Automatic Generation of Experimental Protocol Runtime},"Le Moulec, Gwendal and Argelaguet Sanz, Ferran and Gouranton, Valérie and Blouin, Arnaud and Arnaldi, Bruno","Due to the nature of Virtual Reality (VR) research, conducting experiments in order to validate the researcher's hypotheses is a must. However, the development of such experiments is a tedious and time-consuming task. In this work, we propose to make this task easier, more intuitive and faster with a method able to describe and generate the most tedious components of VR experiments. The main objective is to let experiment designers focus on their core tasks: designing , conducting, and reporting experiments. To that end, we propose the use of Domain-Specific Languages (DSLs) to ease the description and generation of VR experiments. An analysis of published VR experiments is used to identify the main properties that characterize VR experiments. This allowed us to design AGENT (Automatic Generation of ExperimeNtal proTocol runtime), a DSL for specifying and generating experimental protocol runtimes. We demonstrated the feasibility of our approach by using AGENT on two experiments published in the VRST'16 proceedings."
223,hal-01374140,2016,{A Decision-making Process for Exploring Architectural Variants in Systems Engineering},"Le Noir, Jérome and Madelénat, Sébastien and Labreuche, Christophe and Constant, Olivier and Gailliard, Grégory and Acher, Mathieu and Barais, Olivier","In systems engineering, practitioners shall explore numerous architectural alternatives until choosing the most adequate variant. The decision-making process is most of the time a manual, time-consuming, and error-prone activity. The exploration and justification of architectural solutions is ad-hoc and mainly consists in a series of tries and errors on the mod-eling assets. In this paper, we report on an industrial case study in which we apply variability modeling techniques to automate the assessment and comparison of several candidate architectures (variants). We first describe how we can use a model-based approach such as the Common Variability Language (CVL) to specify the architectural variability. We show that the selection of an architectural variant is a multi-criteria decision problem in which there are numerous interactions (veto, favor, complementary) between criteria. We present a tooled process for exploring architectural variants integrating both CVL and the MYRIAD method for assessing and comparing variants based on an explicit preference model coming from the elicitation of stakeholders' concerns. This solution allows understanding differences among variants and their satisfactions with respect to criteria. Beyond variant selection automation improvement, this experiment results highlight that the approach improves rationality in the assessment and provides decision arguments when selecting the preferred variants."
224,hal-01890446,2018,{Modular Language Composition for the Masses},"Leduc, Manuel and Degueule, Thomas and Combemale, Benoit","The goal of modular language development is to enable the definition of new languages as assemblies of pre-existing ones. Recent approaches in this area are plentiful but usually suffer from two main problems: either they do not support modular language composition both at the specification and implementation levels, or they require advanced knowledge of specific paradigms which hampers wide adoption in the industry. In this paper, we introduce a non-intrusive approach to modular development of language concerns with well-defined interfaces that can be composed modularly at the specification and implementation levels. We present an implementation of our approach atop the Eclipse Mod-eling Framework, namely Alex-an object-oriented metalanguage for semantics definition and language composition. We evaluate Alex in the development of a new DSL for IoT systems modeling resulting from the composition of three independently defined languages (UML activity diagrams, Lua, and the OMG Interface Description Language). We evaluate the effort required to implement and compose these languages using Alex with regards to similar approaches of the literature."
225,hal-01568169,2017,{Revisiting Visitors for Modular Extension of Executable DSMLs},"Leduc, Manuel and Degueule, Thomas and Combemale, Benoit and van Der Storm, Tijs and Barais, Olivier","Executable Domain-Specific Modeling Languages (xDSMLs) are typically defined by metamodels that specify their abstract syntax, and model interpreters or compilers that define their execution semantics. To face the proliferation of xDSMLs in many domains, it is important to provide language engineering facilities for opportunistic reuse, extension, and customization of existing xDSMLs to ease the definition of new ones. Current approaches to language reuse either require to anticipate reuse, make use of advanced features that are not widely available in programming languages, or are not directly applicable to metamodel-based xDSMLs. In this paper, we propose a new language implementation pattern, named Revisitor, that enables independent extensibility of the syntax and semantics of metamodel-based xDSMLs with incremental compilation and without anticipation. We seamlessly implement our approach alongside the compilation chain of the Eclipse Modeling Framework, thereby demonstrating that it is directly and broadly applicable in various modeling environments. We show how it can be employed to incrementally extend both the syntax and semantics of the fUML language without requiring anticipation or re-compilation of existing code, and with acceptable performance penalty compared to classical handmade visitors."
226,hal-01910139,2018,{Fostering metamodels and grammars within a dedicated environment for HPC: the NabLab environment (tool demo)},"Lelandais, Benoît and Oudot, Marie-Pierre and Combemale, Benoit","Advanced and mature language workbenches have been proposed in the past decades to develop Domain-Specific Languages (DSL) and rich associated environments. They all come in various flavors, mostly depending on the underlying technological space (e.g., grammarware or modelware). However, when the time comes to start a new DSL project, it often comes with the choice of a unique technological space which later bounds the possible expected features. In this tool paper, we introduce NabLab, a full-fledged industrial environment for scientific computing and High Performance Computing (HPC), involving several metamod-els and grammars. Beyond the description of an industrial experience of the development and use of tool-supported DSLs, we report in this paper our lessons learned, and demonstrate the benefits from usefully combining metamodels and grammars in an integrated environment."
227,hal-01308625,2016,{Automatic Detection of GUI Design Smells: The Case of Blob Listener},"Lelli, Valéria and Blouin, Arnaud and Baudry, Benoit and Coulon, Fabien and Beaudoux, Olivier","Graphical User Interfaces (GUIs) intensively rely on event-driven programming: widgets send GUI events, which capture users' interactions, to dedicated objects called controllers. Controllers implement several GUI listeners that handle these events to produce GUI commands. In this work, we conducted an empirical study on 13 large Java Swing open-source software systems. We study to what extent the number of GUI commands that a GUI listener can produce has an impact on the change-and fault-proneness of the GUI listener code. We identify a new type of design smell, called Blob listener that characterizes GUI listeners that can produce more than two GUI commands. We show that 21 % of the analyzed GUI controllers are Blob listeners. We propose a systematic static code analysis procedure that searches for Blob listener that we implement in InspectorGuidget. We conducted experiments on six software systems for which we manually identified 37 instances of Blob listener. InspectorGuidget successfully detected 36 Blob listeners out of 37. The results exhibit a precision of 97.37 % and a recall of 97.59 %. Finally, we propose coding practices to avoid the use of Blob listeners."
228,hal-01114724,2015,{Classifying and Qualifying GUI Defects},"Lelli, Valéria and Blouin, Arnaud and Baudry, Benoit","Graphical user interfaces (GUIs) are integral parts of software systems that require interactions from their users. Software testers have paid special attention to GUI testing in the last decade, and have devised techniques that are effective in finding several kinds of GUI errors. However, the introduction of new types of interactions in GUIs (e.g., direct manipulation) presents new kinds of errors that are not targeted by current testing techniques. We believe that to advance GUI testing, the community needs a comprehensive and high level GUI fault model, which incorporates all types of interactions. The work detailed in this paper establishes 4 contributions: 1) A GUI fault model designed to identify and classify GUI faults. 2) An empirical analysis for assessing the relevance of the proposed fault model against failures found in real GUIs. 3) An empirical assessment of two GUI testing tools (i.e. GUITAR and Jubula) against those failures. 4) GUI mutants we've developed according to our fault model. These mutants are freely available and can be reused by developers for benchmarking their GUI testing tools."
229,hal-01123647,2015,{On Model-Based Testing Advanced GUIs},"Lelli, Valéria and Blouin, Arnaud and Baudry, Benoit and Coulon, Fabien","Graphical User Interface (GUI) design is currently shifting from designing GUIs composed of standard widgets to designing GUIs relying on more natural interactions and ad hoc widgets. This shift is meant to support the advent of GUIs providing users with more adapted and natural interactions, and the support of new input devices such as multi-touch screens. Standard widgets (e.g. buttons) are more and more replaced by ad hoc ones (e.g. the drawing area of graphical editors), and interactions are shifting from mono-event (e.g. button pressures) to multi-event interactions (e.g. multi-touch and gesture-based interactions). As a consequence, the current GUI model-based testing approaches, which target event-based systems, show their limits when applied to test such new advanced GUIs. The work introduced in this paper establishes three contributions: a precise analysis of the reasons of these current limits; a proposition to tackle the identified limits by leveraging the Malai GUI specification language and by proposing the concept of interaction-action-flow graph; feedback from two use cases, an industrial project and an open-source application, where the proposed approach has been applied."
230,hal-01651801,2017,{Create and Play your Pac-Man Game with the GEMOC Studio (Tool Demonstration)},"Leroy, Dorian and Bousse, Erwan and Wimmer, Manuel and Combemale, Benoit and Schwinger, Wieland","Executable Domain-Specific Languages (DSLs) are used for defining the behaviors of systems. The operational semantics of such DSLs may define how conforming models react to stimuli from their environment. This commonly requires adapting the semantics to define both the possible domain-level stimuli, and their handling during the execution. However, manually adapting the semantics for such cross-cutting concern is a complex and error-prone task. In this paper, we present an approach and a tool addressing this problem by augmenting the operational semantics for handling stimuli, and by automatically generating a complete behavioral language interface from this augmentation. At runtime, this interface can receive stimuli sent to models, and can safely handle them by interrupting the execution flow. This tool has been developed for the GEMOC Studio, a language and modeling workbench for executable DSLs. We demonstrate how it can be used to implement a Pac-Man DSL enabling the creation and execution of Pac-Man games."
231,hal-03358061,2021,{Monilogging for Executable Domain-Specific Languages},"Leroy, Dorian and Lelandais, Benoît and Oudot, Marie-Pierre and Combemale, Benoit","Runtime monitoring and logging are fundamental techniques for analyzing and supervising the behavior of computer programs. However, supporting these techniques for a given language induces significant development costs that can hold language engineers back from providing adequate logging and monitoring tooling for new domain-specific modeling languages. Moreover, runtime monitoring and logging are generally considered as two different techniques: they are thus implemented separately which makes users prone to overlooking their potentially beneficial mutual interactions. We propose a language-agnostic, unifying framework for runtime monitoring and logging and demonstrate how it can be used to define loggers, runtime monitors and combinations of the two, aka. moniloggers. We provide an implementation of the framework that can be used with Java-based executable languages, and evaluate it on 2 implementations of the NabLab interpreter, leveraging in turn the instrumentation facilities offered by Truffle, and those offered by AspectJ."
232,hal-01803031,2018,{Trace Comprehension Operators for Executable DSLs},"Leroy, Dorian and Bousse, Erwan and Megna, Anaël and Combemale, Benoit and Wimmer, Manuel","Recent approaches contribute facilities to breathe life into metamodels, thus making behavioral models directly executable. Such facilities are particularly helpful to better utilize a model over the time dimension, e.g., for early validation and verification. However, when even a small change is made to the model, to the language definition (e.g., semantic variation points), or to the external stimuli of an execution scenario, it remains difficult for a designer to grasp the impact of such a change on the resulting execution trace. This prevents accessible trade-off analysis and design-space exploration on behavioral models. In this paper, we propose a set of formally defined operators for analyzing execution traces. The operators include dynamic trace filtering, trace comparison with diff computation and visualization, and graph-based view extraction to analyze cycles. The operators are applied and validated on a demonstrative example that highlight their usefulness for the comprehension specific aspects of the underlying traces."
233,hal-03624309,2022,{Beware of the Interactions of Variability Layers When Reasoning about Evolution of MongoDB},"Lesoil, Luc and Acher, Mathieu and Blouin, Arnaud and Jézéquel, Jean-Marc","With commits and releases, hundreds of tests are run on varying conditions (e.g., over different hardware and workloads) that can help to understand evolution and ensure non-regression of software performance. We hypothesize that performance is not only sensitive to evolution of software, but also to different variability layers of its execution environment, spanning the hardware, the operating system, the build, or the workload processed by the software. Leveraging the MongoDB dataset, our results show that changes in hardware and workload can drastically impact performance evolution and thus should be taken into account when reasoning about evolution. An open problem resulting from this study is how to manage the variability layers in order to efficiently test the performance evolution of a software."
234,hal-03084276,2021,{Deep Software Variability: Towards Handling Cross-Layer Configuration},"Lesoil, Luc and Acher, Mathieu and Blouin, Arnaud and Jézéquel, Jean-Marc","Configuring software is a powerful means to reach functional and performance goals of a system. However, many layers (hardware, operating system, input data, etc.), themselves subject to variability, can alter performances of software configurations. For instance, configurations' options of the x264 video encoder may have very different effects on x264's encoding time when used with different input videos, depending on the hardware on which it is executed. In this vision paper, we coin the term deep software variability to refer to the interaction of all external layers modifying the behavior or non-functional properties of a software. Deep software variability challenges practitioners and researchers: the combinatorial explosion of possible executing environments complicates the understanding, the configuration, the maintenance, the debug, and the test of configurable systems. There are also opportunities: harnessing all variability layers (and not only the software layer) can lead to more efficient systems and configuration knowledge that truly generalizes to any usage and context."
235,hal-03286127,2021,{The Interplay of Compile-time and Run-time Options for Performance Prediction},"Lesoil, Luc and Acher, Mathieu and Tërnava, Xhevahire and Blouin, Arnaud and Jézéquel, Jean-Marc","Many software projects are configurable through compile-time options (e.g., using ./configure) and also through run-time options (e.g., command-line parameters, fed to the software at execution time). Several works have shown how to predict the effect of run-time options on performance. However it is yet to be studied how these prediction models behave when the software is built with different compile-time options. For instance, is the best run-time configuration always the best w.r.t. the chosen compilation options? In this paper, we investigate the effect of compile-time options on the performance distributions of 4 software systems. There are cases where the compiler layer effect is linear which is an opportunity to generalize performance models or to tune and measure runtime performance at lower cost. We also prove there can exist an interplay by exhibiting a case where compile-time options significantly alter the performance distributions of a configurable system."
236,hal-03514984,2022,{Transferring Performance between Distinct Configurable Systems : A Case Study},"Lesoil, Luc and Martin, Hugo and Acher, Mathieu and Blouin, Arnaud and Jézéquel, Jean-Marc","Many research studies predict the performance of configurable software using machine learning techniques, thus requiring large amounts of data. Transfer learning aims to reduce the amount of data needed to train these models and has been successfully applied on different executing environments (hardware) or software versions. In this paper we investigate for the first time the idea of applying transfer learning between distinct configurable systems. We design a study involving two video encoders (namely x264 and x265) coming from different code bases. Our results are encouraging since transfer learning outperforms traditional learning for two performance properties (out of three). We discuss the open challenges to overcome for a more general application."
237,hal-03355162,2021,"{DataTime: a Framework to smoothly Integrate Past, Present and Future into Models}","Lyan, Gauthier and Jézéquel, Jean-Marc and Gross-Amblard, David and Combemale, Benoit","Models at runtime have been initially investigated for adaptive systems. Models are used as a reflective layer of the current state of the system to support the implementation of a feedback loop. More recently, models at runtime have also been identified as key for supporting the development of fullfledged digital twins. However, this use of models at runtime raises new challenges, such as the ability to seamlessly interact with the past, present and future states of the system. In this paper, we propose a framework called DataTime to implement models at runtime which capture the state of the system according to the dimensions of both time and space, here modeled as a directed graph where both nodes and edges bear local states (ie. values of properties of interest). DataTime provides a unifying interface to query the past, present and future (predicted) states of the system. This unifying interface provides i) an optimized structure of the time series that capture the past states of the system, possibly evolving over time, ii) the ability to get the last available value provided by the system’s sensors, and iii) a continuous micro-learning over graph edges of a predictive model to make it possible to query future states, either locally or more globally, thanks to a composition law. The framework has been developed and evaluated in the context of the Intelligent Public Transportation Systems of the city of Rennes (France). This experimentation has demonstrated how DataTime can deprecate the use of heterogeneous tools for managing data from the past, the present and the future, and facilitate the development of digital twins."
238,hal-03356199,2021,{On the Quality of Compositional Prediction for Prospective Analytics on Graphs},"Lyan, Gauthier and Gross-Amblard, David and Jézéquel, Jean-Marc","Recently, micro-learning has been successfully applied to various scenarios, such as graph optimization (e.g. power grid management). In these approaches, ad-hoc models of local data are built instead of one large model on the overall data set. Micro-learning is typically useful for incremental, what-if/prospective scenarios, where one has to perform step-by-step decisions based on local properties. A common feature of these applications is that the predicted properties (such as speed of a bus line) are compositions of smaller parts (e.g. the speed on each bus inter-stations along the line). But little is known about the quality of such predictions when generalized at a larger scale. In this paper we propose a generic technique that embeds machine-learning for graph-based compositional prediction, that allows 1) the prediction of the behaviour of composite objects, based on the predictions of their sub-parts and appropriate composition rules, and 2) the production of rich prospective analytics scenarios, where new objects never observed before can be predicted based on their simpler parts. We show that the quality of such predictions compete with macro-learning ones, while enabling prospective scenarios. We assess our work on a real size, operational bus network data set."
239,hal-03335263,2021,{A comparison of performance specialization learning for configurable systems},"Martin, Hugo and Acher, Mathieu and Pereira, Juliana Alves and Jézéquel, Jean-Marc","The specialization of the configuration space of a software system has been considered for targeting specific configuration profiles, usages, deployment scenarios, or hardware settings. The challenge is to find constraints among options' values that only retain configurations meeting a performance objective. Since the exponential nature of configurable systems makes a manual specialization unpractical, several approaches have considered its automation using machine learning, i.e., measuring a sample of configurations and then learning what options' values should be constrained. Even focusing on learning techniques based on decision trees for their built-in explainability, there is still a wide range of possible approaches that need to be evaluated, i.e., how accurate is the specialization with regards to sampling size, performance thresholds, and kinds of configurable systems. In this paper, we compare six learning techniques: three variants of decision trees (including a novel algorithm) with and without the use of model-based feature selection. We first perform a study on 8 configurable systems considered in previous related works and show that the accuracy reaches more than 90% and that feature selection can improve the results in the majority of cases. We then perform a study on the Linux kernel and show that these techniques performs as well as on the other systems. Overall, our results show that there is no one-size-fits-all learning variant (though high accuracy can be achieved): we present guidelines and discuss tradeoffs. CCS CONCEPTS • Software and its engineering → Software product lines; • Computing methodologies → Classification and regression trees."
240,hal-02287459,2019,{Machine Learning and Configurable Systems: A Gentle Introduction},"Martin, Hugo and Pereira, Juliana Alves and Acher, Mathieu and Temple, Paul","The goal of this tutorial is to give an introduction to how machine learning can be used to support activities related to the engineering of con gurable systems and software product lines. To the best of our knowledge, this is the rst practical tutorial in this trending eld. The tutorial is based on a systematic literature review and includes practical tasks (specialization, performance prediction) on real-world systems (VaryLaTeX, x264)."
241,hal-01720519,2018,{Towards Estimating and Predicting User Perception on Software Product Variants},"Martinez, Jabier and Sottet, Jean-Sébastien and García Frey, Alfonso and Bissyandé, Tegawendé and Ziadi, Tewfik and Klein, Jacques and Temple, Paul and Acher, Mathieu and Le Traon, Yves","Estimating and predicting user subjective perceptions on software products is a challenging, yet increasingly important, endeavour. As an extreme case study, we consider the problem of exploring computer-generated art object combinations that will please the maximum number of people. Since it is not feasible to gather feedbacks for all art products because of a combinatorial explosion of possible configurations as well as resource and time limitations, the challenging objective is to rank and identify optimal art product variants that can be generated based on their average likability. We present the use of Software Product Line (SPL) techniques for gathering and leveraging user feedbacks within the boundaries of a variability model. Our approach is developed in two phases: 1) the creation of a data set using a genetic algorithm and real feedback and 2) the application of a data mining technique on this data set to create a ranking enriched with confidence metrics. We perform a case study of a real-world computer-generated art system. The results of our approach on the arts domain reveal interesting directions for the analysis of user-specific qualities of SPLs."
242,hal-04084097,2023,{Caught in the Game: On the History and Evolution of Web Browser Gaming},"Mehanna, Naif and Rudametkin, Walter","Web browsers have come a long way since their inception, evolving from a simple means of displaying text documents over the network to complex software stacks with advanced graphics and network capabilities. As personal computers grew in popularity, developers jumped at the opportunity to deploy cross-platform games with centralized management and a low barrier to entry. Simply going to the right address is now enough to start a game. From text-based to GPU-powered 3D games, browser gaming has evolved to become a strong alternative to traditional console and mobile-based gaming, targeting both casual and advanced gamers. Browser technology has also evolved to accommodate more demanding applications, sometimes even supplanting functions typically left to the operating system. Today, websites display rich, computationally intensive, hardware-accelerated graphics, allowing developers to build ever-more impressive applications and games.In this paper, we present the evolution of browser gaming and the technologies that enabled it, from the release of the first text-based games in the early 1990s to current open-world and game-engine-powered browser games. We discuss the societal impact of browser gaming and how it has allowed a new target audience to accessdigital gaming. Finally, we review the potential future evolution ofthe browser gaming industry."
243,hal-01284822,2016,{Puzzle: A tool for analyzing and extracting specification clones in DSLs},"Méndez-Acuña, David and Galindo Duarte, José Angel and Combemale, Benoit and Blouin, Arnaud and Baudry, Benoit","The use of domain-specific languages (DSLs) is a successful technique in the development of complex systems. Indeed, the construction of new DSLs addressing the particular needs of software projects has become a recurrent activity. In this context, the phenomenon of specification cloning has started to appear. Language designers often copy&paste some parts of the specification from legacy DSLs to ""reuse"" formerly defined language constructs. As well known, this type of practices introduce problems such as bugs propagation, thus increasing of maintenance costs. In this paper, we present Puzzle, a tool that uses static analysis to facilitate the detection of specification clones in DSLs implemented under the executable metamodeling paradigm. Puzzle also enables the extraction specification clones as reusable language modules that can be later used to build up new DSLs."
244,hal-02982981,2018,{Reverse engineering language product lines from existing DSL variants},"Méndez-Acuña, David and Galindo, José and Combemale, Benoit and Blouin, Arnaud and Baudry, Benoit",
245,hal-01284816,2016,{Reverse-engineering reusable language modules from legacy domain-specific languages},"Méndez-Acuña, David and Galindo Duarte, José Angel and Combemale, Benoit and Blouin, Arnaud and Baudry, Benoit and Le Guernic, Gurvan","The use of domain-specific languages (DSLs) has become a successful technique in the development of complex systems. Nevertheless , the construction of this type of languages is time-consuming and requires highly-specialized knowledge and skills. An emerging practice to facilitate this task is to enable reuse through the definition of language modules which can be later put together to build up new DSLs. Still, the identification and definition of language modules are complex and error-prone activities, thus hindering the reuse exploitation when developing DSLs. In this paper, we propose a computer-aided approach to i) identify potential reuse in a set of legacy DSLs; and ii) capitalize such potential reuse by extracting a set of reusable language modules with well defined interfaces that facilitate their assembly. We validate our approach by using realistic DSLs coming out from industrial case studies and obtained from public GitHub repositories."
246,hal-01077834,2014,{Variability Management in Domain-Specific Languages},"Méndez-Acuña, David","Domain-specific languages (DSLs) have demonstrated their capability to reduce the gap between the problem domain and the techni-cal decisions during the software development process. However, building a DSL is not an easy task because it requires specialized knowledge and skills. Moreover, the challenge becomes even more complex in the con-text of multi-domain companies where several domains coexist across the business units and, consequently, there is a need of dealing not only with isolated DSLs but also with families of DSLs. To deal with this complexity, the research community has been working on the definition of approaches that use the ideas of Software Product Lines Engineering (SPLE) for building and maintaining families of DSLs. In this paper, we present a PhD thesis that is aimed to contribute to this effort. In particular, we explain the challenges that need to be addressed during the process of going from a family of DSLs to a software language line. Then, we briefly discuss the state of the art, and finally we introduce a research plan."
247,hal-01188800,2015,{A Precise Metamodel for Open Cloud Computing Interface},"Merle, Philippe and Barais, Olivier and Parpaillon, Jean and Plouzeau, Noël and Tata, Samir","Open Cloud Computing Interface (OCCI) proposes one of the first widely accepted, community-based, open standards for managing any kinds of cloud resources. But as it is specified in natural language, OCCI is imprecise, ambiguous, incomplete, and needs a precise definition of its core concepts. Indeed, the OCCI Core Model has conceptual drawbacks: an imprecise semantics of its type classification system, a nonextensible data type system for OCCI attributes, a vague and limited extension concept and the absence of a configuration concept. To tackle these issues, this paper proposes a precise metamodel for OCCI. This metamodel defines rigourously the static semantics of the OCCI core concepts, of a precise type classification system, of an extensible data type system, and of both extension and configuration concepts. This metamodel is based on the Eclipse Modeling Framework (EMF), its structure is encoded with Ecore and its static semantics is rigourously defined with Object Constraint Language (OCL). As a consequence, this metamodel provides a concrete language to precisely define and exchange OCCI models. The validation of our metamodel is done on the first worldwide dataset of OCCI extensions already published in the literature, and addressing inter-cloud networking, infrastructure, platform, application, service management, cloud monitoring, and autonomic computing domains, respectively. This validation highlights simplicity, consistency, correctness, completeness, and usefulness of the proposed metamodel."
248,hal-02990897,2020,{An Asset-Based Assistance for Secure by Design},"Messe, Nan and Belloir, Nicolas and Chiprianov, Vanea and El-Hachem, Jamal and Fleurquin, Régis and Sadou, Salah","With the growing numbers of security attacks causing more and more serious damages in software systems, security cannot be added as an afterthought in software development. It has to be built in from the early development phases such as requirement and design. The role responsible for designing a software system is termed an ""architect"", knowledgeable about the system architecture design, but not always well-trained in security. Moreover, involving other security experts into the system design is not always possible due to time-to-market and budget constraints. To address these challenges, we propose to define an asset-based security assistance in this paper, to help architects design secure systems even if these architects have limited knowledge in security. This assistance helps alert threats, and integrate the security controls over vulnerable parts of system into the architecture model. The central concept enabling this assistance is that of asset. We apply our proposal on a telemonitoring case study to show that automating such an assistance is feasible."
249,hal-02990919,2020,{Asset-Oriented Threat Modeling},"Messe, Nan and Chiprianov, Vanea and Belloir, Nicolas and El-Hachem, Jamal and Fleurquin, Régis and Sadou, Salah","Threat modeling is recognized as one of the most important activities in software security. It helps to address security issues in software development. Several threat modeling processes are widely used in the industry such as the one of Microsoft SDL. In threat modeling, it is essential to first identify assets before enumerating threats, in order to diagnose the threat targets and spot the protection mechanisms. Asset identification and threat enumeration are collaborative activities involving many actors such as security experts and software architects. These activities are traditionally carried out in brainstorming sessions. Due to the lack of guidance, the lack of a sufficiently formalized process, the high dependence on actors' knowledge, and the variety of actors' background, these actors often have difficulties collaborating with each other. Brainstorming sessions are thus often conducted sub-optimally and require significant effort. To address this problem, we aim at structuring the asset identification phase by proposing a systematic asset identification process, which is based on a reference model. This process structures and identifies relevant assets, facilitating the threat enumeration during brainstorming. We illustrate the proposed process with a case study and show the usefulness of our process in supporting threat enumeration and improving existing threat modeling processes such as the Microsoft SDL one."
250,hal-02948867,2019,{Development of Secure System of Systems Needing a Rapid Deployment},"Messe, Nan and Belloir, Nicolas and Chiprianov, Vanea and Cherfa, Imane and Fleurquin, Régis and Sadou, Salah","In certain cases, such as secure humanitarian corridors in a conflict zone, a special type of SoS, needing a rapid deployment, has to be developed. Because of the tense time constraint, usually only a domain expert is responsible with this development. However, many such SoSs also have to take into account the security aspect. How to help a domain expert integrate the security aspect into the rapid development of an SoS? In this proposal paper, we present an approach and a tool suite that help the domain expert tag business assets using security properties, which are then used to identify vulnerabilities and to propose possible security control mechanisms. We illustrate our proposal on a case study."
251,hal-01248160,2015,"{Polymer: A Model-Driven Approach for Simpler, Safer, and Evolutive Multi-Objective Optimization Development}","Moawad, Assaad and Hartmann, Thomas and Fouquet, François and Nain, Grégory and Klein, Jacques and Bourcier, Johann","Multi-Objective Evolutionary Algorithms (MOEAs) have been successfully used to optimize various domains such as finance, science, engineering, logistics and software engineering. Nevertheless, MOEAs are still very complex to apply and require detailed knowledge about problem encoding and mutation operators to obtain an effective implementation. Software engineering paradigms such as domain-driven design aim to tackle this complexity by allowing domain experts to focus on domain logic over technical details. Similarly, in order to handle MOEA complexity, we propose an approach, using model-driven software engineering (MDE) techniques, to define fitness functions and mutation operators without MOEA encoding knowledge. Integrated into an open source modelling framework, our approach can significantly simplify development and maintenance of multi-objective optimizations. By leveraging modeling methods, our approach allows reusable optimizations and seamlessly connects MOEA and MDE paradigms. We evaluate our approach on a cloud case study and show its suitability in terms of i) complexity to implement an MOO problem, ii) complexity to adapt (maintain) this implementation caused by changes in the domain model and/or optimization goals, and iii) show that the efficiency and effectiveness of our approach remains comparable to ad-hoc implementations."
252,hal-01356104,2016,{A Generative Middleware for Heterogeneous and Distributed Services},"Morin, Brice and Fleurey, Franck and Husa, Knut Eilif and Barais, Olivier","Modern software-based services increasingly rely on a highly heterogeneous and dynamic interconnection of platforms and devices offering a wide diversity of capabilities ranging from cloud server with virtually unlimited resources down to micro-controllers with only a few KB of RAM. This paper motivates the fact that no single software framework or software engineering approach is suited to span across this range, and proposes an approach which leverages the latest advances in model-driven engineering, generative techniques and models@runtime in order to tame this tremendous heterogeneity. This paper presents a set of languages dedicated to the integration, deployment and continuous operation of existing libraries and components already available and implemented in various languages. The proposed approach is validated on an industrial case study in the eHealth domain, implemented by an industrial partner that provide an qualitative evaluation of the approach. This case study involves a large number of sensors, devices and gateways based on Rasperry Pi, Intel Edison and Arduino."
253,hal-03284626,2021,{Extending the Identification of Object-Oriented Variability Implementations using Usage Relationships},"Mortara, Johann and Tërnava, Xhevahire and Collet, Philippe and Dery-Pinna, Anne-Marie","Many variability-rich object-oriented systems rely on multiple traditional techniques (inheritance, patterns) to implement their variability in a single codebase. These variability implementation places are neither explicit nor documented, hampering their detection and variability comprehension. Based on the identification of symmetry property in seven implementation techniques, a first approach was proposed with symfinder to automatically identify and display the variability of a system in a graph-based visualization structured by inheritance. However, composition, or more generally the usage relationship, is extensively used to implement the variability in object-oriented systems, and without this information, comprehending the large amount of variability identified by symfinder is not trivial. In this paper, we present symfinder-2, an extension of the former approach that incorporates the usage relationships to better identify potential variability implementations. We provide two ways to mark classes as entry points, user-defined and automatic, so that the visualization is filtered and enables users to have a better focus when they identify variability. We also report on the evaluation of this extension to ten open-source Java-based systems."
254,hal-01228715,2015,{Discovering Model Transformation Pre-conditions using Automatically Generated Test Models},"Mottu, Jean-Marie and Sen, Sagar and Cadavid, Juan and Baudry, Benoit","Specifying a model transformation is challenging as it must be able to give a meaningful output for any input model in a possibly infinite modeling domain. Transformation preconditions constrain the input domain by rejecting input models that are not meant to be transformed by a model transformation. This paper presents a systematic approach to discover such preconditions when it is hard for a human developer to foresee complex graphs of objects that are not meant to be transformed. The approach is based on systematically generating a finite number of test models using our tool, PRAMANA to first cover the input domain based on input domain partitioning. Tracing a transformation's execution reveals why some preconditions are missing. Using a benchmark transformation from simplified UML class diagram models to RDBMS models we discover new preconditions that were not initially specified."
255,hal-01862964,2018,{A Temporal Model for Interactive Diagnosis of Adaptive Systems},"Mouline, Ludovic and Benelallam, Amine and Fouquet, François and Bourcier, Johann and Barais, Olivier","The evolving complexity of adaptive systems impairs our ability to deliver anomaly-free solutions. Fixing these systems require a deep understanding on the reasons behind decisions which led to faulty or suboptimal system states. Developers thus need diagnosis support that trace system states to the previous circumstances –targeted requirements, input context– that had resulted in these decisions. However, the lack of efficient temporal representation limits the tracing ability of current approaches. To tackle this problem, we describe a novel temporal data model to represent, store and query decisions as well as their relationship with the knowledge (context, requirements, and actions). We validate our approach through a use case based on the smart grid at Luxembourg."
256,hal-01723451,2018,{Enabling Temporal-Aware Contexts for Adaptative Distributed Systems Temporal Context Representation System updates},"Mouline, Ludovic and Benelallam, Amine and Hartmann, Thomas and Fouquet, François and Bourcier, Johann and Morin, Brice and Barais, Olivier","Distributed adaptive systems are composed of federated entities offering remote inspection and reconfiguration abilities. This is often realized using a MAPE-K loop, which constantly evaluates system and environmental parameters and derives corrective actions if necessary. The OpenStack Watcher project uses such a loop to implement resource optimization services for multi-tenant clouds. To ensure a timely reaction in the event of failures, the MAPE-K loop is executed with a high frequency. A major drawback of such reactivity is that many actions, e.g., the migration of containers in the cloud, take more time to be effective and their effects to be measurable than the MAPE-k loop execution frequency. Unfinished actions as well as their expected effects over time are not taken into consideration in MAPE-K loop processes, leading upcoming analysis phases potentially take sub-optimal actions. In this paper, we propose an extended context representation for MAPE-K loop that integrates the history of planned actions as well as their expected effects over time into the context representations. This information can then be used during the upcoming analysis and planning phases to compare measured and expected context metrics. We demonstrate on a cloud elasticity manager case study that such temporal action-aware context leads to improved reasoners while still be highly scalable. CCS CONCEPTS • Software and its engineering → Software design engineering ; • Computer systems organization → Reconfigurable computing ; Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. Adaptive process used as knowledge for re-configures adds actions with their timed effects reads next actions Scheduler Figure 1: Overview of temporal-aware adaptation"
257,hal-01609796,2017,{Weaving Rules into [email protected] for Embedded Smart Systems},"Mouline, Ludovic and Hartmann, Thomas and Fouquet, François and Le Traon, Yves and Bourcier, Johann and Barais, Olivier","Smart systems are characterised by their ability to analyse measured data in live and to react to changes according to expert rules. Therefore, such systems exploit appropriate data models together with actions, triggered by domain-related conditions. The challenge at hand is that smart systems usually need to process thousands of updates to detect which rules need to be triggered, often even on restricted hardware like a Raspberry Pi. Despite various approaches have been investigated to efficiently check conditions on data models, they either assume to fit into main memory or rely on high latency persistence storage systems that severely damage the reactivity of smart systems. To tackle this challenge, we propose a novel composition process, which weaves executable rules into a data model with lazy loading abilities. We quantitatively show, on a smart building case study, that our approach can handle, at low latency, big sets of rules on top of large-scale data models on restricted hardware."
258,hal-01156595,2014,{The relevance of model-driven engineering thirty years from now},"Mussbacher, Gunter and Amyot, Daniel and Breu, Ruth and Bruel, Jean-Michel and Cheng, Betty and Collet, Philippe and Combemale, Benoit and France, Robert B. and Heldal, Rogardt and Hill, James and Kienzle, Jörg and Schöttle, Matthias and Steimann, Friedrich and Stikkolorum, Dave and Whittle, Jon","Although model-driven engineering (MDE) is now an established approach for developing complex software systems, it has not been universally adopted by the software industry. In order to better understand the reasons for this, as well as to identify future opportunities for MDE, we carried out a week-long design thinking experiment with 15 MDE experts. Participants were facilitated to identify the biggest problems with current MDE technologies, to identify grand challenges for society in the near future, and to identify ways that MDE could help to address these challenges. The outcome is a reflection of the current strengths of MDE, an outlook of the most pressing challenges for society at large over the next three decades, and an analysis of key future MDE research opportunities."
259,hal-02925142,2020,{Towards an Assessment Grid for Intelligent Modeling Assistance},"Mussbacher, Gunter and Combemale, Benoit and Abrahão, Silvia and Bencomo, Nelly and Burgueño, Loli and Engels, Gregor and Kienzle, Jörg and Kühn, Thomas and Mosser, Sébastien and Sahraoui, Houari and Weyssow, Martin","The ever-growing complexity of systems, the growing number of stakeholders, and the corresponding continuous emergence of new domain-specific modeling abstractions has led to significantly higher cognitive load on modelers. There is an urgent need to provide modelers with better, more Intelligent Modeling Assistants (IMAs). An important factor to consider is the ability to assess and compare, to learn from existing and inform future IMAs, while potentially combining them. Recently, a conceptual Reference Framework for Intelligent Modeling Assistance (RF-IMA) was proposed. RF-IMA defines the main required components and high-level properties of IMAs. In this paper, we present a detailed, level-wise definition for the properties of RF-IMA to enable a better understanding, comparison, and selection of existing and future IMAs. The proposed levels are a first step towards a comprehensive assessment grid for intelligent modeling assistance. For an initial validation of the proposed levels, we assess the existing landscape of intelligent modeling assistance and three future scenarios of intelligent modeling assistance against these levels. • General and reference → Evaluation; • Software and its engineering → Model-driven software engineering; Abstraction , modeling and modularity; Integrated and visual development environments; Application specific development environments ."
260,hal-01611324,2017,{Modélisation et Évaluation de la Sécurité des IHM},"Ndiaye, Youssou and Aillery, Nicolas and Barais, Olivier and Blouin, Arnaud and Bouabdallah, Ahmed","Human-Computer Interactions(HCI) are part of the Information System and allowuser to interact with the system. HCI allow to manipulate sensitive data such as user'scredentials. A lack of security or a dysfonctionnement of such HCI can lead to securityaws. Software testing technics can help to find problems on HCI, but they are facinga several limits. This thesis entitled ""Modeling and Assessing the HCI Security"" aims atproviding new methods and tools to the software engineers in order to improve the securityof HCI."
261,hal-02087663,2019,{Requirements for preventing logic flaws in the authentication procedure of web applications},"Ndiaye, Youssou and Barais, Olivier and Blouin, Arnaud and Bouabdallah, Ahmed and Aillery, Nicolas","Ensuring the security is one of the most daunting challenges that web applications are facing nowadays. Authentication and authorization are two main security fields that web applications must consider to be protected against unauthorized accesses. Various approaches that detect well-known vulnerabilities and flaws exist. However, these approaches mainly focus on detecting input validation flaws. Another kind of flaws that affect web applications are logic flaws, but they lack of considerations. This paper proposes an approach that helps to considering logic flaws in the context of web applications. The goal of the proposal is to strengthen the authentication procedure of web applications and thus enforce the security early in the design phase. We conducted an empirical study in nine well-known web-based applications to demonstrate that logic flaws may put at risk the authentication procedure. The results showed that logic flaws may be either caused by security issues or usability issues. To overcome such flaws, we provide ten relevant requirements that should be followed in the design of an authentication procedure."
262,hal-03149211,2021,{SE-PAC: A Self-Evolving PAcker Classifier against rapid packers evolution},"Noureddine, Lamine and Heuser, Annelie and Puodzius, Cassius and Zendra, Olivier","Packers are widespread tools used by malware authors to hinder static malware detection and analysis. Identifying the packer used to pack a malware is essential to properly unpack and analyze the malware, be it manually or automatically. While many well-known packers are used, there is a growing trend for new custom packers that make malware analysis and detection harder. Research works have been very effective in identifying known packers or their variants, with signature-based, supervised machine learning or similarity-based techniques. However, identifying new packer classes remains an open problem. This paper presents a self-evolving packer classifier that provides an effective, incremental, and robust solution to cope with the rapid evolution of packers. We propose a composite pairwise distance metric combining different types of packer features. We derive an incremental clustering approach able to identify both (variants of) known packer classes and new ones, as well as to update clusters automatically and efficiently. Our system thus continuously enhances, integrates, adapts and evolves packer knowledge. Moreover, to optimize post clustering packer processing costs, we introduce a new post clustering strategy for selecting small subsets of relevant samples from the clusters. Our approach effectiveness and time-resilience are assessed with: 1) a real-world malware feed dataset composed of 16k packed binaries, comprising 29 unique packers, and 2) a synthetic dataset composed of 19k manually crafted packed binaries, comprising 31 unique packers (including custom ones)."
263,hal-01243146,2015,{Enhancing Cloud Energy Models for Optimizing Datacenters Efficiency},"Outin, Edouard and Dartois, Jean-Emile and Barais, Olivier and Pazat, Jean-Louis","Due to high electricity consumption in the Cloud datacenters, providers aim at maximizing energy efficiency through VM consolidation, accurate resource allocation or adjusting VM usage. More generally, the provider attempts to optimize resource utilization. However, while minimizing expenses, the Cloud operator still needs to conform to SLA constraints negotiated with customers (such as latency, downtime, affinity, placement, response time or duplication). Consequently, optimizing a Cloud configuration is a multi-objective problem. As a nontrivial multi-objective optimization problem, there does not exist a single solution that simultaneously optimizes each objective. There exists a (possibly infinite) number of Pareto optimal solutions. Evolutionary algorithms are popular approaches for generating Pareto optimal solutions to a multi-objective optimization problem. Most of these solutions use a fitness function to assess the quality of the candidates. However, regarding the energy consumption estimation, the fitness function can be approximative and lead to some imprecisions compared to the real observed data. This paper presents a system that uses a genetic algorithm to optimize Cloud energy consumption and machine learning techniques to improve the fitness function regarding a real distributed cluster of server. We have carried out experiments on the OpenStack platform to validate our solution. This experimentation shows that the machine learning produces an accurate energy model, predicting precise values for the simulation."
264,hal-01356099,2016,{Seeking for the Optimal Energy Modelisation Accuracy to Allow Efficient Datacenter Optimizations},"Outin, Edouard and Dartois, Jean-Emile and Barais, Olivier and Pazat, Jean-Louis","As cloud computing is being more and more used, datacenters play a large role in the overall energy consumption. We propose to tackle this problem, by continuously and autonomously optimizing the cloud datacenters energy efficiency. To this end, modeling the energy consumption for these infrastructures is crucial to drive the optimization process, anticipate the effects of aggressive optimization policies, and to determine precisely the gains brought with the planned optimization. Yet, it is very complex to model with accuracy the energy consumption of a physical device as it depends on several factors. Do we need a detailed and fine-grained energy model to perform good optimizations in the datacenter? Or is a simple and naive energy model good enough to propose viable energy-efficient optimizations? Through experiments, our results show that we don't get energy savings compared to classical bin-packing strategies but there are some gains in using precise modeling: better utilization of the network and the VM migration processes."
265,hal-01243158,2014,{Using [email protected] to embed an Energetic Cloud Simulator in a MAPE-K Loop},"Outin, Edouard and Pazat, Jean-Louis and Barais, Olivier","Avec la démocratisation du Cloud et son utilisation toujours plus intensive , les data-centers sont devenus des consommateurs d'énergie notables. Dans ce cadre, l’informatique autonomique vise a` apporter une réponse adéquate et durable en proposant des mécanismes d’adaptation et d’optimisation autonomes. La problématique de ce travail est la suivante: Comment construire un modèle de réflexion désynchronisable pour un système de Cloud distribué permettant de brancher des moteurs de simulation à l’exécution ?"
266,hal-00983046,2014,{A Prediction-Driven Adaptation Approach for Self-Adaptive Sensor Networks},"Paez Anaya, Ivan Dario and Simko, Viliam and Bourcier, Johann and Plouzeau, Noël and Jézéquel, Jean-Marc","Engineering self-adaptive software in unpredictable environments such as pervasive systems, where network's ability, remaining battery power and environmental conditions may vary over the lifetime of the system is a very challenging task. Many current software engineering approaches leverage run-time architectural models to ease the design of the autonomic control loop of these self-adaptive systems. While these approaches perform well in reacting to various evolutions of the runtime environment, implementations based on reactive paradigms have a limited ability to anticipate problems, leading to transient unavailability of the system, useless costly adaptations, or resources waste. In this paper, we follow a proactive self-adaptation approach that aims at overcoming the limitation of reactive approaches. Based on predictive analysis of internal and external context information, our approach regulates new architecture recon figurations and deploys them using models at runtime. We have evaluated our approach on a case study where we combined hourly temperature readings provided by National Climatic Data Center (NCDC) with re reports from Moderate Resolution Imaging Spectroradiometer (MODIS) and simulated the behavior of multiple systems. The results confirm that our proactive approach outperforms a typical reactive system in scenarios with seasonal behavior."
267,hal-01188826,2015,{OCCIware - A Formal and Tooled Framework for Managing Everything as a Service},"Parpaillon, Jean and Merle, Philippe and Barais, Olivier and Dutoo, Marc and Paraiso, Fawaz","The OCCIware project aims at building a comprehensive, coherent while modular model-driven toolchain for managing any kinds of cloud resources, especially Data Center as a Service, Deployment as a Service, Big Data as a Service, and Linked Open Data as a Service. Lever-aging the Open Cloud Computing Interface (OCCI) and its core model, the OCCIware toolchain applies a model-driven engineering approach based on a formal model of cloud resources and systems. This approach allows for better modularity, clear separation between functional (cloud resources) and non-functional concerns (security, scalability, reliability, etc.). The project brings together ten French partners-academics, SMEs, associations-and is supervised by a Strategic Orientation Committee of eleven top industrial and academic experts. The OCCIware project has been selected by French Ministry of Industry and funded by French Banque Publique d'Investissement (BPI)."
268,hal-03020125,2020,{Machine Learning and Configurable Systems: A Gentle Introduction},"Pereira, Juliana Alves and Martin, Hugo and Temple, Paul and Acher, Mathieu","The goal of this tutorial is to give a gentle introduction to how machine learning can be used to support software product line conguration. This is our second practical tutorial in this trending eld. The tutorial is based on a systematic literature review and includes practical tasks (specialization, performance and bug prediction) on real-world systems (Linux, VaryLaTeX, x264). The material is designed for academics and practitioners with basic knowledge in software product lines and machine learning."
269,hal-01427165,2016,{A Complexity Tale: Web Configurators},"Perrouin, Gilles and Acher, Mathieu and Davril, Jean-Marc and Legay, Axel and Heymans, Patrick","Online configurators are basically everywhere. From physical goods (cars, clothes) to services (cloud solutions, insurances, etc.) such configurators have pervaded many areas of everyday life, in order to provide the customers products tailored to their needs. Being sometimes the only interfaces between product suppliers and consumers, much care has been devoted to the HCI aspects of configurators, aiming at offering an enjoyable buying experience. However, at the backend, the management of numerous and complex configuration options results from ad-hoc process rather than a systematic variability-aware engineering approach. We present our experience in analysing web configurators and formalising configuration options in terms of feature models or product configuration matrices. We also consider behavioural issues and perspectives on their architectural design."
270,hal-01406507,2016,{Featured model types: Towards Systematic Reuse in Modelling Language Engineering},"Perrouin, Gilles and Amrani, Moussa and Acher, Mathieu and Combemale, Benoit and Legay, Axel and Schobbens, Pierre-Yves","By analogy with software product reuse, the ability to reuse (meta)models and model transformations is key to achieve better quality and productivity. To this end, various op-portunistic reuse techniques have been developed, such as higher-order transformations, metamodel adaptation, and model types. However, in contrast to software product development that has moved to systematic reuse by adopting (model-driven) software product lines, we are not quite there yet for modelling languages, missing economies of scope and automation opportunities. Our vision is to transpose the product line paradigm at the metamodel level, where reusable assets are formed by metamodel and transformation fragments and "" products "" are reusable language building blocks (model types). We introduce featured model types to concisely model variability amongst metamodelling elements, enabling configuration, automated analysis, and derivation of tailored model types. "
271,hal-01243021,2015,{ATSyRa: An Integrated Environment for Synthesizing Attack Trees },"Pinchinat, Sophie and Acher, Mathieu and Vojtisek, Didier","Attack trees are widely considered in the fields of security for the analysis of risks (or threats) against electronics, computer control, or physical systems. A major barrier is that attack trees can become largely complex and thus hard to specify. This paper presents ATSyRA, a tooling environment to automatically synthesize attack trees of a system under study. ATSyRA provides advanced editors to specify high-level descriptions of a system, high-level actions to structure the tree, and ways to interactively refine the synthesis. We illustrate how users can specify a military building, abstract and organize attacks, and eventually obtain a readable attack tree."
272,hal-01064645,2014,{Towards Synthesis of Attack Trees for Supporting Computer-Aided Risk Analysis},"Pinchinat, Sophie and Acher, Mathieu and Vojtisek, Didier","Attack trees are widely used in the fields of defense for the analysis of risks (or threats) against electronics systems, computer control systems or physical systems. Based on the analysis of attack trees, practitioners can define actions to engage in order to reduce or annihilate risks. A major barrier to support computer-aided risk analysis is that attack trees can become largely complex and thus hard to specify. This paper is a first step towards a methodology, formal foundations as well as automated techniques to synthesize attack trees from a high-level description of a system. Attacks are expressed as a succession of elementary actions and high-level actions can be used to abstract and organize attacks into exploitable attack trees. We describe our tooling support and identify open challenges for supporting the analysis of risks."
273,hal-01545557,2017,{Efficient and Complete FD-Solving for Extended Array Constraints *},"Plazar, Quentin and Acher, Mathieu and Bardin, Sébastien and Gotlieb, Arnaud","Array constraints are essential for handling data structures in automated reasoning and software verification. Unfortunately, the use of a typical finite domain (FD) solver based on local consistency-based filtering has strong limitations when constraints on indexes are combined with constraints on array elements and size. This paper proposes an efficient and complete FD-solving technique for extended constraints over (possibly unbounded) arrays. We describe a simple but particularly powerful transformation for building an equisatisfiable formula that can be efficiently solved using standard FD reasoning over arrays, even in the unbounded case. Experiments show that the proposed solver significantly outperforms FD solvers, and successfully competes with the best SMT-solvers."
274,hal-01991857,2019,{Uniform Sampling of SAT Solutions for Configurable Systems: Are We There Yet?},"Plazar, Quentin and Acher, Mathieu and Perrouin, Gilles and Devroey, Xavier and Cordy, Maxime","Uniform or near-uniform generation of solutions for large satisfiability formulas is a problem of theoretical and practical interest for the testing community. Recent works proposed two algorithms (namely UniGen and QuickSampler) for reaching a good compromise between execution time and uniformity guarantees, with empirical evidence on SAT benchmarks. In the context of highly-configurable software systems (e.g., Linux), it is unclear whether UniGen and QuickSampler can scale and sample uniform software configurations. In this paper, we perform a thorough experiment on 128 real-world feature models. We find that UniGen is unable to produce SAT solutions out of such feature models. Furthermore, we show that QuickSampler does not generate uniform samples and that some features are either never part of the sample or too frequently present. Finally, using a case study, we characterize the impacts of these results on the ability to find bugs in a configurable system. Overall, our results suggest that we are not there: more research is needed to explore the cost-effectiveness of uniform sampling when testing large configurable systems."
275,hal-03379913,2021,{ChaT: Evaluation of Reconfigurable Distributed Network Systems Using Metamorphic Testing},"Pranata, Alif Akbar and Barais, Olivier and Bourcier, Johann and Noirie, Ludovic","Detecting faults in distributed network systems is challenging because of their complexity, but this is required to evaluate and improve their reliability. This paper proposes ChaT, a testing and evaluation methodology under system reconfigurations and perturbations for distributed network systems, to evaluate QoS reliability by discriminating safe and failure-prone behaviors from different testing scenarios. Motivated by metamorphic testing technique that removes the burden of defining software oracles, we propose some metamorphic relationships that correlate system inputs and outputs to find patterns in executions. Classification techniques based on machine learning (principal component analysis and support vector machine) are used to identify system states and validate the proposed metamorphic relationships. These metamorphic relationships are also used to help anomaly detection. We verify this with several anomaly detection techniques (isolation forest, one-class SVM, local outlier factor, and robust covariance) that categorize experiments belonging to either safe or failure-prone states. We apply ChaT to a video streaming application use case. The simulation results show the effectiveness of ChaT to achieve our goals: identifying execution classes and detecting failure-prone experiments based on metamorphic relationships with high level of statistical scores."
276,hal-03137874,2020,{Misconfiguration Discovery with Principal Component Analysis for Cloud-Native Services},"Pranata, Alif Akbar and Barais, Olivier and Bourcier, Johann and Noirie, Ludovic","Cloud applications and services have significantly increased the importance of system and service configuration activities. These activities include updating(i) these services, (ii) their dependencies on third parties,(iii) their configurations, (iv) the configuration of the execution environment, (v) network configurations. The high frequency of updates results in significant configuration complexity that can lead to failures or performance drops.To mitigate these risks, service providers extensively rely on testing techniques, such as metamorphic testing, to detect these failures before moving to production. How-ever, the development and maintenance of these tests are costly, especially the oracle, which must determine whether a system’s performance remains within acceptable boundaries. This paper explores the use of a learning method called Principal Component Analysis (PCA) to learn about acceptable performance metrics on cloud-native services and identify a metamorphic relationship between the nominal service behavior and the value of these metrics. We investigate the following research question: Is it possible to combine the metamorphic testing technique with learning methods on service monitoring data to detect error-prone reconfigurations before moving to production? We remove the developers’ burden to define a specific oracle in detecting these configuration issues. For validation, we applied this proposal on a distributed media streaming application whose authentication was managed by an external identity and access management services.This application illustrates both the heterogeneity of the technologies used to build this type of service and its large configuration space. Our proposal demonstrated the ability to identify error-prone reconfigurations using PC"
277,hal-03328395,2021,{Accurate and Robust Malware Analysis through Similarity of External Calls Dependency Graphs (ECDG)},"Puodzius, Cassius and Zendra, Olivier and Heuser, Annelie and Noureddine, Lamine","Malware is a primary concern in cybersecurity, being one of the attacker's favorite cyberweapons. Over time, malware evolves not only in complexity but also in diversity and quantity. Malware analysis automation is thus crucial. In this paper we present ECDGs, a shorter call graph representation, and a new similarity function that is accurate and robust. Toward this goal, we revisit some principles of malware analysis research to define basic primitives and an evaluation paradigm addressed for the setup of more reliable experiments. Our benchmark shows that our similarity function is very efficient in practice, achieving speedup rates of 3.30x and 354, 11x wrt. radiff2 for the standard and the cache-enhanced implementations, respectively. Our evaluations generate clusters that produce almost unerring results-homogeneity score of 0.983 for the accuracy phase-and marginal information loss for a highly polluted dataset-NMI score of 0.974 between initial and final clusters of the robustness phase. Overall, ECDGs and our similarity function enable autonomous frameworks for malware search and clustering that can assist human-based analysis or improve classification models for malware analysis."
278,hal-03615777,2022,{Incremental Build of Linux Kernel Configurations},"Randrianaina, Georges Aaron","Building software is a crucial task to compile, test, and deploy software systems while continuously ensuring quality. The Linux Kernel is the most configurable and complex system with more than 15,000 features. To speed up the building of such a large configuration set, and in contrast to the common workflow relying on only building clean configurations, we propose to incrementally build them.Initial results do not provide any optimal order to incrementally build configurations due to a high distance between them. However, we show it is possible to control the configurations generation process: reusing commonality can save up to 66% of build time compared to only clean builds."
279,hal-03547219,2022,{On the Benefits and Limits of Incremental Build of Software Configurations: An Exploratory Study},"Randrianaina, Georges Aaron and Tërnava, Xhevahire and Khelladi, Djamel Eddine and Acher, Mathieu","Software projects use build systems to automate the compilation, testing, and continuous deployment of their software products. As software becomes increasingly configurable, the build of multiple configurations is a pressing need, but expensive and challenging to implement. The current state of practice is to build independently (a.k.a., clean build) a software for a subset of configurations. While incremental build has been studied for software evolution and relatively small changes of the source code, it has surprisingly not been considered for software configurations. In this exploratory study, we examine the benefits and limits of building software configurations incrementally, rather than always building them cleanly. By using five real-life configurable systems as subjects, we explore whether incremental build works, outperforms a sequence of clean builds, is correct w.r.t. clean build, and can be used to find an optimal ordering for building configurations. Our results show that incremental build is feasible in 100% of the times in four subjects and in 78% of the times in one subject. In average, 88.5% of the configurations could be built faster with incremental build while also finding several alternatives faster incremental builds. However, only 60% of faster incremental builds are correct. Still, when considering those correct incremental builds with clean builds, we could always find an optimal order that is faster than just a collection of clean builds with a gain up to 11.76%."
280,hal-03558479,2022,{Towards Incremental Build of Software Configurations},"Randrianaina, Georges Aaron and Khelladi, Djamel Eddine and Zendra, Olivier and Acher, Mathieu","Building software is a crucial task to compile, test, and deploy software systems while continuously ensuring quality. As software is more and more configurable, building multiple configurations is a pressing need, yet, costly and challenging to instrument. The common practice is to independently build (a.k.a., clean build) a software for a subset of configurations. While incremental build has been considered for software evolution and relatively small modifications of the source code, it has surprisingly not been considered for software configurations. In this vision paper, we formulate the hypothesis that incremental build can reduce the cost of exploring the configuration space of software systems. We detail how we apply incremental build for two real-world application scenarios and conduct a preliminary evaluation on two case studies, namely x264 and Linux Kernel. For x264, we found that one can incrementally build configurations in an order such that overall build time is reduced. Nevertheless, we could not find any optimal order with the Linux Kernel, due to a high distance between random configurations. Therefore, we show it is possible to control the process of generating configurations: we could reuse commonality and gain up to 66% of build time compared to only clean builds."
281,hal-02019695,2015,{Une démarche pour l?assistance à l'utilisation des patrons de sécurité},"Regainia, Loukmen and Bouhours, Cédric and Salva, Sébastien","La sécurité des applications est critique et primordiale pour la préservation des données personnelles et elle doit donc être prise en compte dès les premières phases du cycle de vie d""une application. Pour cela, une possibilité est de profiter des patrons de sécurité, qui offrent les lignes directrices pour le développement d""une application sûre et de haute qualité. Néanmoins le choix du bon patron et son utilisation pour palier à un problème de sécurité restent difficiles pour un développeur non expert dans leur maniement. Nous proposons dans ce papier une démarche d""assistance aux développeurs permettant de vérifier si un modèle UML composé de patrons de sécurité soulève des vulnérabilités. Notre approche est basée sur une liste de patrons de sécurité et, pour chaque patron, sur une liste de propriétés « génériques » de vulnérabilités. Ainsi, à partir d""un modèle UML composé de patrons de sécurité, notre approche vise à vérifier si un patron de sécurité peut être une garantie à l""absence de vulnérabilités dans une application. Le développeur peut ainsi savoir si son modèle UML comporte des failles qui se retrouveront dans son implémentation malgré l""utilisation de patrons de sécurité. De plus, notre approche peut également montrer que le modèle UML est mal conçu ou qu""un ou des patrons de sécurité ont été abimés lors de leur utilisation."
282,hal-01026148,2014,{Using Path-Dependent Types to Build Type Safe JavaScript Foreign Function Interfaces},"Richard-Foy, Julien and Barais, Olivier and Jézéquel, Jean-Marc","The popularity of statically typed programming languages compiling to JavaScript shows that there exists a fringe of the programmer population interested in leveraging the benefits of static typing to write Web applications. To be of any use, these languages need to statically expose the Web browser dynamically typed native API, which seems to be a contradiction in terms. Indeed, we observe that existing statically typed languages compiling to JavaScript expose the browser API in ways that either are not type safe, or when they are, typically over constrain the programmers. This article presents new ways to encode the challenging parts of the Web browser API in static type systems such that both type safety and expressive power are preserved. Our first encoding relies on type parameters and can be implemented in most mainstream languages but drags phantom types up to the usage sites. The second encoding does not suffer from this inconvenience but requires the support of dependent types in the language."
283,hal-02315327,2019,"{Benefits of Energy Management Systems on local energy efficiency, an agricultural case study}","Rio, Alexandre and Maurel, Yoann and Bugni, Yoran and Barais, Olivier",
284,hal-01913169,2018,{Efficient use of local energy: An activity oriented modeling to guide Demand Side Management},"Rio, Alexandre and Maurel, Yoann and Barais, Olivier and Bugni, Yoran","Self-consumption of renewable energies is defined as electricity that is produced from renewable energy sources, not injected to the distribution or transmission grid or instantaneously withdrawn from the grid and consumed by the owner of the power production unit or by associates directly contracted to the producer. Designing solutions in favor of self-consumption for small industries or city districts is challenging. It consists in designing an energy production system made of solar panels, wind turbines, batteries that fit the annual weather prediction and the industrial or human activity. In this context, this paper reports the context of this business domain, its challenges, and the application of modeling that leads to a solution. Through this article, we highlight the essentials of a domain specific modeling language designed to let domain experts run their own simulations, we compare with existing practices that exist in such a company and we discuss the benefits and the limits of the use of modeling in such context."
285,hal-01342353,2016,{Hacia el uso de sistemas de recomendación en sistemas de alta variabilidad},"Rodas, Jorge L. and Olivares, Javier and Galindo Duarte, José Angel and Benavides, David","Resumen Los sistemas de alta variabilidad son sistemas de software que describen una gran cantidad de configuraciones. Existen sistemas de alta variabilidad que representan miles de productos. Manejar la varia-bilidad presente en estos sistemas es costoso y en muchos casos suele ser complicado. En la actualidad hemos visto en en la industria un notable crecimiento de los sistemas de recomendación en muchos ambitos, como el comercio electrónico , publicidad online, entre otros. Un sistema de re-comendación es un agente de software que permite hacer predicciones de una serie de productos para que se adapten mejor a las necesidades o gustos de un usuario. En este artículo de prospección proponemos la fusión de estos dos campos de la ingeniería para mejorar distintas facetas dentro de la gestión de los sistemas de alta variabilidad. 1. Introducción Los sistemas de alta variabilidad son sistemas de software cuyo comporta-miento puede ser personalizado de acuerdo con las necesidades específicas de un contexto particular [1]. Un sistema de alta variabilidad puede ser represen-tado por un modelo de características que define el número de combinaciones posibles para las configuraciones derivadas del mismo. Esta complejidad hace que el análisis y mejora de dichos sistemas de una forma manual sea una acti-vidad costosa y propensa a errores. Para aliviar este problema se han propuesto múltiples técnicas que permiten un análisis automático de los sistemas de alta variabilidad [2]. De hecho, en la industria podemos encontrar varios ejemplos de dichos modelos que representan la variabilidad de estos sistemas en entornos reales, como el ecosistema de dispositivos móviles [6] o los sistemas de gestión de precios en la nube (cloud-price management system)[7] que describen cientos de configuraciones diferentes. Por otra parte, en la industria encontramos los sistemas de recomendación. Un sistema de recomendación tiene como objetivo recomendar a los usuarios los productos más adecuados de acuerdo a su perfil de gustos. Normalmente son usados para sugerir los productos que mejor se adaptan a un usuario o a un tipo"
286,hal-01204507,2015,{Towards testing variability intensive systems using user reviews},"Rodas, Jorge L. and Mendez Acuna, David and Galindo Duarte, José Angel and Benavides, David and Cardenas, Jessenia","Variability intensive systems are software systems that describe a large set of diverse and different configurations that share some characteristics. This high number of configurations makes testing such systems an expensive and error-prone task. For example, in the Android ecosystem we can find up to 24 different valid configurations, thus, making it impossible to test an application on all of them. To alleviate this problem, previous research suggest the selection of a subset of test cases that maximize the changes of finding errors while maximizing the diversity of configurations. Concretely, the proposals focus on the prioritization and selection of tests, so only relevant configurations are tested according to some criterion. In this paper, we envision the use of user reports to prioritize and select meaningful tests. To do this, we explore the use of recommender systems as a possible improvement to the selection of test cases in intensive variability systems."
287,hal-02407868,2019,{Approximate Loop Unrolling},"Rodriguez-Cancio, Marcelino and Combemale, Benoit and Baudry, Benoit",
288,hal-01343818,2016,{Automatic Microbenchmark Generation to Prevent Dead Code Elimination and Constant Folding},"Rodriguez-Cancio, Marcelino and Combemale, Benoit and Baudry, Benoit","Microbenchmarking consists of evaluating, in isolation, the performance of small code segments that play a critical role in large applications. The accuracy of a microbenchmark depends on two critical tasks: wrap the code segment into a payload that faithfully recreates the execution conditions that occur in the large application; build a scaffold that runs the payload a large number of times to get a statistical estimate of the execution time. While recent frameworks such as the Java Microbenchmark Harness (JMH) take care of the scaffold challenge, developers have very limited support to build a correct payload. In this work, we focus on the automatic generation of pay-loads, starting from a code segment selected in a large application. In particular, we aim at preventing two of the most common mistakes made in microbenchmarks: dead code elimination and constant folding. Since a microbench-mark is such a small program, if not designed carefully, it will be "" over-optimized "" by the JIT and result in distorted time measures. Our technique hence automatically extracts the segment into a compilable payload and generates additional code to prevent the risks of "" over-optimization "". The whole approach is embedded in a tool called AutoJMH, which generates payloads for JMH scaffolds. We validate the capabilities AutoJMH, showing that the tool is able to process a large percentage of segments in real programs. We also show that AutoJMH can match the quality of payloads handwritten by performance experts and outperform those written by professional Java developers without experience un microbenchmarking."
289,hal-04108237,2023,"{The TeamPlay Project: Analysing and Optimising Time, Energy, and Security for Cyber-Physical Systems}","Rouxel, Benjamin and Brown, Christopher and Ebeid, Emad and Eder, Kerstin and Falk, Heiko and Grelck, Clemens and Holst, Jesper and Jadhav, Shashank and Marquer, Yoann and Alejandro, Marcos Martinez De and Nikov, Kris and Sahafi, Ali and Lundquist, Ulrik Pagh Schultz and Seewald, Adam and Vassalos, Vangelis and Wegener, Simon and Zendra, Olivier","Non-functional properties, such as energy, time, and security (ETS) are becoming increasingly important in Cyber-Physical Systems (CPS) programming. This article describes TeamPlay, a research project funded under the EU Horizon 2020 programme between January 2018 and June 2021. TeamPlay aimed to provide the system designer with a toolchain for developing embedded applications where ETS properties are first-class citizens, allowing the developer to reflect directly on energy, time and security properties at the source code level. In this paper we give an overview of the TeamPlay methodology, introduce the challenges and solutions of our approach and summarise the results achieved. Overall, applying our TeamPlay methodology led to an improvement of up to 18% performance and 52% energy usage over traditional approaches."
290,hal-02545875,2020,{Loop Aggregation for Approximate Scientific Computing},"Sallou, June and Gauvain, Alexandre and Bourcier, Johann and Combemale, Benoit and de Dreuzy, Jean-Raynald","Trading off some accuracy for better performances in scientific computing is an appealing approach to ease the exploration of various alternatives on complex simulation models. Existing approaches involve the application of either time-consuming model reduction techniques or resource-demanding statistical approaches. Such requirements prevent any opportunistic model exploration, e.g., exploring various scenarios on environmental models. This limits the ability to analyse new models for scientists, to support trade-off analysis for decision-makers and to empower the general public towards informed environmental intelligence. In this paper, we present a new approximate computing technique, aka. loop aggregation, which consists in automatically reducing the main loop of a simulation model by aggregating the corresponding spatial or temporal data. We apply this approximate scientific computing approach on a geophysical model of a hydraulic simulation with various input data. The experimentation demonstrates the ability to drastically decrease the simulation time while preserving acceptable results with a minimal set-up. We obtain a median speed-up of 95.13% and up to 99.78% across all the 23 case studies."
291,hal-01002099,2014,{Deriving Usage Model Variants for Model-based Testing: An Industrial Case Study},"Samih, Hamza and Acher, Mathieu and Bogusch, Ralf and Le Guen, Hélène and Baudry, Benoit","The strong cost pressure of the market and safety issues faced by aerospace industry affect the development. Suppliers are forced to continuously optimize their life-cycle processes to facilitate the development of variants for different customers and shorten time to market. Additionally, industrial safety standards like RTCA/DO-178C require high efforts for testing single products. A suitably organized test process for Product Lines (PL) can meet standards. In this paper, we propose an approach that adopts Model-based Testing (MBT) for PL. Usage models, a widely used MBT formalism that provides automatic test case generation capabilities, are equipped with variability information such that usage model variants can be derived for a given set of features. The approach is integrated in the professional MBT tool MaTeLo. We report on our experience gained from an industrial case study in the aerospace domain."
292,hal-01025159,2014,{MPLM -- MaTeLo Product Line Manager},"Samih, Hamza and Bogusch, Ralf","The diversity of requirements elicited from different customers leads to the development of many variants. Furthermore, compliance with safety standards as mandated for safety-critical systems requires high test efforts for each variant. Model-based testing aims to reduce test efforts by automatically generating test cases from test models. In this paper, we introduce variability management to usage models, a widely used model-based testing formalism. We present an approach that allows to derive usage model variants from a desired set of features and thus generate test cases for each variant. The approach is integrated in the industrial model-based testing tool chain MaTeLo and exemplified using an industrial case study from the aerospace domain."
293,hal-00927312,2014,{Comparing or Configuring Products: Are We Getting the Right Ones?},"Sannier, Nicolas and Bécan, Guillaume and Acher, Mathieu and Ben Nasr, Sana and Baudry, Benoit","Product comparators and configurators aim to assist customers in choosing a product that meets their expectations. While comparators present similarities and differences between competing products, configurators propose an assisted environment to gradually choose and customize products. The two systems have pros and cons and are inherently different. But both share the same variability information background and operate over a set of (possible) products, typically represented through product comparison matrices (PCMs). A key issue is that current PCMs have no clear semantics, making their analysis and transformations imprecise and hard. In this paper, we sketch a research plan for generating dedicated comparators or configurators from PCMs. The core of our vision is the use of formal variability models to encode PCMs and enables a further exploitation by developers of comparators or configurators. We elaborate on five research questions and describe the expected outputs of the research."
294,hal-00982065,2014,{INCREMENT: A Mixed MDE-IR Approach for Regulatory Requirements Modeling and Analysis},"Sannier, Nicolas and Baudry, Benoit","[Context and motivation] Regulatory requirements for Nuclear instrumentation and control (I&C) systems are first class requirements. They are written by national safety entities and are completed through a large documentation set of national recommendation guidesandnational/internationalstandards. [Question/Problem] I&C systems important to safety must comply to all of these requirements. The global knowledge of this domain is scattered through these different documents and not formalized. Its organization and traceability relationships within this domain is mainly implicit. As a consequence, such long lasting nuclear I&C projects set important challenges in terms of tacit expertise capitalization and domain analysis. [Principal ideas/results] To tackle this domain formalization issue, we propose a dual Model-driven Engineering (MDE) and Information Retrieval (IR) approach to address the nuclear regulatory requirements domain definition, and assisted traceability based on the acquired requirements model. [Contributions] In this paper, we present the Connexion metamodel that pro- vides a canvas for the definition and capitalization of the nuclear regu- latory requirements domain. We also present an hybrid MDE/IR-based approach, named INCREMENT, for acquiring, modeling and analyzing these regulatory requirements. This approach is supported by a tool that is developed in the context of the CONNEXION project, which gathers French major nuclear I&C industrial actors."
295,hal-01763369,2018,{Detection and Analysis of Behavioral T-patterns in Debugging Activities},"Soto-Valero, César and Bourcier, Johann and Baudry, Benoit","A growing body of research in empirical software engineering applies recurrent patterns analysis in order to make sense of the developers' behavior during their interactions with IDEs. However, the exploration of hidden real-time structures of programming behavior remains a challenging task. In this paper, we investigate the presence of temporal be-havioral patterns (T-patterns) in debugging activities using the THEME software. Our preliminary exploratory results show that debugging activities are strongly correlated with code editing, file handling, window interactions and other general types of programming activities. The validation of our T-patterns detection approach demonstrates that debug-ging activities are performed on the basis of repetitive and well-organized behavioral events. Furthermore, we identify a large set of T-patterns that associate debugging activities with build success, which corroborates the positive impact of debugging practices on software development."
296,hal-02080248,2019,{The Emergence of Software Diversity in Maven Central},"Soto-Valero, César and Benelallam, Amine and Harrand, Nicolas and Barais, Olivier and Baudry, Benoit","Maven artifacts are immutable: an artifact that is uploaded on Maven Central cannot be removed nor modified. The only way for developers to upgrade their library is to release a new version. Consequently, Maven Central accumulates all the versions of all the libraries that are published there, and applications that declare a dependency towards a library can pick any version. In this work, we hypothesize that the immutability of Maven artifacts and the ability to choose any version naturally support the emergence of software diversity within Maven Central. We analyze 1,487,956 artifacts that represent all the versions of 73,653 libraries. We observe that more than 30% of libraries have multiple versions that are actively used by latest artifacts. In the case of popular libraries, more than 50% of their versions are used. We also observe that more than 17% of libraries have several versions that are significantly more used than the other versions. Our results indicate that the immutability of artifacts in Maven Central does support a sustained level of diversity among versions of libraries in the repository."
297,hal-01141395,2015,{Towards the use of slicing techniques for an efficient invariant checking},"Sun, Wuliang and Combemale, Benoit and France, Robert B.","In Model Driven Development (MDD), invariant checking involves determining whether a model is consistent with invariants defined in a metamodel. Such checking can improve developers' understanding of modeled aspects of complex systems and uncover structural errors in design models during the early stages of software development. General-purpose rigorous analysis tools that check invariants are likely to perform the analysis over the entire meta-model and model. Their scalability thus becomes an issue (e.g., the time used for checking can be up to several hours) with very large metamodels and models (e.g., more than 500,000 elements). In this paper we introduce model slicing within the invariant checking process , and use a slicing technique to reduce the size of checking inputs to improve the scalability of existing invariant checking tools. The evaluation we performed provides evidence that model slicing can significantly reduce the time to perform the invariant checking while preserving the checking results."
298,hal-01795144,2017,{Modeling business motivation and underlying processes for RAMI 4.0-aligned cyber-physical production systems},"Suri, Kunal and Cadavid, Juan and Alférez, Mauricio and Dhouib, Saadia and Tucci-Piergiovanni, Sara",
299,hal-01730163,2018,{Multimorphic Testing},"Temple, Paul and Acher, Mathieu and Jézéquel, Jean-Marc","The functional correctness of a software application is, of course, a prime concern, but other issues such as its execution time, precision , or energy consumption might also be important in some contexts. Systematically testing these quantitative properties is still extremely difficult, in particular, because there exists no method to tell the developer whether such a test set is ""good enough"" or even whether a test set is better than another one. This paper proposes a new method, called Multimorphic testing, to assess the relative effectiveness of a test suite for revealing performance variations of a software system. By analogy with mutation testing, our core idea is to vary software parameters, and to check whether it makes any difference on the outcome of the tests: i.e. are some tests able to "" kill "" bad morphs (configurations)? Our method can be used to evaluate the quality of a test suite with respect to a quantitative property of interest, such as execution time or computation accuracy."
300,hal-02287616,2019,{Towards Quality Assurance of Software Product Lines with Adversarial Configurations},"Temple, Paul and Acher, Mathieu and Perrouin, Gilles and Biggio, Battista and Jézéquel, Jean-Marc and Roli, Fabio","Software product line (SPL) engineers put a lot of effort to ensure that, through the setting of a large number of possible configuration options, products are acceptable and well-tailored to customers’ needs. Unfortunately, options and their mutual interactions create a huge configuration space which is intractable to exhaustively explore. Instead of testing all products, machine learning is increasingly employed to approximate the set of acceptable products out of a small training sample of configurations. Machine learning (ML) techniques can refine a software product line through learned constraints and a priori prevent non-acceptable products to be derived. In this paper, we use adversarial ML techniques to generate adversarial configurations fooling ML classifiers and pinpoint incorrect classifications of products (videos) derived from an industrial video generator. Our attacks yield (up to) a 100% misclassification rate and a drop in accuracy of 5%. We discuss the implications these results have on SPL quality assurance."
301,hal-01323446,2016,{Using Machine Learning to Infer Constraints for Product Lines},"Temple, Paul and Galindo Duarte, José Angel and Acher, Mathieu and Jézéquel, Jean-Marc","Variability intensive systems may include several thousand features allowing for an enormous number of possible configurations , including wrong ones (e.g. the derived product does not compile). For years, engineers have been using constraints to a priori restrict the space of possible configurations , i.e. to exclude configurations that would violate these constraints. The challenge is to find the set of constraints that would be both precise (allow all correct configurations) and complete (never allow a wrong configuration with respect to some oracle). In this paper, we propose the use of a machine learning approach to infer such product-line constraints from an oracle that is able to assess whether a given product is correct. We propose to randomly generate products from the product line, keeping for each of them its resolution model. Then we classify these products according to the oracle, and use their resolution models to infer cross-tree constraints over the product-line. We validate our approach on a product-line video generator, using a simple computer vision algorithm as an oracle. We show that an interesting set of cross-tree constraint can be generated, with reasonable precision and recall."
302,hal-03527250,2022,{On the Interaction of Feature Toggles},"Tërnava, Xhevahire and Lesoil, Luc and Randrianaina, Georges Aaron and Khelladi, Djamel Eddine and Acher, Mathieu","Feature toggling is a technique for enabling branching-in-code. It is increasingly used during continuous deployment to incrementally test and integrate new features before their release. In principle, feature toggles tend to be light, that is, they are defined as simple Boolean flags and used in conditional statements to condition the activation of some software features. However, there is a lack of knowledge on whether and how they may interact with each other, in that case their enabling and testing become complex. We argue that finding the interactions of feature toggles is valuable for developers to know which of them should be enabled at the same time, which are impacted by a removed toggle, and to avoid their mis-configurations. In this work, we mine feature toggles and their interactions in five open-source projects. We then analyse how they are realized and whether they tend to be multiplied over time. Our results show that 7% of feature toggles interact with each other, 33% of them interact with another code expression, and their interactions tend to increase over time (22%, on average). Further, their interactions are expressed by simple logical operators (i.e., and and or) and nested if statements. We propose to model them into a Feature Toggle Model, and believe that our results are helpful towards robust management approaches of feature toggles."
303,hal-03627246,2022,{Scratching the Surface of ./configure: Learning the Effects of Compile-Time Options on Binary Size and Gadgets},"Tërnava, Xhevahire and Acher, Mathieu and Lesoil, Luc and Blouin, Arnaud and Jézéquel, Jean-Marc","Numerous software systems are configurable through compile-time options and the widely used ./configure. However, the combined effects of these options on binary's non-functional properties (size and attack surface) are often not documented, and or not well understood, even by experts. Our goal is to provide automated support for exploring and comprehending the configuration space (a.k.a., surface) of compile-time options using statistical learning techniques. In this paper, we perform an empirical study on four C-based configurable systems. We measure the variation of binary size and attack surface (by quantifying the number of code reuse gadgets) in over 400 compile-time configurations of a subject system. We then apply statistical learning techniques on top of our build infrastructure to identify how compile-time options relate to non-functional properties. Our results show that, by changing the default configuration, the system's binary size and gadgets vary greatly (roughly −79% to 244% and −77% to 30%, respectively). Then, we found out that identifying the most influential options can be accurately learned with a small training set, while their relative importance varies across size and attack surface for the same system. Practitioners can use our approach and artifacts to explore the effects of compile-time options in order to take informed decisions when configuring a system with ./configure."
304,hal-03916459,2023,{Specialization of Run-time Configuration Space at Compile-time: An Exploratory Study},"Tërnava, Xhevahire and Acher, Mathieu and Combemale, Benoit","Numerous software systems are highly configurable through runtime options (e.g., command-line parameters). Users can tune some of the options to meet various functional and non-functional requirements such as footprint, security, or execution time. However, some options are never set for a given system instance, and their values remain the same whatever the use cases of the system. Herein, we design a controlled experiment in which the system's run-time configuration space can be specialized at compile-time and combinations of options can be removed on demand. We perform an in-depth study of the well-known x264 video encoder and quantify the effects of its specialization to its non-functional properties, namely on binary size, attack surface, and performance while ensuring its validity. Our exploratory study suggests that the configurable specialization of a system has statistically significant benefits on most of its analysed non-functional properties, which benefits depend on the number of the debloated options. While our empirical results and insights show the importance of removing code related to unused run-time options to improve software systems, an open challenge is to further automate the specialization process."
305,hal-02968938,2020,{A principled approach to REPL interpreters},"Thomas van Binsbergen, L. and Merino, Mauricio Verano and Jeanjean, Pierre and van Der Storm, Tijs and Combemale, Benoit and Barais, Olivier","Read-eval-print-loops (REPLs) allow programmers to test out snippets of code, explore APIs, or even incrementally construct code, and get immediate feedback on their actions. However, even though many languages provide a REPL, the relation between the language as is and what is accepted at the REPL prompt is not always well-defined. Furthermore, implementing a REPL for new languages, such as DSLs, may incur significant language engineering cost. In this paper we survey the domain of REPLs and investigate the (formal) principles underlying REPLs. We identify and define the class of sequential languages, which admit a sound REPL implementation based on a definitional interpreter , and present design guidelines for extending existing language implementations to support REPL-style interfaces (including computational notebooks). The obtained REPLs can then be generically turned into an exploring interpreter, to allow exploration of the user's interaction. The approach is illustrated using three case studies, based on MiniJava, QL (a DSL for questionnaires), and eFLINT (a DSL for normative rules). We expect sequential languages, and the consequent design principles, to be stepping stones towards a better understanding of the essence of REPLs."
306,hal-01354997,2016,{KevoreeJS: Enabling Dynamic Software Reconfigurations in the Browser},"Tricoire, Maxime and Barais, Olivier and Leduc, Manuel and Bourcier, Johann and Fouquet, François and Nain, Grégory and Mouline, Ludovic and Sunyé, Gerson and Morin, Brice","The architecture of classic productivity software are moving from a traditional desktop-based software to a client server architecture hosted in the Cloud. In this context, web browsers behave as application containers that allow users to access a variety of Cloud-based applications and services, such as IDEs, Word processors, Music Collection Managers, etc. As a result, a significant part of these software run in the browser and accesses remote services. A lesson learned from development framework used in distributed applications is the success of pluggable architecture pattern as a core architecture concept, i.e., a Software Architecture that promotes the use of Pluggable Module to dynamically plug. Following this trend, this paper discusses the main challenges to create a component-based platform supporting the development of dynamically adaptable single web page applications. This paper also presents an approach called KevoreeJS based on models@runtime to control browser as component platform which address some of these challenges. We validate this work by presenting the design of a dashboard for sensor based system and highlighting the capacity of KevoreeJS to dynamically choose the placement of code on the server or client side and how KevoreeJS can be used to dynamically install or remove running components."
307,hal-01023864,2014,{Automating Variability Model Inference for Component-Based Language Implementations},"Vacchi, Edoardo and Cazzola, Walter and Combemale, Benoit and Acher, Mathieu","Recently, domain-specific language development has become again a topic of interest, as a means to help designing solutions to domain-specific problems. Componentized language frameworks, coupled with variability modeling, have the potential to bring language development to the masses, by simplifying the configuration of a new language from an existing set of reusable components. However, designing variability models for this purpose requires not only a good understanding of these frameworks and the way components interact, but also an adequate familiarity with the problem domain. In this paper we propose an approach to automatically infer a relevant variability model from a collection of already implemented language components, given a structured, but general representation of the domain. We describe techniques to assist users in achieving a better understanding of the relationships between language components, and find out which languages can be derived from them with respect to the given domain."
308,hal-03921387,2022,{A Language-Parametric Approach to Exploratory Programming Environments},"van Binsbergen, L. Thomas and Frölich, Damian and Verano Merino, Mauricio and Lai, Joey and Jeanjean, Pierre and van der Storm, Tijs and Combemale, Benoit and Barais, Olivier","Exploratory programming is a software development style in which code is a medium for prototyping ideas and solutions, and in which even the end-goal can evolve over time. Exploratory programming is valuable in various contexts such as programming education, data science, and end-user programming. However, there is a lack of appropriate tooling and language design principles to support exploratory programming. This paper presents a host language-and object language-independent protocol for exploratory programming akin to the Language Server Protocol. The protocol serves as a basis to develop novel (or extend existing) programming environments for exploratory programming such as computational notebooks and command-line REPLs. An architecture is presented on top of which prototype environments can be developed with relative ease, because existing (language) components can be reused. Our prototypes demonstrate that the proposed protocol is sufficiently expressive to support exploratory programming scenarios as encountered in literature within the software engineering, human-computer interaction and data science domains."
309,hal-01018938,2014,{Towards Managing Variability in the Safety Design of an Automotive Hall Effect Sensor},"van Landuyt, Dimitri and Op de Beeck, Steven and Hovsepyan, Aram and Michiels, Sam and Joosen, Wouter and Meynckens, Sven and de Jong, Gjalt and Barais, Olivier and Acher, Mathieu","This paper discusses the merits and challenges of adopting software product line engineering (SPLE) as the main development process for an automotive Hall Effect sensor. This versatile component is integrated into a number of automotive applications with varying safety requirements (e.g., windshield wipers and brake pedals). This paper provides a detailed explanation as to why the process of safety assessment and verification of the Hall Effect sensor is currently cumbersome and repetitive:~it must be repeated entirely for every automotive application in which the sensor is to be used. In addition, no support is given to the engineer to select and configure the appropriate safety solutions and to explain the safety implications of his decisions. To address these problems, we present a tailored SPLE-based approach that combines model-driven development with advanced model composition techniques for applying and reasoning about specific safety solutions. In addition, we provide insights about how this approach can reduce the overall complexity, improve reusability, and facilitate safety assessment of the Hall Effect sensor."
310,hal-01182773,2015,{A Behavioral Coordination Operator Language (BCOoL)},"Vara Larsen, Matias Ezequiel and Deantoni, Julien and Combemale, Benoit and Mallet, Frédéric","The design of complex systems involves various, possibly heterogeneous, structural and behavioral models. In model-driven engineering, the coordination of behavioral models to produce a single integrated model is necessary to provide support for validation and verification. Indeed, it allows system designers to understand and validate the global and emerging behavior of the system. However, the manual coordination of models is tedious and error-prone, and current approaches to automate the coordination are bound to a fixed set of coordination patterns. In this paper, we propose a Behavioral Coordination Operator Language (B-COOL) to reify coordination patterns between specific domains by using coordination operators between the Domain-Specific Modeling Languages used in these domains. Those operators are then used to automate the coordination of models conforming to these languages. We illustrate the use of B-COOL with the definition of coordination operators between timed finite state machines and activity diagrams."
311,hal-01198744,2015,{A Model-Driven Based Environment for Automatic Model Coordination},"Vara Larsen, Matias Ezequiel and Deantoni, Julien and Combemale, Benoit and Mallet, Frédéric","We present the integration of the Behavioral Coordination Operator Language (B-COOL) into the GEMOC Studio. B-COOL enables the system designer to automate the coordination of models by specifying Operators between Domain-Specific Modeling Languages. In this demonstration, we present how B-COOL is used to coordinate the heterogeneous model of a video surveillance system. To this propose, we define operators between timed finite state machines and activity diagrams. These operators are used to generate an explicit model of coordination that can be executed and verified. This demonstration comes as a support for the paper accepted into the main conference."
312,hal-01652021,2018,{FP-STALKER: Tracking Browser Fingerprint Evolutions},"Vastel, Antoine and Laperdrix, Pierre and Rudametkin, Walter and Rouvoy, Romain","Browser fingerprinting has emerged as a technique to track users without their consent. Unlike cookies, fingerprinting is a stateless technique that does not store any information on devices, but instead exploits unique combinations of attributes handed over freely by browsers. The uniqueness of fingerprints allows them to be used for identification. However, browser fingerprints change over time and the effectiveness of tracking users over longer durations has not been properly addressed. In this paper, we show that browser fingerprints tend to change frequently—from every few hours to days—due to, for example, software updates or configuration changes. Yet, despite these frequent changes, we show that browser fingerprints can still be linked, thus enabling long-term tracking. FP-STALKER is an approach to link browser fingerprint evolutions. It compares fingerprints to determine if they originate from the same browser. We created two variants of FP-STALKER, a rule-based variant that is faster, and a hybrid variant that exploits machine learning to boost accuracy. To evaluate FP-STALKER, we conduct an empirical study using 98,598 fingerprints we collected from 1,905 distinct browser instances. We compare our algorithm with the state of the art and show that, on average, we can track browsers for 54.48 days, and 26 % of browsers can be tracked for more than 100 days."
313,hal-01870976,2018,{Descartes: a PITest engine to detect pseudo-tested methods - Tool Demonstration},"Vera-Pérez, Oscar Luis and Monperrus, Martin and Baudry, Benoit","Descartes is a tool that implements extreme mutation operators and aims at finding pseudo-tested methods in Java projects. It leverages the efficient transformation and runtime features of PITest. The demonstration compares Descartes with Gregor, the default mutation engine provided by PITest, in a set of real open source projects. It considers the execution time, number of mutants created and the relationship between the mutation scores produced by both engines. It provides some insights on the main features exposed by Descartes."
314,hal-03936218,2022,{Data-Centric Green AI An Exploratory Empirical Study},"Verdecchia, Roberto and Cruz, Luis and Sallou, June and Lin, Michelle and Wickenden, James and Hotellier, Estelle","With the growing availability of large-scale datasets, and the popularization of affordable storage and computational capabilities, the energy consumed by AI is becoming a growing concern. To address this issue, in recent years, studies have focused on demonstrating how AI energy efficiency can be improved by tuning the model training strategy. Nevertheless, how modifications applied to datasets can impact the energy consumption of AI is still an open question. To fill this gap, in this exploratory study, we evaluate if datacentric approaches can be utilized to improve AI energy efficiency. To achieve our goal, we conduct an empirical experiment, executed by considering 6 different AI algorithms, a dataset comprising 5,574 data points, and two dataset modifications (number of data points and number of features). Our results show evidence that, by exclusively conducting modifications on datasets, energy consumption can be drastically reduced (up to 92.16%), often at the cost of a negligible or even absent accuracy decline. As additional introductory results, we demonstrate how, by exclusively changing the algorithm used, energy savings up to two orders of magnitude can be achieved. In conclusion, this exploratory investigation empirically demonstrates the importance of applying data-centric techniques to improve AI energy efficiency. Our results call for a research agenda that focuses on data-centric techniques, to further enable and democratize Green AI."
315,hal-03632376,2022,{Data-Centric Green AI: An Exploratory Empirical Study},"Verdecchia, Roberto and Cruz, Luís and Sallou, June and Lin, Michelle and Wickenden, James and Hotellier, Estelle","With the growing availability of large-scale datasets, and the popularization of affordable storage and computational capabilities, the energy consumed by AI is becoming a growing concern. To address this issue, in recent years, studies have focused on demonstrating how AI energy efficiency can be improved by tuning the model training strategy. Nevertheless, how modifications applied to datasets can impact the energy consumption of AI is still an open question. To fill this gap, in this exploratory study, we evaluate if datacentric approaches can be utilized to improve AI energy efficiency. To achieve our goal, we conduct an empirical experiment, executed by considering 6 different AI algorithms, a dataset comprising 5,574 data points, and two dataset modifications (number of data points and number of features). Our results show evidence that, by exclusively conducting modifications on datasets, energy consumption can be drastically reduced (up to 92.16%), often at the cost of a negligible or even absent accuracy decline. As additional introductory results, we demonstrate how, by exclusively changing the algorithm used, energy savings up to two orders of magnitude can be achieved. In conclusion, this exploratory investigation empirically demonstrates the importance of applying data-centric techniques to improve AI energy efficiency. Our results call for a research agenda that focuses on data-centric techniques, to further enable and democratize Green AI."
316,hal-01342354,2016,{El uso de modelos de características con atributos para pruebas en sistemas de alta variabilidad: primeros pasos},"Vinueza, Mariuxi and Rodas, Jorge L. and Galindo Duarte, José Angel and Benavides, David","Los modelos de características con atributos representan todos los productos de una línea de productos junto con información adi-cional. En la literatura encontramos modelos representando miles de pro-ductos distintos. La selección de estos productos para hacer pruebas es un reto que se está estudiando en la literatura, en algunas de estas propuestas utilizan modelos de características con atributos para seleccionar este subconjunto de productos. Sin embargo no existe una guía de como utilizar los atributos para selección de casos de pruebas en distintos es-cenarios, con el objetivo de alimentar esa guía , nos proponemos buscar en la literatura la manera de caracterizar los modelos usados por otros investigadores con el objetivo de ayudar a modelar atributos en modelos de características para realizar las pruebas. Una línea de productos permite desarrollar un conjunto de productos dis-tintos que comparten parte de la funcionalidad. Los modelos de características son usados para representar el conjunto de productos de software en términos de características y relaciones [6]. El análisis automático de modelos de características trata de extraer informa-ción de los modelos de características usando mecanismos automatizados [2]. Por ejemplo, para validar la corrección del modelo o identificar el conjunto de carac-terístcas que lo conforman. FAMA [3] o FaMiLiar[1], entre otras, son herramien-tas que implementan e integran diferentes soluciones para análisis automático de modelos de características. Existen propuestas para representar la información de calidad dentro de los modelos de características. Actualmente, a este tipo de modelos con información adicional se les denomina como extendidos, avanzados, o modelos de característi-cas con atributos [2,4,5]. Un modelo de características extendido contiene infor-mación extra sobre las características , el propósito de esta extensión es añadir información medible (cuantitativa) sobre las características mediante la intro-ducción de atributos, así como la inclusión de restricciones más complejas entre las características y sus atributos."
317,hal-03284857,2017,{A Systematic Mapping Study on Modeling for Industry 4.0},"Wortmann, Andreas and Combemale, Benoit and Barais, Olivier","Industry 4.0 is a vision of manufacturing in which smart, interconnected production systems optimize the complete value-added chain to reduce cost and time-to-market. At the core of Industry 4.0 is the smart factory of the future, whose successful deployment requires solving challenges from many domains. Model-based systems engineering (MBSE) is a key enabler for such complex systems of systems as can be seen by the increased number of related publications in key conferences and journals. This paper aims to characterize the state of the art of MBSE for the smart factory through a systematic mapping study on this topic. Adopting a detailed search strategy, 1466 papers were initially identified. Of these, 222 papers were selected and categorized using a particular classification scheme. Hence, we present the concerns addressed by the modeling community for Industry 4.0, how these are investigated, where these are published, and by whom. The resulting research landscape can help to understand, guide, and compare research in this field. In particular, this paper identifies the Industry 4.0 challenges addressed by the modeling community, but also the challenges that seem to be less investigated."
318,hal-03921309,2022,{RISCLESS: A Reinforcement Learning Strategy to Guarantee SLA on Cloud Ephemeral and Stable Resources},"Yalles, Sidahmed and Handaoui, Mohamed and Dartois, Jean-Emile and Barais, Olivier and d'Orazio, Laurent and Boukhobza, Jalil","In this paper, we propose RISCLESS, a Reinforcement Learning strategy to exploit unused Cloud resources. Our approach consists in using a small proportion of stable on-demand resources alongside the ephemeral ones in order to guarantee customers SLA and reduce the overall costs. The approach decides when and how much stable resources to allocate in order to fulfill customers’ demands. RISCLESS improved the Cloud Providers (CPs)’ profits by an average of 15.9% compared to past strategies. It also reduced the SLA violation time by 36.7% while increasing the amount of used ephemeral resources by 19.5%."
319,hal-01159118,2015,{Embedding Adaptivity in Software Systems using the ECSELR framework},"Yeboah-Antwi, Kwaku and Baudry, Benoit","ECSELR is an ecologically-inspired approach to software evolution that enables environmentally driven evolution at runtime in extant software systems without relying on any offline components or management. ECSELR embeds adaptation and evolution inside the target software system enabling the system to transform itself via darwinian evolutionary mechanisms and adapt in a self contained manner. This allows the software system to benefit autonomously from the useful emergent byproducts of evolution like adaptivity and biodiversity , avoiding the problems involved in engineering and maintaining such properties. ECSELR enables software systems to address changing environments at runtime, ensuring benefits like mitigation of attacks and memory-optimization among others while avoiding time consuming and costly maintenance and downtime. ECSELR differs from existing work in that, 1) adaptation is embedded in the target system, 2) evolution and adaptation happens online(i.e. in-situ at runtime) and 3) ECSELR is able to embed adaptation inside systems that have already been started and are in the midst of execution. We demonstrate the use of ECSELR and present results on using the ECSELR framework to slim a software system."
320,hal-01010140,2014,{Vision-based Absolute Localization for Unmanned Aerial Vehicles},"Yol, Aurelien and Delabarre, Bertrand and Dame, Amaury and Dartois, Jean-Emile and Marchand, Eric","This paper presents a method for localizing an Unmanned Aerial Vehicle (UAV) using georeferenced aerial images. Easily maneuverable and more and more affordable, UAVs have become a real center of interest. In the last few years, their utilization has significantly increased. Today, they are used for multiple tasks such as navigation, transportation or vigilance. Nevertheless, the success of these tasks could not be possible without a highly accurate localization which can, unfortunately be often laborious. Here we provide a multiple usage localization algorithm based on vision only. However, a major drawback with vision-based algorithms is the lack of robustness. Most of the approaches are sensitive to scene variations (like season or environment changes) due to the fact that they use the Sum of Squared Differences (SSD). To prevent that, we choose to use the Mutual Information (MI) which is very robust toward local and global scene variations. However, dense approaches are often related to drift disadvantages. Here, we solve this problem by using georeferenced images. The localization algorithm has been implemented and experimen- tal results are presented demonstrating the localization of a hexarotor UAV fitted with a downward looking camera during real flight tests."
321,hal-01098109,2014,{Architectures logicielles et variabilité multiple},"Acher, Mathieu and Collet, Philippe and Lahire, Philippe","Lors de la construction de lignes de produits logiciels, la gestion de la variabilité est une activité cruciale. Un très grand nombre de variantes logicielles doit être réalisé, le plus souvent en utilisant des architectures extensibles. Dans ce chapitre, nous présentons plusieurs applications d'un outillage (FAMILIAR) de gestion modulaire de la variabilité à différentes formes d'architectures (à base de composants, de services, de plugins) et à différents stades du cycle de vie logiciel. Nous discutons les leçons apprises de ces études, ainsi que quelques problèmes ouverts liés à la variabilité et aux architectures logicielles."
322,hal-01098107,2014,{Software Architectures and Multiple Variability},"Acher, Mathieu and Collet, Philippe and Lahire, Philippe","During the construction of software product lines, variability management is a crucial activity. A large number of software variants must be produced, in most cases, by using extensible architectures. In this chapter, we present the various applications of a set of modular management variability tools (FAMILIAR) for different forms of architecture (component-, service- and plug-in-based), and at different stages of the software life cycle. We discuss the lessons learnt from these studies and present guidelines for resolving recurring problems linked to multiple variability and to software architecture."
323,hal-02054122,2014,{A Reference Architecture and Roadmap for [email protected] Systems},"Assmann, Uwe and Gotz, Sebastian and Jézéquel, Jean-Marc and Morin, Brice and Trapp, Mario","The key property of [email protected] systems is their use and provision of manageable reflection, which is characterized to be tractable and predictable and by this overcomes the limitation of reflective systems working on code, which face the problem of undecidability due to Turing-completeness. To achieve tractability, they abstract from certain aspects of their code, maintaining runtime models of themselves, which form the basis for reflection. In these systems, models form abstractions that neglect unnecessary details from the code, details which are not pertinent to the current purpose of reflection. Thus, [email protected] systems are a new class of reflective systems, which are characterized by their tractability, due to abstraction, and their ability to predict certain aspects of their own behavior for the future. This chapter outlines a reference architecture for [email protected] systems with the appropriate abstraction and reflection components and gives a roadmap comprised of short-and long-term research challenges for the area. Additionally, an overview of enabling and enabled technologies is provided. The chapter is concluded with a discussion of several application fields and use cases."
324,hal-03043187,2020,{Profilage de navigateurs : état de l’art et contre-mesures},"Baudry, Benoit and Bromberg, Yérom-David and Frey, Davide and Gómez-Boix, Alejandro and Laperdrix, Pierre and Taïani, François",
325,hal-03865929,2021,{Living with Uncertainty in Model-Based Development},"Bernardi, Simona and Famelis, Michalis and Jézéquel, Jean-Marc and Mirandola, Raffaela and Palacin, Diego Perez and Polack, Fiona and Trubiani, Catia",
326,hal-01427560,2016,{Software Unbundling: Challenges and Perspectives},"Bosco, João Ferreira Filho and Acher, Mathieu and Barais, Olivier","Unbundling is a phenomenon that consists of dividing an existing software artifact into smaller ones. It can happen for different reasons , one of them is the fact that applications tend to grow in func-tionalities and sometimes this can negatively influence the user experience. For example, mobile applications from well-known companies are being divided into simpler and more focused new ones. Despite its current importance, little is known or studied about unbundling or about how it relates to existing software engineering approaches, such as modularization. Consequently, recent cases point out that it has been performed unsystematically and arbitrarily. In this article, our main goal is to present this novel and relevant concept and its underlying challenges in the light of software engineering, also exemplifying it with recent cases. We relate un-bundling to standard software modularization, presenting the new motivations behind it, the resulting problems, and drawing perspectives for future support in the area."
327,hal-01224283,2015,{Globalized Domain Specific Language Engineering},"Bryant, Barrett and Jézéquel, Jean-Marc and Laemmel, Ralf and Mernik, Marjan and Schindler, Martin and Steinmann, Friedrich and Tolvanen, Juha-Pekka and Vallecillo, Antonio and Voelter, Markus","This chapter is dedicated to discussing the engineering aspects involved in the integration of modeling languages, as an essential part of the globalization process. It covers the foundations of language integration, the definition of the relationships between the languages to be integrated, and the various dimensions of language and tool integration. Language variants, evolution, refactoring and retirement are also discussed, as key issues involved in the globalization of modeling languages. Today's software development is characterized by a large degree of improvisation: di↵erent languages and language-based tools are used to create di↵erent artifacts that act on the same (physical or logical) system. Such multiplicity is not a problem in itself, but rather the common situation of those languages and tools to be insuciently integrated. The languages and tools though involve interaction at the system level and these interactions are hard to understand without good integration. In fact, the actual interactions may disagree with the intended interactions. Any analysis or verification is going to be hard. Excessive testing may be needed and still fail to be conclusive. We just may hope for the engineers collaborating on the same system to limit themselves to a set of languages implemented with a single language workbench. Version changes of languages and tools would be semantics preserving (both backwards and forward compatible) and therefore not require any changes to the current artifacts."
328,hal-01233660,2015,{Motivating Use Cases for the Globalization of DSLs},"Cheng, Betty H.C. and Degueule, Thomas and Atkinson, Colin and Clarke, Siobhan and Frank, Ulrich and Mosterman, Pieter J. and Sztipanovits, Janos","The development of complex software-intensive systems involves many stakeholders who contribute their expertise on specific aspects of the system under construction. Domain-specific languages (DSLs) are typically used by stakeholders to express their knowledge of the system using dedicated tools and abstractions. In this chapter, we explore different scenarios that lead to the globalization of DSLs through two motivating case studies – a command and control wind tunnel and a smart emergency response system – and outline the concrete engineering challenges they raise. Finally, we list some of the general research challenges related to the globalization of DSMLs and discuss some promising approaches for addressing them."
329,hal-01224105,2015,{On the Globalization of Domain-Specific Languages},"Cheng, Betty H.C. and Combemale, Benoit and France, Robert B. and Jézéquel, Jean-Marc and Rumpe, Bernhard","In the software engineering community, research on domain-specific languages (DSLs) is focused on providing technologies for designing languages and tools that enable domain experts to develop system solutions eciently. Unfortunately, the current lack of support to explicitly relate concepts expressed in di↵erent DSLs makes it dicult for software and system engineers to reason about information distributed across models or programs describing di↵erent system aspects, at di↵er-ent levels of abstraction. Supporting the coordinated use of DSLs is what we call the globalization of DSLs. In this chapter, we introduce a grand challenge of the globalization of DSLs, and we present a few motivating scenarios for such a grand chalenge."
330,hal-01224116,2015,{Conceptual Model of the Globalization for Domain-Specific Languages},"Clark, Tony and van den Brand, Mark and Combemale, Benoit and Rumpe, Bernhard","Domain Specific Languages (DSL) have received some prominence recently. Designing a DSL and all their tools is still cumbersome and lots of work. Engineering of DSLs is still at infancy, not even the terms have been coined and agreed on. In particular globalization and all its consequences need to be precisely defined and discussed. This chapter provides a definition of the relevant terms and relates them, such that a conceptual model emerges. The authors think that this clarification of terms and the meaning will foster the field of ecient DSL definition and evolution in the future. 1 Towards a Conceptual Model Of Globalization Software Engineering, unlike other engineering disciplines, such as Civil, Chemical or Material, deals with constructing precise descriptions of highly complex systems, where each new application contains structure and behaviour that is essentially unique. In essence, each new application is a novel theory of structure and execution, and requires a way of expressing this meta-information [2]. Traditionally General Purpose Languages (GPLs) have been used to encode the theories in executable, but implicit forms (e.g., libraries). However recent advances in language engineering technologies have made it possible to develop Domain Specific Languages (DSLs) each of which is more suited to encoding theories relating to specific application domains [4]. Modern applications tend to be large, heterogeneous and distributed, involving the use of many di↵erent languages including mixtures of GPLs and DSLs. Given that an application consists of many di↵erent subsystems written in di↵erent languages, there is a requirement to ensure that the languages and therefore the subsystems work together e↵ectively and must share the same concepts (theories). Subsystems written in DSLs are attractive because the languages can provide better support for the specific application domains, however they tend to be less mature than their GPL counterparts and therefore there is an interesting research challenge: how to achieve language globalization [1]"
331,hal-02407886,2019,{Towards a Model-Based DevOps for Cyber-Physical Systems},"Combemale, Benoit and Wimmer, Manuel",
332,hal-04113319,2023,{Is privacy possible in a digital world?},"Coppens, Bart and Zendra, Olivier","While privacy used to be a concern only for a limited number of people, in recent years awareness of it has been growing. This has been for a number of reasons, including the implementation of the General Data Protection Regulation (GDPR), the growing impact of data leaks, and data logging by governments and enterprises. In addition, there have been high-profile examples demonstrating how new legislation can turn seemingly innocuous private data against its owners. At the same time, the majority of us are consciously or unknowingly transmitting an increasing amount of private data to the cloud, which increases the likelihood of it being leaked or abused.To try and reconcile these two opposing directions, both consumers and businesses should enhance their knowledge of privacy issues and their use of privacy-enhancing technologies, and enterprises should include privacy by design into their IT development processes."
333,hal-03362809,2021,"{Privacy: whether you're aware of it or not, it does matter!}","Coppens, Bart and Zendra, Olivier","While privacy used to be a concern of only a limited number ofpeople, in recent years awareness of it has been growing. This hasbeen for a number of reasons including the enactment of the GDPR,the growing impact of data leaks, data logging by governmentsand companies, and even the recent discussions about COVID-19contact tracing. At the same time, most of us are knowingly orunknowingly sending more and more private data to the cloud,which increases the risk of it being leaked or abused in some way.In order to try and reconcile these two opposing directions,consumers and companies alike should increase their usage ofprivacy-enhancing technologies, and businesses should integrateprivacy by design into their development."
334,hal-01424909,2017,{On Language Interfaces},"Degueule, Thomas and Combemale, Benoit and Jézéquel, Jean-Marc","Complex systems are developed by teams of experts from multiple domains , who can be liberated from becoming programming experts through domain-specific languages (DSLs). The implementation of the different concerns of DSLs (including syntaxes and semantics) is now well-established and supported by various languages workbenches. However, the various services associated to a DSL (e.g., editors, model checker, debugger or composition operators) are still directly based on its implementation. Moreover, while most of the services crosscut the different DSL concerns, they only require specific information on each. Consequently, this prevents the reuse of services among related DSLs, and increases the complexity of service implementation. Leveraging the time-honored concept of interface in software engineering, we discuss the benefits of language interfaces in the context of software language engineering. In particular, we elaborate on particular usages that address current challenges in language development."
335,hal-04080311,2023,{Modeling: From CASE Tools to SLE and Machine Learning},"Jézéquel, Jean-Marc","Finding better ways to handle software complexity (both inherent and accidental) is the holy grail for a significant part of the software engineering community, and especially for the Model Driven Engineering (MDE) one. To that purpose, plenty of techniques have been proposed, leading to a succession of trends in model based software developments paradigms in the last decades. While these trends seem to pop out from nowhere, we claim in this article that most of them actually stem from trying to get a better grasp on the variability of software. We revisit the history of MDE trying to identify the main aspect of variability they wanted to address when they were introduced. We conclude on what are the variability challenges of our time, including variability of data leading to machine learning of models."
336,hal-02881632,2019,{Morellian Analysis for Browsers: Making Web Authentication Stronger with Canvas Fingerprinting},"Laperdrix, Pierre and Avoine, Gildas and Baudry, Benoit and Nikiforakis, Nick","In this paper, we present the first fingerprinting-based authen-tication scheme that is not vulnerable to trivial replay attacks. Our proposed canvas-based fingerprinting technique utilizes one key characteristic: it is parameterized by a challenge, generated on the server side. We perform an in-depth analysis of all parameters that can be used to generate canvas challenges , and we show that it is possible to generate unique, unpredictable, and highly diverse canvas-generated images each time a user logs onto a service. With the analysis of images collected from more than 1.1 million devices in a real-world large-scale experiment, we evaluate our proposed scheme against a large set of attack scenarios and conclude that canvas fingerprinting is a suitable mechanism for stronger authentication on the web."
337,hal-03921905,2023,{Machine Learning for Feature Constraints Discovery},"Martin, Hugo and Temple, Paul and Acher, Mathieu and Pereira, Juliana Alves and Jézéquel, Jean-Marc",
338,hal-04157804,2023,{Pull Requests Integration Process Optimization: An Empirical Study},"Olmedo, Agustín and Arévalo, Gabriela and Cassol, Ignacio and Perez, Quentin and Urtado, Christelle and Vauttier, Sylvain","Pull-based Development (PbD) is widely used in collaborative development to integrate changes into a project codebase. In this model, contributions are notified through Pull Request (PR) submissions. Project administrators are responsible for reviewing and integrating PRs. In the integration process, conflicts occur when PRs are concurrently opened on a given target branch and propose different modifications for a same code part. In a previous work, we proposed an approach, called IP Optimizer, to improve the Integration Process Efficiency (IPE) by prioritizing PRs. In this work, we conduct an empirical study on 260 open-source projects hosted by GitHub that use PRs intensively in order to quantify the frequency of conflicts in software projects and analyze how much the integration process can be improved. Our results indicate that regarding the frequency of conflicts in software projects, half of the projects have a moderate and high number of pairwise conflicts and half have a low number of pairwise conflicts or none. Futhermore, on average 18.82% of the time windows have conflicts. On the other hand, regarding how much the integration process can be improved, IP Optimizer improves the IPE in 94.16% of the time windows and the average improvement percentage is 146.15%. In addition, it improves the number of conflict resolutions in 67.16% of the time windows and the average improvement percentage is 134.28%."
339,hal-02054220,2017,{Perpetual Assurances for Self-Adaptive Systems},"Weyns, Danny and Bencomo, Nelly and Calinescu, Radu and Cámara, Javier and Ghezzi, Carlo and Grassi, Vincenzo and Grunske, Lars and Inverardi, Paola and Jézéquel, Jean-Marc and Malek, Sam and Mirandola, Raffaela and Mori, Marco and Tamburrelli, Giordano","Providing assurances for self-adaptive systems is challenging. A primary underlying problem is uncertainty that may stem from a variety of different sources, ranging from incomplete knowledge to sensor noise and uncertain behavior of humans in the loop. Providing assurances that the self-adaptive system complies with its requirements calls for an enduring process spanning the whole lifetime of the system. In this process, humans and the system jointly derive and integrate new evidence and arguments, which we coined perpetual assurances for self-adaptive systems. In this paper, we provide a background framework and the foundation for perpetual assurances for self-adaptive systems. We elaborate on the concrete challenges of offering perpetual assurances, requirements for solutions, realization techniques and mechanisms to make solutions suitable. We also present benchmark criteria to compare solutions. We then present a concrete exemplar that researchers can use to assess and compare approaches for perpetual assurances for self-adaptation."
340,hal-03362808,2021,{Cybersecurity must come to IT systems now},"Zendra, Olivier and Coppens, Bart","After decades of apparently low-intensity cyber attacks, during which security was not really thought of on most IT systems, recent years have brought a flurry of well-organized, larger-scale attacks that have caused billions of Euros of damage. This was made possible by the plethora of IT systems that have been produced with no or low security, a trend that has further increased with the rise of ubiquitous computing, with smartphones, IoT and smart-* being everywhere with extremely low control. However, although the current situation in IT systems can still be considered as critical and very much working in favour of cyber attackers, there are definitely paths to massive but doable technical improvements that can lead us to a much more secure and sovereign IT ecosystem, along with strong business opportunities in Europe."
341,hal-04113296,2023,"{From cybercrime to cyberwarfare, nobody can overlook cybersecurity any more}","Zendra, Olivier and Coppens, Bart","After decades of apparently low-intensity cyberattacks, during which security was not really thought of in most information technology (IT) systems, recent years, including those marked by the COVID-19 pandemic, have brought a flurry of well-organized, larger-scale attacks that have caused billions of euros of damage.Such attacks have been made possible by the plethora of IT systems that have been produced with no or low security, a trend that has further increased with the rise of ubiquitous computing, with smartphones, the internet of things (IoT) and smart-* spreading everywhere with extremely low control.However, although the current situation in IT systems can still be considered as critical and very much working in favour of cyber attackers, there are paths to massive but achievable technical improvements that can lead us to a much more secure and sovereign IT ecosystem, along with strong business opportunities in Europe. Furthermore, when it comes to cyberwarfare, improved cybersecurity is now the first line of defence, as the conflict in Ukraine has shown."
342,hal-03362810,2021,{Taming the IT systems complexity hydra},"Zendra, Olivier and de Bosschere, Koen","Although most people remain unaware of its presence in the background, the ever-increasing complexity of IT, with its multiple sources, has been an ongoing issue for quite some time. It can even be qualified as a crisis, in both hardware and software. Indeed, this complexity has reached the point where systems are no longer fully understandable by human beings, which raises the question of how we can continue being in full control of their functioning. It is of course a matter of cost for the IT industry. But a number of incidents caused by bugs or a misunderstanding of some part of an IT system have already occurred. With an IT world that is permanently connected on a worldwide scale, the risk of damage caused by the lack of control of IT systems is both real and growing, with errors and malevolent attacks the most likely culprits.Taming the IT complexity hydra is thus more necessary than ever. Fortunately, various solutions can be proposed to tackle the various heads of the hydra (i.e. the various aspects of complexity); these are solutions based on existing methodologies, tools and resources or extensions thereof."
343,hal-04113336,2023,{THE RACE FOR CYBERSECURITY},"Zendra, Olivier and Coppens, Bart","After decades of digitalization spreading into every area of our lives, with very little attention given to the aspects linked to cybersecurity, information technology (IT) had essentially become an “open bar” for cybercriminals. For a few years, with a marked degradation during the peak of the COVID-19 pandemic, the news has been rife with reports of privacy breaches and cyberattacks (mainly ransomware) on companies and institutions, especially local governments and hospitals. In addition, cyberwarfare has been making the news too, especially in relation to the conflict in Ukraine.Thus, the era of blissful ignorance and naivetéhas ended. Although the wake-up call was abrupt,knowledge of these issues has expanded, andgovernments and to some extent businesses havetaken first moves to enhance the cybersecurityfrontline. However, cybersecurity is a highly competitive race between nations, between defenders and attackers, with enormous stakes. The pervasiveness of IT provides a broad attack surface, and attacks can be economically devastating, but they can also have tangible or even lethal repercussions on the physical world.Despite several highly acclaimed advancements (e.g. the General Data Protection Regulation-GDPR), the EU still has a great deal of work to do in this regard, particularly to maintain its sovereignty and become a leader in the global competition. Cybersecurity is indeed a matter of both economic leadership and national sovereignty. This chapter contains two contributions.• “From cybercrime to cyberwarfare, nobody can overlook cybersecurity any more”This article describes the current state of IT system cybersecurity, showing how vulnerable systems are to the numerous dangers and challenges posed by cybercrime and cyberwarfare. It goes on to present a few concrete ways to remedy the issue, whether by technical, legal, sociological, or political means. Indeed, although the EU has weaknesses, linked to its extremely high reliance on IT systems, it also has the potential to become a world leader in cybersecurity, owing to both its strong technical culture and its regulatory capabilities.• “Is privacy possible in a digital world?”Over the last few years, privacy has become a hot topic. However, this is in large part due to the fact that ever more data is being collected, not only by governments, but also by companies. It is often unclear for which purposes this data ends up being used; worse, it can even be leaked to third parties by attackers. Furthermore, even if this collected data would appear not to be sensitive in and of itself, sometimes sensitive information can be deduced from it. In this article, we present a summary of some of the ways in which data is gathered; how additional information can be inferred from it and how this is problematic; and how we can try to protect our privacy."
344,tel-03521806,2021,"{Modelling, Reverse Engineering, and Learning Software Variability}","Acher, Mathieu","The society expects software to deliver the right functionality, in a short amount of time and with fewer resources, in every possible circumstance whatever are the hardware, the operating systems, the compilers, or the data fed as input. For fitting such a diversity of needs, it is common that software comes in many variants and is highly configurable through configuration options, runtime parameters, conditional compilation directives, menu preferences, configuration files, plugins, etc. As there is no one-size-fits-all solution, software variability (""the ability of a software system or artifact to be efficiently extended, changed, customized or configured for use in a particular context"") has been studied the last two decades and is a discipline of its own. Though highly desirable, software variability also introduces an enormous complexity due to the combinatorial explosion of possible variants. For example, the Linux kernel has 15000+ options and most of them can have 3 values: ""yes"", ""no"", or ""module"". Variability is challenging for maintaining, verifying, and configuring software systems (Web applications, Web browsers, video tools, etc.). It is also a source of opportunities to better understand a domain, create reusable artefacts, deploy performance-wise optimal systems, or find specialized solutions to many kinds of problems. In many scenarios, a model of variability is either beneficial or mandatory to explore, observe, and reason about the space of possible variants. For instance, without a variability model, it is impossible to establish a sampling strategy that would satisfy the constraints among options and meet coverage or testing criteria. I address a central question in this HDR manuscript: How to model software variability? I detail several contributions related to modelling, reverse engineering, and learning software variability. I first contribute to support the persons in charge of manually specifying feature models, the de facto standard for modeling variability. I develop an algebra together with a language for supporting the composition, decomposition, diff, refactoring, and reasoning of feature models. I further establish the syntactic and semantic relationships between feature models and product comparison matrices, a large class of tabular data. I then empirically investigate how these feature models can be used to test in the large configurable systems with different sampling strategies. Along this effort, I report on the attempts and lessons learned when defining the ""right"" variability language. From a reverse engineering perspective, I contribute to synthesize variability information into models and from various kinds of artefacts. I develop foundations and methods for reverse engineering feature models from satisfiability formulae, product comparison matrices, dependencies files and architectural information, and from Web configurators. I also report on the degree of automation and show that the involvement of developers and domain experts is beneficial to obtain high-quality models. Thirdly, I contribute to learning constraints and non-functional properties (performance) of a variability-intensive system. I describe a systematic process ""sampling, measuring, learning"" that aims to enforce or augment a variability model, capturing variability knowledge that domain experts can hardly express. I show that supervised, statistical machine learning can be used to synthesize rules or build prediction models in an accurate and interpretable way. This process can even be applied to huge configuration space, such as the Linux kernel one. Despite a wide applicability and observed benefits, I show that each individual line of contributions has limitations. I defend the following answer: a supervised, iterative process (1) based on the combination of reverse engineering, modelling, and learning techniques; (2) capable of integrating multiple variability information (eg expert knowledge, legacy artefacts, dynamic observations). Finally, this work opens different perspectives related to so-called deep software variability, security, smart build of configurations, and (threats to) science."
345,tel-01096952,2014,"{Utilisation de la modélisation à l’exécution : objectif, challenges et bénéfices}","Barais, Olivier","Le développement logiciel traditionnel, généralement fondé sur l’hypothèse d’un monde clos définissant une frontière connue et stable entre le système et son environnement n’est plus tenable. Par opposition, la notion de système dit ouvert et éternel s’est imposée à la plupart des systèmes informatiques. Ces systèmes logiciels se caractérisent par leur besoin d’offrir des capacités d’adaptation qui leur permettent de réagir aux changements de leur environnement de manière continue et sans interruption de service. Un des challenges important pour la communauté du génie logiciel est d’identifier et de supprimer progressivement les limites liées à l’hypothèse du monde clos. En partant de cette hypothèse de monde ouvert, cette habilitation expose les bénéfices engendrés par l’effacement de la frontière entre la phase de conception et la phase d’exécution du logiciel en proposant l’utilisation des travaux liés à la modélisation non plus uniquement lors de la phase de conception du système, mais aussi au cours de l’exécution des systèmes dits ouverts. Pour ce faire, cette habilitation synthétise, dans une première partie, les fondations d’une approche permettant l’utilisation de techniques de modélisation, à l’exécution en se concentrant principalement sur le point de vue de l’architecte logicielle. Nous exposons ensuite les bénéfices attendus en montrant comment des approches avancées de composition logicielle, de vérification ou de gestion de la variabilité peuvent être bénéfiques pour la compréhension et la maîtrise de l’espace de configuration et de reconfiguration d’un système dit ouvert. Nous synthétisons ensuite les principaux challenges liés à l’utilisation de techniques de modélisation à l’exécution en particulier dans le cadre de systèmes distribués et hétérogènes. Afin d’insister dans ce document de synthèse sur l’importance de l’expérimentation dans ma démarche scientifique, une deuxième partie présente un résumé non exhaustif, mais représentatif, de différentes expériences menées dans le cadre des thèses que j’encadre ou que j’ai co-encadrées afin de montrer la pertinence des approches de modélisation à l’exécution et des opérateurs de composition de modèles associés. Ces expériences permettent de confronter cette proposition à différents domaines d’applications : informatique mobile, Internet des Objets, Cloud Computing) et permettent de pousser aux limites cette idée d’utilisation de modélisation à l’exécution en regardant sa pertinence pour chaque domaine étudié par rapport à ses propres contraintes. Une dernière partie conclut ce manuscrit de synthèse par quelques chiffres pour étayer la production scientifique et propose un ensemble de perspectives de recherche associées à ces travaux."
346,tel-02354530,2019,{Contribution to the Engineering of User Interfaces},"Blouin, Arnaud","User interfaces (UI) pervade our daily lives. To do office tasks, to pilot an airliner, to write programs, UIs are the tangible vectors that enable users to interact with software systems. The development of UIs involves multiple roles. Designers and ergonomists are in charge of the design and evaluation of UIs from a strict human factor viewpoint. They use concepts and theories established by the Human-Computer Interaction (HCI) community. Software engineers develop, validate, maintain UIs using software engineering techniques. UI engineering is an interdisciplinary field that cross-cuts these two roles and their underlying domains, HCI and software engineering. This thesis details the contributions I develop during the last decade in the filed of UI engineering. I develop this thesis through two research axes.In the first axis I detail new UI engineering concepts. These concepts follow the same leitmotiv: engineering UIs is a specific problem that requires specific abstractions. These abstractions thus go beyond the classical object-oriented abstractions provided by the current programming languages to provide abstractions that focus on UI concerns. In the second axis I focus on domain-specific languages (DSL). DSLs are specific user interfaces that stand between domain experts and their engineering problems. As for any UI, engineering DSLs must make use of HCI concepts at different levels.I then propose three research perspectives that highlight the importance of working on UI engineering to support the increasingly interactivity of software systems."
347,tel-01238817,2015,{Towards Language-Oriented Modeling},"Combemale, Benoit","In this habilitation à diriger des recherches (HDR), I review a decade of research work in the fields of Model-Driven Engineering (MDE) and Software Language Engineering (SLE). I propose contributions to support a language-oriented modeling, with the particular focus on enabling early validation & verification (V&V) of software-intensive systems. I first present foundational concepts and engineering facilities which help to capture the core domain knowledge into the various heterogeneous concerns of DSMLs (aka. metamodeling in the small), with a particular focus on executable DSMLs to automate the development of dynamic V&V tools. Then, I propose structural and behavioral DSML interfaces, and associated composition operators to reuse and integrate multiple DSMLs (aka. metamodeling in the large).In these research activities I explore various breakthroughs in terms of modularity and reusability of DSMLs. I also propose an original approach which bridges the gap between the concurrency theory and the algorithm theory, to integrate a formal concurrency model into the execution semantics of DSMLs. All the contributions have been implemented in software platforms — the language workbench Melange and the GEMOC studio – and experienced in real-world case studies to assess their validity. In this context, I also founded the GEMOC initiative, an attempt to federate the community on the grand challenge of the globalization of modeling languages."
348,hal-04152637,2023,{Reproducible Science and Software Engineering},"Acher, Mathieu and Temple, Paul and Barais, Olivier","One of the main promises of software is that a result obtained from an experiment (e.g. a simulation) can be reproduced with a high degree of concordance. The quest for reproducibility has an impact on different scientific fields, takes different forms and requires making all the data and code available so that calculations can be run again with identical results. In this course, we will first review terminologies (e.g. reproducible vs replication vs repeatability), basic tools (e.g. versioning systems, build systems, package managers, notebooks), and techniques (e.g. automated testing, continuous integration and deployment, configuration management) related to reproducible science.We will then program a relatively simple feature in different variants and demonstrate that many factors (including programming languages, library versions, compilers, variable types, randomness management, etc.) can have an impact on the final result. Through this exercise, we aim to present and discuss software techniques, methodologies and tools that developers or scientists can use to address and mitigate reproducibility issues, hopefully leading to more robust and general results."
349,hal-01247090,2015,{Le fingerprinting : une nouvelle technique de traçage},"Laperdrix, Pierre and Baudry, Benoit","Le «browser fingerprinting» désigne l’activité de collecte par un navigateur d’un certain nombre d’informations sur l’appareil d’un internaute pour bâtir une empreinte (fingerprint). De nombreuses études ont montré que cette empreinte est unique dans la très grande majorité des cas et évolue très lentement. Il est ainsi possible de l’utiliser pour tracer les internautes, sans laisser aucune trace sur l’appareil."
350,hal-04163509,2023,{Analyser efficacement de grands historiques de code avec HyperAST : une démonstration},"Le Dilavrec, Quentin and Khelladi, Djamel Eddine and Blouin, Arnaud and Jézéquel, Jean-Marc","Cette démonstration présente l'HyperAST, une approche d'analyse d'historiques de code performante, se fondant sur la redondance du code à travers le temps et l'espace, ainsi que sur les possibilités d'analyses partielles de code. Actuellement, l'analyse des historiques de code se fait en mode « batch »: chaque version est traitée indépendamment des autres pour calculer un ensemble de métriques ; à la fin de l'analyse ces métriques sont utilisées pour observer l'évolution de la base de code au cours du temps. Notre approche propose de traiter plus finement l'historique de code, au niveau de l'AST, et de partager les éléments identiques dans et entre chaque version. Cette démonstration vise à expliquer ce principe au travers de trois scénarios."
351,hal-01355374,2016,{Engineering Modeling Languages},"Combemale, Benoit and France, Robert and Jézéquel, Jean-Marc and Rumpe, Bernhard and Steel, Jim R.H. and Vojtisek, Didier","Written by foremost experts in the field, Engineering Modeling Languages provides end-to-end coverage of the engineering of modeling languages to turn domain knowledge into tools.The book provides a definition of different kinds of modeling languages, their instrumentation with tools such as editors, interpreters and generators, the integration of multiple modeling languages to achieve a system view, and the validation of both models and tools. Industrial case studies, across a range of application domains, are included to attest to the benefits offered by the different techniques. The book also includes a variety of simple worked examples that introduce the techniques to the novice user.The book is structured in two main parts. The first part is organized around a flow that introduces readers to Model Driven Engineering (MDE) concepts and technologies in a pragmatic manner. It starts with definitions of modeling and MDE, and then moves into a deeper discussion of how to express the knowledge of particular domains using modeling languages to ease the development of systems in the domains.The second part of the book presents examples of applications of the model-driven approach to different types of software systems. In addition to illustrating the unification power of models in different software domains, this part demonstrates applicability from different starting points (language, business knowledge, standard, etc.) and focuses on different software engineering activities such as Requirement Engineering, Analysis, Design, Implementation, and V&V.Each chapter concludes with a small set of exercises to help the reader reflect on what was learned or to dig further into the examples. Many examples of models and code snippets are presented throughout the book, and a supplemental website features all of the models and programs (and their associated tooling) discussed in the book."
352,hal-01224096,2015,{Globalizing Domain-Specific Languages},"Combemale, Benoit and Cheng, Betty H.C. and France, Robert B. and Jézéquel, Jean-Marc and Rumpe, Bernhard",
353,hal-03359519,2021,{HiPEAC Vision 2021},"Duranton, Marc and de Bosschere, Koen and Coppens, Bart and Gamrat, Christian and Hoberg, Thomas and Munk, Harm and Roderick, Catherine and Vardanega, Tullio and Zendra, Olivier","Our world is evolving very rapidly, bothfrom the technological point of view – withimpressive advances in artificial intelligenceand new hardware challenging longstanding PC hardware traditions, for example –and as a result of unexpected events. Theyear 2020 was quite exceptional, an annushorribilis, according to some. It is hard todisagree with this statement, but every darkcloud has a silver lining. 2020 was also theyear that accelerated digital transformationbeyond what could have been imaginedin 2019. Vaccine development happenedfaster than would ever have been conceivable a year ago, digital payment became thenorm for many people and e-commerceand online sales threatened brick andmortar shops. Employees were encouragedto work from home – with its advantagesand disadvantages, videoconferencingbecame the de facto way to interact withboth family and colleagues, schools wereforced to experiment with distance learning. The list goes on. After living for overa year in an online world, most people willnot return completely to the “old normal”.They will go for a combination of the “oldnormal” and things they discovered andexperimented with in the circumstancesforced upon us by COVID-19; they mightkeep their home office on some days, andbe in the workplace on other days. Highereducation will certainly also continue tooffer online teaching.The rapidly evolving digital world hasalso had an impact on the HiPEAC Vision:updating it every two years no longerseems quite in keeping with the speed ofthe evolution of computing systems. Therefore, we decided to move from producinga large roadmap document every otheryear, to an agile, rapidly evolving electronicmagazine-like set of articles. The HiPEACVision 2021 has two main parts:1) A set of recommendations for theHiPEAC community. It will also introduce the second part of the Vision andwill be updated periodically, or onparticular occasions.2) A set of “articles”, like in magazines, thatwill be regularly updated, the purposeof which is to support the set of recommendations, or to introduce new topicsor situations. This will guarantee thatthe HiPEAC Vision keeps evolving andremains up to date. These articles areintended to be self-sufficient and can beread independently. They are groupedinto four themes or dimensions: technical, business, societal and European. Newarticles will be added over the course ofthe years (and outdated ones might beremoved). A further element of this newapproach to the Vision is that the editorial board asked and will ask variousauthors to contribute to those articles.""is adds heterogeneity and a diversityof point of view that could be helpful forbetter analysis of the computing systemslandscape as well as improve the qualityof the recommendations."
354,hal-04023794,2023,{HiPEAC Vision 2023},"Duranton, Marc and Bosschere, Koen De and Coppens, Bart and Gamrat, Christian and Gray, Madeleine and Hoberg, Thomas and Munk, Harm and Robinson, Charles and Vardanega, Tullio and Zendra, Olivier","The focus of the HiPEAC Vision 2023 is that we are in races, both against time and with the rest of the world.Technology never stands still. The last few years have once again seen rapid, profound changes across the world, both from the technological point of view – with impressive advances in artificial intelligence – and from the geopolitical point of view, where technology is increasingly seen as a strategic asset.Different world regions are competing for leadership in several areas. Competition between the United States (US) and China in the technology and artificial intelligence (AI) domains is particularly fierce, and it is becoming more intense. This creates a threat to Europe, but at the same time an opportunity. The recent changeof ownership and leadership at Twitter is also a wake-up call for Europe. Many of the essential services the European society depends on run on platforms that are not controlled by Europe. This creates vulnerability in the event of conflict, comparable to European dependency on Russian energy. These are just the evolutions of the last year.Change is taking place so fast that it is also having an impact on the HiPEAC Vision: updating it every two years is no longer sufficient to keep up with the speed of the evolution of computing systems. Therefore, from now on, there will be a HiPEAC Vision every year. The speed of the evolution has also inspired the editorial board to present the challenges of our community as six leadership races: for the “next web”, for AI, for innovative hardware solutions, for cybersecurity, for digital sovereignty, and for sustainability solutions.Structurally, the HiPEAC Vision 2023 has two parts:- A set of recommendations for the HiPEAC community at large. - A set of articles written by experts and grouped into six chapters each describing one “global leadership race”.We encourage you to read the complete document, but to clearly understand the recommendations, the reader should read at least the following section “The drivers of the HiPEAC Vision 2023” and the “Key recommendations of the HiPEAC Vision 2023”.The HiPEAC Vision will be available on the HiPEAC website, with easy access to each article, and to the consolidated version. The consolidated version of the Vision will also be printed and distributed, as with previous editions"
355,hal-01657866,2017,{Verifying the configuration of Virtualized Network Functions in Software Defined Networks},"Pelay, Johan and Guillemin, Fabice and Barais, Olivier","The deployment of modular virtual network functions (VNFs) in software defined infrastructures (SDI) enables cloud and network providers to deploy integrated network services across different resource domains. It leads to a large interleaving between network configuration through software defined network controllers and VNF deployment within this network. Most of the configuration management tools and network orchestrators used to deploy VNF lack of an abstraction to express Assume-Guarantee contracts between the VNF and the SDN configuration. Consequently, VNF deployment can be inconsistent with network configurations. To tackle this challenge, in this paper, we present an approach to check the consistency between the VNF description described from a set of structural models and flow-chart models and a proposed deployment on a real SDN infrastructure with its own configuration manager. We illustrate our approach on virtualized Evolved Packet Core function."
356,hal-01077788,2014,{Variability Management in Domain-Specific Languages},"Méndez-Acuña, David and Combemale, Benoit and Baudry, Benoit","Domain-specific languages (DSLs) allow domain experts to express solutions directly in terms of relevant domain concepts and, for example, use generative mechanisms to transform DSL' specifications into software artifacts (e.g. code, configuration files or documentation). Thus, abstracting away from the complexity of the rest of the system and the intricacies of its implementation. As a result, the construction of DSLs is becoming a recurrent activity during the development of software intensive systems. However, the construction of DSLs is a challenging task due to the specialized knowledge it requires; in order to successfully perform such activity, an engineer must own not only quite solid modeling skills but also the technical expertise for conducting the definition of specific artifacts such as grammars, metamodels, compilers, interpreters, among others. The situation becomes even more challenging in the context of multi-domain companies (e.g. Thales) where several domains coexist across the business units and, consequently, there is a need to deal with families of DSLs. A family of DSLs is a set of DSLs that share some commonalities and that differ by some variability that, in turn, is materialized in certain variation points existing at three different dimensions: functional, syntactical and semantical. Functional variation points refer to the capability of creating DSLs including only a subset of constructs of the whole language so it is possible to create stakeholder-specific DSLs while maintaining them as simpler as possible. Syntactic variation points refers to different representations of the same concept (e.g., graphical and textual representation). Finally, semantic variation points refer to different interpretation to the same concept by two members of the family (e.g., the concept \textit{fork} in the case of the state machines can be interpreted as a concurrency point where all the output transitions are dispatched simultaneously or simply as a bifurcation point where the output transitions are dispatched sequentially). Recent research works have demonstrated the benefits of the use of software product lines engineering (SPLE) in the construction of families DSLs. All of these works agree on the need of a modularization approach that enables the decomposition of a DSL into independent modules and a variability management mechanism for effectively dealing with the differences and commonalities among the DSLs members of the family.The research summarized in this document is aimed to contribute to this study. Concretely speaking, we work on the formalization of the alignment between the modularization approach and the variability management mechanism taking into account the three dimensions of the variability. Our preliminary results suggest the need of the definition of language interfaces for addressing each variability realization technique for the particular case of DSLs. In addition, an strategy should be conceived for modeling the multi-dimensional variability existing in families DSLs in such a way that facilitates the configuration and derivation of DSLs according to the specific needs of the users. As part of the validation process, we are applying all these ideas in a real word industrial case study in the context of Thales Group."
357,hal-01218571,2015,{Model-Driven Multi-Cloud Resource Management},"Paraïso, Fawaz and Parpaillon, Jean and Merle, Philippe",This poster presents a novel model-driven approach for managing resources of multiple clouds.
358,hal-02317918,2018,{Demand Side Management A model driven approach to promote energy self-consumption},"Rio, Alexandre and Maurel, Yoann and Barais, Olivier and Bugni, Yoran",
359,hal-03613422,2022,{Journal First: Interacto: A Modern User Interaction Processing Model},"Blouin, Arnaud and Jézéquel, Jean-Marc",
360,hal-01074602,2014,{GEMOC 2014 2nd International Workshop on The Globalization of Modeling Languages},"Combemale, Benoit and Deantoni, Julien and France, Robert",Proceedings of the 2nd International Workshop on The Globalization of Modeling Languages
361,hal-01242558,2015,{Joint Proceedings of the 3rd International Workshop on the Globalization Of Modeling Languages and the 9th International Workshop on Multi-Paradigm Modeling},"Combemale, Benoit and Deantoni, Julien and Gray, Jeff",
362,hal-01110914,2014,{Software Language Engineering},"Combemale, Benoit and Pearce, David and Barais, Olivier and Vinju, Jurgen",
363,hal-01628959,2016,"{Fourth International Workshop on Reverse Variability Engineering (REVE 2016), associated with SPLC}","Lopez-Herrejon, Roberto Erick and Ziadi, Tewfik and Martinez, Jabier and Acher, Mathieu",
364,hal-01665572,2017,{Report on the 1 st International Workshop on Debugging in Model-Driven Engineering (MDEbug'17)},"van Mierlo, Simon and Bousse, Erwan and Vangheluwe, Hans and Wimmer, Manuel and Verbrugge, Clark and Gogolla, Martin and Tichy, Matthias and Blouin, Arnaud","System developers spend a significant part of their time debugging systems (i.e., locating and fixing the cause of failures observed through verification and validation (V&V)). While V&V techniques are commonly used in model-driven engineering, locating and fixing the cause of a failure in a modelled system is most often still a manual task without tool-support. Although debugging techniques are well-established for programming languages, only a few debugging techniques and tools for models have been proposed. Debugging models faces various challenges: handling a wide variety of models and modelling languages; adapting debugging techniques initially proposed for programming languages; tailoring debugging approaches for the domain expert using the abstractions of the considered language. The aim of the first edition of the MDEbug workshop was to bring together researchers wanting to contribute to the emerging field of debugging in model-driven engineering by discussing new ideas and compiling a research agenda. This paper summarizes the workshop's discussion session and distils a list of challenges that should be addressed in future research."
365,hal-01307091,2016,{Large-scale Analysis of Chess Games with Chess Engines: A Preliminary Report},"Acher, Mathieu and Esnault, François","The strength of chess engines together with the availability of numerous chess games have attracted the attention of chess players, data scientists, and researchers during the last decades. State-of-the-art engines now provide an authoritative judgement that can be used in many applications like cheating detection, intrinsic ratings computation, skill assessment, or the study of human decision-making. A key issue for the research community is to gather a large dataset of chess games together with the judgement of chess engines. Unfortunately the analysis of each move takes lots of times. In this paper, we report our effort to analyse almost 5 millions chess games with a computing grid. During summer 2015, we processed 270 millions unique played positions using the Stockfish engine with a quite high depth (20). We populated a database of 1+ tera-octets of chess evaluations, representing an estimated time of 50 years of computation on a single machine. Our effort is a first step towards the replication of research results, the supply of open data and procedures for exploring new directions, and the investigation of software engineering/scalability issues when computing billions of moves."
366,hal-02147012,2019,{Learning From Thousands of Build Failures of Linux Kernel Configurations},"Acher, Mathieu and Martin, Hugo and Alves Pereira, Juliana and Blouin, Arnaud and Eddine Khelladi, Djamel and Jézéquel, Jean-Marc","The Linux kernel offers more than ten thousands configuration options that can be combined to build an almost infinite number of kernel variants. Developers and contributors spend significant effort and computational resources to continuously track and hopefully fix configurations that lead to build failures. In this experience paper, we report on our endeavor to develop an infrastructure, called TuxML, able to build any kernel configuration and learn what could explain or even prevent configurations' failures. Our results over 95,000+ configurations show that TuxML can accurately cluster 3,600+ failures, automatically trace the responsible configuration options, and learn by itself to avoid unnecessary and costly builds. Our large qualitative and quantitative analysis reveals insights about Linux itself (e.g., we only found 16 configuration bugs) and the difficulty to engineer a build infrastructure for configurable systems (e.g., a false positive failure may mask true configuration bugs)."
367,hal-02314830,2019,{Learning Very Large Configuration Spaces: What Matters for Linux Kernel Sizes},"Acher, Mathieu and Martin, Hugo and Pereira, Juliana Alves and Blouin, Arnaud and Jézéquel, Jean-Marc and Khelladi, Djamel Eddine and Lesoil, Luc and Barais, Olivier","Linux kernels are used in a wide variety of appliances, many of them having strong requirements on the kernel size due to constraints such as limited memory or instant boot. With more than ten thousands of configuration options to choose from, obtaining a suitable trade off between kernel size and functionality is an extremely hard problem. Developers, contributors, and users actually spend significant effort to document, understand, and eventually tune (combinations of) options for meeting a kernel size. In this paper, we investigate how machine learning can help explain what matters for predicting a given Linux kernel size. Unveiling what matters in such very large configuration space is challenging for two reasons: (1) whatever the time we spend on it, we can only build and measure a tiny fraction of possible kernel configurations; (2) the prediction model should be both accurate and interpretable. We compare different machine learning algorithms and demonstrate the benefits of specific feature encoding and selection methods to learn an accurate model that is fast to compute and simple to interpret. Our results are validated over 95,854 kernel configurations and show that we can achieve low prediction errors over a reduced set of options. We also show that we can extract interpretable information for refining documentation and experts' knowledge of Linux, or even assigning more sensible default values to options."
368,hal-01023159,2014,{Modeling Variability in the Video Domain: Language and Experience Report},"Alférez, Mauricio and Galindo Duarte, José Angel and Acher, Mathieu and Baudry, Benoit","This paper reports about a new domain-specific variability modeling language, called VM, resulting from the close collaboration with industrial partners in the video domain. We expose the requirements and advanced variability constructs required to characterize and realize variations of physical properties of a video (such as objects' speed or scene illumination). The results of our experiments and industrial experience show that VM is effective to model complex variability information and can be exploited to synthesize video variants. We concluded that basic variability mechanisms are useful but not enough, attributes and multi-features are of prior importance, and meta-information is relevant for efficient variability analysis. In addition, we questioned the existence of one-size-fits-all variability modeling solution applicable in any industry. Yet, some common needs for modeling variability are becoming apparent such as support for attributes and multi-features."
369,hal-01067782,2014,{The Multiple Facets of Software Diversity: Recent Developments in Year 2000 and Beyond},"Baudry, Benoit and Monperrus, Martin","Early experiments with software diversity in the mid 1970's investigated N-version programming and recovery blocks to increase the reliability of embedded systems. Four decades later, the literature about software diversity has expanded in multiple directions: goals (fault-tolerance, security, software engineering); means (managed or automated diversity) and analytical studies (quantification of diversity and its impact). Our paper contributes to the field of software diversity as the first paper that adopts an inclusive vision of the area, with an emphasis on the most recent advances in the field. This survey includes classical work about design and data diversity for fault tolerance, as well as the cybersecurity literature that investigates randomization at different system levels. It broadens this standard scope of diversity, to include the study and exploitation of natural diversity and the management of diverse software products. Our survey includes the most recent works, with an emphasis from 2000 to present. The targeted audience is researchers and practitioners in one of the surveyed fields, who miss the big picture of software diversity. Assembling the multiple facets of this fascinating topic sheds a new light on the field."
370,hal-01116663,2015,{Synthesis of Attributed Feature Models From Product Descriptions: Foundations},"Bécan, Guillaume and Behjati, Razieh and Gotlieb, Arnaud and Acher, Mathieu","Feature modeling is a widely used formalism to characterize a set of products (also called configurations). As a manual elaboration is a long and arduous task, numerous techniques have been proposed to reverse engineer feature models from various kinds of artefacts. But none of them synthesize feature attributes (or constraints over attributes) despite the practical relevance of attributes for documenting the different values across a range of products.In this report, we develop an algorithm for synthesizing attributed feature models given a set of product descriptions. We present sound, complete, and parametrizable techniques for computing all possible hierarchies, feature groups, placements of feature attributes, domain values, and constraints. We perform a complexity analysis w.r.t. number of features, attributes, configurations, and domain size. We also evaluate the scalability of our synthesis procedure using randomized configuration matrices. This report is a first step that aims to describe the foundations for synthesizing attributed feature models."
371,hal-01060601,2014,{Operational Semantics of the Model of Concurrency and Communication Language},"Deantoni, Julien and Diallo, Papa Issa and Champeau, Joël and Combemale, Benoit and Teodorov, Ciprian","In the GEMOC project, MoCCML is dedicated to define the MoC associated with the DSMLs. The purpose of this document is to define the operational semantics of the MoCCML language and also to define the first steps of an approach to provide an exhaustive exploration of the MoCCML models. This document presents the operational semantics of the MoCCML language. The chapter is divided in several sections that present the grammar rules of the language and the operational rules mainly defined using mathematical grounds and Plotkin structural operation semantics rules. The document presents the elements that are relevant to describe the evolution of a MoCCML model as well as a draft of how exhaustive exploration is to be realized in a context using MoCCML models."
372,hal-00996764,2014,{Kevoree Modeling Framework (KMF): Efficient modeling techniques for runtime use},"Fouquet, François and Nain, Grégory and Morin, Brice and Daubert, Erwan and Barais, Olivier and Plouzeau, Noël and Jézéquel, Jean-Marc","The creation of Domain Specific Languages(DSL) counts as one of the main goals in the field of Model-Driven Software Engineering (MDSE). The main purpose of these DSLs is to facilitate the manipulation of domain specific concepts, by providing developers with specific tools for their domain of expertise. A natural approach to create DSLs is to reuse existing modeling standards and tools. In this area, the Eclipse Modeling Framework (EMF) has rapidly become the defacto standard in the MDSE for building Domain Specific Languages (DSL) and tools based on generative techniques. However, the use of EMF generated tools in domains like Internet of Things (IoT), Cloud Computing or Models@Runtime reaches several limitations. In this paper, we identify several properties the generated tools must comply with to be usable in other domains than desktop-based software systems. We then challenge EMF on these properties and describe our approach to overcome the limitations. Our approach, implemented in the Kevoree Modeling Framework (KMF), is finally evaluated according to the identified properties and compared to EMF."
373,hal-02398051,2019,{Experience Report on the Development of a Specialized Multi-view Multi-stakeholder Model-Based Engineering Framework (extended version)},"Le Guernic, Gurvan",
374,hal-01385541,2016,{Experience Report on the Formal Specification of a Packet Filtering Language Using the K Framework},"Le Guernic, Gurvan and Galindo Duarte, José Angel","Many project-specific languages, including in particular filtering languages, are defined using non-formal specifications written in natural languages. This leads to ambiguities and errors in the specification of those languages. This paper reports on an experiment on using a tool-supported language specification framework (K) for the formal specification of the syntax and semantics of a filtering language having a complexity similar to those of real-life projects. In the context of this experimentation, the cost and benefits of formally specifying a language using a tool-supported framework in general (as well as the expressivity and ease of use of the K framework in particular) are evaluated."
375,hal-01095501,2014,{Analysis and Exploitation of Natural Software Diversity: The Case of API Usages},"Mendez, Diego and Baudry, Benoit and Monperrus, Martin","In this paper, we study how object-oriented classes are used across thousands of software packages. We concentrate on ""usage diversity"", defined as the different statically observable combinations of methods called on the same object. We present empirical evidence that there is a significant usage diversity for many classes. For instance, we observe in our dataset that Java's String is used in 2460 manners. Beyond those empirical observations, we show that we can use this API usage diversity to reason on the core design of object-oriented classes. We think that our pieces of evidence on API usage diversity shake up some established ideas on the nature of software and how to engineer it. Hence, we discuss those empirical results in the general context of software engineering: what are the reasons behind this diversity? what are the implications of this diversity?"
376,hal-01467299,2017,{Learning-Based Performance Specialization of Configurable Systems},"Temple, Paul and Acher, Mathieu and Jézéquel, Jean-Marc and Noel-Baron, Léo and Galindo, José A","A large scale configurable system typically offers thousands of options or parameters to let the engineers customize it for specific needs. Among the resulting many billions possible configurations, relating option and parameter values to desired performance is then a daunting task relying on a deep know how of the internals of the configurable system. In this paper, we propose a staged configuration process to narrow the space of possible configurations to a good approximation of those satisfying the wanted high level customer requirements. Based on an oracle (e.g. a runtime test) that tells us whether a given configuration meets the requirements (e.g. speed or memory footprint), we leverage machine learning to retrofit the acquired knowledge into a variability model of the system that can be used to automatically specialize the configurable system. We validate our approach on a set of well-known configurable software systems. Our results show that, for many different kinds of objectives and performance qualities, the approach has interesting accuracy, precision and recall after a learning stage based on a relative small number of random samples."
377,hal-01514421,2017,{A Systematic Mapping Study on Modeling for Industry 4.0},"Wortmann, Andreas and Combemale, Benoit and Barais, Olivier","Industry 4.0 is a vision of interconnected manufacturing in which smart, interconnected production systems optimize the complete value-added chain to reduce cost and time-to-market. At the core of Industry 4.0 is the smart factory of the future, whose successful deployment requires solving challenges from many domains. Model-based systems engineering (MBSE) is a key enabler for such complex systems of systems as can be seen by the increased number of related publications in key conferences and journals. This paper aims to characterize the state of the art of MBSE for the smart factory hrough a systematic mapping study on this topic. Adopting a detailed search strategy, 1466 papers were initially identified. Of these, 222 papers were selected and categorized using a particular classification scheme. Hence we present the concerns addressed by modeling community for Industry 4.0, how these are investigated, where these are published, and by whom. The resulting research landscape can help to understand, guide, and compare research in this field. In particular, this paper identifies the Industry 4.0 challenges addressed by the modeling community, but also the challenges that seems to be less investigated."
378,hal-03652738,2022,{RISCLESS: A Reinforcement Learning Strategy to Exploit Unused Cloud Resources},"Yalles, Sidahmed and Handaoui, Mohamed and Dartois, Jean-Emile and Barais, Olivier and d'Orazio, Laurent and Boukhobza, Jalil","One of the main objectives of Cloud Providers (CP) is to guarantee the Service-Level Agreement (SLA) of customers while reducing operating costs. To achieve this goal, CPs have built large-scale datacenters. This leads, however, to underutilized resources and an increase in costs. A way to improve the utilization of resources is to reclaim the unused parts and resell them at a lower price. Providing SLA guarantees to customers on reclaimed resources is a challenge due to their high volatility. Some state-of-the-art solutions consider keeping a proportion of resources free to absorb sudden variation in workloads. Others consider stable resources on top of the volatile ones to fill in for the lost resources. However, these strategies either reduce the amount of reclaimable resources or operate on less volatile ones such as Amazon Spot instance. In this paper, we proposed RISCLESS, a Reinforcement Learning strategy to exploit unused Cloud resources. Our approach consists of using a small proportion of stable on-demand resources alongside the ephemeral ones in order to guarantee customers SLA and reduce the overall costs. The approach decides when and how much stable resources to allocate in order to fulfill customers' demands. RISCLESS improved the CPs' profits by an average of 15.9% compared to state-of-the-art strategies. It also reduced the SLA violation time by an average of 36.7% while increasing the amount of used ephemeral resources by 19.5% on average"
379,tel-01426219,2016,{Self-adaptation for Internet of things applications},"Acosta Padilla, Francisco Javier","The Internet of Things (IoT) is covering little by little every aspect on our lives. As these systems become more pervasive, the need of managing this complex infrastructure comes with several challenges. Indeed, plenty of small interconnected devices are now providing more than a service in several aspects of our everyday life, which need to be adapted to new contexts without the interruption of such services. However, this new computing system differs from classical Internet systems mainly on the type, physical size and access of the nodes. Thus, typical methods to manage the distributed software layer on large distributed systems as usual cannot be employed on this context. Indeed, this is due to the very different capacities on computing power and network connectivity, which are very constrained for IoT devices. Moreover, the complexity which was before managed by experts on several fields, such as embedded systems and Wireless Sensor Networks (WSN), is now increased by the larger quantity and heterogeneity of the node’s software and hardware. Therefore, we need efficient methods to manage the software layer of these systems, taking into account the very limited resources. This underlying hardware infrastructure raises new challenges in the way we administrate the software layer of these systems. These challenges can be divided into: intra-node, on which we face the limited memory and CPU of IoT nodes, in order to manage the software layer and ; inter-node, on which a new way to distribute the updates is needed, due to the different network topology and cost in energy for battery powered devices. Indeed, the limited computing power and battery life of each node combined with the very distributed nature of these systems, greatly adds complexity to the distributed software layer management. Software reconfiguration of nodes in the Internet of Things is a major concern for various application fields. In particular, distributing the code of updated or new software features to their final node destination in order to adapt it to new requirements, has a huge impact on energy consumption. Most current algorithms for disseminating code over the air (OTA) are meant to disseminate a complete firmware through small chunks and are often implemented at the network layer, thus ignoring all guiding information from the application layer. First contribution: A models@runtime engine able to represent an IoT running application on resource constrained nodes. The transformation of the Kevoree meta-model into C code to meet the specific memory constraints of an IoT device was performed, as well as the proposition of modelling tools to manipulate a model@runtime. Second contribution: Component decoupling of an IoT system as well as an efficient component distribution algorithm. Components decoupling of an application in the context of the IoT facilitates its representation on the model@runtime, while it provides a way to easily change its behaviour by adding/removing components and changing their parameters. In addition, a mechanism to distribute such components using a new algorithm, called Calpulli is proposed."
380,tel-01416129,2016,{Metamodels and feature models : complementary approaches to formalize product comparison matrices},"Bécan, Guillaume","Product Comparison Matrices (PCMs) abound on the Web. They provide a simple representation of the characteristics of a set of products. However, the lack of formalization and the large diversity of PCMs challenges the development of software for processing these matrices. In this thesis, we develop two complementary approaches for the formalisation of PCMs. The first one consists in a precise description of the structure and semantics of PCMs in the form of a metamodel. We also propose an automated transformation from PCMs to PCM models conformant to the metamodel. The second one consists in synthesizing attributed feature models from a class of PCMs. With our contributions, we propose a generic and extensible approach for the formalization and exploitation of PCMs."
381,tel-01388392,2016,{Mining and Modeling Variability from Natural Language Documents: Two Case Studies},"Ben Nasr, Sana","Domain analysis is the process of analyzing a family of products to identify their common and variable features. This process is generally carried out by experts on the basis of existing informal documentation. When performed manually, this activity is both time-consuming and error-prone. In this thesis, our general contribution is to address mining and modeling variability from informal documentation. We adopt Natural Language Processing (NLP) and data mining techniques to identify features, commonalities, differences and features dependencies among related products. We investigate the applicability of this idea by instantiating it in two different contexts: (1) reverse engineering Feature Models (FMs) from regulatory requirements in nuclear domain and (2) synthesizing Product Comparison Matrices (PCMs) from informal product descriptions. In the first case study, we adopt NLP and data mining techniques based on semantic analysis, requirements clustering and association rules to assist experts when constructing feature models from these regulations. In the second case study, our proposed approach relies on contrastive analysis technology to mine domain specific terms from text, information extraction, terms clustering and information clustering. The main lesson learnt from the two case studies is that the exploitability and the extraction of variability knowledge depend on the context, the nature of variability and the nature of text. "
382,tel-01385022,2016,{Interactive Demand-Shifting in the Context of Domestic Micro-Generation},"Bourgeois, Jacky","The combination of ubiquitous computing and emerging energy technologies is radically changing the home energy landscape. Domestic micro-generation, dominated by solar photovoltaic, is increasing at a rapid pace. This represents an opportunity for creating and altering energy behaviours. However, these transformations generate new challenges that we call the domestic energy gap: domestic electricity consumption and micro-generation are out of sync. Micro-generation is mainly uncontrollable production relying on weather while domestic energy consumption tends to happen mostly during the evening. This thesis focuses on understanding and supporting new domestic practices in the context of domestic solar electricity generation, looking at ‘Demand-Shifting’. Specifically, we look at how can digital tools leverage Demand-Shifting practices in the context of domestic micro-generation? Relying on a mixed-method approach, we provide a qualitative and quantitative answer with the collaboration of 38 participating households in several field studies including two spanning more than eight months. Through a deep investigation of laundry and electric mobility routines in the context of domestic micro-generation, we emphasised a natural engagement into Demand-Shifting which appeared as a complex and time-consuming task for participants which was not visible when we analysed their quantitative data. We revealed this complexity through Participatory Data Analyses, a method we designed to analyse the data in collaboration with the participating householders. This provided us with a comprehensive view of the relationship between domestic micro-generation and daily routines. Finally, we highlight the need for timely and contextual support through the deployment of interventions in-the-wild. Building on discussions of our findings in perspective of the literature, we propose a conceptual framework to support domestic interactive Demand-Shifting. "
383,tel-01598821,2017,{Automatic non-functional testing and tuning of configurable generators},"Boussaa, Mohamed","Generative software development has paved the way for the creation of multiple generators (code generators and compilers) that serve as a basis for automatically producing code to a broad range of software and hardware platforms. With full automatic code generation, users are able to rapidly synthesize software artifacts for various software platforms. In addition, they can easily customize the generated code for the target hardware platform since modern generators (i.e., C compilers) become highly configurable, offering numerous configuration options that the user can apply. Consequently, the quality of generated software becomes highly correlated to the configuration settings as well as to the generator itself. In this context, it is crucial to verify the correct behavior of generators. Numerous approaches have been proposed to verify the functional outcome of generated code but few of them evaluate the non-functional properties of automatically generated code, namely the performance and resource usage properties. This thesis addresses three problems : (1) Non-functional testing of generators: We benefit from the existence of multiple code generators with comparable functionality (i.e., code generator families) to automatically test the generated code. We leverage the metamorphic testing approach to detect non-functional inconsistencies in code generator families by defining metamorphic relations as test oracles. We define the metamorphic relation as a comparison between the variations of performance and resource usage of code, generated from the same code generator family. We evaluate our approach by analyzing the performance of HAXE, a popular code generator family. Experimental results show that our approach is able to automatically detect several inconsistencies that reveal real issues in this family of code generators. (2) Generators auto-tuning: We exploit the recent advances in search-based software engineering in order to provide an effective approach to tune generators (i.e., through optimizations) according to user's non-functional requirements (i.e., performance and resource usage). We also demonstrate that our approach can be used to automatically construct optimization levels that represent optimal trade-offs between multiple non-functional properties such as execution time and resource usage requirements. We evaluate our approach by verifying the optimizations performed by the GCC compiler. Our experimental results show that our approach is able to auto-tune compilers and construct optimizations that yield to better performance results than standard optimization levels. (3) Handling the diversity of software and hardware platforms in software testing: Running tests and evaluating the resource usage in heterogeneous environments is tedious. To handle this problem, we benefit from the recent advances in lightweight system virtualization, in particular container-based virtualization, in order to offer effective support for automatically deploying, executing, and monitoring code in heterogeneous environment, and collect non-functional metrics (e.g., memory and CPU consumptions). This testing infrastructure serves as a basis for evaluating the experiments conducted in the two first contributions."
384,tel-01238005,2015,{Execution trace management to support dynamic V\&V for executable DSMLs},"Bousse, Erwan","Dynamic verification and validation (V&V) techniques are required to ensure the correctness of executable models. Most of these techniques rely on the concept of execution trace, which is a sequence containing information about an execution. Therefore, to enable dynamic V&V of executable models conforming to any executable domain-specific modeling language (xDSML), it is crucial to provide efficient facilities to construct and manipulate all kinds of execution traces. To that effect, we first propose a scalable model cloning approach to conveniently construct generic execution traces using model clones. Using a random metamodel generator, we show that this approach is scalable in memory with little manipulation overhead. We then present a generative approach to define multidimensional and domain-specific execution trace metamodels, which consists in creating the execution trace data structure specific to an xDSML. Thereby, execution traces of models conforming to this xDSML can be efficiently captured and manipulated in a domain-specific way. We apply this approach to two existing dynamic V&V techniques, namely semantic differencing and omniscient debugging. We show that such a generated execution trace metamodel provides good usability and scalability for dynamic early V&V support for any xDSML. Our work have been implemented and integrated within the GEMOC Studio, which is a language and modeling workbench resulting from the eponym international initiative."
385,tel-03589271,2021,"{Conception, maintenance et évolution non-cassante des API REST}","Cheron, Antoine","The World Wide Web is mainly composed of two types of application components: applications and services. Applications, whether they are mobile or Web applications, i.e. intended to be used from a browser, have in common that they are a kind of text to holes and communicate with the services to customize the application for each user. It is therefore the service that owns and manages the data. To make this communication possible, the services offer APIs following the REST architecture. The management of the life cycle of a REST API is then a central element of the development of systems on the Web. The first step in this life cycle is the definition of the requirements of an API (functionality and software properties). Then, the technologies that will allow it to be designed, implemented and documented are chosen. It is then implemented and documented and put online. From then on, applications can use it. Then follows a phase of maintenance and evolution of the API, in which bugs are fixed and functionalities evolve to adapt to the changes of its users' expectations. In this thesis, we review the methods and technologies that accompany the developer during this life cycle. We identify two open challenges. First, there are many technologies for creating and documenting an API. Choosing the most relevant technologies for a project is a difficult task. As a first contribution of this thesis, we establish criteria to compare these technologies. Then, we use these criteria to compare existing technologies and propose three comparison matrices. Finally, to simplify this selection, we have developed an open-source wizard available on the Web, which guides the developer in his choice. The second challenge we have identified is related to the maintenance and evolution of REST APIs. The existing literature does not allow a REST API to evolve freely, without the risk of breaking the applications that use it (their clients). The second contribution of this work is a new approach to the co-evolution of REST APIs and their clients. We have identified that by following 7 rules governing the documentation of the API and the data they return in response to its clients, it is possible to create Web user interfaces capable of adapting to the majority of evolutions of REST APIs without producing bugs, nor breaking them and without even requiring the modification of their source code."
386,tel-01943728,2018,{User controlled trust and security level of Web real-time communications},"Corre, Kevin","In this thesis, we propose three main contributions : In our first contribution we study the WebRTC identity architecture and more particularly its integration with existing authentication delegation protocols. This integration has not been studied yet. To fill this gap, we implement components of the WebRTC identity architecture and comment on the issues encountered in the process. In order to answer RQ1, we then study this specification from a privacy perspective an identify new privacy considerations related to the central position of identity provider. In the Web, the norm is the silo architecture of which users are captive. This is even more true of authentication delegation systems where most of the time it is not possible to freely choose an identity provider. In order to answer RQ3, we conduct a survey on the top 500 websites according to Alexa.com to identify the reasons why can't users choose their identity provider. Our results show that while the choice of an identity provider is possible in theory, the lack of implementation of existing standards by websites and identity providers prevent users to make this choice. In our second contribution, we aim at giving more control to users. To this end and in order to answer RQ2, we extend the WebRTC specification to allow identity parameters negotiation. We present a prototype implementation of our proposition to validate it. It reveals some limits due to the WebRTC API, in particular preventing to get feedback on the other peer's authentication strength. We then propose a web API allowing users to choose their identity provider in order to authenticate on a third-party website, answering RQ2. Our API reuse components of the WebRTC identity architecture in a client-server authentication scenario. Again, we validate our proposition by presenting a prototype implementation of our API based on a Firefox extension. Finally, in our third contribution, we look back on RQ1 and propose a trust and security model of a WebRTC session. Our proposed model integrates in a single metric the security parameters used in the session establishment, the encryption parameters for the media streams, and trust in actors of the communication setup as defined by the user. Our model objective is to help non-expert users to better understand the security of their WebRTC session. To validate our approach, we conduct a preliminary study on the comprehension of our model by non-expert users. This study is based on a web survey offering users to interact with a dynamic implementation of our model."
387,tel-03854875,2022,{Vers un environnement de développement intégré flexible},"Coulon, Fabien","Digitization is leading domain experts to use development environments, softwares increasingly turning to Web and Cloud technologies. This raises the questions of how to take advantage of the diversity of execution platforms by distributing the heterogeneous services of these environments to the places adapted to their needs and how to combine the specific stengths of several technological stacks to implement these services. The contributions of this thesis are a generative approach to distribute services of a development environment and a mechanism to synchronize services implemented in different technological stacks."
388,tel-03009816,2020,{Leveraging Cloud unused heterogeneous resources for applications with SLA guarantees},"Dartois, Jean-Emile","Managing efficiently Cloud resources and reducing costs are major concerns for Cloud providers both economic and ecological reasons. However, It has been observed that the average usage of resources remains low, between 25-35% for the CPU. One way to improve Cloud data center resource utilization and thus reduce the total cost of ownership is to reclaim Cloud unused resources. However, reselling resources needs to meet the expectations of its customers in terms of quality of service. In this thesis the goal is to leverage Cloud unused resources for applications with SLA guarantees. To achieve that, this thesis proposes four contributions. The first one focuses on estimating real system capacity by considering SSD interferences. The second aims at estimating future use to provide availability guarantees. Then, a third contribution demonstrates the possibility of leveraging Cloud unused resources for big data without interfering with the co-located workloads. Finally, the last contribution aims at preventing malicious infrastructure owners from sabotaging the computation."
389,tel-01427009,2016,{Composition and Interoperability for External Domain-Specific Language Engineering},"Degueule, Thomas","Following the principles of Model-Driven Engineering and Language-Oriented Programming, Domain-Specific Languages (DSLs) are now developed in numerous domains to address specific concerns in the development of complex systems. However, despite many advances in Software Language Engineering, DSLs and their tooling still suffer from substantial development costs which hamper their successful adoption in the industry.We identify two main challenges to be addressed. First, the proliferation of independently developed and constantly evolving DSLs raises the problem of interoperability between similar languages and environments. Language users must be given the flexibility to open and manipulate their models using different variants and versions of various environments and services to foster collaboration in the development of complex systems. Second, since DSLs and their environments suffer from high development costs, tools and methods must be provided to assist language designers and mitigate development costs.In this thesis, we address these challenges through three interconnected contributions. First, we propose the notion of language interface. Using language interfaces, one can vary or evolve the implementation of a DSL while retaining the compatibility with the services and environments defined on its interface. Then, we present a mechanism, named model polymorphism, for manipulating models through different language interfaces. Model polymorphism opens up the possibility to safely manipulate models using different modeling environments and services. Finally, we propose a meta-language that enables language designers to reuse legacy DSLs, compose them, extend them, and customize them to meet new requirements. This approach relies on language interfaces to provide a reasoning layer for ensuring the structural correctness of composed DSLs and their safe manipulation.We implement all our contributions in a new language workbench named Melange. Melange supports the modular definition of DSLs, and the interoperability of their environments. Melange is seamlessly integrated with the de facto standard Eclipse Modeling Framework (EMF) and provides model polymorphism to any EMF-based tool of the Eclipse modeling ecosystem. Using Melange, we show how to reuse tools and services over various language families (four versions of the Unified Modeling Language, and a family of statechart languages), and how to flexibly manipulate their models. We also show how Melange eases the development of new DSLs by designing a new modeling language for Internet of Things systems as an assembly of various independently developed languages.Different perspectives directly stem from the contributions presented in this thesis. In particular, we discuss how our contributions constitutes a first step towards component-based language engineering and viewpoints engineering."
390,tel-01488300,2016,{Composition et interopérabilité pour l'ingénierie des langages dédiés externes},"Degueule, Thomas","Development and evolution of Domain-Specific Languages (DSLs) is becoming recurrent in the development of complex software-intensive systems. However, despite many advances in Software Language Engineering (SLE), DSLs and their tooling still suffer from substantial development costs which hamper their successful adoption in the industry. We identify two main challenges to be addressed. First, the proliferation of independently developed and constantly evolving DSLs raises the problem of interoperability between similar languages and environments. Second, since DSLs and their environments suffer from high development costs, tools and methods must be provided to assist language designers and mitigate development costs. To address these challenges, we first propose the notion of language interface. Using language interfaces, one can vary or evolve the implementation of a DSL while retaining the compatibility with the services and environments defined on its interface. Then, we present a mechanism, named model polymorphism, for manipulating models through different language interfaces. Finally, we propose a meta-language that enables language designers to reuse legacy DSLs, compose them, extend them, and customize them to meet new requirements. We implement all our contributions in a new language workbench named Melange that supports the modular definition of DSLs and the interoperability of their tooling. We evaluate the ability of Melange to solve challenging SLE scenarios."
391,tel-01127500,2014,{Leveraging model-based product lines for systems engineering},"Filho, João Bosco Ferreira","Systems Engineering is a complex and expensive activity in several kinds of companies, it imposes stakeholders to deal with massive pieces of software and their integration with several hardware components. To ease the development of such systems, engineers adopt a divide and conquer approach : each concern of the system is engineered separately, with several domain specific languages (DSL) and stakeholders. The current practice for making DSLs is to rely on the Model-driven Engineering (MDE. On the other hand, systems engineering companies also need to construct slightly different versions/variants of a same system; these variants share commonalities and variabilities that can be managed using a Software Product Line (SPL) approach. A promising approach is to ally MDE with SPL – Model-based SPLs (MSPL) – in a way that the products of the SPL are expressed as models conforming to a metamodel and well-formedness rules. The Common Variability Language (CVL) has recently emerged as an effort to standardize and promote MSPLs. Engineering an MSPL is extremely complex to an engineer: the number of possible products is exponential; the derived product models have to conform to numerous well- formedness and business rules; and the realization model that connects a variability model and a set of design models can be very expressive specially in the case of CVL. Managing variability models and design models is a non-trivial activity. Connecting both parts and therefore managing all the models is a daunting and error-prone task. Added to these challenges, we have the multiple different modeling languages of systems engineering. Each time a new modeling language is used for developing an MSPL, the realization layer should be revised accordingly. The objective of this thesis is to assist the engineering of MSPLs in the systems engineering field, considering the need to support it as earlier as possible and without compromising the existing development process. To achieve this, we provide a systematic and automated process, based on CVL, to randomly search the space of MSPLs for a given language, generating counterexamples that can server as antipatterns. We then provide ways to specialize CVL’s realization layer (and derivation engine) based on the knowledge acquired from the counterexamples. We validate our approach with four modeling languages, being one acquired from industry; the approach generates counterexamples efficiently, and we could make initial progress to increase the safety of the MSPL mechanisms for those languages, by implementing antipattern detection rules. Besides, we also analyse big Java programs, assessing the adequacy of CVL to deal with complex languages; it is also a first step to assess qualitatively the counterexamples. Finally, we provide a methodology to define the processes and roles to leverage MSPL engineering in an organization."
392,tel-01187958,2015,"{Evolution, testing and configuration of variability systems intensive}","Galindo Duarte, José Ángel","The large number of configurations that a feature model can encode makes the manual analysis of feature models an error prone and costly task. Then, computer-aided mechanisms appeared as a solution to extract useful information from feature models. This process of extracting information from feature models is known as ''Automated Analysis of Feature models'' that has been one of the main areas of research in the last years where more than thirty analysis operations have been proposed. In this dissertation we looked for different tendencies in the automated analysis field and found several research opportunities. Driven by real-world scenarios such as smart phone or videosurveillance domains, we contributed applying, adapting or extending automated analysis operations in variability intensive systems evolution, testing and configuration."
393,tel-01308660,2015,{Supporting resource awareness in managed runtime environment},"Gonzalez Herrera, Inti Yulien","Software systems are more pervasive than ever nowadays. Occasionally, applications run on top of resource-constrained devices where efficient resource management is required ; hence, they must be capable of coping with such limitations. However, applications require support from the run-time environment to properly deal with resource limitations. This thesis addresses the problem of supporting resource-aware programming in execution environments. In particular, it aims at offering efficient support for collecting data about the consumption of computational resources (e.g., CPU, memory), as well as efficient mechanisms to reserve resources for specific applications. In existing solutions we find two important drawbacks. First, they impose performance overhead on the execution of applications. Second, creating resource management tools for these abstractions is still a daunting task. The outcomes of this thesis are three contributions: an optimistic resource monitoring framework that reduces the cost of collecting resource consumption data ; a methodology to select components' bindings at deployment time in order to perform resource reservation ; a language to build customized memory profilers that can be used both during applications' development, and also in a production environment."
394,tel-01246035,2015,{Supporting resource-awareness in managed runtime environments},"Gonzalez-Herrera, Inti","Software systems are more pervasive than ever nowadays. Occasionally, applications runon top of resource-constrained devices where efficient resource management is required;hence, they must be capable of coping with such limitations. However, applicationsrequire support from the runtime environment to properly deal with resource limitations. This thesis addresses the problem of supporting resource-aware programming inexecution environments. In particular, it aims at offering efficient support for collecting data about the consumption of computational resources (e.g., CPU, memory), aswell as efficient mechanisms to reserve resources for specific applications. In existingsolutions we find two important drawbacks. First, they impose performance overheadon the execution of applications. Second, creating resource management tools for theseabstractions is still a daunting task. The outcomes of this thesis are three contributions:• An optimistic resource monitoring framework that reduces the cost of collectingresource consumption data.• A methodology to select components’ bindings at deployment time in order toperform resource reservation.• A language to build customized memory profilers that can be used both duringapplications’ development, and also in a production environment."
395,tel-03881947,2022,{IDE as Code : reifying language protocols as first-class citizens},"Jeanjean, Pierre","The use of modern and complex programming languages requires dedicated development environments to support programmers. Integrated Development Environments (IDEs) are the most used environments today, providing all the necessary tools to use efficiently the languages they target. Providing a complete IDE for a specific language is however very costly, which lead to their maintainers to only focus on a few languages for each IDE, splitting their users. In order to stay relevant, small languages such as domain-specific languages (DSLs) need to be properly integrated in the environment of their users, which requires vast resources and cannot scale to the fragmentation of IDEs. In this thesis, we explore the idea of deploying development environments customized to the needs of their users, and leveraging language tools completely separated from any specific IDE. We start by considering language protocols, such as LSP, and designing a modular and extensible alternative that is more in line with the specificities of DSLs. Then, we focus on REPL interpreters, interactive language interpreters which offer great value for education and exploration but are not easily derived from existing language engineering techniques. We provide a formal approach to define REPLs, as well as a generative approach, and discuss their integration in development environments. Finally, we unify the semantics of several execution tools (execution engines, REPLs and debuggers) through the specification of the common interface between them."
396,tel-01127612,2014,{Rare event simulation for statistical model checking},"Jegourel, Cyrille","In this thesis, we consider two problems that statistical model checking must cope. The first problem concerns heterogeneous systems, that naturally introduce complexity and non-determinism into the analysis. The second problem concerns rare properties, difficult to observe, and so to quantify. About the first point, we present original contributions for the formalism of composite systems in BIP language. We propose SBIP, a stochastic extension and define its semantics. SBIP allows the recourse to the stochastic abstraction of components and eliminate the non-determinism. This double effect has the advantage of reducing the size of the initial system by replacing it by a system whose semantics is purely stochastic, a necessary requirement for standard statistical model checking algorithms to be applicable. The second part of this thesis is devoted to the verification of rare properties in statistical model checking. We present a state-of-the-art algorithm for models described by a set of guarded commands. Lastly, we motivate the use of importance splitting for statistical model checking and set up an optimal splitting algorithm. Both methods pursue a common goal to reduce the variance of the estimator and the number of simulations. Nevertheless, they are fundamentally different, the first tackling the problem through the model and the second through the properties."
397,tel-01729126,2017,{Browser Fingerprinting : Exploring Device Diversity to Augment Authentification and Build Client-Side Countermeasures},"Laperdrix, Pierre","Users are presented with an ever-increasing number of choices to connect to the Internet. From desktops, laptops, tablets and smartphones, anyone can find the perfect device that suits his or her needs while factoring mobility, size or processing power. Browser fingerprinting became a reality thanks to the software and hardware diversity that compose every single one of our modem devices. By collecting device-specific information with a simple script running in the browser, a server can fully or partially identify a device on the web and follow it wherever it goes. This technique presents strong privacy implications as it does not require the use of stateful identifiers like cookies that can be removed or managed by the user. In this thesis, we provide the following contributions: an analysis of 118,934 genuine fingerprints to understand the current state of browser fingerprinting, two countermeasures called Blink and FPRandom and a complete protocol based on canvas fingerprinting to augment authentication on the web. Browser fingerprinting is still in its early days. As the web is in constant evolution and as browser vendors keep pushing the limits of what we can do online, the contours of this technique are continually changing. With this dissertation, we shine a light into its inner-workings and its challenges along with a new perspective on how it can reinforce account security."
398,tel-01959918,2018,{Model driven synthesis of virtual reality applications},"Le Moulec, Gwendal","Development practices in Virtual Reality (VR) are not optimized. for example, each company uses its own methods. The goal of this PhD thesis is to automatize development and evaluation of VR software with the use of Model-Driven Engineering (MDE) technics. The existing approaches in VR do not take advantage of software commonalities. Those lacks of reuse and abstraction are known problems in MDE, which proposes the Soflware Product Line (SPL) concept to automatize the production of software belonging to the same family, by reusing common components. However, this approach is not adapted to software based on a scenario, like inVR.We propose two frameworks that respectively address the lacks in MDE and VR : SOSPL (scenario-oriented software product line) and VRSPL (VR SPL). SOSPL is based on a scenario model that handles a software variability model (feature model , FM). Each scenario step matches a configuration of the FM. VRSPL is based on SOSPL. The scenario manages virtual objects manipulation, the objects being generated automatically from a model. We implemented these frameworks inside tools that have been tried on exemples and evaluated by their target users. The results promote the use of these frameworks for producing scenario-based software."
399,tel-02418676,2019,{On modularity and performance of External Domain-Specific Language implementations},"Leduc, Manuel","Model-Driven Engineering (MDE) aims at supporting Domain Experts when developing complex systems, by separating concerns through the use of models. Models are representations of specific aspects of a system and are defined using relevant abstractions. Such abstractions are defined using Domain-Specific Languages (DSLs). DSLs are created with specialized Integrated Development Environment (IDE), called language workbenches. Language workbenches assist the engineering of languages by offering useful language abstractions. First, these abstractions have the benefit of providing the relevant level of abstraction for the specification of languages. Second, they address the concerns inherent to software development, such as modularity or testability. However, the benefits of these abstractions can be lost when translated to language service implementations (e.g., editors, interpreters, debuggers). This has many drawbacks, especially in terms of reusability and performance. Domain Experts are subject to these limitations and are forced to reason in terms of the low-level intricacies of language services implementation. These problems can be alleviated by exploiting the information provided by the abstractions available in the DSL specifications. To do so, we propose two new implementation patterns supporting reuse and performances, and a systematic translation of language specifications to these patterns. The first implementation pattern tackles language reuse and is called REVISITOR. The REVISITOR pattern allows the safe and modular reuse of languages, both syntactically and semantically, while relying solely on mainstream object-oriented programming concepts. The second implementation pattern addresses language performances. Language performances are improved by introducing language-specific optimizations. These optimizations are automatically introduced without being intrusive of the usual language development methods. We seamlessly implement our approaches on top of the Eclipse IDE ecosystem by using two metalanguages: Ecore and ALE. Ecore supports the specification of abstract syntaxes in the form of metamodels. Ecore is provided by the de facto industrial standard Eclipse Modeling Framework (EMF). ALE supports the modular specification of language semantics on top of Ecore metamodels. We provide two compilers for these metalanguages. They support respectively the compilation of language specifications to the REVISITOR pattern and the automatic introduction of performance-specific optimizations in DSL interpreter implementations. We evaluate the benefits of our approaches by implementing a varied selection of heterogeneous DSLs. Our contributions make possible the implementation of reusable or efficient DSLs for Domain Experts — while currently separated, future work aims to integrate them. In practice, our approach is both non-intrusive of the usual methodology of language engineering and based on automated code generation. Consequently, our approach is directly applicable to industrial contexts and can be integrated with legacy DSLs artifacts."
400,tel-02972666,2019,{On modularity and performances of external domain-specific language implementations},"Leduc, Manuel","Model-Driven Engineering (MDE) aims at supporting Domain Experts when developing complex systems, by separating concerns through the use of models. Models are representations of specific aspects of a system and are defined using relevant abstractions. Such abstractions are defined using Domain-Specific Languages (DSLs). DSLs are created with specialized Integrated Development Environment (IDE), called language workbenches. Language workbenches assist the engineering of languages by offering useful language abstractions. First, these abstractions have the benefit of providing the relevant level of abstraction for the specification of languages. Second, they address the concerns inherent to software development, such as modularity or testability. However, the benefits of these abstractions can be lost when translated to language service implementations (e.g., editors, interpreters, debuggers). This has many drawbacks, especially in terms of reusability and performance. Domain Experts are subject to these limitations and are forced to reason in terms of the low-level intricacies of language services implementation. These problems can be alleviated by exploiting the information provided by the abstractions available in the DSL specifications. To do so, we propose two new implementation patterns supporting reuse and performances, and a systematic translation of language specifications to these patterns. The first implementation pattern tackles language reuse and is called REVISITOR. The REVISITOR pattern allows the safe and modular reuse of languages, both syntactically and semantically, while relying solely on mainstream object-oriented programming concepts. The second implementation pattern addresses language performances. Language performances are improved by introducing language-specific optimizations. These optimizations are automatically introduced without being intrusive of the usual language development methods. We seamlessly implement our approaches on top of the Eclipse IDE ecosystem by using two metalanguages: Ecore and ALE. Ecore supports the specification of abstract syntaxes in the form of metamodels. Ecore is provided by the de facto industrial standard Eclipse Modeling Framework (EMF). ALE supports the modular specification of language semantics on top of Ecore metamodels. We provide two compilers for these metalanguages. They support respectively the compilation of language specifications to the REVISITOR pattern and the automatic introduction of performance-specific optimizations in DSL interpreter implementations. We evaluate the benefits of our approaches by implementing a varied selection of heterogeneous DSLs. Our contributions make possible the implementation of reusable or efficient DSLs for Domain Experts — while currently separated, future work aims to integrate them. In practice, our approach is both non-intrusive of the usual methodology of language engineering and based on automated code generation. Consequently, our approach is directly applicable to industrial contexts and can be integrated with legacy DSLs artifacts."
401,tel-01232388,2015,{Testing and maintenance of graphical user interfaces},"Lelli Leitao, Valeria","The software engineering community takes special attention to the quality and the reliability of software systems. Software testing techniques have been developed to find errors in code. Software quality criteria and measurement techniques have also been assessed to detect error-prone code. In this thesis, we argue that the same attention has to be investigated on the quality and reliability of GUIs, from a software engineering point of view. We specifically make two contributions on this topic. First, GUIs can be affected by errors stemming from development mistakes. The first contribution of this thesis is a fault model that identifies and classifies GUI faults. We show that GUI faults are diverse and imply different testing techniques to be detected. Second, like any code artifact GUI code should be analyzed statically to detect implementation defects and design smells. As for the second contribution, we focus on design smells that can affect GUIs specifically. We identify and characterize a new type of design smell, called Blob listener. It occurs when a GUI listener, that gathers events to treat and transform as commands, can produce more than one command. We propose a systematic static code analysis procedure that searches for Blob listener that we implement in a tool called InspectorGuidget. Experiments we conducted exhibits positive results regarding the ability of InspectorGuidget in detecting Blob listeners. To counteract the use of Blob listeners, we propose good coding practices regarding the development of GUI listeners."
402,tel-04055573,2023,{Deep Software Variability for Resilient Performance Models of Configurable Systems},"Lesoil, Luc","Context. Software systems are heavily configurable, in the sense that users can adapt them according to their needs thanks to configurations. But not all configurations are equals, and some of them will clearly be more efficient than others in terms of performance. For human beings, it is quite complex to handle all the possible configurations of a system and to choose among one of them to reach a performance goal. Research work have shown that machine learning can bridge this gap and predict the performance value of a software systems based on its configurations. Problem. These techniques do not include the executing environment as part of the training data, while it could interact with the different configuration options and change their related performance distribution. In short, our machine learning models are too simple and will not be useful or applicable for end-users. Contributions. In this thesis, we first propose the term deep variability to refer to the existing interactions between the environment and the configurations of a software system, altering its performance distribution. We then empirically demonstrate the existence of deep variability and propose few solutions to tame the related issues. Finally, we prove that machine learning models can be adapted to be by-design robust to deep variability."
403,hal-03355775,2021,{Urban mobility : Leveraging machine learning and data masses for the buildingof simulators},"Lyan, Gauthier","The so called data era we have entered in is accompanied by an explosion of data, both in variety and quantity. Public transportation is a data-intensive field, and related information systems are often supported by old technologies that struggle to keep up as the amount of data continually increases. This poses two problems. First, the massive data generated by the transportation network must be qualified and enriched with external data sources in order to be used for decision making. Second, in order to limit the number of tools and the complexity of maintenance, it is desirable to integrate data governance with decision support tools to allow non-expert operators to manipulate this data. Through four contributions leading to the proposal of a technical framework that integrates the past, present and future into a traditional information system containing a priori models, this thesis argues that the integration of various highly qualified datasets from the real world into a single spatio-temporal model provides a qualitative, efficient and low-cost mean of analysis, prediction and strategic decision support for bus networks while depreciating the use of data management systems in a non integrated multi-tool data management systems?"
404,tel-03520672,2021,{Urban mobility : leveraging machine learning and data masses for the building of simulators},"Lyan, Gauthier","The so called data era we have entered in is accompanied by an explosion of data, both in variety and quantity. Public transportation is a data-intensive field, and related information systems are often supported by old technologies that struggle to keep up as the amount of data continually increases. This poses two problems. First, the massive data generated by the transportation network must be qualified and enriched with external data sources in order to be used for decision making. Second, in order to limit the number of tools and the complexity of maintenance, it is desirable to integrate data governance with decision support tools to allow non-expert operators to manipulate this data. Through four contributions leading to the proposal of a technical framework that integrates the past, present and future into a traditional information system containing a priori models, this thesis argues that the integration of various highly qualified datasets from the real world into a single spatio-temporal model provides a qualitative, efficient and low-cost mean of analysis, prediction and strategic decision support for bus networks while depreciating the use of data management systems in a non integrated multi-tool data management systems ?"
405,tel-03698474,2021,{Machine learning for performance modelling on colossal software configuration spaces},"Martin, Hugo","Almost all of today's software systems are configurable. With the help of options, it is possible to modify the behavior of these systems, to add or remove certain capabilities to improve their performance or to adapt them to different situations. Each of these options is linked to certain parts of the code, and ensuring that these parts work well together, or that they cannot be used together, is one of the challenges during the development and the usage of these software products, known as Software Product Lines (SPL). While this may seem relatively simple with a few options, some software assembles thousands of options spread over millions of lines of code, making the task much more complex. Over the past decade, researchers have begun to use machine learning techniques to address the problems of Software Product Lines. One of the key problems is the prediction of different properties of software, such as the speed of execution of a task, which can vary greatly depending on the configuration of the software used. Measuring the properties for each configuration can be costly and complex, or even impossible in the most extreme cases. The creation of a model allowing to predict the properties of the system, with the help of measurements on only a small part of the possible configurations, is a task in which machine learning excels. Different solutions have been developed, but they have only been validated in cases where the number of options is quite small. However, a large part of SPL have hundreds or even thousands of options. Without testing machine learning solutions on systems with so many options, it is impossible to know if these solutions are suitable for such cases. The first contribution of this thesis is the application of machine learning algorithms on a Software Product Line at a scale never before achieved. Using Linux and its 15,000 options, it was possible to determine that linear algorithms, but also those specialized for SPL, are not able to work properly at this scale. Only tree-based algorithms, as well as neural networks, were able to provide a fairly accurate model with reasonable resources in terms of time and memory. The second contribution is the Feature Ranking List, a list of options ranked by importance towards their impact on a target software property, generated by an improved feature selection based on decision trees. We evaluated its effects on Linux kernel binary size prediction models under the same conditions as the first contribution. The desired and best known effect of feature selection in general is a major speed-up in learning time as well as a significant improvement in accuracy for most of the previously considered algorithms. The third contribution is the improvement of automated performance specialization and its evaluation on different SPL including Linux. Performance specialization is a process that consists in adding constraints on an SPL in order to meet a certain performance threshold defined by the user, to help them when configuring the software. The results show that it is possible to obtain a sufficiently accurate set of rules, even on Linux."
406,tel-03675149,2021,{Machine Learning for Performance Modelling on Colossal Software Configuration Spaces},"Martin, Hugo","Variability is the blessing and the curse of today software development. On one hand, it allows for fast and cheap development, while offering efficient customization to precisely meet the needs of a user. On the other hand, the increase in complexity of the systems due to the sheer amount of possible configurations makes it hard or even impossible for users to correctly utilize them, for developers to properly test them, or for experts to precisely grasp their functioning.Machine Learning is a research domain that grew in accessibility and variety of usages over the last decades. It attracted interest from researchers from the Software Engineering domain for its ability to handle the complexity of Software Product Lines on problems they were tackling such as performance prediction or optimization. However, all studies presenting learning-based solutions in the SPL domain failed to explore the scalability of their techniques on systems with colossal configuration space (>1000 options).In this thesis, we focus on the Linux Kernel. With more than 15.000 options, it is very representative of the complexity of systems with colossal configuration spaces. We first apply various learning techniques to predict the kernel binary size, and report that most of the techniques fail to produce accurate results. In particular, performance-influence model, a learning technique tailored for SPL problem, does not even work on such large dataset. Among the tested techniques, only Tree-based algorithms and Neural Networks are able to produce an accurate model in an acceptable time.To mitigate the problems created by colossal configuration spaces on learning techniques, we propose a feature selection technique leveraging Random Forest, enhanced toward better stability. We show that by using the feature selection, the training time can be greatly reduced, and the accuracy can be improved. This Tree-based feature selection technique is also completely automated and does not rely on prior knowledge on the system.Performance specialization is a technique that constrains the configuration space of a software system to meet a given performance criterion. It is possible to automate the specialization process by leveraging Decision Trees. While only Decision Tree Classifier has been used for this task, we explore the usage of Decision Tree Regressor, as well as a novel hybrid approach. We test and compare the different approaches on a wide range of systems, as well as on Linux to ensure the scalability on colossal configuration spaces. In most cases, including Linux, we report at least 90\% accuracy, and each approach having their own particular strength compared to the others. At last, we also leverage the Tree-based feature selection, whose most notorious effect is the reduction of the training time of Decision Trees on Linux, downing from one minute to a second or less.The last contribution explores the sustainability of a performance model across versions of a configurable system. We reused the model trained on the 4.13 version of Linux from our first contribution, and measured its accuracy on six later versions up to 5.8, spanning over three years. We show that a model is quickly outdated and unusable as is. To preserve the accuracy of the model over versions, we use transfer learning with the help of Tree-based algorithms to maintain it at a reduced cost. We tackle the problem of heterogeneity of the configuration space, that is evolving with each version. We show that the transfer approach allows for an acceptable accuracy at low cost, and vastly outperforms a learning from scratch approach using the same budget.Overall, this thesis focuses on the problems of systems with colossal configuration spaces such as Linux, and show that Tree-based algorithms are a valid solution, versatile enough to answer a wide range of problem, and accurate enough to be considered."
407,tel-01427187,2016,{Leveraging software product lines engineering in the construction of domain specific languages},"Méndez Acuña, David Fernando","The use of domain-specific languages (DSLs) has become a successful technique in the development of complex systems because it furnishes benefits such as abstraction, separation of concerns, and improvement of productivity. Nowadays, we can find a large variety of DSLs providing support in various domains. However, the construction of these languages is an expensive task. Language designers are intended to invest an important amount of time and effort in the definition of formal specifications and tooling for the DSLs that tackle the requirements of their companies. The construction of DSLs becomes even more challenging in multi-domain companies that provide several products. In this context, DSLs should be often adapted to diverse application scenarios, so language development projects address the construction of several variants of the same DSL. At this point, language designers face the challenge of building all the required variants by reusing, as much as possible, the commonalities existing among them. The objective is to leverage previous engineering efforts to minimize implementation from scratch. As an alternative to deal with such a challenge, recent research in software language engineering has proposed the use of product line engineering techniques to facilitate the construction of DSL variants. This led the notion of language product lines i.e., software product lines where the products are languages. Similarly to software product lines, language product lines can be built through two different approaches: top-down and bottom-up. In the top-down approach, a language product line is designed and implemented through a domain analysis process. In the bottom-up approach, the language product line is built up from a set of existing DSL variants through reverse-engineering techniques. In this thesis, we provide support for the construction of language product lines according to the two approaches mentioned before. On one hand, we propose facilities in terms of language modularization and variability management to support the top-down approach. Those facilities are accompanied with methodological insights intended to guide the domain analysis process. On the other hand, we introduce a reverse-engineering technique to support the bottom-up approach. This technique includes a mechanism to automatically recover a language modular design for the language product line as we as a strategy to synthesize a variability model that can be later used to configure concrete DSL variants. The ideas presented in this thesis are implemented in a well-engineered language workbench. This implementation facilitates the validation of our contributions in three case studies. The first case study is dedicated to validate our languages modularization approach that, as we will explain later in this document, is the backbone of any approach supporting language product lines. The second and third case studies are intended to validate our contributions on top-down and bottom-up language product lines respectively."
408,tel-02527656,2019,{Towards a modelling framework with temporal and uncertain data for adaptive systems},"Mouline, Ludovic","Self-Adaptive Systems (SAS) optimise their behaviours or configurations at runtime in response to a modification of their environments or their behaviours. These systems therefore need a deep understanding of the ongoing situation which enables reasoning tasks for adaptation operations. Using the model-driven engineering (MDE) methodology, one can abstract this situation. However, information concerning the system is not always known with absolute confidence. Moreover, in such systems, the monitoring frequency may differ from the delay for reconfiguration actions to have measurable effects. These characteristics come with a global challenge for software engineers: how to represent uncertain knowledge that can be efficiently queried and to represent ongoing actions in order to improve adaptation processes? To tackle this challenge, this thesis defends the need for a unified modelling framework which includes, besides all traditional elements, temporal and uncertainty as first-class concepts. Therefore, a developer will be able to abstract information related to the adaptation process, the environment as well as the system itself. Towards this vision, we present two evaluated contributions: a temporal context model and a language for uncertain data. The temporal context model allows abstracting past, ongoing and future actions with their impacts and context. The language, named Ain’tea, integrates data uncertainty as a first-class citizen."
409,tel-02921435,2019,{Modeling and assessing the security of the authentication procedure},"Ndiaye, Youssou","Software Applications are being ubiquitous in our daily life. One application may run in different platforms (e.g., Web, Mobile, Gaming Console) while being accessible simultaneously. Although these heterogeneity raises the usability from the end-user’s perspective, it makes the applications more complex to maintain when they evolve, especially when it comes to verify the end-user’s identity and authenticity. Approuved Standards and protocols provide means to ensure the end-user’s identity. Neverthless, these ones lake of considering risks introduced by factors of these heteregeneous context. Some of the factors are : execution environment, usability choice, design errors and end-user’s behaviour. This thesis provides tools and approaches that allows designers to improve the security design of their applications while considering elements from the real-life contexte. This approach imply to identify the main assets to protect, the risks and the threats. Our approach involves Web/Mobile applications, mainly we are focusing on the authentication procedure of the end-user since this is vital in order to avoid unauthorized access to the legitimate user’s resources. These authentification, while leveraging on approuved authentication schemes, considerably lakes of formal specification during the design phase. First, we investigate the impact of the heteregeneous elements on the authentication procedure. Consequently, we identify the relevant flaws that we caracterize and then define as logic flaws. To overcome their flaws, we provide a set requirements that aim to tackle them during the design phase. Second, to overcome the lake of formal specification of the authentication procedure during the design phase, we provide a Domain-Specific Language (DSL). This dedicated language implements the abstractions of a risk assessment framework that we provide. The DSL allows to extend existing authentication schemes while considering real-life contexts. Then, from a given specification, it provides the result of the risk assessment of the identified logic attacks."
410,tel-03781104,2021,{Packing detection and classification relying on machine learning to stop malware propagation},"Noureddine, Lamine","In this thesis, we propose solutions for effective, efficient, and robust packing detection and classification to be practical parts of the malware analysis chain of an antivirus. Our solutions bring two contributions to the literature. In the first one, we introduce a study which aims at better understanding the impact of ground truth generation, machine learning algorithm selection, and feature selection on the effectiveness, efficiency, and robustness of supervised machine-learning-based packing detection and classification systems. In the second one, we propose, design, and implement SE-PAC, a new Self-Evolving PAcker Classifier framework that relies on incremental clustering in a semi-supervised fashion, in order to cope with the fast-paced evolution of packers over time. For both contributions, we conduct realistic experiments showing promising results in terms of effectiveness, efficiency, and robustness for packing detection and classification."
411,tel-01251551,2015,{Integrating predictive analysis in self-adaptive pervasive systems},"Paez Anaya, Ivan Dario","In this thesis we proposed a proactive self-adaptation by integrating predictive analysis into two phases of the software process. At design time, we propose a predictive modeling process, which includes the activities: define goals, collect data, select model structure, prepare data, build candidate predictive models, training, testing and cross-validation of the candidate models and selection of the ''best'' models based on a measure of model goodness. At runtime, we consume the predictions from the selected predictive models using the running system actual data. Depending on the input data and the time allowed for learning algorithms, we argue that the software system can foresee future possible input variables of the system and adapt proactively in order to accomplish middle and long term goals and requirements."
412,tel-03935152,2022,{Data-Driven Malware Classification Assisted by Machine Learning Methods},"Puodzius, Cassius","Historically, malware (MW) analysis has heavily resorted to human savvy for manual signature creation to detect and classify MW.This procedure is very costly and time consuming, thus unable to cope with modern cyber threat scenario.The solution is to widely automate MW analysis.Toward this goal, MW classification allows optimizing the handling of large MW corpora by identifying resemblances across similar instances.Consequently, MW classification figures as a key activity related to MW analysis, which is paramount in the operation of computer security as a whole.This thesis addresses the problem of MW classification taking an approach in which human intervention is spared as much as possible.Furthermore, we steer clear of subjectivity inherent to human analysis by designing MW classification solely on data directly extracted from MW analysis, thus taking a data-driven approach.Our objective is to improve the automation of malware analysis and to combine it with machine learning methods that are able to autonomously spot and reveal unwitting commonalities within data.We phased our work in three stages.Initially we focused on improving MW analysis and its automation, studying new ways of leveraging symbolic execution in MW analysis and developing a distributed framework to scale up our computational power.Then we concentrated on the representation of MW behavior, with painstaking attention to its accuracy and robustness.Finally, we fixed attention on MW clustering, devising a methodology that has no restriction in the combination of syntactical and behavioral features and remains scalable in practice.As for our main contributions, we revamp the use of symbolic execution for MW analysis with special attention to the optimal use of SMT solver tactics and hyperparameter settings;we conceive a new evaluation paradigm for MW analysis systems;we formulate a compact graph representation of behavior, along with a corresponding function for pairwise similarity computation, which is accurate and robust;and we elaborate a new MW clustering strategy based on ensemble clustering that is flexible with respect to the combination of syntactical and behavioral features."
413,tel-01087372,2014,{Web applications engineering: reduce the complexity without loosing control},"Richard-Foy, Julien","Thanks to information technologies, some tasks or information process can be automated, thus saving a significant amount of money. The web platform brings numerous of such digital tools. These are hosted on web servers that centralize information and coordinate users, which can use the tools from several kinds of devices (desktop computer, laptop, smartphone, etc.), by using a web browser, without installing anything. Nevertheless, developing such web applications is challenging. The difficulty mainly comes from the distance between client and server devices.First, the physical distance between these machines requires them to be networked. This raises several issues. How to manage latency? How to provide a good quality of service even when the network is down? How to choose on which side (client or server) to execute a computation? How to free developers from addressing these problems without yet hiding the distributed nature of web application so that they can still benefit from their advantages?Second, the execution environment is different between clients and servers. Indeed, on client-side the program is executed within a web browser whose API provides means of reacting to user actions and of updating the page. On server-side, the program is executed on a web server that processes HTTP requests. Some aspects of web applications can be shared between client and server sides (e.g. content display, form validation, navigation, or even some business computations). However, the APIs and environments are different between clients and servers, so how to share the same code while keeping the same execution performance as with native APIs? How to keep the opportunity to leverage the specificities of a given platform?This work aims at shortening this distance while keeping the opportunity to leverage it, that is while giving developers as much expressive power.Our first contribution consists of an architecture pattern to build interactive and collaborative web applications handling on-line and off-line modes. Our pattern captures the client-server synchronization concern, thus giving developers a simpler programming model.Our second contribution shows how to use a delayed evaluation mechanism to build high-level abstractions that can be shared between client and server sides and that generate efficient code leveraging the specificities of each platform. We observed that the size of the code written using our abstractions is similar to code that uses high-level libraries, and 35% to 50% smaller than low-level code, while execution performance are similar to low-level code and 39% to 972% faster than high-level code.Our third contribution makes it easier to use the web browser’s API from a statically typed language. Indeed, this API is designed to be used with the dynamically typed language JavaScript, and some functions are hard to encode in a static type system. Current solutions either loose type information, requiring users to perform unsafe typecasts or reduce the expressive power. We show two ways to encode challenging web browser’s functions in a static type system by leveraging parameterized types and dependent types. Our approach is typesafe and keeps the same expressive power as the native API."
414,tel-01061129,2014,{Variability management and automation of software development processes},"Rouillé, Emmanuelle","Many tools have been developped in order to manage the complexity of the software and of the software development projects. However, using these tools is the source of manual recurrent tasks that are error prone and time consuming. Automating these tasks enables to improve the productivity. But the difficulties are i) to determine when the automation of a manual recurrent task must be used, and ii) to create automations that are reusable across their different use cases. We propose a tool-supported approach that drives the reuse of the automations of the manual recurrent tasks by software processes. A software process defines the sequence of steps to perform in order to realize a software engineering project. This approche consists of capitalizing on a set of software processes and of reusing processes from this set according to projects' requirements and independently of the formalism used to define the processes. The automations of the manual recurrent tasks are bound to the processes' steps they automate. This binding enables to know which automations to reuse for a specific project and when to reuse these automations during the project. This binding also enables to explicit the different use cases of each automation. We use this information to create automations that are reusable across their different use cases. We applied this tool-supported approach on a family of Java development processes coming from the industry as well as on a family of processes consisting of designing and implementing a modeling language."
415,tel-03854849,2022,{On reliability and flexibility of scientific software in environmental science : towards a systematic approach to support decision-making},"Sallou, June","Scientific software are the centre stage to support decision-making related to tackling environmental issues thanks to simulation. However, their complexity makes their execution time-consuming or resource-demanding, which is not compatible to the context of interactive decision-making. The main goal of the thesis is to tailor the scientific models, and thus, scientific software, to make them relevant and efficient to be used in such context. We first demystify the scientific models and their complementarity to engineering models to understand how they interact together. We present the MODA framework that defines the integration of the different types and roles that models can take in a sociotechnical system. Furthermore, we highlight the need to tailor the descriptive scientific models to make them take a predictive role and improve the prescriptive model of the decision-making process. We then investigate the specificity of the scientific models in terms of development cycle and validation process. We describe a reasoned approach for the development of reliable scientific software that allows to clearly characterize the validity envelope of this type of software. Finally, we propose a systematic approach of tailoring the scientific models to support decision-making by trading-off accuracy for flexibility. We adapt an approximate computing technique for scientific models. We evaluate it on a hydro-geological model used to assess the risk of flooding in coastal areas. Our results show a significant speed-up with a minimal set-up. We also propose a trade-off optimisation approach to generalise the tailoring of scientific models for decision-making."
416,tel-01092342,2014,{An approach of combining model-based testing with product Ffamily management},"Samih, Hamza","Software product line engineering is an approach that supports developing products in family. These products are described by common and variable features. Currently, the validation activity is disjointed from the product lines development process. The effort and resources provided in the test campaigns for each product can be optimized in the context of product lines. Model-based testing is a technique for automatically generating a suite of test cases from requirements. In this thesis report, we present an approach to test a software product line with model-based testing. This technique is based on an algorithm that establishes the relationship between the variability model released with OVM and the test model, using traceability of functional requirements present in both formalisms. Our contribution is an algorithm that automatically extracts a product test model. It is illustrated with a real industrial case of automotive dashboards and experimented by an industrial of aeronautic domain in the MBAT European project context."
417,tel-01256926,2015,{Computability Abstractions for Fault-tolerant Asynchronous Distributed Computing},"Stainer, Julien","This thesis studies computability in systems composed of multiple computers exchanging messages or sharing memory. The considered models take into account the possible failure of some of these computers, as well as variations in time and heterogeneity of their execution speeds. The presented results essentially consider agreement problems, systems prone to partitioning and failure detectors. The document establishes relations between known iterated models and the concept of failure detector and presents a hierarchy of agreement problems spanning from k-set agreement to s-simultaneous consensus. It also introduces a new universal construction based on s-simultaneous consensus objects and a family of iterated models allowing several processes to run in isolation."
418,tel-02125962,2018,{Investigate the matrix : leveraging variability to specialize software and test suites},"Temple, Paul","Nowadays, software have to be efficient, fast to execute, etc. They can be configured to adapt to specific needs. Each configuration leads to a different system and usually it is hard to generate them all. Thus, the exhaustive evaluation of their performance is impossible. Furthermore, several executions of systems, under different conditions, are needed to properly evaluate performances. Two dimensions emerge from this description of performance testing: the selection of system configurations allowing to generate associated systems that meet expressed needs and the selection of test cases allowing to observe performances of systems under different conditions. We propose to represent those two dimensions as a (performance) matrix: one dimension represents selected systems that can be observed while the other dimension represents the set of test cases that will be executed on each of these systems. Each cell is the execution of a program variant regarding a test. The contributions of this thesis are as follows : First, we leverage Machine Learning techniques in order to specialize a Software Product Line (in this case a video generator) helping in selecting a configuration that is likely to meet requirements. End users must be able to express their requirements such that it results in a binary decision problem (i.e., configurations that are acceptable and those that are not). Machine Learning techniques are then used to retrieve partial configurations that specialize a Software Product Line to guide end users and reduce the configuration space. In the end, this work aims at diminishing the first dimension of the matrix that deals with systems and programs. Second, we propose a new method assessing the ability of test suites to reveal significant performance differences of a set of configurations tackling the same task. This method can be used to assess whether a new test case is worth adding to a test suite or to select an optimal test set with respect to a property of interest. In the end, it might help structuring the execution of tests. For instance, it can create an order of execution resulting in using less test cases that are presented in the second dimension of the matrix. We evaluated our approach on several systems from different domains such as OpenCV or Haxe."
419,tel-02459572,2019,{Dynamic program analysis for suggesting test improvements to developers},"Vera-Pérez, Oscar Luis","Automated testing is at the core of modern software development. Yet developers struggle when it comes to the evaluation of the quality of their test cases and how to improve them. The main goal of this thesis is precisely that, to generate concrete suggestion that developers can follow to improve their test suite. We propose the use of extreme mutation, or extreme transformations as an alternative to discover testing issues. Extreme transformations are a form of mutation testing that remove the entire logic of a method instead of making a small syntactic change in the code. As it traditional counterpart it challenges the test suite with a transformed variant of the program to see if the test cases can detect the change. In this thesis we assess the relevance of the testing issues that extreme transformations can spot. We also propose a dynamic infection-propagation analysis to automatically derive concrete test improvement suggestions from undetected extreme transformations. Our results are validated through the interaction with actual developers. We also report the industrial adoption of parts of our results. developers to improve their tests by detecting more of these transformations. Our results are validated through the interaction with actual developers."
420,hal-01198384,2015,{Automatic Software Diversity in the Light of Test Suites},"Baudry, Benoit and Allier, Simon and Rodriguez-Cancio, Marcelino and Monperrus, Martin","A few works address the challenge of automating software diversification, and they all share one core idea: using automated test suites to drive diversification. However, there is is lack of solid understanding of how test suites, programs and transformations interact one with another in this process. We explore this intricate interplay in the context of a specific diversification technique called ""sosiefication"". Sosiefication generates sosie programs, i.e., variants of a program in which some statements are deleted, added or replaced but still pass the test suite of the original program. Our investigation of the influence of test suites on sosiefication exploits the following observation: test suites cover the different regions of programs in very unequal ways. Hence, we hypothesize that sosie synthesis has different performances on a statement that is covered by one hundred test case and on a statement that is covered by a single test case. We synthesize 24583 sosies on 6 popular open-source Java programs. Our results show that there are two dimensions for diversification. The first one lies in the specification: the more test cases cover a statement, the more difficult it is to synthesize sosies. Yet, to our surprise, we are also able to synthesize sosies on highly tested statements (up to 600 test cases), which indicates an intrinsic property of the programs we study. The second dimension is in the code: we manually explore dozens of sosies and characterize new types of forgiving code regions that are prone to diversification."
421,hal-01162219,2015,{DSpot: Test Amplification for Automatic Assessment of Computational Diversity},"Baudry, Benoit and Allier, Simon and Rodriguez-Cancio, Marcelino and Monperrus, Martin","In this work, we characterize a new form of software diver- sity: the existence of a set of variants that (i) all share the same API, (ii) all behave the same according to an input- output based specification and (iii) exhibit observable dif- ferences when they run outside the specified input space. We quantify computational diversity as the dissimilarity be- tween execution traces on inputs that are outside the speci- fied domain. Our technique relies on test amplification. We propose source code transformations on test cases to explore the input domain and systematically sense the observation domain. We run our experiments on 472 variants of 7 classes from open-source, large and thoroughly tested Java classes. Our test amplification multiplies by ten the number of input points in the test suite and is effective at detecting software diversity."
422,hal-01336940,2016,{Interoperability and Composition of DSLs with Melange},"Degueule, Thomas","Domain-Specific Languages (DSLs) are now developed for a wide variety of domains to address specific concerns in the development of complex systems. However, DSLs and their tooling still suffer from substantial development costs which hamper their successful adoption in the industry. For over a decade, researchers and practitioners have developed language workbenches with the promise to ease the development of DSLs. Despite many advances, there is still little support for advanced scenarios such as language evolution, composition , and interoperability. In this paper, we present a modular approach for assembling DSLs from other ones and seamlessly evolving them, while ensuring the reuse of associated tools through subsequent versions or across similar DSLs. We introduce the conceptual foundations of our approach, its implementation in the Melange language workbench, and summarize its benefits on various case studies."
423,hal-00994541,2014,{Leveraging Family Polymorphism in MDE},"Degueule, Thomas and Combemale, Benoit and Barais, Olivier and Blouin, Arnaud and Jézéquel, Jean-Marc","While Domain-Specific Modeling Languages (DSMLs) are increasingly used in industry, both their definition and tooling (e.g., checkers, document or code generators, model transformations) still require significant development efforts that must be balanced with their limited number of users (by definition). Unfortunately, the current Model-Driven Engineering (MDE) technologies strongly rely on the conformance relation that bind a model to the unique DSML used to create it. Consequently, while most of the tools would be reusable for a family of close DSMLs, in practice it is not possible. In this paper, we propose to abstract the overly restrictive conformance relation with a typing relation allowing to manipulate a model through different DSMLs and uncover the need for model-oriented type systems. We introduce K3SLE a new modeling framework built on top of the Eclipse Modeling Framework (EMF) that leverages family polymorphism to support the typing relation. Based on the typing relation, it natively provides model polymorphism, language inheritance and DSML evolution and interoperability. We demonstrate its use on representative use cases."
424,hal-01025283,2014,{The K3 Model-Based Language Workbench},"Degueule, Thomas and Barais, Olivier and Blouin, Arnaud and Combemale, Benoit","We introduce K3, a model-based language work- bench that eases the engineering of domain-specific languages. K3 features state-of-the-art facilities that increase modularity and reusability of software language artifacts to decrease their development costs. Aspect-oriented and executable metamodeling are supported through the K3AL action language. K3SLE provides software language engineering facilities such as model polymorphism and language inheritance, supported by a model-oriented typing system. We present the main components of K3, their integration into Eclipse, and the main research questions they tackle. Finally, we present the plan of the tool demonstration."
425,hal-01138017,2015,{Towards Language Interfaces for DSLs Integration},"Degueule, Thomas","Developing software-intensive systems involves many stakeholders who bring their expertise on specific concerns of the developed system. Model-Driven Engineering (MDE) proposes to address each concern separately with a dedicated Domain-Specific (possibly modeling) Language (DSL) closely tied to the needs of each stakeholder [4]. With DSLs, models are expressed in terms of problem-level abstractions. Associated tools are then used to semi-automatically transform the models into concrete artifacts. However, the definition of a DSL and its tooling (e.g., checkers, editors, generators, model transformations) still requires significant development efforts for, by definition, a limited audience. DSLs evolve as the concepts in a domain and the expert understanding of the domain evolve. A mere example is the addition, refinement or removal of features from a DSL, with possibly the intent to ensure the compatibility between the subsequent versions. Additionally, the current practice in industry has led to widespread use of small independently developed DSLs leading to challenges related to the sharing of languages and corresponding tools [6]. For example, the core concepts of an action language can be shared by all DSLs that encompass the expression of actions. Finally, while more and more DSLs are developed in various domains, recurrent paradigms are observed (e.g., state-transition, classifiers) with their own syntactic and semantic variation points reflecting the domain specificities (e.g., family of finite-state machines). Given the DSL development costs, redefining from scratch a new ecosystem of tools for each variant of a DSL is not scalable. Instead, one would like to leverage the commonalities of these languages to enable reuse of existing tools. An underlying challenge is the modular definition of languages, i.e., the possibility to define either self-contained or incomplete language components (in terms of syntax and semantics) that could be recomposed afterwards for the definition of new DSLs. To support modularity, DSLs designers should be able to define proper provided and required interfaces for each language component, together with composition operators. To improve modularity and abstraction capabilities in software language engineering and support the aforementioned scenarios, we advocate the definition of explicit language interfaces on top of language implementations. Language interfaces allow to abstract some of the intrinsic complexity carried in the implementation of languages, by exposing meaningful information concerning an aspect of a language (e.g., syntactical constructs) and for a specific purpose (e.g., composition, reuse or coordination) in an appropriate formalism. In this regard, language interfaces can be thought of as a reasoning layer on top of language implementations. The definition of language interfaces relies on proper formalisms for expressing different kinds of interfaces and binding relations between language implementations and interfaces. Using language interfaces, one can vary or evolve the implementation of a language while preserving tools and analyses defined over its interface. Language interfaces also facilitate the modular definition of languages by enabling the description of required and provided interfaces of a language (or language component). Syntactical or semantical composition operators can then be defined upon these interfaces. Languages interfaces may be crafted manually or automatically inferred from an implementation. Model types [5] are an illustration of such kind of interfaces. Model types are interfaces on the abstract syntax of a language (defined by a metamodel). Models are linked to model types by a typing relation. Most importantly, model types are linked one to another by subtyping relations, providing model polymorphism, i.e., the ability to manipulate a model through different interfaces. Model polymorphism enables the definition of generic tools that can be applied to any model matching the interface on which they are defined, regardless of the concrete implementation of their language. Model types can also be used to filter the information exposed from the abstract syntax of a language. Doing so, they can define language viewpoints by extracting the appropriate information on a system for one specific development task of a stakeholder. Model types are supported by a model-oriented type system that leverages family polymorphism [3] and structural typing to abstract the conformance relation standing between models and metamodels with a typing relation between models and model types. We incorporated these concepts into Melange [2], a new language for DSLs designers and users. Melange is a language-based, model-oriented programming language in which DSLs designers can manipulate languages definitions with high-level operators (e.g., inheritance, composition, slicing) and express their relations through the definition of metamodels, language interfaces, and transformations. Melange provides DSLs users with an action language where models are first-class typed citizens and embeds a model-oriented type system that natively provides model polymorphism through model typing. We applied Melange on two industrial use cases to maximize the reuse of DSLs ecosystems: managing syntactic and semantic variation points in a family of FSM languages; providing an executable extension of Capella [1], a large-scale system engineering modeling language."
426,hal-04132604,2023,{Fingerprinting and Building Large Reproducible Datasets},"Lefeuvre, Romain and Galasso, Jessie and Combemale, Benoit and Sahraoui, Houari and Zacchiroli, Stefano","Obtaining a relevant dataset is central to conducting empirical studies in software engineering. However, in the context of mining software repositories, the lack of appropriate tooling for large scale mining tasks hinders the creation of new datasets. Moreover, limitations related to data sources that change over time (e.g., code bases) and the lack of documentation of extraction processes make it difficult to reproduce datasets over time. This threatens the quality and reproducibility of empirical studies. In this paper, we propose a tool-supported approach facilitating the creation of large tailored datasets while ensuring their reproducibility. We leveraged all the sources feeding the Software Heritage append-only archive which are accessible through a unified programming interface to outline a reproducible and generic extraction process. We propose a way to define a unique fingerprint to characterize a dataset which, when provided to the extraction process, ensures that the same dataset will be extracted. We demonstrate the feasibility of our approach by implementing a prototype. We show how it can help reduce the limitations researchers face when creating or reproducing datasets."
427,hal-03793085,2022,{The Indiscernibility Methodology: quantifying information leakage from side-channels with no prior knowledge},"Marquer, Yoann and Zendra, Olivier and Heuser, Annelie","Cyber security threats are important and growing issues in computing systems nowadays. Among them are the side-channel attacks, made possible by information leaking from computing systems through nonfunctional properties like execution time, consumed energy, power profiles, etc. These attacks are especially difficult to protect from, since they rely on physical measurements not usually envisioned when designing the functional properties of a program. Furthermore, countermeasures are usually dedicated to protect a particular program against a particular attack, lacking universality. To help fight these threats, we propose in this paper the Indiscernibility Methodology, a novel methodology to quantify with no prior knowledge the information leaked from programs, thus providing the developer with valuable security metrics, derived either from topology or from information theory. Our original approach considers the code to be analyzed as a completely black box, only the public inputs and leakages being observed. It can be applied to various types of side-channel leakages: time, energy, power, EM, etc. In this paper, we first present our Indiscernibility Methodology, including channels of information and our threat model. We then detail the computation of our novel metrics, with strong formal foundations based both on topological security (with distances defined between secret-dependent observations) and on information theory (quantifying the remaining secret information after observation by the attacker). Then we demonstrate the applicability of our approach by providing experimental results for both time and power leakages, studying both average case-, worst case-and indiscernible information metrics."
428,hal-01634288,2017,{The Emerging Field of Test Amplification: A Survey},"Monperrus, Martin and Danglot, Benjamin and Vera-Perez, Oscar Luis and Yu, Zhongxing and Baudry, Benoit","Context: The increasing adoption of test-driven development results in software projects with strong test suites. These suites include a large number of test cases, in which developers embed knowledge about meaningful input data and expected properties in the form of oracles. Objective: This article surveys various works that aim at exploiting this knowledge in order to enhance these manually written tests with respect to an engineering goal (e.g., improve coverage of changes or increase the accuracy of fault localization). While these works rely on various techniques and address various goals, we believe they form an emerging and coherent field of research, and which we call ""test amplification"". Method: We devised a first set of papers based on our knowledge of the literature (we have been working in software testing for years). Then, we systematically followed the citation graph. Results: This survey is the first that draws a comprehensive picture of the different engineering goals proposed in the literature for test amplification. In particular, we note that the goal of test amplification goes far beyond maximizing coverage only. Conclusion: We believe that this survey will help researchers and practitioners entering this new field to understand more quickly and more deeply the intuitions, concepts and techniques used for test amplification."
429,hal-04130361,2023,{PyroBuildS: Enabling Efficient Exploration of Linux Configuration Space with Incremental Build},"Randrianaina, Georges Aaron and Khelladi, Djamel Eddine and Zendra, Olivier and Acher, Mathieu","Software engineers are acutely aware that the build of software is an essential but resource-intensive step in any software development process. This is especially true when building large systems or highly configurable systems whose vast number of configuration options results in a space explosion in the number of versions that should ideally be built and evaluated. Linux is precisely one such large and highly configurable system with thousands of options that can be combined. Previous study showed the benefit of incremental build, however, only on small-sized configurable software systems, unlike Linux. Although incremental compilation for post-commit is used in Linux, we show in this paper that the build of large numbers of random Linux configurations does not benefit from incremental build. Thus, we introduce and detail PyroBuildS, our new approach to efficiently explore, with incremental builds, the very large configuration space of Linux. Very much like fireworks, PyroBuildS starts from several base configurations (""rockets"") and generates mutated configurations (""sparks"") derived from each of the base ones. This enables exploring the configuration space with an efficient incremental build of the mutants, while keeping a good amount of diversity. We show on a total of 2520 builds that our PyroBuildS approach does trigger synergies with the caching capabilities of Make, hence significantly decreasing builds time with gains up to 85%, while having a diversity of 33% of options and 15 out of 17 subsystems. Overall, individual contributors and continuous integration services can leverage PyroBuildS to efficiently augment their configuration builds, or reduce the cost of building numerous configurations."
430,hal-01401828,2016,{Approximate Loop Unrolling},"Rodriguez-Cancio, Marcelino and Combemale, Benoit and Baudry, Benoit","We introduce Approximate Unrolling, a loop optimization that reduces execution time and energy consumption, exploiting the existence of code regions that can endure some degree of approximation while still producing acceptable results. This work focuses on a specific kind of forgiving region: counted loops that map a given functions over the elements of an array. Approximate Unrolling transforms loops in a similar way Loop Unrolling does. However, unlike its exact counterpart, our optimization does not unroll loops by adding exact copies of the loop's body. Instead, it adds interpolations. We describe our experimental implementation of Approximate Unrolling in the Server (C2) Compiler of the Open-JDK Hotspot JVM. The choice to implement our technique directly in the compiler reduced Phase Order problems and transformation overhead. It also proved that our technique could actually improve the performance of a production-ready compiler. Using our modified version of the compiler , we perform several experiments showing that Approximate Unrolling is able reduce execution time and energy consumption of the generated code by a factor of 50% with minimal accuracy losses."
431,hal-01162220,2015,{Dynamic Analysis can be Improved with Automatic Test Suite Refactoring},"Xuan, Jifeng and Cornu, Benoit and Martinez, Matias and Baudry, Benoit and Seinturier, Lionel and Monperrus, Martin","Context: Developers design test suites to automatically verify that software meets its expected behaviors. Many dynamic analysis techniques are performed on the exploitation of execution traces from test cases. However, in practice, there is only one trace that results from the execution of one manually-written test case. Objective: In this paper, we propose a new technique of test suite refactoring, called B-Refactoring. The idea behind B-Refactoring is to split a test case into small test fragments, which cover a simpler part of the control flow to provide better support for dynamic analysis. Method: For a given dynamic analysis technique, our test suite refactoring approach monitors the execution of test cases and identifies small test cases without loss of the test ability. We apply B-Refactoring to assist two existing analysis tasks: automatic repair of if-statements bugs and automatic analysis of exception contracts. Results: Experimental results show that test suite refactoring can effectively simplify the execution traces of the test suite. Three real-world bugs that could previously not be fixed with the original test suite are fixed after applying B-Refactoring; meanwhile, exception contracts are better verified via applying B-Refactoring to original test suites. Conclusions: We conclude that applying B-Refactoring can effectively improve the purity of test cases. Existing dynamic analysis tasks can be enhanced by test suite refactoring."