UNIVERSIDAD DE COSTA RICA

SISTEMA DE ESTUDIOS DE POSGRADO

MINING SOFTWARE REPOSITORIES TO
AUTOMATICALLY MEASURE DEVELOPER CODE

CONTRIBUTIONS

Tesis sometida a la consideración de la Comisión del
Programa de Estudios de Posgrado en Computación e
Informática para optar por el grado y título de Maestría

Académica en Computación e Informática

SIVANA HAMER

Ciudad Universitaria Rodrigo Facio, Costa Rica
2023


Dedication

To all who have helped me.

ii


Acknowledgements

Throughout this thesis, I have received insurmountable support from many. Luck
has been on my side, everything considered. I thank anyone who has helped me
directly and indirectly throughout my journey.

Though unusual, I would like to thank myself. Despite adversity, I have per-
severed with constant dedication. Like anyone, there is a lot that can still be
improved on, yet I am happy with what I have achieved and excited for what is to
come. Hopefully, I can learn to be more kind and patient with myself, which has
been an insurmountable personal quest.

I am immensely grateful to my father, mother, and sister, in no particular order,
for their constant motivation and support. Without them, I would not be the person
I am today or have the opportunity to do what I love. My dogs, Poker and Cherry,
also deserve special thanks as they have emotionally supported me through my
journey by being so cute, adorable, and lovable.

I am also extremely grateful to Dr. Christian Quesada-López for all his invalu-
able advice, continuous guidance, and unyielding support. I am extremely thankful
that he convinced me to consider researching as a career that I adore. He has
truly shaped my career by being such an excellent role model. Through thick and
thin, he has been there for me and guided me throughout my process. I also es-
pecially thank Dr. Marcelo Jenkins for their professional support of my career. His
invaluable experience has been immensely useful. I also thank Dr. Allan Berrocal
and Dr. Alexandra Martínez for their help during this thesis.

I would also like to thank anyone who has helped me improve professionally
as a researcher. I want to especially thank Dr. Bodgan Vasilescu and his lab,
Socio-Technical Research Using Data Excavation Lab (STRUDEL), for being so
kind and motivational to me during my visit to Carnegie Mellon University (CMU).

This thesis is part of the research project No. 834-C1-011 “Procedimiento

iii


automatizado de medición de contribuciones a partir de repositorios de proyectos
de desarrollo de software” of the Universidad de Costa Rica (UCR). This work was
supported by the Centro de Investigaciones en Tecnologías de la Información y
Comunicación (CITIC), Sistema de Estudios de Posgrado (SEP), and Escuela de
Ciencias de la Computación e Informática (ECCI).

iv


v


Table of Contents

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Approval Sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

Resumen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv

1 Introduction 1

1.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Background 11

2.1 Continuous software engineering . . . . . . . . . . . . . . . . . . . 11

2.2 Software metrology and metrics . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Measurement context model . . . . . . . . . . . . . . . . . 16

2.2.2 Goal Question Metric . . . . . . . . . . . . . . . . . . . . . 18

2.2.3 Classifying software measures . . . . . . . . . . . . . . . . 20

2.3 Mining software repositories . . . . . . . . . . . . . . . . . . . . . . 20

2.3.1 Metrics from repositories . . . . . . . . . . . . . . . . . . . 23

vi


2.3.2 Software traceability . . . . . . . . . . . . . . . . . . . . . . 23

2.4 Design science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.5 Empirical software engineering . . . . . . . . . . . . . . . . . . . . 27

2.5.1 Systematic mapping studies . . . . . . . . . . . . . . . . . . 28

2.5.2 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.5.3 Controlled experiments . . . . . . . . . . . . . . . . . . . . 31

2.5.4 Surveys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 Characterizing developer contribution research in software engineer-
ing 34

3.1 Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . 39

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4 Developing the automated code contribution measurement procedure 43

4.1 Measurement procedure design . . . . . . . . . . . . . . . . . . . 44

4.2 Tool implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Evaluating the effectiveness of the procedure 53

5.1 Characterizing the process with contributions . . . . . . . . . . . . 55

5.2 Recovering the code traces automatically . . . . . . . . . . . . . . 57

5.3 Measuring the contribution quality . . . . . . . . . . . . . . . . . . 59

5.4 Integrating the procedure in software engineering projects . . . . . 60

5.5 Classifying the value of contributions . . . . . . . . . . . . . . . . . 62

5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6 Discussion and conclusions 65

6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.2 Discussion and future work . . . . . . . . . . . . . . . . . . . . . . 68

Bibliography 76

vii


Appendix A How have we researched developers’ contributions in soft-
ware engineering? A systematic mapping study 97

Appendix B Measuring students’ contributions in software development
projects using Git metrics 142

Appendix C Using git metrics to measure students’ and teams’ contri-
butions in software development projects 153

Appendix D Automatically recovering students’ missing trace links be-
tween commits and user stories 183

Appendix E Measuring students’ source code quality in software devel-
opment projects through commit-impact analysis 198

Appendix F Students projects’ source code changes impact on soft-
ware quality through static analysis 209

Appendix G Students’ perceptions of integrating contribution measure-
ment tools in software engineering projects 221

Appendix H Development perceptions and behaviors of continuously
measuring software contributions 232

Appendix I Classifying the value of code contributions: An exploratory
study 245

viii


Resumen

Las personas desarrolladoras contribuyen a los proyectos en una variedad de for-
mas y actividades diferentes. La evaluación de las contribuciones puede ayudar a
los procesos, productos, desarrolladores y proyectos de software en la educación,
investigación, industria y proyectos de software abierto. Los procedimientos ac-
tuales típicamente extraen medidas de repositorios de software. Se necesitan
procedimientos y herramientas de medición para capturar mejor la naturaleza
compleja y multidimensional de las contribuciones objetivamente, ayudando en la
adopción. Por lo tanto, el objetivo de esta tesis es desarrollar un procedimiento
automatizado para medir las contribuciones de código de las personas desar-
rolladoras mediante la minería de repositorios de software. Para lograr esto,
seguimos las guías de la ciencia del diseño para desarrollar la herramienta de
procedimiento de medición por medio de tres ciclos principales. Primero, se re-
alizó un mapeo sistemático de 166 estudios para caracterizar cómo se han in-
vestigado las contribuciones de las personas desarrolladoras en la ingeniería de
software. Segundo, se propuso e implementó un procedimiento automatizado de
tres fases que extrae datos de repositorios que miden seis dimensiones de las
contribuciones de las personas desarrolladoras. Finalmente, la efectividad del
procedimiento fue evaluada en ocho estudios empíricos. Analizamos 13 proyec-
tos distintos de ingeniería de software educativo, con un total de 246 estudiantes
desarrolladores. A lo largo de nuestras evaluaciones empíricas, encontramos
evidencia de la efectividad, aceptación, aplicabilidad y utilidad del enfoque. La in-
vestigación puede aprovechar el procedimiento automatizado y los conocimientos
adquiridos para trabajos futuros.

ix


Abstract

Developers contribute to projects in a variety of ways and different activities. As-
sessment of contributions can help education, research, industry, and open-source
software processes, products, developers, and projects. Current procedures typ-
ically mine software repositories for measures. Measurement procedures and
tools are needed to better objectively capture the complex and multi-dimensional
nature of contributions, aiding in the adoption. Therefore, the objective of this
thesis is to develop an automated procedure to measure developer code contribu-
tions by mining software repositories. To achieve this, we followed design science
guidelines to develop the measurement procedure tool through the following three
main cycles. First, a systematic mapping study of 166 studies was conducted to
characterize how developer contributions have been researched in software engi-
neering. Second, an automated three-phase procedure was proposed and imple-
mented that mines data from repositories measuring six dimensions of developer
contributions. Finally, the procedure’s effectiveness was evaluated in eight empir-
ical studies. We analyzed 13 distinct educational software engineering projects,
totaling 246 student developers. Throughout our empirical evaluations, we found
evidence of the effectiveness, acceptance, applicability, and utility of the approach.
Research can take advantage of the automated procedure and insights gained in
future work.

Keywords: software contributions, automated measurement procedure, continu-
ous software engineering, software measures, mining software repositories, soft-
ware engineering education, empirical software engineering

x


List of Tables

2.1 The seven wastes . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.1 Mapping study research questions with their motivation . . . . . . 36

3.2 The inclusion (I) and exclusion (E) criteria . . . . . . . . . . . . . . 37

3.3 Data extraction fields with their dimensions . . . . . . . . . . . . . 38

3.4 Study quality assessment criteria . . . . . . . . . . . . . . . . . . . 39

4.1 The procedure contribution dimensions with measures . . . . . . . 46

4.2 The current measures visualized by the tool . . . . . . . . . . . . . 52

5.1 Summary of the approaches of the evaluations . . . . . . . . . . . 54

xi


List of Figures

1.1 Research subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Research design science framework . . . . . . . . . . . . . . . . . 6

1.3 Design science process . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Research methodology summary . . . . . . . . . . . . . . . . . . . 8

1.5 Detailed contributions of the work . . . . . . . . . . . . . . . . . . . 10

2.1 The “Stairway to Heaven” model . . . . . . . . . . . . . . . . . . . 13

2.2 Continuous* model . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Measurement context model . . . . . . . . . . . . . . . . . . . . . 17

2.4 GQM+ strategies model . . . . . . . . . . . . . . . . . . . . . . . 19

2.5 Mining software repository . . . . . . . . . . . . . . . . . . . . . . . 23

2.6 Relationship between trace artifacts and trace links . . . . . . . . 24

2.7 Traceability process model . . . . . . . . . . . . . . . . . . . . . . 25

2.8 Summary of the design science approach . . . . . . . . . . . . . . 26

2.9 Systematic mapping process . . . . . . . . . . . . . . . . . . . . . 28

2.10 Case study process . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.11 Experiment process . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.1 Mapping study process . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1 Automated measurement procedure for developer contribution . . 45

4.2 Tool gathering code contributions with their relationship to user sto-
ries from software repositories . . . . . . . . . . . . . . . . . . . . 48

4.3 The tool’s main user interfaces. . . . . . . . . . . . . . . . . . . . . 51

xii


5.1 General characteristics of the empirical studies . . . . . . . . . . . 55

5.2 Methodology of the process evaluation . . . . . . . . . . . . . . . . 56

5.3 Methodology of the traceability evaluation . . . . . . . . . . . . . . 58

5.4 Methodology of the quality evaluation . . . . . . . . . . . . . . . . 60

5.5 Methodology of the integration evaluation . . . . . . . . . . . . . . 61

5.6 Methodology of the value evaluation . . . . . . . . . . . . . . . . . 63

xiii


List of Acronyms

API: Application Programming Interface

CSE: Continuous Software Engineering

DC: Design Cycles

EC: Empirical Cycles

GQM: Goal Question Metric

ITS: Issue Tracking Systems

KQ: Knowledge Question

MSR: Mining Software Repositories

OSS: Open Source Software

RQ: Research Question

SE: Software Engineering

SO: Specific Objective

VCS: Version Control Systems

VSBE: Value-Based Software Engineering

xiv


Autorización para digitalización y comunicación pública de Trabajos Finales de Graduación del Sistema de 

Estudios de Posgrado en el Repositorio Institucional de la Universidad de Costa Rica. 

Yo, _______________________________________, con cédula de identidad _____________________, en mi 

condición de autor del TFG titulado ___________________________________________________ 

_____________________________________________________________________________________________

_____________________________________________________________________________________________ 

Autorizo a la Universidad de Costa Rica para digitalizar y hacer divulgación pública de forma gratuita de dicho TFG  

a través del Repositorio Institucional u otro medio electrónico, para ser puesto a disposición del público según lo que 

establezca el Sistema de Estudios de Posgrado.  SI           NO *

*En caso de la negativa favor indicar el tiempo de restricción: ________________  año (s). 

Este Trabajo Final  de Graduación será publicado en formato PDF, o en el formato que en el momento se establezca, 

de tal forma que el acceso al mismo sea libre, con el fin de permitir la consulta e impresión, pero no su modificación. 

Manifiesto que mi Trabajo Final de Graduación fue debidamente subido al sistema digital Kerwá y su contenido 

corresponde al documento original que sirvió para la obtención de mi título, y que su información no infringe ni 

violenta ningún derecho a terceros. El TFG además cuenta con el visto bueno de mi Director (a) de Tesis o Tutor (a) 

y cumplió con lo establecido en la revisión del Formato por parte del Sistema de Estudios de Posgrado.  

INFORMACIÓN DEL ESTUDIANTE: 

Nombre Completo: . 

Número de Carné:  Número de cédula: . 

Correo Electrónico: . 

Fecha:   . Número de teléfono:       . 

Nombre del Director (a)  de Tesis o Tutor (a): . 

FIRMA ESTUDIANTE 

Nota: El presente documento constituye una declaración jurada, cuyos alcances aseguran a la Universidad, que su contenido sea tomado como cierto. Su 

importancia radica en que permite abreviar procedimientos administrativos, y al mismo tiempo genera una responsabilidad legal para que quien declare 

contrario a la verdad de lo que manifiesta, puede como consecuencia, enfrentar un proceso penal por delito de perjurio, tipificado en el artículo 318 de nuestro 
Código Penal. Lo anterior implica que el estudiante se vea forzado a realizar su mayor esfuerzo para que no sólo incluya información veraz en la Licencia de 

Publicación, sino que también realice diligentemente la gestión de subir el documento correcto en la plataforma digital Kerwá.   

SISTEMA DE ESTUDIOS DE POSGRADO ESTUDIANTE 

 
Sivana Alexa Hamer Campos B63296

"Mining software repositories to automatically measure developer code contributions"

x

Sivana Alexa Hamer Campos

B63296 9 0115 0250

sivana.hamer@ucr.ac.cr

Christian Quesada-López

27 de julio 2023 8722 1111

xv


1

Chapter 1

Introduction

The contribution of developers to projects is a central notion of software engineer-
ing [1]. A contribution is defined as the act of giving or supplying something [2].
Therefore, software contributions are defined as any action to the software project
performed by anyone involved in the software engineering process. While con-
structing software, developers participate in diverse technical and non-technical
tasks hence contributions are also varied. Different types of contribution have
been investigated including code developing and reviewing [1, 3–12], bug report-
ing and fixing [1, 10, 11, 13–15], communication through messages [1, 12], and
models [16].

Contributions are assessed to help software engineering products, people,
processes, and projects [1, 3, 10, 13, 14]. Assessment is defined here as the
evaluation, calculation, or valuation of a contribution. Works commonly assess
contributions through measuring data mined software repositories [10], artifacts
produced and archived during the software development cycle [17]. Measure-
ment in software engineering provides valuable information helping quantitative
decision-making and aiding in understanding, controlling, and improving software
products, processes, resources, and projects [18,19].

Assessing software contributions provides benefits for research, industry, open-
source projects, and education:

• In software engineering research, creating and understanding phenomena
through constructs, “things” that are inderectly measured, is core to knowl-
edge acquisition [20]. For example, size, coupling, and cohesion are soft-
ware constructs [21]. Along the same line, contributions are another soft-


2

ware engineering construct. For example, when we determine health we can
utilize software contribution measures that serve as indicators to represent
the concept [22]. As such, improved operationalizations of our measured
indicators can generate more empirical evidence of software engineering
practices. Hence, we can better understand software engineering phenom-
ena.

• In software projects, there are increasingly rising demands for rapidness, fre-
quency, fluidity, adaptability, and customer-centricity, with trends such as ag-
ile processes and continuous software engineering emerging with increased
momentum and prominence [23–25]. Continuous improvement through mea-
surement is a crucial orthogonal aspect for software projects [26]. As such,
contributions assessments with measurement can also be used to help mon-
itor development, plan future projects, identify risks, recognize developers,
improve behaviors, increase efficiency, and make informed business deci-
sions [1,3,10,13,14]. Management can thus use contribution data to better
current projects through increased insights and future projects through im-
proved planning.

• For developers, ensuring their involvement in developing software is needed
for the sustainability of projects. Specifically, the sustainability of open-
source projects is vital as most participants are volunteers [27–29]. As such,
papers have extensively studied which factors help or intervene with the
health of projects [30–33]. Research has also focused on determining mea-
sures that detect the health of communities and ecosystems [34, 35]. Less
work has focused on creating interventions that help community health, usu-
ally through tooling [36]. As recognition is a success and motivational factor
for developers [37–39], adequately recognizing all developer contributions
could help the health of software projects. At the same time, developers
can benefit from tracking their progress and gaining additional information to
improve their skills.

• In software engineering education, there are gaps between industry needs
for newcomer developers and what is taught in university classrooms [40,
41]. Automated approaches can help train future professionals by provid-
ing valuable feedback on their contribution to improve continuously. At the
same time, contribution assessment approaches can be utilized by educa-


3

tors to aid in grade assignment [42, 43], which is still recently considered
challenging for instructors [44]. Additionally, educators can use quantitative
information to aid in teaching by determining learning challenges. This way,
discussion opportunities are created and refined within the classroom to fur-
ther comprehension of software practices. Based on previous course data,
improvement opportunities can also be found.

Though the assessment of contributions is beneficial, there are still several
challenges with the effectiveness and usability of assessment approaches.

• Software contributions are diverse. Assessment approaches need to con-
sider many aspects including both technical and non-technical contributions [10,
45]. At the same time, research consensus on the definition of contributions
has not yet been achieved [1, 10, 11]. Even recently, studies have men-
tioned that works are not explicitly detailing the types of contributions con-
sidered [46]. This is further exacerbated by the difficulty of objectively, fairly,
and accurately discerning individual contribution in team projects [43].

• When assessing contributions, in depth characterizations is needed. Soft-
ware is too complex to be quantified by a single metric [47]. Approaches do
not only need to quantify the size, which may also be difficult to measure,
but also it has to consider other aspects such as quality [6, 48], complex-
ity [49], or value [5]. Hence, the multiple dimensions must be considered
when assessing software contributions. As such, assessment models may
be unrepresentative by invisibilizing certain contributions to projects [50].

• Assessment approaches need to be easy to integrate for adopters. Software
measures require a considerable upfront investment of time [51]. Software
tools help developers in producing software [52]. Examples include inte-
grated development environments that aid in the development of code [53,
54], version control systems that help manage software versions [55, 56],
bots that automate software tasks [57–59], and artificial intelligence models
that suggest code [60, 61]. Thus specialized tools can help adopters bene-
fit from objective contribution assessment insights while reducing adoption
costs. Yet, few such tools have been implemented in research and integrated
into projects.


4

Due to the previous challenges, automated measurement procedures of soft-
ware contribution can be created. The procedure mines from software reposi-
tories different measures to account for and acknowledge the diverse and multi-
dimensional nature of software contributions. Thus aiding software engineering
projects, people, products, and processes.

1.1 Objectives

This thesis, therefore, aims the following:

Main objective. Develop an automated procedure to measure developer
code contributions by mining software repositories.

Hence, the research question of the work is: How can developer code contri-
butions be measured by mining software repositories? To achieve our research
goal and question, three specific objectives were proposed.

Specific objective 1. Characterize software engineering research of devel-
oper contributions.

The motivation of this first specific objective (SO1) is to aggregate software en-
gineering research of developer contributions systematically. We characterize the
contribution types, research topics, research design, measurements, assessment
approaches, contexts, threats to validity, and challenges. Therefore, the findings
can help empirically uniformize terms, consolidate current findings, and determine
gaps in developer contribution research.

Specific objective 2. Design an automated procedure to measure devel-
oper contributions by mining software repositories.

The second specific objective (SO2) develops an automated procedure that
measures code contributions from software repositories. This includes the design
of the procedure and implementation of the tool. The measurement procedure
characterizes six dimensions of developer contributions in a three-step process.
Meanwhile, our implementation for code contributions has three main phases:


5

setup, extraction, and usage of information. The creation of tools helps with tech-
nology transfer [62].

Specific objective 3. Evaluate the effectiveness of the automated proce-
dure to measure developer code contributions by mining software reposito-
ries.

Finally, the third specific objective (SO3) evaluates the effectiveness of the
measurement procedure. Hence, we determine if the tool was successful in
achieving the desired results and has perceived relevance. To achieve this, we
conducted five different types of empirical studies: characterizing the process, re-
covering traces, measuring quality, integrating the procedure, and classifying the
value. Researching the measurement procedure helps formalize and generate
more evidence of engineering practices. Insights from the evaluations provide
improvement opportunities for contribution assessment approaches.

1.2 Methodology

The methodology of this work followed the guidelines of design science in infor-
mation systems and software engineering, studying artifacts within a context [63].
Fig. 1.1 shows our research subject and its interactions. The research artifact
studied was the proposed automated procedure to measure developer code con-
tributions by mining software repositories, while the context studied was in soft-
ware engineering projects developed by students. Validating a solution in academia
can also serve as the first step for empirical technological transfer to industry [62].

Automated procedure to
measure developer code
contributions by mining
software repositories

Artifact

Software engineering
projects devel-

oped by students

Context

Interaction

Figure 1.1: Research subject

The design science framework is shown in Fig. 1.2. The social context of this
thesis includes stakeholders who may affect the project or be affected by it and the
sponsors. Therefore, the stakeholders are instructors and students. The design
science context corresponds to the design problems and knowledge questions


6

of the automated procedure to measure developer code contributions. Finally,
the knowledge context is the existing theories, specifications, and designs. In
our case, these include the fields of continuous software engineering, software
metrology, mining software repositories, empirical software engineering, and soft-
ware contributions. The background of continuous software engineering, software
metrology, mining software repositories, and empirical software engineering is ex-
plained in Chapter 2. Meanwhile, the research of developer contributions is char-
acterized in Chapter 3.

Social context
instructors, students

Design
Design of a code contribution

measurement procedure

Investigation
Find knowledge about the

code contribution measure-
ment procedure in context

Design Science

Knowledge context
continuous software engineering, software metrology, mining software repositories,

empirical software engineering, software developer contributions

Goals,
budgets Designs

Existing
knowledge

and designs

New
knowledge

and designs

Existing
answers to
knowledge
questions

New
answers to
knowledge
questions

Artifact & context to investigate

Knowledge & new design problems

Figure 1.2: Research design science framework

In design science, design cycles solve engineering design problems, while
empirical cycles answer knowledge questions. The DC and EC of our work are
shown in Fig. 1.3.

1. The main design cycle of the thesis, named DC1, focuses on the design of
the automated procedure to measure developer code contributions artifact.
To achieve this, first, the research problem was defined (T1).

2. Then the empirical cycle, called EC1, characterizes the software engineering
research of developer contributions through a mapping study. In this cycle,


7

the goals for the mapping study were defined (T2), protocol designed (T3)
and validated (T4), study conducted (T5), and results analyzed (T6). This
cycle was achieved by carrying out a systematic mapping study [64,65].

3. After EC1, the main design cycle DC1 was continued. The treatment was
designed (T7) inspired by the measurement context model [66], validated
(T8), and implemented (T9) using iterative development [23,24].

4. Finally, the automated procedure developed in DC1 was evaluated in the
empirical cycle, called EC2, through empirical studies [62]. In this cycle,
for each study the goals were defined (T10), protocol designed (T11) and
validated (T12), study executed (T13), and results analyzed (T14).

5. The development of DC1 and evaluation of EC2 was repeated in multiple
iterations.

DC1
Design an automated pro-

cedure to measure de-
veloper contributions

T1. Problem investigation
Define problem and motivations

T7. Treatment design
Design measurement procedure’s conceptual
model

T8. Treatment validation
Validate the measurement procedure model

T9. Treatment implementation
Implement measurement procedure

EC1
Characterize soft-
ware engineering
research of devel-
oper contributions

T2. Research problem analysis
Define goals for mapping study

T3. Research and interference design
Design the mapping study’s protocol

T4. Validation
Validate the mapping study’s protocol

T5. Research execution
Conduct the mapping study

T6. Data analysis
Analyze the results of the mapping study

EC2
Evaluate the auto-

mated procedure to
measure developer
code contributions

T10. Research problem analysis
Define study’s goals

T11. Research and interference design
Design the study’s protocol

T12. Validation
Validate the study’s protocol

T13. Research execution
Conduct the study

T14. Data analysis
Analyze the results of the study

Figure 1.3: Design science process

Technical research questions (RQ) for the design cycles and knowledge ques-
tions (KQ) for the empirical cycles were also defined. RQ focuses on creating
artifacts to achieve a goal. Meanwhile, KQ gathers knowledge about the world.
Based on our objectives, the KQ and RQ are:

KQ1: How are developer contributions researched in software engineering?

RQ2: How to design an automated procedure to measure developer contributions
by mining software repositories?


8

KQ3: What is the effectiveness of the automated procedure to measure develop-
ers code contributions by mining software repositories?

KQ1 contributes to SO1, RQ2 satifies to SO2, and KQ3 responds to SO3.
The relationships between the thesis objectives, cycles, questions, methods, and
products are shown in Fig 1.4. The detailed methodology of each cycle is covered
in the respective chapters of this work. Their location is provided in Section 1.4.

SO1. Characterize
software engineer-
ing research of de-

veloper contributions

EC1. Characterize
software engineer-
ing research of de-

veloper contributions

KQ1. How are developer
contributions researched
in software engineering?

Mapping study
Kietchnham & Charters
2007, & Petersen 2008

Mapping study charac-
terizing software en-
gineering research of

developer contributions

SO2. Design an auto-
mated procedure to
measure developer

contributions by mining
software repositories

DC1. Design an au-
tomated procedure
to measure devel-
oper contributions

RQ2. How to design an
automated procedure
to measure developer

contributions by mining
software repositories?

Measurement context model
& Iterative development

Abran 2010, Sommerville
2016, & Pressman 2010

Design and implemen-
tation of the automated
procedure to measure
developer contribution

SO3. Evaluate the effec-
tiveness of the automated

procedure to measure
developer code con-
tributions by mining

software repositories

EC2. Evaluate the au-
tomated procedure to
measure developer
code contributions

KQ3. What is the effec-
tiveness of the automated

procedure to measure
developers code con-
tributions by mining

software repositories?

Empirical studies
Wohlin et al. 2012

Empirical evaluations of
the automated procedure

that measures devel-
oper code contribution

Objectives Cycle Questions Methods Products

Figure 1.4: Research methodology summary

1.3 Contributions

This thesis, therefore, contributes to the following.

• We characterized software engineering studies of developer contributions,
summarizing the works to standardize terms, consolidate findings, and de-
termine research gaps.

• We developed an automated procedure that measures mutiple dimensions
of developer contributions by mining software repositories implementing the
measurement tool for code contributions, aiding software engineering re-
searchers or adopters of continuous measurement.

• We evaluated the automated measure procedure in empirical studies, de-
termining the effectiveness, acceptance, applicability, and utility of the ap-
proach.


9

A representation of the contributions with the relationship to the research ques-
tions is shown in Fig. 1.5.

1.4 Document structure

The structure of this research thesis is as follows:

• Chapter 2 describes the background of continuous software engineering,
metrology and metrics, mining software repositories, design science, and
empirical methodologies.

• Chapter 3 presents the systematic mapping study that systematically char-
acterizes software research of development contributions.

• Chapter 4 explains the developed automated measurement procedure and
tool that automatically measures developer code contributions.

• Chapter 5 presents the empirical evaluation of the effectiveness procedure
and tool focused on five topics: process characterization, traceability recov-
ery, quality measurement, perception integration, and value classification.

• Chapter 6 summarizes the main findings, contributions, and future work of
the thesis.


10

Develop an automated procedure to
measure developer code contribu-

tions by mining software repositories

RQ2. How to design an
automated procedure
to measure developer

contributions by mining
software repositories?

KQ1. How are developer
contributions researched
in software engineering?

KQ3. What is the effec-
tiveness of the automated

procedure to measure
developers code con-
tributions by mining

software repositories?

GC1.
We characterized soft-

ware engineering studies
of developer contributions,

summarizing the works
to standardize terms,

consolidate findings, and
determine research gaps.

GC2.
We developed an auto-
mated procedure that
measures mutiple di-

mensions of developer
contributions by mining
software repositories

implementing the mea-
surement tool for code
contributions, aiding

software engineering re-
searchers or adopters of

continuous measurement.

GC3.
We evaluated the au-
tomated measure pro-

cedure in empirical
studies, determining the

effectiveness, accep-
tance, applicability, and
utility of the approach.

• Types of contributions
• Methodological

approaches
• Measures
• Topics
• Contexts
• Challenges
• Threats to validity

• GQM model
• Measurement

procedure
• Tool

• Process
• Traceability
• Product
• Integrations
• Value

Q
ue

st
io

ns
G

oa
l

G
en

er
al

co
nt

ri
bu

tio
n

P
ro

du
ct

s

Figure 1.5: Detailed contributions of the work


11

Chapter 2

Background

This chapter presents the background. Section 2.1 describes the field of con-
tinuous software engineering, detailing trends that motivate and situate our work
within the field. Section 2.2 provides foundations of software metrology and met-
rics, used for the design of the automated measurement procedure. Section 2.3
details the mining software repository field and the general processes, utilized for
the extraction of data. Section 2.4 explains design science, the methodology of
this thesis. Finally, Section 2.5 explains empirical software engineering method-
ologies utilized throughout the empirical studies.

2.1 Continuous software engineering

Software engineering is founded on software processes, activities, actions, and
tasks leading to the development of software. Though the applicable activities are
adaptable for each project, there are a set of activities that are always included:
communication, planning, modeling, construction, and deployment [23]. Software
processes’ high-level, abstract descriptions are represented in software process
models [24].

Software engineering processes are constantly evolving to respond to the in-
dustry’s needs of dealing with market changes and providing more accurate cus-
tomer solutions [26]. These business demands have led to a paradigm shift in
software development, learning from the customer’s software usage after delivery
and deployment [67]. The software process evolution from traditional develop-
ment towards continuous development is represented in the “Stairway to Heaven”


12

model, shown in Fig. 2.1. The model’s five steps are described below [67,68].

Traditional development. Companies start with traditional development. Tradi-
tional software models are based on the waterfall model, the first published
software process model. The model presents sequential, separate steps of
requirement generation, analysis, design, coding, testing, and operations.
It is an example of a plan-driven model where planning and scheduling are
done before starting implementation. However, this model has several limi-
tations in accommodating change, managing uncertainty, and providing fast,
workable software versions [23,24,67,69].

R&D Organization All Agile. To tackle many of the challenges with traditional
processes, companies adopt agile practices [68]. These software processes
interleaved activities in feedback cycles to produce software incrementally
and rapidly for customers. Adopting agile practices improves the software’s
fluidity, adaptability, and rapidness [23, 24, 70]. Different agile software de-
velopment processes have been proposed such as Extreme Programming
(XP) [71] and Scrum [72]. Nonetheless, project management and system
verification still follow traditional development approaches [68].

Continuous integration. As agile benefits materialize, verification gets involved
in continuous practices [68]. Continuous integration is a development prac-
tice with developers frequently integrating their work. Each integration is
automatically built and verified with tests [73]. This leads to development
teams producing software faster and reducing bugs [74].

Continuous deployment. Once continuous integration is incorporated into the
process, project management gets involved in the agile development cycle
and requests faster development cycles. Continuous deployment is adopted,
constantly and automatically pushing out code changes to production. This
allows for continuous customer feedback and eliminates waste of non-valuable
products for the customer [67,68,74].

R&D as an Innovation System. Finally, deployed functionality is treated as an
experiment. Customer feedback is analyzed, with techniques such as A/B
testing, and used to determine their needs. The delivered functionality is
treated as a starting point that is further tuned [67,68].


13

Traditional
Development

R&D
Organization

All Agile

Continuous
Integration

Continuous
Deployment

R&D as an
Innovation

System

Figure 2.1: The “Stairway to Heaven” model [68]

Continuous software engineering (CSE) is a process in which software is de-
veloped continuously and incrementally, allowing for continuous learning and im-
provement [75]. Lean software engineering is a relevant and useful lens to un-
derstand CSE [25]. Lean software development (a.k.a lean software engineering)
seeks to create as rapidly as possible value for the customer [76]. Value can mean
many different things and even has its research field denominated Value-Based
Software Engineering (VSBE) [77]. VSBE is concerned with incorporating value
in development projects to create products that are more useful for stakehold-
ers [78]. In agile, value is providing the customer what they want and require [77].
In lean software engineering, value is providing customers with their unmet needs
to delight them and achieved only when the customer receives the product [76,79].
Finally, in VBSE, value is “relative utility, worth, or importance” [78].

Lean is characterized by its principles, broadly applicable ideas, and insights.
These principles are described below [76].

Eliminate waste. Waste is anything that does not add value to the customer. De-
velopment should spend time only on what adds customer value. Eliminat-
ing waste requires learning to see, uncovering the source, and eliminating
the waste, iteratively. Seven types of waste are defined in manufacturing
and can be translated to software development, shown in Table 2.1. For
example, a waste in software engineering is motion. The motion within the
process can be interrupted when the focus needs to be re-established and
artifacts are reassigned.


14

The seven wastes in
manufacturing

The seven wastes of
software development

Inventory Partially done work
Extra processing Extra process
Overproduction Extra features
Transportation Task switching
Waiting Waiting
Motion Motion
Defects Defects

Table 2.1: The seven wastes [76]

Amplify learning. Compared to manufacturing, software development is a cre-
ative process where learning is expected and should be amplified. While
developing, perfecting the solution requires trial and error. This leads to a
process where quality is focused on solving a problem in an easy-to-use and
cost-effective manner; variability is expected as different solutions are pro-
vided to each unique customer; and, iterations are desirable as they are the
most effective way to generate knowledge in ill-defined problems.

Decide as late as possible. Options should be kept open for as long as possi-
ble. This provides insurance against uncertainty, where decisions are taken
when more knowledge is available and easier to predict. This leads to the
need of building the capacity for change inside the system.

Deliver as fast as possible. Rapid delivery provides customers with what they
need now providing a competitive advantage. Speed allows to delay deci-
sions until more is known, have reliable feedback, and increase learning.

Empower the team. Frontline workers equipped with the expertise and guided
by a leader are better equipped than anyone to taking technical and process
decisions. As decisions are taken late and delivery is rapid, management
by an authority is not possible. Therefore, workers have decision-making
responsibility and design authority on satisfying customer needs.

Build integrity in. Systems have to focus since day one on fulfilling customers’
needs, being cohesive as a whole, and maintaining usefulness over time. To
achieve this, excellent information flows between customers and developers
must be adopted.


15

See the whole. Improving the software process requires considering the whole
process from end to end. Optimizing only parts of the process makes sub-
optimization likely to occur.

The Continuous* model, shown in Fig. 2.2, displays the end-to-end, holistic
process of continuous software engineering [25]. The process is divided into
three main activities: business, development, and operations. The business fo-
cuses on adaptable planning and budgeting based on the business environment.
Development concentrates on frequent and rapid aspects of software creation and
management, such as integration, delivery, and verification. Finally, the opera-
tions monitor system use to detect problems as early as possible. The foundation
of these activities is continuous improvement, and continuous innovation and ex-
perimentation. They focus on small and big changes, respectively, to improve
processes based on data-driven decision-making.

Development

Continuous integration

Continuous deployment

Continuous delivery

Continuous ver-
ification/testing

Continuous compliance

Continuous evolution

Business strategy

Continuous planning

Continuous budgeting

Operations

Continuous use

Continuous trust

Continuous run-
time monitoring

Continuous improving

Continuous experimen-
tation and innovation

BizDev DevOps

Figure 2.2: Continuous* model [25]

A critical, orthogonal aspect to the evolution of software processes is organi-
zational performance metrics [26]. In this work, we, therefore, define continuous
evaluation or measurement as continuously collecting software metrics. We con-
sider continuous evaluation as a part of continuous improvement, and continuous
innovation and experimentation activities.


16

2.2 Software metrology and metrics

Metrology is the science of measurements, including its theoretical and practical
aspects [80]. As software engineering is the “application of a systematic, disci-
plined, quantifiable approach to the development, operation, and maintenance of
software; that is, the application of engineering to software...” [81], measurement
is fundamental to the discipline [66]. Furthermore, measurements are part of the
foundational engineering topics in the Software Engineering Body of Knowledge
(SWEBOK), thus it is part of the generally accepted knowledge in software engi-
neering [82].

There are three vital terms in metrology: measurement, measure, and metric.

Measurement. The process where numbers or symbols are assigned to attributes
of real-world entities based on clearly defined rules [19].

Measure. Number or symbol assigned to the entity to characterize an attribute [19].

Metric. The term refers to a quantitative measure relating to the degree to which
an object possesses a given attribute [81].

Software measures exist for different aspects such as code [83], and qual-
ity [84–86]. Software measures are also commonly utilized to assess software
engineering contributions [10]. Notably, many works count commits as contribu-
tions [50].

To gather measurement results, a process must be defined. This process
is denominated as the measurement context model and is described in Sub-
section 2.2.1. An approach to selecting the measures must also be defined.
Specifically, in this project, the goal-oriented approach Goal Question Metric is
used and is described in Sub-section 2.2.2. Finally, different types of software
measures exist, a classification and description of specific software measures is
presented in Sub-section 2.2.3.

2.2.1 Measurement context model

The process to gather and exploit measurement results is defined by measure-
ment context models. They are integrated by three steps, shown in Fig. 2.3: de-
sign of the measurement method, application of the measurement method, and


17

Determination
of the objectives

Characterization
of the concept

to be measured
(entity & attribute)

Design or selection
of the meta-model
(Measurable con-

truct: Relationships
across entuty
and attribute)Definition of the

numberical as-
signment rules

Step 1
Design of the
measurement

method

Software documen-
tation gathering

Construction of the
software model

Application of the
assignment rules Measurement results

Audit

Step 2
Application of the

measurement
method

Quality model

Budgeting model

Productivity model

Estimation model

Estimation process

· · ·

· · ·

Step 3
Exploitation of the

measurement results
(examples)

Figure 2.3: Measurement context model [66]

exploitation of the measurement results. A measurement method is a general se-
quence of logical operations used to obtain a measurement. Meanwhile, a mea-
surement procedure is a specific set of operations to obtain particular measure-
ments according to a given method [66]. Each step of the measurement context
model will be detailed below.

Design the measurement method. A measurement method must be designed
or selected from previous methods. To design the measurement method,
four sub-steps must be followed. First, measurement objectives are defined
based on what we want to measure, what is the measurement point of view
and who are the intended users. Then, the measured concept is character-
ized by defining the measured entity, measurable attributes, and the empir-
ical detailed definition of the attributes. Concurrently, the representation of
the entities and attributes with their relationships are abstractly described in
a meta-model. The meta-model must also describe how to recognize the
measured attributes. Finally, the numeric assignment rules are defined with
their measurement unit.

Apply the measurement method. In this step, the measurement method is ap-
plied to measure specific context (i.e. measurement procedure). To apply
the measurement procedure, five sub-steps are carried out. Firstly, informa-
tion about the software is gathered. Secondly, the software model is built ac-
cording to the proposer meta-model. Thirdly, the numeric assignment rules
are applied. Fourthly, the measurement results are documented with de-


18

tails such as the measurement unit, measurement process, and measurers.
Finally, the results are verified and audited to ascertain their correctness.

Exploitation of the results. Measurement results are exploited for both quanti-
tative and qualitative analysis using models, such as evaluation, budgeting,
and estimation models.

2.2.2 Goal Question Metric

An approach to determine what to measure is Goal Question Metric (GQM) [19].
It is a goal-oriented method that assumes that purposeful measures for organi-
zations must be defined by project and organization goals, traced by the goals to
the data-defined goals, and provided by a framework to interpret the data [62,87].
The result of the approach is the specification of a measurement system targeting
specific issues, and rules to interpret the measurement data. The methodology
has three main components: goals, questions, and metrics [62,87].

Conceptual level (Goal). Goals are defined for objects, for multiple purposes,
from distinct focuses, from various points of view, relative to the environ-
ment [88].

Operational level (Question). Questions characterize objects to assess a goal
based on characterization models [62,87]. Questions bridge the subjectivity
of goals with quantitative measurements [87].

Quantitative level (Metric). Objective or subjective data is associated with ques-
tions to answer them quantitatively [62,87].

GQM+ Strategies extends the GQM methodology, adding the capability of
aligning the measurement program with business goals and strategies, software
goals, and measurement goals [88]. The GQM+ Strategy approach is composed
by the following elements [88,89]:

Business goals. Organization goals to achieve strategic objectives. Goals are
conformed by the activity performed to achieve it, focus, object under consid-
eration, quantified magnitude, timeframe where it must be achieved, scope,
constraints, and relationship with other goals.


19

Goal

Strategy

Context/
assumption

Goal + Strategies element

GQM
Goal

Question

Question

made measurable trough Metric

Metric

Metric

Interpretation model

GQM Graph

realized
by a

set of

influences

influences

Goal+Strategies element GQM graph

Measures achievement of

Made measurable through

leads trough
a set of is part of

Figure 2.4: GQM+ strategies model [89]

Context factors. Organizational environment variables that affect the used mod-
els and data.

Assumptions. Estimated unknowns that may affect data interpretation.

Strategies. Possible approaches to achieve goals, refined by activities.

Lower level goals. Set of lower-level goals inherited from the strategy of upper-
level goals.

Interpretation models. Models help interpret data to determine if goals have
been achieved.

Goal+Strategies element. A goal with its strategies, activities, and assumptions.

GQM graph. A GQM goal with the corresponding questions, metrics, and inter-
pretation models. It is associated with a Goal+Strategy element.

The model of GQM+ Strategies is shown in Fig. 2.4. There are multiple goal
levels, allowing for strategies for each level. Goals may be realized through strate-
gies that may also define other goals. Context information influences the definition
of goals and strategies. At every level, GQM plans are defined by measurement
goals, questions, metrics, and interpretation models to measure the achievement
of the respective goal and strategy [88,89].


20

2.2.3 Classifying software measures

All measurement methods must define the software entities and attributes it is de-
scribing. As such, measures can be classified according to software entities into
processes, products, and resources. Furthermore, within each entity software can
also be further categorized into internal or external attributes. Internal attributes
can be measured by examining the entity. While external attributes are measured
in terms of the behavior of the entity in its environment. In the following sections,
software processes, products, and resources measures are detailed, respectively
[19].

Process. These measures are related to software activities and are usually asso-
ciated with time. Process attributes include cost, controllability, observability,
and stability. Only a limited number of internal aspects can be measured
such as the duration, effort, or number of incidents of a specified type. Pop-
ular process metrics are the number of changes, time, and effort.

Product. These measures focus on analyzing the attributes of software artifacts.
Some attributes measured include size, integrity, usability, portability, testa-
bility, complexity, and maintainability. Common product metrics include Lines
of Code (LOC), Function Points (FP) [90], Cyclomatic Complextiy [91], de-
fects, and the number of features.

Resources. These measures examine entities required in the software process,
such as personnel, materials, tools, and methods. This can help determine
the magnitude, cost, and quality of our resources. Some resource measures
are the number of developers, the hardware requirements, and the cost of
personnel.

2.3 Mining software repositories

Software configuration items started out being physically stored as paper docu-
ments, thus managing the information was difficult, error-prone, time-consuming,
and complex. Consequently, repositories were used as a center of the accu-
mulation of knowledge. Repositories started as people but it was challenging,
meanwhile, nowadays databases are used as software repositories [23].


21

Software repositories are artifacts produced and archived while developing
software [17]. Mining software repositories (MSR) is the field that analyzes and
cross-links interesting and actionable information from software repositories about
software products and projects [92]. The field was consolidated in 2004 with
the organization of the workshop on Mining Software Repositories at the Inter-
national Conference of Software Engineering (ICSE) [93]. Software repositories
can be classified as: historical repositories, recording the evolution of software ar-
tifacts; run-time repositories saving the execution and usage of applications; and,
code repositories containing the source code of developed applications [92]. Ex-
amples of historical repositories include source control repositories (i.e., source
code management systems), bug repositories (i.e., issue tracking systems), and
archived project communications [92,94].

Source code management (SCM). Record and maintain changes of source code
artifacts [94]. They are also known as version control systems (VCS), revi-
sion control systems (RCS), software configuration management, or source
code control [95]. Examples of this type of repositories are Git [56], Subver-
sion [96], and CVS [97].

Issue tracking systems (ITS). Track bugs, features, and inquiries from their cre-
ation to their final state. Bugs represent defects, features are new function-
ality or enhancements, and inquiries are questions or technical support for
customers [98]. They are also known as requirement tracking systems and
bug tracking systems (BTS) [17,98]. Examples include Bugzilla 1 and Jira 2.

Archived project communications (APC). Track discussions about any aspect
of software development [92]. Examples include mailing lists, bulletin boards,
question and answer forums, and microblogs storing discussions [99–101].

There is a wide spectrum of techniques and purposes used in MSR [17].

i. Metadata analysis gathers the metadata stored in software repositories us-
ing methodologies such as regular expressions, heuristics, and common-
sequence matching.

1https://www.bugzilla.org/
2https://www.atlassian.com/software/jira


22

ii. Static source code analysis extracts facts from versions of a software system
using techniques for parsing, processing, and extracting facts from source
code.

iii. Source code differencing and analysis focuses on the changes between ver-
sions of source code while using methods to express both syntactic and
semantic changes.

iv. Software metrics quantify aspects of software products, projects, and pro-
cesses such as size, effort, cost, functionality, quality, and complexity.

v. Visualizations use information-visualization techniques to represent data am-
plifying cognition.

vi. Clone-detection methods find source code with similar textual, structural,
and semantic compositions, applying text-based and token-based techniques
or code abstractions like abstract syntax trees and program dependency
graphs.

vii. Frequent-pattern mining discovers trends, patterns, and rules utilizing meta-
data, source code data, and difference data with techniques such as itemset
and sequential-pattern mining.

viii. Information-retrieval is a methodology used to classify and cluster textual
units of various similarity concepts based on metadata.

ix. Classification with supervised learning is based on machine learning tech-
niques that use metadata and historical data to acquire intricate knowledge
to improve tasks.

x. Social network analysis considers techniques to derive and measure invisi-
ble relationships between social entities.

A typical MSR process has the following four steps, shown in Fig. 2.5, data
extraction, data modeling, synthesis, and analysis [100]. First, the raw data is
extracted from the software repositories. Then, the data may be preprocessed
before it is used. Then, the data can be synthesized by applying data mining or
learning techniques. Finally, to conclude the data is analyzed and interpreted.


23

Data extraction Data modeling Synthesis Analysis

Figure 2.5: Mining software repository
process [100]

One of the many data that can be extracted are metrics that are described
in Section 2.2. Further details of some metrics mined from repositories are pre-
sented in Sub-section 2.3.2. Furthermore, to model the information mined from
repositories, data must be linked. A full description of traceability is explained in
Sub-section 2.3.2.

2.3.1 Metrics from repositories

The metrics mined from software repositories are very diverse. Some of these
studies focus on quality, developers, activities, and other categories. Quality met-
rics focus on topics such as defects, vulnerabilities, bugs, evolution, anti-patterns,
and merges [48, 102–107]. Some measures used in this category are boun-
ties [105], code smells [48], and bug density [105]. Developer-focused studies
have focused on the contributions, activity, and productivity [1, 108, 109]. Devel-
oper studies have used metrics such as code owned [109], inequality indexes [108],
and developer contribution [1]. Some metrics topics include energy consump-
tion [110], semantic similarities [111], and number of daily stars [112].

Furthermore, these metrics are extracted from a variety of software reposi-
tories types. Version control systems are software repositories very used, such
as Git, CVS, and Subversion [1, 48, 102, 104–106, 108–113]. Furthermore, issue
tracking systems are also used including Jira, Bugzilla and Google Code [1,102–
105, 108]. Finally, other repositories have also been used to extract measures
including release blogs, vulnerabilities databases, mailing lists and wikis [1, 105,
111].

2.3.2 Software traceability

Even though it is typical for MSR studies to focus on only one repository, using and
linking data between repositories can improve the quality of the data and provide a
more complete view to practitioners [92]. The activity of establishing links between


24

Source artifact Target artifact

Trace link

Primary trace link direction

Reverse trace link direction

Figure 2.6: Relationship between trace artifacts and trace links [116]

and within software artifacts is called traceability [114]. Traceability can focus on
tracing requirement artifacts (i.e. requirement traceability), software engineering
artifacts (i.e. software traceability), and software artifacts with system-level com-
ponents (i.e., system traceability) [115].

The building blocks of traceability are the traceability artifacts and the trace
links. Trace artifacts are the artifacts that are being traced in a project and can
be classified as either source artifacts or target artifacts. Source artifacts are the
origin of the trace, while the target artifact is the destination of the trace. Trace
links are directional associations between a pair of artifacts. The direction of the
association is called the primary trace link direction, and the opposite direction is
denominated as the reverse trace link direction. As trace links can be associated
in both directions they are bidirectional. The relationship between the traceability
artifacts and the trace links is shown in Fig. 2.6. Based on these definitions, a
trace can either mean the triplet of the source artifact, trace link, and target ar-
tifact, or the act of following a trace link. Furthermore, traces can be atomic or
chained. Atomic traces only have one source code and target artifact. Mean-
while, a chained trace is a chain of linked traces. Traceability is the potential of
establishing and using traces [116].

The traceability process model, shown in Fig. 2.7, abstractly defines the ac-
tivities to establish and use traceability. These activities are traceability strategy,
creation, use, and maintenance. First, the traceability strategy is planned and
managed. Stakeholder and system traceability requirements are determined, de-
signed, and implemented. Then, traceability links between artifacts are created
either manually, automatically, or semi-automatically. The links can be created in
real-time (trace capture) or later (trace recovery). Updating and creating trace-
ability links to keep the traceability information is performed in the traceability
maintenance step. The maintenance can be either continuous, immediately af-
ter changes to the artifacts, or on-demand, when it is requested. Finally, these
traceability links are used both short-term and long-term. Some uses are re-


25

Creating Using

Planning and manag-
ing traceability strategy

Maintaining

Trace created
[use requested]

Trace created [el-
ements change]

Trace creation planned
[create directed]

Trace maintained
[use requested]

Trace maintenance re-
quired [elements change]

Requirements for
traceability changed

Trace
maintenance

planned

Maintenance
feedback

Creation feedback Use feedback

Trace envisaged

Traceability required
Project archived

Trace retired

Figure 2.7: Traceability process model [117]

quirement validation, impact analysis, verification, validation, and change man-
agement [116,117].

The activities required to create or use traces are called tracing. Tracing can
be: manual, established by a human; automatic, established using tools, tech-
niques, or tools; or semi-automatic, established using both manual and automatic
traces [116]. Manual trace links are frequently incomplete, inaccurate, and un-
trustworthy [118–120]. Even with the ubiquity of software repositories, developers
often forget or fail to link the artifacts [118]. Hence, automatic approaches to creat-
ing and maintaining software links have been proposed and developed, including
heuristics, information retrieval, machine learning, and artificial intelligence [121].

2.4 Design science

Design science is a methodology that studies artifacts within a context. The ar-
tifacts interact with the context to improve something in the context. The full ap-
proach of design science is shown in Fig. 2.8 [63].

There are two main parts of design science: design and investigation. Design
problems require a change in the real world, thus one of many possible solutions
is designed to achieve a goal. This problem can also be represented in technical
research problems or technical research questions. Knowledge questions ask


26

Research problem

Design problem

• Improve a problem context
• By (re)designing an artifact
• That satified some requirements
• In order to help stakeholders to achieve

some goals

Knowledge question

- Descriptive questions:

• What, when, where, who, how many,
how often, etc.

- Explanatory questions:

• Why? Causes, mechanisms, reasons.

Part I: Framework for design science

- Problem investigation

• Stakeholders?
• Goals?
• Conceptual framework?
• Theory of phenomena? (statistical,

causal, architectural)
• Contribution to goals?

- Treatment design

• Requirements!
• Contribution to goals?
• Available treatments?
• New treatment design!

- Treatment validation

• Effects?
• Requirements satisfaction?
• Trade-offs?
• Sensitivity

Part II: Design cycle

Artifact design

- Conceptual framworks

• Architectural structures
• Statistical structures

- Theorethical generalizations

• Natural and social science theories
• Design science theories

Part III: Theories

- Problem analysis

• Conceptual framework?
• Knowledge questions?
• Population?

- Research setup design

• Objects of study?
• Treatment?
• Measurement?

- Inference design

• Descriptive inferences?
• Statistical inferences? (statistical

models)
• Abductive inferences? (casual or

architectural explanations)
• Analogical inferences?

(generalizations)

- Validation inferences against
research setup

Research design

- Research execution
- Data analysis

• Descriptions?
• Statistical conclusions?
• Explinations? (casual, architectural)
• Generalizations by analogy?
• Answers to knowledge questions?

Part IV: Empirical cycle

Theories improve
our capability to
describe, explain,
predict, design

Figure 2.8: Summary of the design science approach [63]


27

about the world as it is , assuming there is only one answer to the question.
Furthermore, the problem context of an artifact can be extended to contain the
social and knowledge context of the artifact.

The design science problem iterates over two problem-solving cycles, the de-
sign cycle and the empirical cycle. Design problems are treated by the design
cycle and knowledge questions can be answered in the empirical cycle. The de-
sign cycle iterates over problem investigation, treatment design, and treatment
validation and is part of the engineering cycle. But, it is restricted to the first
three phases of the engineering cycle. The design cycle’s outcome is a validated
artifact design, but not an implementation. The empirical cycle analyzes the prob-
lem, designs the research and interference, validates, executes, and analyzes the
data. This thesis utilizes different empirical software engineering methodologies,
explained in Section 2.4.

2.5 Empirical software engineering

As software engineering is a human-centered context, empiricism allows us to
gather evidence of socio-technical phenomena from observation from real-world
projects [62, 122, 123]. There exists a wide variety of research methods, such
as laboratory experiments, surveys, and field studies, that have distinct research
objectives and focuses [124,125]. Empirical standards for the software engineer-
ing field have also been developed to improve the transparency and quality of
peer-reviewed studies [126].

Research can be classified depending on the source of information from which
the data was generated. Primary studies gather information from the the primary
data sources [62]. As in, from observing the studied phenomena. Examples
of such approaches include experiments or case studies. Meanwhile, when the
source of information is research works, this is considered a secondary study [64].
Examples of such approaches include systematic literature reviews and mapping
studies. Finally, if the source of information is other secondary studies the work
is therefore considered a tertiary study [127]. In software engineering, tertiary
studies tend to follow the same guidelines as secondary studies. Still, different
threats to validity for such approaches exist such as double counting [128].

The studies can also gather quantitative or qualitative data [62]. Quantitative


28

data represents numerical representations of phenomena. For example, counting
the number of commits created by developers. The information is thus usually
analyzed with statistical techniques. Meanwhile, qualitative data represents non-
numerical information. For example, in a survey, we can gather the perceptions
of developers utilizing a tool. To analyze such information, qualitative research
synthesis approaches such as narrative synthesis, thematic synthesis, and ground
theory, to analyze [129].

This research uses two principal empirical software engineering methodolo-
gies for the empirical cycles. First, we utilize mapping studies, a type of sec-
ondary research, to gather information about the research area (Section 2.5.1) in
Chapter 3. Furthermore, case studies (Section 2.5.2) and controlled experiments
(Section 2.5.3) are utilized in our evaluations of Chapter 5. Surveys are also used
to gather data for our evaluations (Section 2.5.4).

2.5.1 Systematic mapping studies

Systematic mapping is a type of secondary study (studies that analyze previous
research) that provides an overview of a broad topic area, identifying and classify-
ing all the research of the topic area [130], thus focusing on answering more gen-
eral questions [131]. The main goal of mapping studies is to provide an overview
of the research area by identifying the quantity and type of research in a field. The
mapping study process, shown in Fig. 2.9, has five steps and outcomes [65].

Definition of Re-
search Question Conduct Search Screening of Papers Keywording us-

ing Abstract
Data Extraction and
Mapping Process

Process steps

Review Scope All Papers Relevant Papers Classification
Scheme

Systematic Map

Outcomes

Figure 2.9: Systematic mapping process [65]

Definition of research questions. First, the research questions are defined, es-
tablishing the research scope. Often, mapping studies ask questions related
to frequencies over time or publication forums.


29

Conduct search. In this step, primary studies are gathered. Strategies to search
for primary studies include snowballing, manual search, and database search.
Additionally, strategies to develop the search such as defining the terms
based on the PICO (Population, Intervention, Comparison, and Outcome)
model, deriving keywords from known papers, and consulting librarians or
experts. The search can be evaluated using a test set of known papers,
expert evaluation, checking authors’ websites, and utilizing test-retest [132].
It is a good strategy to utilize a quasi-gold standard of manually selected
papers to tune the performance of the search string [133].

Screening of papers. Papers are included or excluded based on criteria to re-
move studies that do not answer research questions. Strategies to decide
to include or exclude papers include decision rules, resolving disagreements
among multiple researchers, and identifying objective criteria to evaluate ob-
jectivity [132].

Keywording using abstracts. The classification scheme is defined to map stud-
ies. A technique to create the classification scheme is keywording, which
has two main steps. First keywords and concepts are gathered from review-
ers reading abstracts, through introductions and conclusions may also be
used. Then, the keywords are combined to gather a high-level understand-
ing of the research to help identify representative categories. Keywording is
one of many ways to analyze data from literature reviews or mapping stud-
ies [129].

Data extraction and mapping process. Finally, the data is extracted and classi-
fied based on the scheme, though the scheme can further evolve during the
extraction. Thus producing the results of the systematic mapping study.

2.5.2 Case studies

Case studies are a research methodology that studies contemporary phenomena
that is hard to study in isolation. The case study process, shown in Fig. 2.10, has
five major steps [134,135].

Case study design. First, the reasons for studying the case study are defined.
Based on this objective, what we expect to achieve, is established and is fur-


30

ther refined in research questions, propositions, and hypotheses. Further-
more, the case and the units of analysis are selected. A case is anything that
is a contemporary software engineering phenomenon in its real-life setting
(e.g., software projects, individuals, processes, or technologies). Then, the
case can be further integrated by subunits (i.e. units of analysis). Both the
case and units of analysis are selected intentionally to find typical, critical,
revelatory, or unique information. To refine the context, the theoretical frame
of reference or related work is defined. The general decisions of how the
data is collected are also defined in the design, taking into account the data
source, quantity, and type. Finally where the data will be collected is selected
ensuring there is enough coverage to enhance the validity and reliability of
the findings.

Preparation for data collection. Which and how the data will be collected is de-
fined. Data can be collected directly from the source with the interaction of
the researcher (first degree), indirectly where the research does not inter-
act with the source (second degree), or independently from artifacts that are
already available (third degree). Furthermore, using multiple data sources
is advantageous to limit the effects of analyzing only one data source (i.e.
triangulation). Methods to recollect data include interviews, focus groups,
observations, archival data, and metrics.

Collecting evidence. In this step the previously defined procedures and proto-
cols are executed on the studied case, gathering the results.

Analysis of collected data. With the information gathered, the data is analyzed
to understand what happened and seek patterns within the data. Data can
be analyzed both qualitatively and quantitatively. Qualitative data analysis
uses techniques such as hypothesis generation and hypothesis confirma-
tion on non-numerical data. However, quantitative data analysis focuses on
working with numbers, including techniques such as descriptive statistics,
correlation analysis, predictive models, and hypothesis testing.

Reporting. Lastly, the findings are reported tailoring the report depending on the
audience. In the case of research, the case study should introduce the work,
describe related work, detail the case study design, state the results with
their analysis and present the conclusions.


31

2.5.3 Controlled experiments

Experiments or controlled experiments in software engineering are empirical stud-
ies in which variables are varied as part of the studied setting based on random-
izing treatments [62]. Quasi-experiments are similar to experiments, yet they do
not fully randomize treatments. Experiments are usually done in laboratory-type
settings, to better control the treatments to the control variables. There are five
main steps for the experimental process, presented in Fig. 2.11.

Scoping. In the first activity, the goal, objectives, and hypothesis of the experi-
ment must be defined clearly.

Planning. Based on the goal of the study, the foundations of how the experiment
design is defined. This includes defining the context, variables, subjects, and
instrumentation. The previously defined hypothesis is refined. Additionally,
depending on the objective and resources, the design type is chosen be-
tween experiments and quasi-experiments. Finally, threats to validity must
be analyzed, mitigated, tackled, reduced, or reported.

Operation. Based on the design, the experimental design is put into operation.
Hence, the subjects are prepared if needed, the design executed, and the
data collected ensuring the validity.

Analysis & interpretation. With the experimental results, we can now analyze
the data and interpret it. The same techniques for data analysis as case
studies can be used. Experiments tend to gather more quantitative data.

Presentation & package. Finally, the experimental results are reported. This
step is similar to the last step of case studies.

2.5.4 Surveys

Surveys are a research technique to gather information from or about people [62,
136]. Surveys can be used as a primary source of information for a study or to
supplement other empirical software engineering strategies. Surveys intend to
understand phenomena based on a sample of a population to generalize findings.
There are two main types of data collection for surveys: questionnaires and inter-
views. For the first type of survey, questionnaires, respondents answer physical


32

Case study design

Preparation for
data collection

Collection
of evidence

Analysis of
collect data

Reporting

Figure 2.10: Case study process adapted from [134,135]

Experiment scoping

Experiment
planning

Experiment
operation

Analysis & in-
terpretation

Presentation
& package

Figure 2.11: Experiment process [62]


33

or digital questions from forms and are given back as evidence. In the second
type of survey, interviews, questions are asked by interviewers and answered by
respondents. The answers are recorded to be later transcribed for analysis.

There are three different types of surveys: descriptive, explanatory, and ex-
ploratory. Descriptive surveys focus on understanding what is the distribution of
the population to enable assertions. Explanatory studies focus on exploring cer-
tain claims in populations. For example, understanding why certain phenomena
become present. Lastly, explorative are used pre-study execution to test the sur-
veys and refine the study design.

Surveys, as with any empirical methodologies, need to designed [137]. Though
survey instruments can be reutilized, this is rare in software engineering [138].
Hence, survey instruments must be validated (i.e., measuring what it sets to mea-
sure) and reliable (i.e., having reproducible data) [139]. Sampling considerations
to gather representative subsets of a population are vital while conducting sur-
veys [140]. Surveys can be analyzed with similar techniques to case studies and
experiments, though the type of analysis depends on the data [141]. Some com-
mon pitfalls and considerations exist such as validating the correctness or com-
pleteness of the survey, partioning data into segments based on demographics,
and transforming scales.


34

Chapter 3

Characterizing developer
contribution research in software

engineering

This chapter presents a summary of the design and results of the systematic
mapping study that characterizes how software development contributions are re-
searched in software engineering. The specific objective tackled is:

SO1. Characterize software engineering research of developer contribu-
tions.

The work thus provided a synthesis of the state of the art. To achieve this
aim, we conducted a systematic mapping study [64,65] that characterizes the de-
veloper contribution research in software engineering. This work, therefore com-
plements prior primary works that characterize the types of contributions using
practitioner guidelines and information. Understanding the literature can help uni-
formize terms, consolidate findings, and identify gaps for future work. At the same
time, approaches and measures helped inspire the tool design (Chapter 4). Fur-
thermore, the empirical design approaches and challenges motivated the empiri-
cal evaluations (Chapter 5).

The work of this chapter is based on the following paper, in publication as part
of this thesis [142]. In Section 3.1 we present the design of the study. Then, in
Section 3.2, a synthesis of the main results is provided. Finally, in Section 3.3 we
summarize and discuss main findings. The full work is in Appendix A.


35

3.1 Study design

An overview of our approach is shown in Fig. 3.1. The process had four main
steps: definition of the research questions, conducting the search, screening pa-
pers, and data extraction and analysis.

Figure 3.1: Mapping study process

We started the process by defining seven research questions that were an-
swered. They are shown with their motivation in Table 3.1. Thus, we classify
the contribution types, research topics, research design practices, measurement
constructs, assessment approaches, contexts under study, threats to validity, and
challenges.

Then, we searched digital libraries to gather the set of potentially relevant pa-
pers to achieve our goal and answer our questions. This required specifying two
main components: digital databases, and search strings. We gathered the poten-
tially relevant papers utilizing three digital databases: Scopus, IEEE Xplore, and
Web of Science. They were chosen as they have been used previously in soft-
ware engineering secondary studies and provide determinist results. Thus, the
selected databases are in line with previous work [143]. For the query construc-
tion, we utilized a set of 11 control papers that acted as a quasi-gold standard to
validate our search string [133].

The resulting search query was: (“software engineering” OR “software devel-
opment” OR “software system” OR “software project”) AND contribution AND (
assess* OR evaluat* OR measur* OR examin* ) AND ( developer* OR team* OR


36

Table 3.1: Mapping study research questions with their motivation

ID Question Motivation

RQ1 What types of developer contributions
have been investigated in software en-
gineering studies?

To construct a classification of the
types of contributions in software en-
gineering research.

RQ2 What topics are addressed in software
contribution works?

To discover the areas in which de-
veloper contributions have been re-
searched.

RQ3 What are the research design prac-
tices in developer contribution stud-
ies?

To discover the design of the studies.

RQ4 What measures are detailed in devel-
oper contribution research?

To collect which measures studies
have proposed and employed.

RQ5 What are the assessment approaches
presented in developer contribution re-
search?

To find the assessment approaches
used to assess developer contribu-
tions.

RQ6 What contexts are investigated in de-
veloper contribution studies in soft-
ware engineering?

To assemble the research settings of
the works, summarizing the usage
scenarios of the research.

RQ7 What threats to validity are described
in software engineering literature?

To gather reported threats to validity of
the works to serve as a checklist used
in the design of future studies.

RQ8 What contribution assessment chal-
lenges are reported by the re-
searchers?

To assemble the prevalent challenges
of assessing developer contributions
indicated by the literature.


37

student* ). We added terms, that specified the subject of the papers (e.g., con-
tribution), delimited the context (e.g., software engineering), and were adaptable
for different terms (e.g., student, team), inspired by common words found in our
control studies. The final query hence retrieved all the control papers, achieving
a sensitivity of 100%. This surpassed the threshold of 80%, hence the query had
an acceptable performance [133]. This resulted in 1, 112 potential relevant papers.
After removing duplicates, 828 distinct potentially relevant studies remained.

Papers were then screened to only synthesize relevant studies for our work.
First, we defined collaboratively and iteratively a set of inclusion and exclusion cri-
teria as a basis for our screening, shown in Table 3.2. Then, works were screened
in a two-phase process. Initially, works were screened in the first phase by the two
first authors of the study, based on the title, abstract, and keywords, resulting in
the selection of 287 studies. In case of any doubt about the relevance of the work,
it was included to be checked in the full-text screening. Subsequently, a full-text
screen was conducted, selecting 166 relevant papers. The final precision of the
string was 20%, hence the performance of the string was within acceptable ranges
as indicated by software engineering guidelines [133]. Validation of the screening
showed inter-coder agreement (90.1%), with acceptable inter-rater performance
(Cohen’s Kappa of k = 0.794 and p < 0.001, Krippendorff’s alpha of α = 0.793).

Table 3.2: The inclusion (I) and exclusion (E) criteria

ID Type Criteria

E1 Exclusion Not written in English.
E2 Exclusion Unrelated to software engineering.
E3 Exclusion Without full-text available.
E4 Exclusion Are not primary studies.
I1 Inclusion Study software developer contributions.
I2 Inclusion Assess software developer contributions.
I3 Inclusion Investigated software engineering projects.

Finally, the extraction and analysis phase was carried out. First, data extrac-
tion fields, shown in Table 3.3, were defined to answer each research question.
The number of themes, codes, and occurrences of the code are shown for each
data extraction item. Then, the selected relevant papers’ information was syn-
thesized through thematic analysis [129, 144], also inspired by the keywording
approach [65]. We constructed the categories collaboratively with an iterative in-


38

tegrated approach that combined deductive codes from previous works [17, 20,
21,23,24,62,65,82,92,99,100,145–153] and inductive codes that emerged from
the data. We therefore extracted the data, created codes, translated codes into
classifications, and when appropriate created themes for the data. Works could
be classified for the same data extraction item in multiple categories. For example,
a study could consider a contribution to the code and communication in a project.

Additionally, we gather general information from the studies and other data
from online sources such as GitHub or the tools. Lastly, for each study, quality
assessment criteria were evaluated to quantify the level of detail of the reports de-
scribing if they had enough information to answer our research questions. Scores
ranged between 0 and 10. The criteria are shown in Table 3.4.

Table 3.3: Data extraction fields with their dimensions

Question Data extraction field Themes Codes Occurences

RQ1 Type of contribution 4 11 210
RQ1 Development activity 3 9 245
RQ2 Topics 4 35 232
RQ3 Research contribution - 5 252
RQ3 Research method - 4 195
RQ3 Research type - 5 168
RQ3 Analysis type - 2 177
RQ3 Analysis techniques - 5 327
RQ4 Measures 12 295 1391
RQ5 Extraction techniques 2 9 292
RQ5 Software repositories 9 26 197
RQ5 Software artifacts 9 23 470
RQ5 Tools - 79 133
RQ5 Datasets - 53 84
RQ6 Context - 6 185
RQ6 Project - 9127 11838
RQ7 Threats to validity 3 + 4 64 943
RQ8 Challenges 3 23 342

Total 53 9781 17681
Total without projects 53 654 5843


39

Table 3.4: Study quality assessment criteria

ID Quality criteria (Score - Specification)

Q1 Does the study explicitly mention the goal? (1 - Mentions the goal, 0 - The are no goals)
Q2 Does the study apply the assessment? (2 - Applied in at least two projects, 1 - Applied in

one project, 0 - It is not applied in any project)
Q3 Does the study describe how the contributions are assessed? (2 - Describes the as-

sessment explicitly, 1 - Describes the assessment implicitly, 0 - No description about the
assessment)

Q4 Does the study describe how the contribution assessment was extracted? (2 - Describes
the data extraction explicitly, 1 - Describes the data extraction implicitly, 0 - The data
extraction is not described)

Q5 Does the study analyze the assessed contribution data? (1 - The data is analyzed, 0 -
The data is not analyzed)

Q6 Does the study explicitly describe the threats to validity? (2 - Describes threats to validity
explicitly, 1 - Describes threats to validity implicitly, 0 - There are no described threats to
validity)

3.2 Results and discussion

As for the results, the oldest publication was from 1989 and the most recent
was from 2021 (the year in which the query was executed). Ever since 2008,
there has been at least one study that assesses developer contributions by year.
Additionally, more works were published in conferences (122 studies) than journal
articles (44 studies). Though we found 105 distinct venues of publication, only 26

venues had at least 2 research with more than two studies. The average quality
score of the studies was 7.2. Hence, the studies were, in general, comprehensive
reports. For the following results, it is important to note that studies could be
classified into multiple categories. For example, studies can consider multiple
types of contribution or utilize multiple approaches.

Starting with the types of contributions (RQ1), we found four different main
types studied in the literature. First, contributions can be related to the devel-
opment of the software products (130 studies), such as code contributions (111
studies), issues (16 studies), and tasks (2 studies). Another common type of con-
tribution investigated is those related to the involvement within the community (23
studies). Thus contributing by communication (23 studies) and providing atten-
tion (1 study). Contributions that support software projects were rare (6 studies).


40

These focused on documentation for the project (4 studies) and administration
(3 studies). Furthermore, some publications were ambiguous in their definition
of a contribution (26 studies). A total of 11 different types of contributions were
identified. The most studied types of contributions were related to the code (111
studies). The least investigated types of contributions were related to design (2
studies) and attention (1 study).

The topics of study of the works were focused on four themes (RQ2). The
first most studied area aimed to comprehend software development phenomena
(107 studies). For example, understanding contributions inequality between team
members (4 studies). These comprehension works focused on people (53 stud-
ies), artifacts (50 studies), or systems (28 studies). This was followed by works
that focused on training and teaching students (34 studies). Then, another topic
of study was focused on constructing models and artifacts (29 studies). Finally,
the least studied topics were related to proposing contribution assessment ap-
proaches (22 studies).

For the research design (RQ3), we classified the research contribution, re-
search method, research type, analysis type, and analysis techniques. Research
has mostly contributed with models (91 studies) and validating research as the
type (87 studies). Studies mostly analyze quantitative data (105 studies). Statisti-
cal analysis techniques are also very common (131 studies).

We identified 297 distinct measures mentioned in 161 works, which we classi-
fied into 12 construct types (RQ4). The most utilized type of measure is related to
repository activities (120 studies). Particularly, the number of commits has been
the most used measure (54 studies). This is followed by demographic information
(73 studies). For example, the number of contributors to a project (23 studies).
Meanwhile, performance (64 studies) and quality (57 studies) metrics were also
frequent. The least common construct measures are those related to the sig-
nificance (14 studies) and the purpose (6 studies). Additionally, we also classify
the data type (quantitative or qualitative) and extraction type (objective or subjec-
tive) of each measure mentioned. Almost all measured constructs are related to
quantitative objective data.

Regarding the assessment approaches (RQ52), we gathered the extraction
techniques, software repositories, software artifacts, tools, and datasets. Publica-
tions have mostly focused on utilizing mining software repository techniques (131
studies). Notably, by mining the metadata of the projects (121 studies). For exam-


41

ple, gathering the number of lines of code changes in a commit. Code repositories
are the most prominent type of repository used (110 studies). As such, code arti-
facts are also frequently used (124 studies). Still, human perception is also widely
used to assess contributions (45 studies). We found that there were no tools or
datasets that were used in more than three works that were specifically proposed
for contribution assessment works. For example, through experts, peers, or auto-
evaluating the contributions.

As for the context (RQ6), more than half of the works have been investigated
within the open-source software domain (109 studies). Student education (40
studies) and practitioner-industry (20 studies) projects were also mentioned as
domains. We additionally gathered specific open-source projects mentioned and
mined from their repository popularity and activity information. We found that Java
(39 studies) is the most investigated language and the owner of most projects is
Apache (26 studies). Additionally, few projects have been studied with less than
100 commits (7 studies).

We also gathered the threats to validity reported in the studies (RQ7). Threats
to validity were found in most works (144 studies). These were classified as inter-
nal (118 studies), construct (117 studies), external (98 studies), or conclusion (73
studies) threats to validity. We found 64 different threats to validity in the studies.
The most mentioned threat was the generalization across selection or settings (83
studies). This was followed by the threat of mono or unbalanced operationalization
(78 studies), where there may be issues due to not correctly representing a con-
struct. Notably, some studies mention this threat in their contribution assessment.
Another widely mentioned threat was related to inaccurate data (48 studies). For
example, issues with user identities or manipulated commits.

Finally, regarding the challenges of assessing contributions (RQ8), there were
23 mentioned across 72 papers. These were grouped into three categories. The
most mentioned challenge is related to how usable the approaches are (57 stud-
ies). Examples of such pain points include the personalization of approaches (28
studies) and techniques being simple to use (28 studies). Then, the effectiveness
of the works was also commonly mentioned (47 studies). The most reported issue
within this category is related to how comprehensive the approaches are (37 stud-
ies). The least mentioned theme was related to the dependability of the results
(45 studies). The most mentioned pain points for this challenge are related to the
fairness of results (24 studies) and encouragement for adopters (24 studies).


42

3.3 Summary

In this chapter, we characterized the research of developer contributions in
software engineering with a mapping study. The results show that what a contri-
bution is in software engineering is broad. Yet, we need to provide more usable,
effective, and dependable assessments of contributions. More focus should be
given to considering more than code contribution activity measures. Work is also
needed to help the reusability and replicability of the results through tools and
datasets. Another opportunity is to evaluate the approaches with adopters.

The findings indicate that software contribution measurement tools are needed
as assessment approaches need to be usable, effective, and dependable. Despite
the large amount of work, less than 30 studies proposed tools and none have
been extensively utilized in research. Additionally, we have to consider more than
just the number of commits as a software engineering contribution including the
quality and value. For the code, we can also consider the relationship of these
contributions with others through traces. Finally, implementing and evaluating in
practice the approaches is also needed.

In this thesis, we help tackle prior issues by proposing a measurement proce-
dure model that considers multiple dimensions of a software contribution. With
such procedures, we can improve the comprehensiveness of the operationaliza-
tion of software contributions. Additionally, we implemented the procedure through
a tool, aiding in the adoption. The classifications and insights gained from this
study inspired both the measurement procedure design (Chapter 4) and the em-
pirical evaluations (Chapter 5).


43

Chapter 4

Developing the automated code
contribution measurement

procedure

In this chapter, we detail the automated procedure design and development of
a software measurement to collect developer code contributions mining software
repositories. The following specific objective is addressed:

SO2. Design an automated procedure to measure developer contributions
by mining software repositories.

This measurement procedure helped automatically measure developer contri-
butions. To achieve this aim, we designed a measurement procedure inspired by
the measurement context model [66]. This model clarifies the distinct steps for de-
signing measurement methods, specifically for software engineering. Additionally,
we instantiated the procedure by developing a tool for code contributions following
iterative development [23,24].

This chapter is based on the paper published as part of this thesis [154]. We
detail the measurement procedure design in Section 4.1. The development of the
tool is described in Section 4.2. We finalize by summarizing the contributions of
this chapter in Section 4.3. The work is present in Appendix G.


44

4.1 Measurement procedure design

Our measurement procedure, shown in 4.1, has three main steps: design, appli-
cation, and exploitation. In the following, we describe each step.

The first phase of the measurement procedure is the design where the mea-
surement method is proposed. First, we defined the goals and measures of the
measurement procedure with a generic Goal Question Metric model [19]. The
goal of the measurement procedure is to automatically measure developer contri-
butions by mining software repositories.

Based on this, we proposed six different questions based on different dimen-
sions of the contributions with four types of questio