People, Processes, and Products: Case Studies in Open-Source Software Using Complex Networks

Persistent Link:
http://hdl.handle.net/10150/217072
Title:
People, Processes, and Products: Case Studies in Open-Source Software Using Complex Networks
Author:
Ma, Jian James
Issue Date:
2011
Publisher:
The University of Arizona.
Rights:
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Abstract:
Open-source software becomes increasingly popular nowadays. Many startup companies and small business owners choose to adopt open source software packages to meet their daily office computing needs or to build their IT infrastructure. Unlike proprietary software systems, open source software systems usually have a loosely-organized developer collaboration structure. Developers work on their "assignments" on a voluntary basis. Many developers do not physically meet their "co-workers." This unique developer collaboration pattern leads to unique software development process, and hence unique structure of software products. It is those unique characteristics of open source software that motivate this dissertation study. Our research follows the framework of the four key elements of software engineering: Project, People, Process and Product (Jacobson, Booch et al. 1999). This dissertation studies three of the four P's: People, Process and Product. Due to the large sizes and high complexities of many open source software packages, the traditional analysis methods and measures in software engineering can not be readily leveraged to analyze those software packages. In this dissertation, we adopt complex network theory to perform our analysis on open source software packages, software development process, and the collaboration among software developers. We intend to discover some common characteristics that are shared by different open source software packages, and provide a possible explanation of the development process of those software products. Specifically we represent real world entities, such as open source software source code or developer collaborations, with networks composed of inter-connected vertices. We then leverage the topological metrics that have been established in complex network theory to analyze those networks. We also propose our own random network growth model to illustrate open source software development processes. Our research results can be potentially used by software practitioners who are interested to develop high quality software products and reduce the risks in the development process. Chapter 1 is an introduction of the dissertation's structure and research scope. We aim at studying open source software with complex networks. The details of the 4-P framework will be introduced in that chapter. Chapter 2 analyzes five C-language based open source software packages by leveraging function dependency networks. That chapter calculates the topological measures of the dependency networks extracted from software source code. Chapter 3 analyzes the collaborative relationship among open source software developers. We extract developer's co-working data out of two software bug fixing data sets. Again by leveraging complex network theory, we find out a number of topological characteristics of the software developer networks, such as the scale-free property. We also realize the topological differences between from the bug side and from the developer side for the extracted bipartite networks. Chapter 4 is to compare two widely adopted clustering coefficient definitions, the one proposed by Watts and Strogatz, the other by Newman. The analytical similarities and differences between the two clustering coefficient definitions provide useful guidance to the proposal of the random network growth model that is presented in the next chapter. Chapter 5 aims to characterize the open source software development process. We propose a two-phase network growth model to illustrate the software development process. Our model describes how different software source code units interconnect as the size of the software grows. A case study was performed by using the same five open source software packages that have been adopted in Chapter 2. The empirical results demonstrate that our model provides a possible explanation on the process of how open source software products are developed. Chapter 6 concludes the dissertation and highlights the possible future research directions.
Type:
text; Electronic Dissertation
Keywords:
Modeling; Open Source; Random Network; Software; Management Information Systems; Complex Network; Information Systems
Degree Name:
Ph.D.
Degree Level:
doctoral
Degree Program:
Graduate College; Management Information Systems
Degree Grantor:
University of Arizona
Advisor:
Zeng, Daniel

Full metadata record

DC FieldValue Language
dc.language.isoenen_US
dc.titlePeople, Processes, and Products: Case Studies in Open-Source Software Using Complex Networksen_US
dc.creatorMa, Jian Jamesen_US
dc.contributor.authorMa, Jian Jamesen_US
dc.date.issued2011-
dc.publisherThe University of Arizona.en_US
dc.rightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.en_US
dc.description.abstractOpen-source software becomes increasingly popular nowadays. Many startup companies and small business owners choose to adopt open source software packages to meet their daily office computing needs or to build their IT infrastructure. Unlike proprietary software systems, open source software systems usually have a loosely-organized developer collaboration structure. Developers work on their "assignments" on a voluntary basis. Many developers do not physically meet their "co-workers." This unique developer collaboration pattern leads to unique software development process, and hence unique structure of software products. It is those unique characteristics of open source software that motivate this dissertation study. Our research follows the framework of the four key elements of software engineering: Project, People, Process and Product (Jacobson, Booch et al. 1999). This dissertation studies three of the four P's: People, Process and Product. Due to the large sizes and high complexities of many open source software packages, the traditional analysis methods and measures in software engineering can not be readily leveraged to analyze those software packages. In this dissertation, we adopt complex network theory to perform our analysis on open source software packages, software development process, and the collaboration among software developers. We intend to discover some common characteristics that are shared by different open source software packages, and provide a possible explanation of the development process of those software products. Specifically we represent real world entities, such as open source software source code or developer collaborations, with networks composed of inter-connected vertices. We then leverage the topological metrics that have been established in complex network theory to analyze those networks. We also propose our own random network growth model to illustrate open source software development processes. Our research results can be potentially used by software practitioners who are interested to develop high quality software products and reduce the risks in the development process. Chapter 1 is an introduction of the dissertation's structure and research scope. We aim at studying open source software with complex networks. The details of the 4-P framework will be introduced in that chapter. Chapter 2 analyzes five C-language based open source software packages by leveraging function dependency networks. That chapter calculates the topological measures of the dependency networks extracted from software source code. Chapter 3 analyzes the collaborative relationship among open source software developers. We extract developer's co-working data out of two software bug fixing data sets. Again by leveraging complex network theory, we find out a number of topological characteristics of the software developer networks, such as the scale-free property. We also realize the topological differences between from the bug side and from the developer side for the extracted bipartite networks. Chapter 4 is to compare two widely adopted clustering coefficient definitions, the one proposed by Watts and Strogatz, the other by Newman. The analytical similarities and differences between the two clustering coefficient definitions provide useful guidance to the proposal of the random network growth model that is presented in the next chapter. Chapter 5 aims to characterize the open source software development process. We propose a two-phase network growth model to illustrate the software development process. Our model describes how different software source code units interconnect as the size of the software grows. A case study was performed by using the same five open source software packages that have been adopted in Chapter 2. The empirical results demonstrate that our model provides a possible explanation on the process of how open source software products are developed. Chapter 6 concludes the dissertation and highlights the possible future research directions.en_US
dc.typetexten_US
dc.typeElectronic Dissertationen_US
dc.subjectModelingen_US
dc.subjectOpen Sourceen_US
dc.subjectRandom Networken_US
dc.subjectSoftwareen_US
dc.subjectManagement Information Systemsen_US
dc.subjectComplex Networken_US
dc.subjectInformation Systemsen_US
thesis.degree.namePh.D.en_US
thesis.degree.leveldoctoralen_US
thesis.degree.disciplineGraduate Collegeen_US
thesis.degree.disciplineManagement Information Systemsen_US
thesis.degree.grantorUniversity of Arizonaen_US
dc.contributor.advisorZeng, Danielen_US
dc.contributor.committeememberPingry, David E.en_US
dc.contributor.committeememberZhang, Zhuen_US
dc.contributor.committeememberZeng, Danielen_US
All Items in UA Campus Repository are protected by copyright, with all rights reserved, unless otherwise indicated.