Using randomly chosen subjects in software engineering research
Introduction
Many types of software engineering research require software subjects that a researcher can use during an experimental evaluation of a technique. For instance, if you develop a new way to test software, then you need to demonstrate that it can effectively find defects in real-world programs. So, where will you find these programs? And, how can you select the programs so as to minimize the risk of compromising your experiment?
One way to pick subject programs is to download them from an archive like the Software-Artifact Infrastructure Repository, known by the SIR abbreviation. However, there is growing concern in the software engineering community that selecting programs from SIR may not always usefully evaluate a method.
While it is possible to run experiments with programs that you download from sites like GitHub, there are challenges associated with this approach as well. For instance, researchers may be inclined — consciously or not — to specifically download and use programs on which their technique is likely to perform well, which is a threat to the validity of their experimental results.
Solutions
This sounds like a thorny problem! Are there any viable solutions? Well, yes!
If you happen to conduct research that focuses on Java programs, then you can use the SF100 to evaluate your new approach. What if your research does not focus on this area? Thankfully, in certain domains, “natural” solutions have recently emerged. For instance, if, like me, you develop and evaluate ways to test web sites, then you can use the Discuvver site to randomly select a web site for use in your experimental study. Click a button and you have a subject!
In a recent paper that introduces a technique for testing mobile-ready web pages, my colleagues and I used Discuvver to randomly pick web sites for use in the experiments that we report on in (Walsh, Kapfhammer, and McMinn 2017)
If you are a software engineering researcher, I am interested in learning how you pick the subjects that you use to experimentally evaluate your methods. I would also appreciate your feedback on our approach to testing mobile-ready web pages. So, please contact me to share your insights!