Stack Overflow (SO) is a popular Q&A forum for software developers, providing a large number of copyable code snippets. While GitHub is an independent code collaboration platform, developers often reuse SO code in their GitHub projects. In this paper, we investigate how often GitHub developers re-use code snippets from the SO forum, as well as what concepts they are more likely to reference in their code. To accomplish our goal, we mine SOTorrent dataset that provides connectivity between code snippets on the SO posts with software projects hosted on GitHub. We then study the characteristics of GitHub projects that reference SO posts and what popular SO discussions can be found in GitHub projects. Our results demonstrate that on average developers make 45 references to SO posts in their projects, with the highest number of references being made within the JavaScript code. We also found that 79% of the SO posts with code snippets that are referenced in GitHub code do change over time (at least ones) raising code maintainability and reliability concerns.

Code evolution, Code snippets, GHTorrent, GitHub, Open-source, SOTorrent, StackOverflow, Tags
16th IEEE/ACM International Conference on Mining Software Repositories, MSR 2019
School of Computer Science

Manes, S.S. (Saraj Singh), & Baysal, O. (2019). How often and what stackoverflow posts do developers reference in their GitHub projects?. In IEEE International Working Conference on Mining Software Repositories (pp. 235–239). doi:10.1109/MSR.2019.00047