Developer support forums are becoming more popular than ever. Crowdsourced knowledge is an essential resource for many developers yet it can raise concerns about the quality of the shared content. Most existing research efforts address the quality of answers posted by Q&A community members. In this paper, we explore the quality of questions and propose a method of predicting the score of questions on Stack Overflow based on sixteen factors related to questions' format, content and interactions that occur in the post. We performed an extensive investigation to understand the relationship between the factors and the scores of questions. The multiple regression analysis shows that the question's length of the code, accepted answer score, number of tags and the count of views, comments and answers are statistically significantly associated with the scores of questions. Our findings can offer insights to community-based Q&A sites for improving the content of the shared knowledge.

Additional Metadata
Keywords Content quality, Crowdsourced knowledge, Prediction model, Questions, Regression analysis
Persistent URL dx.doi.org/10.1145/2897659.2897661
Conference 3rd International Workshop on CrowdSourcing in Software Engineering, CSI-SE 2016
Citation
Alharthi, H. (Haifa), Outioua, D. (Djedjiga), & Baysal, O. (2016). Predicting questions' scores on stack overflow. Presented at the 3rd International Workshop on CrowdSourcing in Software Engineering, CSI-SE 2016. doi:10.1145/2897659.2897661