Abstract :
 

Issues are highly prevalent on GitHub due to the increasing scale of its software repositories. These issuesare submitted to the issue tracking system for several reasons: reporting a bug, asking a question, or othermaintenance activities. The attractive repositories on Github receive a large number of issues daily. Assigningsimilar issues individually to different developers for validating and fixing introduces inconsistencies whenasynchronously independent developers fix them, in addition to slowing the fixing process. However, groupingsimilar issues into clusters and assigning each cluster to the same and appropriate developer/team speeds upthe fixing process. In this paper, a machine learning algorithm-based approach has been proposed to supportissue management on GitHub by grouping similar issues together. For validity, the proposed approach wasapplied to 13 software components from different and large repositories. Findings reveal that the proposedapproach identifies similar clusters of issues with promising results using widely used evaluation measures inthis subject: Precision, Recall, and F-measure.