E-Predictor: An Approach for Early Prediction of Pull Request Acceptance

2025-02-03 Mon
pull request prediction model GitHub open source projects code review development efficiency BERT
Research Breakthrough on Early Prediction of Pull Request AcceptanceIn recent years, open-source software (OSS) development has gradually become one of the mainstream software development models, heavily relying on collaboration among developers. The Pull Request (PR) mechanism, widely applied in distributed software development, improves collaboration efficiency. On open-source platforms like GitHub, PRs allow developers to submit code change requests that project maintainers (integrators) then review to decide whether to merge the code into the main branch. However, with the rise in OSS activity, the number of PRs has grown exponentially, significantly increasing the workload of integrators and delaying PR processing times. Effectively managing and predicting PR acceptance has become a hot topic among researchers and developers.
Against this backdrop, Kexing Chen, Lingfeng Bao, Xing Hu from the State Key Laboratory of Blockchain and Data Security at Zhejiang University, and Xin Xia from Huawei’s Software Engineering Application Technology Lab, together with Xiaohu Yang, published a research paper titled “e-predictor: An Approach for Early Prediction of Pull Request Acceptance”. The paper was published in the 2025 issue of Science China Information Sciences (Vol. 68, Iss. 5, DOI: https://doi.org/10.1007/s11432-022-3953-4). This paper systematically introduces a new prediction method named “e-predictor” designed to predict whether a PR will be merged at the time it is created, aiming to reduce the workload of integrators and provide quick feedback.
Research Background and Motivation for InnovationWhile the PR-centric development model improves collaboration efficiency in software development, it also introduces significant challenges. On GitHub, for example, over 170 million PRs were merged in 2021. However, the exponential growth in PRs has posed a tremendous review burden on integrators. According to the literature, the average time from PR creation to closure is 37 days, which hinders projects’ timely progress. Current studies attempt to mitigate this issue by building prediction models to pre-screen PRs likely to be accepted and reduce review workload. However, most of the existing models rely on dynamic information, such as comments and discussions, after PR creation. Despite good predictive performance, such models fail to alleviate integrators’ initial workload as they depend on post-creation information.
To address this limitation, the study proposes “e-predictor,” a model that utilizes early-stage features and deep semantic features at the PR creation stage to make predictions. This allows integrators to assess PR priority and estimate workload earlier.
Research WorkflowThe research team implemented a systematic workflow to develop and validate e-predictor, dividing the effort into several stages.
Data Collection and PreprocessingThe research team collected PR data from 49 of the most popular open-source projects on GitHub, totaling 475,192 PRs. To ensure the representativeness of these projects, they adopted strict filtering criteria, excluding tutorial-type projects, forked branches, and projects only passively mirrored on GitHub as backups. They then used multiple criteria to determine whether a PR was merged, including:
Checking if the PR’s “merged at” field was non-null;
Confirming whether the PR’s commits were included in the main branch;
Matching the last three comments in the PR with keywords indicating a merge action (e.g., “merged” or “committed”);
Verifying whether a PR was referenced in commits from the repository.
PRs that did not meet any of the above criteria were regarded as rejected PRs.
The PR description underwent text preprocessing (removing special characters and other noise), and only source code changes (added and deleted lines) were retained for the code changes.
Feature Extractione-predictor extracts two main types of features: handcrafted statistical features and deep semantic features. The handcrafted features are divided into three dimensions:
Contributor Profile Features: Include the contributor’s historical activity prior to the PR submission, such as the number of owned projects, commit counts, submitted PRs, issue participation, and whether the user is a bot account.
PR-specific Features: Include PR description length, number of commits, lines of code changed, and whether files include tests or documentation.
Project Profile Features: Include the project’s overall status at the time of PR submission, such as the number of PRs submitted or the number of comments on issues in the past month.
For better semantic understanding, the research team employed pre-trained models for deep semantic feature extraction:

- PR descriptions: RoBERTa was used to encode the text into a 768-dimensional feature vector.

- Code changes: CodeBERT was employed to encode source code changes into a 768-dimensional feature vector.
The two feature vectors were then reduced and fused into a 30-dimensional deep semantic vector.
Prediction ModelTo build the prediction model, the team selected the XGBoost classifier, which is widely utilized in various prediction tasks due to its strong performance in handling large-scale, high-dimensional data. e-predictor ultimately combines 76 features (handcrafted and deep semantic features) to train the XGBoost model.
Research ResultsPerformance EvaluationThe team evaluated e-predictor’s performance using two validation strategies: 10-fold cross-validation and time-aware validation. Metrics such as F1-score, AUC (area under the curve), precision, and recall were used.
Results demonstrated outstanding performance by e-predictor in predicting whether a PR would be merged:
- In 10-fold cross-validation, e-predictor achieved an F1@Merge score of 90.1% and an AUC of 85.4%, significantly outperforming baseline models.
- In time-aware validation, its AUC reached 81.6%, again outperforming baseline approaches.
However, due to the dataset’s imbalance (approximately a 7:3 merge-to-rejection ratio), e-predictor’s performance in predicting rejected PRs (F1@Reject) was comparatively weaker at 60.5% (10-fold cross-validation). Nevertheless, its precision for rejecting PRs was acceptable at 75.1%, demonstrating an accurate identification of rejected cases.
Feature Importance AnalysisFurther analysis of handcrafted features revealed that PR description quality (e.g., level of detail) and contributors’ past experiences within the target project are key factors in determining PR acceptance. This underscores:
1. Clear and high-quality PR descriptions enhance acceptance chances.
2. Prior merge experience within the same project significantly improves the likelihood of subsequent PR acceptance.
Significance and ValueThe study’s significance lies in its proposal of a practical and efficient PR prediction method. e-predictor provides decision-making support at the time of PR creation through early-stage and deep semantic analysis, significantly reducing integrator workload. Moreover, this system offers contributors quick feedback, allowing immediate refinement of code or descriptions. Crucially, the research exemplifies how deep learning techniques can complement traditional feature engineering to solve real-world software engineering challenges.
This work not only enhances PR management efficiency in the OSS community but also lays foundational data and methodologies for studying collaboration patterns in OSS projects.
Conclusion and Future Prospectse-predictor presents an effective solution for automating PR processing, with considerable academic and practical value. While limitations remain (e.g., data imbalance issues), future expansions, including incorporating additional features and further optimizing algorithms, can improve the tool’s applicability and precision. The research team also plans to broaden the dataset scope and develop more generalizable and robust models to support a wider range of open-source projects.