
Research Article
A Code Completion Approach Combining Pointer Network and Transformer-XL Network
@INPROCEEDINGS{10.1007/978-3-031-54521-4_17, author={Xiangping Zhang and Jianxun Liu and Teng Long and Haize Hu}, title={A Code Completion Approach Combining Pointer Network and Transformer-XL Network}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 19th EAI International Conference, CollaborateCom 2023, Corfu Island, Greece, October 4-6, 2023, Proceedings, Part I}, proceedings_a={COLLABORATECOM}, year={2024}, month={2}, keywords={Code Completion Transformer-XL Pointer Network Out-of-Vocabulary}, doi={10.1007/978-3-031-54521-4_17} }
- Xiangping Zhang
Jianxun Liu
Teng Long
Haize Hu
Year: 2024
A Code Completion Approach Combining Pointer Network and Transformer-XL Network
COLLABORATECOM
Springer
DOI: 10.1007/978-3-031-54521-4_17
Abstract
Code completion is an integral component of modern integrated development environments, as it not only facilitates the software development process but also enhances the quality of software products. By leveraging large-scale codes to learn the probability distribution among code token units, deep learning methods have demonstrated significant improvements in the accuracy of token unit recommendations. However, the effectiveness of code completion with deep learning techniques is hindered by information loss. To alleviate the above problem, we proposed a code language model which combines the pointer network and Transformer-XL network to overcome the limitations of existing approaches in code completion. The proposed model takes as input the original code fragment and its corresponding abstract syntax tree and leverages the Transformer-XL model as the basis model for capturing long-term dependencies. Furthermore, we integrate a pointer network as a local component to predict the out-of-vocabulary words. The proposed method is evaluated on real PY150 and JS150 datasets. The comparative experimental results demonstrate the effectiveness of our model in improving the accuracy of the code completion task at the token unit level.