Learning to Represent Patches

Computing and Communications

Electronic data

pdf
1.38 MB, PDF document

Keywords

Computer Science - Software Engineering

View graph of relations

Research output: Working paper › Preprint

Published

Standard

Learning to Represent Patches. / Tang, Xunzhu; Tian, Haoye; Chen, Zhenghan et al.
2023.

Research output: Working paper › Preprint

Harvard

Tang, X, Tian, H, Chen, Z, Pian, W, Ezzini, S, Kader Kabore, A, Habib, A, Klein, J & Bissyande, TF 2023 'Learning to Represent Patches'. <http://adsabs.harvard.edu/abs/2023arXiv230816586T>

APA

Tang, X., Tian, H., Chen, Z., Pian, W., Ezzini, S., Kader Kabore, A., Habib, A., Klein, J., & Bissyande, T. F. (2023). Learning to Represent Patches. http://adsabs.harvard.edu/abs/2023arXiv230816586T

Vancouver

Tang X, Tian H, Chen Z, Pian W, Ezzini S, Kader Kabore A et al. Learning to Represent Patches. 2023 Aug 1.

Author

Tang, Xunzhu ; Tian, Haoye ; Chen, Zhenghan et al. / Learning to Represent Patches. 2023.

Bibtex

@techreport{8df30737bf4d4d7491964cc4ea25bde9,

title = "Learning to Represent Patches",

abstract = "Patch representation is crucial in automating various software engineering tasks, like determining patch accuracy or summarizing code changes. While recent research has employed deep learning for patch representation, focusing on token sequences or Abstract Syntax Trees (ASTs), they often miss the change's semantic intent and the context of modified lines. To bridge this gap, we introduce a novel method, Patcherizer. It delves into the intentions of context and structure, merging the surrounding code context with two innovative representations. These capture the intention in code changes and the intention in AST structural modifications pre and post-patch. This holistic representation aptly captures a patch's underlying intentions. Patcherizer employs graph convolutional neural networks for structural intention graph representation and transformers for intention sequence representation. We evaluated Patcherizer's embeddings' versatility in three areas: (1) Patch description generation, (2) Patch accuracy prediction, and (3) Patch intention identification. Our experiments demonstrate the representation's efficacy across all tasks, outperforming state-of-the-art methods. For example, in patch description generation, Patcherizer excels, showing an average boost of 19.39% in BLEU, 8.71% in ROUGE-L, and 34.03% in METEOR scores.",

keywords = "Computer Science - Software Engineering",

author = "Xunzhu Tang and Haoye Tian and Zhenghan Chen and Weiguo Pian and Saad Ezzini and {Kader Kabore}, Abdoul and Andrew Habib and Jacques Klein and Bissyande, {Tegawende F.}",

year = "2023",

month = aug,

day = "1",

language = "English",

type = "WorkingPaper",

}

RIS

TY - UNPB

T1 - Learning to Represent Patches

AU - Tang, Xunzhu

AU - Tian, Haoye

AU - Chen, Zhenghan

AU - Pian, Weiguo

AU - Ezzini, Saad

AU - Kader Kabore, Abdoul

AU - Habib, Andrew

AU - Klein, Jacques

AU - Bissyande, Tegawende F.

PY - 2023/8/1

Y1 - 2023/8/1

N2 - Patch representation is crucial in automating various software engineering tasks, like determining patch accuracy or summarizing code changes. While recent research has employed deep learning for patch representation, focusing on token sequences or Abstract Syntax Trees (ASTs), they often miss the change's semantic intent and the context of modified lines. To bridge this gap, we introduce a novel method, Patcherizer. It delves into the intentions of context and structure, merging the surrounding code context with two innovative representations. These capture the intention in code changes and the intention in AST structural modifications pre and post-patch. This holistic representation aptly captures a patch's underlying intentions. Patcherizer employs graph convolutional neural networks for structural intention graph representation and transformers for intention sequence representation. We evaluated Patcherizer's embeddings' versatility in three areas: (1) Patch description generation, (2) Patch accuracy prediction, and (3) Patch intention identification. Our experiments demonstrate the representation's efficacy across all tasks, outperforming state-of-the-art methods. For example, in patch description generation, Patcherizer excels, showing an average boost of 19.39% in BLEU, 8.71% in ROUGE-L, and 34.03% in METEOR scores.

AB - Patch representation is crucial in automating various software engineering tasks, like determining patch accuracy or summarizing code changes. While recent research has employed deep learning for patch representation, focusing on token sequences or Abstract Syntax Trees (ASTs), they often miss the change's semantic intent and the context of modified lines. To bridge this gap, we introduce a novel method, Patcherizer. It delves into the intentions of context and structure, merging the surrounding code context with two innovative representations. These capture the intention in code changes and the intention in AST structural modifications pre and post-patch. This holistic representation aptly captures a patch's underlying intentions. Patcherizer employs graph convolutional neural networks for structural intention graph representation and transformers for intention sequence representation. We evaluated Patcherizer's embeddings' versatility in three areas: (1) Patch description generation, (2) Patch accuracy prediction, and (3) Patch intention identification. Our experiments demonstrate the representation's efficacy across all tasks, outperforming state-of-the-art methods. For example, in patch description generation, Patcherizer excels, showing an average boost of 19.39% in BLEU, 8.71% in ROUGE-L, and 34.03% in METEOR scores.

KW - Computer Science - Software Engineering

M3 - Preprint

BT - Learning to Represent Patches

ER -

Research

Electronic data

Links

Keywords