[1] |
KAMIYA T, KUSUMOTO S, INOUE K. CCFinder: a multilinguistic token-based code clone detection system for large scale source code[J]. IEEE Transactions on Software Engineering,2002,28(7): 654-670.
|
[2] |
BELLON S, KOSCHKE R, ANTONIOL G, et al. Comparison and evaluation of clone detection tools[J].IEEE Transactions on Software Engineering,2007,33(9): 577-591.
|
[3] |
LIU C, CHEN C, HAN J,et al. GPLAG: detection of software plagiarism by program dependence graph analysis[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Philadelphia, PA, USA, 2006.
|
[4] |
COSMA G, JOY M. Towards a definition of source-code plagiarism[J]. IEEE Transactions on Education,2008,51(2): 195-200.
|
[5] |
COSMA G, JOY M. An approach to source-code plagiarism detection and investigation using latent semantic analysis[J]. IEEE Transactions on Computers,2012,61(3): 379-394.
|
[6] |
MENS K, LOZANO A. Source Code-Based Recommendation Systems: Recommendation Systems in Software Engineering[M]. Springer, 2014: 93-130.
|
[7] |
MCMILLAN C, POSHYVANYK D, GRECHANIK M,et al. Portfolio: searching for relevant functions and their usages in millions of lines of code[J]. ACM Transactions on Software Engineering and Methodology,2013,22(4): 1-30. DOI: 10.1145/2522920.2522930.
|
[8] |
RAGKHITWETSAGUL C, KRINKE J, CLARK D. A comparison of code similarity analysers[J]. Empirical Software Engineering,2017,23: 2464-2517.
|
[9] |
ROY C K, CORDY J R. NICAD: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization[C]// Proceedings of IEEE International Conference on Program Comprehension.2008: 172-181.
|
[10] |
BAXTER I D, YAHIN A, MOURA L, et al. Clone detection using abstract syntax trees[C]//Proceedings of the Conference on Reverse Engineering.Benevento, Italy, 2006: 368-377.
|
[11] |
CHAE D K, HA J, KIM S W,et al. Software plagiarism detection: a graph-based approach[C]//Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management.ACM, 2013: 1577-1580.
|
[12] |
HINDLE A, BARR E T, SU Z. On the naturalness of software[C]//2012 34th International Conference on Software Engineering (ICSE).Zurich, Switzerland, 2012: 837-847.
|
[13] |
KARAIVANOV S, RAYCHEV V, VECHEV M T. Phrase-based statisticaltranslation of programming languages[C]//Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software.Portland, Oregon, USA, 2014: 173-184.
|
[14] |
RAYCHEV V, VECHEV M, YAHAV E. Code completion with statistical language models[C]// Proceedings of the 35th ACM Sigplan Conference on Programming Language Design and Implementation.Edinburgh, United Kingdom, 2014: 419-428.
|
[15] |
NGUYEN A T, NGUYEN T T, NGUYEN T N. Divide-and-conquer approach for multi-phase statistical migration for source code(T)[C]// Proceedings of the IEEE/ACM International Conference on Automated Software Engineering.Lincoln, NE, USA, 2016: 585-596.
|
[16] |
张峰逸, 彭鑫, 陈驰, 等. 基于深度学习的代码分析研究综述[J]. 计算机应用与软件, 2018,35(6): 9-17.(ZHANG Fengyi, PENG Xin, CHEN Chi, et al. Research on code analysis based on deep learning[J]. Computer Applications and Software, 2018,35(6): 9-17.(in Chinese))
|
[17] |
陈秋远, 李善平, 鄢萌, 等. 代码克隆检测研究进展[J]. 软件学报, 2019,30(4): 962-980.(CHEN Qiuyuan, LI Shanping, YAN Meng, et al. Code clone detection: a literature review[J]. Journal of Software,2019,30(4): 962-980.(in Chinese))
|
[18] |
TUFANO M, WATSON C, GABRIELE B, et al. Deep learning similarities from different representations of source code[C]// Proceedings of the 15th International Conference on Mining Software Repositories.New York, USA, 2018: 542-553.
|
[19] |
HELLENDOORN V J , DEVANBU P. Are deep neural networks the best choice for modeling source code?[C]//Proceedings of the 11th Joint Meeting.Paderborn, Germany, 2017: 763-773.
|
[20] |
HALSTEAD M H. Elements of Software Science[M]. New York: Elsevier North-Holland, 1977.
|
[21] |
KOMONDOOR R, HORWITZ S. Using slicing to identify duplication in source code[C]// Proceedings of International Symposium on Static Analysis.Berlin, Heidelberg, 2001.
|
[22] |
ARROYO-FERNNDEZ I, MNDEZ-CRUZ C F, SIERRA G, et al. Unsupervised sentence representations as word information series: revisiting TF-IDF[J]. Computer Speech & Language,2019,56: 107-129.
|
[23] |
何绪飞, 艾剑良, 宋智桃. 多元数据融合在无人机结构-健康监测中的应用[J]. 应用数学和力学, 2018,〖STHZ〗 39(4): 395-402.(HE Xufei, AI Jianliang, SONG Zhitao. Multi-source data fusion for health monitoring of unmanned aerial vehicle structures[J]. Applied Mathematics and Mechanics,2018,39(4): 395-402.(in Chinese))
|
[24] |
NGUYEN A T, NGUYEN T D, PHAN H D,et al. A deep neural network language model with contexts for source code[C]// Proceedings of IEEE International Conference on Software Analysis.Campobasso, Italy, 2018: 323-334.
|
[25] |
OTTENSTEIN K J. An algorithmic approach to the detection and prevention of plagiarism[J]. ACM SIGCSE Bulletin,1976,8(4): 30-41.
|
[26] |
WHITE M, TUFANO M, VENDOME C,et al. Deep learning code fragments for code clone detection[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).Singapore, 2016: 87-98.
|
[27] |
LAM A N, NGUYEN A T, NGUYEN H A,et al. Combining deep learning with information retrieval to localize buggy files for bug reports[C]// Proceedings of 2015 30th IEEE/ACM International Conference on Automated Software Engineering(ASE).Lincoln, NE, USA, 2015: 476-481.
|
[28] |
HUO X, THUNG F, LI M. Deep transfer bug localization[J]. IEEE Transactions on Software Engineering,2019. DOI: 10.1109/TSE.2019.2920771.
|
[29] |
MOU L, LI G, JIN Z, et al. TBCNN: a Tree-Based Convolutional Neural Network for Programming Language Processing[M]. Eprint Arxiv, 2014.
|
[30] |
WHITE M, TUFANO M, MARTNEZ M,et al. Sorting and transforming program repair ingredients via deep learning code similarities[C]//Proceedings of 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).Hangzhou, China, 2019: 479-490.
|
[31] |
MIKOLOV T, SUTSKEVER I, KAI C, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems,2013,26: 3111-3119.
|
[32] |
YE X, SHEN H, MA X, et al. From word embeddings to document similarities for improved information retrieval in software engineering[C]//Proceeding of IEEE/ACM International Conference on Software Engineering.2016.
|
[33] |
NGUYEN T D, NGUYEN A T, PHAN H D, et al. Exploring API embedding for api usages and applications[C]// Proceedingof IEEE/ACM International Conference on Software Engineering.Buenos Aires, Argentina, 2017: 438-449.
|
[34] |
CHEN C, XING Z, WANG X. Unsupervised software-specific morphological forms inference from informal discussions[C]// Proceeding of IEEE/ACM International Conference on Software Engineering.Buenos Aires, Argentina, 2017: 450-461.
|
[35] |
HAO P, MOU L, GE L, et al. Building program vector representations for deep learning[C]//Proceeding of International Conference on Knowledge Science.2015: 547-553.
|