| [1] | KAMIYA T, KUSUMOTO S, INOUE K. CCFinder: a multilinguistic token-based code clone detection system for large scale source code[J]. IEEE Transactions on Software Engineering,2002,28(7): 654-670. | 
		
				| [2] | BELLON S, KOSCHKE R, ANTONIOL G, et al. Comparison and evaluation of clone detection tools[J].IEEE Transactions on Software Engineering,2007,33(9): 577-591. | 
		
				| [3] | LIU C, CHEN C, HAN J,et al. GPLAG: detection of software plagiarism by program dependence graph analysis[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Philadelphia, PA, USA, 2006. | 
		
				| [4] | COSMA G, JOY M. Towards a definition of source-code plagiarism[J].   IEEE Transactions on Education,2008,51(2): 195-200. | 
		
				| [5] | COSMA G, JOY M. An approach to source-code plagiarism detection and investigation using latent semantic analysis[J].  IEEE Transactions on Computers,2012,61(3): 379-394. | 
		
				| [6] | MENS K, LOZANO A.  Source Code-Based Recommendation Systems: Recommendation Systems in Software Engineering[M]. Springer, 2014: 93-130. | 
		
				| [7] | MCMILLAN C, POSHYVANYK D, GRECHANIK M,et al. Portfolio: searching for relevant functions and their usages in millions of lines of code[J].  ACM Transactions on Software Engineering and Methodology,2013,22(4): 1-30. DOI:  10.1145/2522920.2522930. | 
		
				| [8] | RAGKHITWETSAGUL C, KRINKE J, CLARK D. A comparison of code similarity analysers[J].  Empirical Software Engineering,2017,23: 2464-2517. | 
		
				| [9] | ROY C K, CORDY J R. NICAD: accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization[C]// Proceedings of IEEE International Conference on Program Comprehension.2008: 172-181. | 
		
				| [10] | BAXTER I D, YAHIN A, MOURA L, et al. Clone detection using abstract syntax trees[C]//Proceedings of the Conference on Reverse Engineering.Benevento, Italy, 2006: 368-377. | 
		
				| [11] | CHAE D K, HA J, KIM S W,et al. Software plagiarism detection: a graph-based approach[C]//Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management.ACM, 2013: 1577-1580. | 
		
				| [12] | HINDLE A, BARR E T, SU Z. On the naturalness of software[C]//2012 34th International Conference on Software Engineering (ICSE).Zurich, Switzerland, 2012: 837-847. | 
		
				| [13] | KARAIVANOV S, RAYCHEV V, VECHEV M T. Phrase-based statisticaltranslation of programming languages[C]//Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software.Portland, Oregon, USA, 2014: 173-184. | 
		
				| [14] | RAYCHEV V, VECHEV M, YAHAV E. Code completion with statistical language models[C]// Proceedings of the 35th ACM Sigplan Conference on Programming Language Design and Implementation.Edinburgh, United Kingdom, 2014: 419-428. | 
		
				| [15] | NGUYEN A T, NGUYEN T T, NGUYEN T N. Divide-and-conquer approach for multi-phase statistical migration for source code(T)[C]// Proceedings of the IEEE/ACM International Conference on Automated Software Engineering.Lincoln, NE, USA, 2016: 585-596. | 
		
				| [16] | 张峰逸, 彭鑫, 陈驰, 等. 基于深度学习的代码分析研究综述[J]. 计算机应用与软件, 2018,35(6): 9-17.(ZHANG Fengyi, PENG Xin, CHEN Chi, et al. Research on code analysis based on deep learning[J].  Computer Applications and Software, 2018,35(6): 9-17.(in Chinese)) | 
		
				| [17] | 陈秋远, 李善平, 鄢萌, 等. 代码克隆检测研究进展[J]. 软件学报, 2019,30(4): 962-980.(CHEN Qiuyuan, LI Shanping, YAN Meng, et al. Code clone detection: a literature review[J].  Journal of Software,2019,30(4): 962-980.(in Chinese)) | 
		
				| [18] | TUFANO M, WATSON C, GABRIELE B, et al. Deep learning similarities from different representations of source code[C]// Proceedings of the 15th International Conference on Mining Software Repositories.New York, USA, 2018: 542-553. | 
		
				| [19] | HELLENDOORN V J , DEVANBU P. Are deep neural networks the best choice for modeling source code?[C]//Proceedings of the 11th Joint Meeting.Paderborn, Germany, 2017: 763-773. | 
		
				| [20] | HALSTEAD M H.  Elements of Software Science[M]. New York: Elsevier North-Holland, 1977. | 
		
				| [21] | KOMONDOOR R, HORWITZ S. Using slicing to identify duplication in source code[C]// Proceedings of International Symposium on Static Analysis.Berlin, Heidelberg, 2001. | 
		
				| [22] | ARROYO-FERNNDEZ I, MNDEZ-CRUZ C F, SIERRA G, et al. Unsupervised sentence representations as word information series: revisiting TF-IDF[J].  Computer Speech & Language,2019,56: 107-129. | 
		
				| [23] | 何绪飞, 艾剑良, 宋智桃. 多元数据融合在无人机结构-健康监测中的应用[J]. 应用数学和力学, 2018,〖STHZ〗 39(4): 395-402.(HE Xufei, AI Jianliang, SONG Zhitao. Multi-source data fusion for health monitoring of unmanned aerial vehicle structures[J].  Applied Mathematics and Mechanics,2018,39(4): 395-402.(in Chinese)) | 
		
				| [24] | NGUYEN A T, NGUYEN T D, PHAN H D,et al. A deep neural network language model with contexts for source code[C]// Proceedings of IEEE International Conference on Software Analysis.Campobasso, Italy, 2018: 323-334. | 
		
				| [25] | OTTENSTEIN K J. An algorithmic approach to the detection and prevention of plagiarism[J].  ACM SIGCSE Bulletin,1976,8(4): 30-41. | 
		
				| [26] | WHITE M, TUFANO M, VENDOME C,et al. Deep learning code fragments for code clone detection[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).Singapore, 2016: 87-98. | 
		
				| [27] | LAM A N, NGUYEN A T, NGUYEN H A,et al. Combining deep learning with information retrieval to localize buggy files for bug reports[C]// Proceedings of 2015 30th IEEE/ACM International Conference on Automated Software Engineering(ASE).Lincoln, NE, USA, 2015: 476-481. | 
		
				| [28] | HUO X, THUNG F, LI M. Deep transfer bug localization[J].  IEEE Transactions on Software Engineering,2019. DOI:  10.1109/TSE.2019.2920771. | 
		
				| [29] | MOU L, LI G, JIN Z, et al.  TBCNN: a Tree-Based Convolutional Neural Network for Programming Language Processing[M]. Eprint Arxiv, 2014. | 
		
				| [30] | WHITE M, TUFANO M, MARTNEZ M,et al. Sorting and transforming program repair ingredients via deep learning code similarities[C]//Proceedings of 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER).Hangzhou, China, 2019: 479-490. | 
		
				| [31] | MIKOLOV T, SUTSKEVER I, KAI C, et al. Distributed representations of words and phrases and their compositionality[J].  Advances in Neural Information Processing Systems,2013,26: 3111-3119. | 
		
				| [32] | YE X, SHEN H, MA X, et al. From word embeddings to document similarities for improved information retrieval in software engineering[C]//Proceeding of IEEE/ACM International Conference on Software Engineering.2016. | 
		
				| [33] | NGUYEN T D, NGUYEN A T, PHAN H D, et al. Exploring API embedding for api usages and applications[C]// Proceedingof IEEE/ACM International Conference on Software Engineering.Buenos Aires, Argentina, 2017: 438-449. | 
		
				| [34] | CHEN C, XING Z, WANG X. Unsupervised software-specific morphological forms inference from informal discussions[C]// Proceeding of IEEE/ACM International Conference on Software Engineering.Buenos Aires, Argentina, 2017: 450-461. | 
		
				| [35] | HAO P, MOU L, GE L, et al. Building program vector representations for deep learning[C]//Proceeding of International Conference on Knowledge Science.2015: 547-553. |