Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на этот ресурс: https://dspace.uzhnu.edu.ua/jspui/handle/lib/42600
Название: Development of the combined method of identification of near duplicates in electronic scientific works
Авторы: Лізунов, Петро Петрович
Білощицький, Андрій Олександрович
Кучанський, Олександр Юрійович
Андрашко, Юрій Васильович
Білощицька, Світлана
Сербін, Олег
Ключевые слова: near-duplicate, electronic scientific paper, antiplagiarism system, locally sensitive hashing
Дата публикации: 2021
Издательство: Eastern-European Journal of Enterprise Technologies
Библиографическое описание: 8. Lizunov P., Biloshchytskyi A., Kuchansky A., Andrashko Y., Biloshchytska S., Serbin O. Development of the combined method of identification of near duplicates in electronic scientific works. Eastern-European Journal of Enterprise Technologies. 2021. Vol. 4/4 (112). P. 57–63. DOI: https://doi.org/10.15587/1729-4061.2021.238318
Краткий осмотр (реферат): The methods for identification of near-du-plicates in electronic scientific papers, which include the content of the same type, for example, text data, mathematical formulas, numerical data, etc. were described. For text data, the method of locally sensitive hash-ing with the finding of Hamming distance between the elements of indices of electronic scientific papers was formalized. If Hamming distance exceeds a fixed numerical threshold, a scientific paper contains a near-duplicate. For numerical data, sub-sequences for each scientific work are formed and the proximi-ty between the papers is determined as the Euclidian distance between the vectors con-sisting of the numbers of these sub-sequences. To compare mathematical formulas, the me-thod for comparing the sample of formulas is used and the names of variables are com-pared. To identify near-duplicates in graphic information, two directions are separated: finding key points in the image and apply-ing locally sensitive hashing for individual pixels of the image. Since scientific papers often include such objects as schemes and diagrams, subscriptions to them are exami-ned separately using the methods for compar-ing text information. The combined method for identification of near-duplicates in elec-tronic scientific papers, which combines the methods for identification of near-dupli-cates of various types of data, was proposed. To implement the combined method for the identification of near-duplicates in electro-nic scientific papers, an information-analyti-cal system that processes scientific materials depending on the content type was devised. This makes it possible to qualitatively identi-fy near-duplicates and as widely as possible identify possible abuses and plagiarism in electronic scientific papers: scientific arti-cles, dissertations, monographs, conference materials, etc
Тип: Text
Тип публикации: Стаття
URI (Унифицированный идентификатор ресурса): https://dspace.uzhnu.edu.ua/jspui/handle/lib/42600
ISSN: 1729-3774
Располагается в коллекциях:Наукові публікації кафедри cистемного аналізу та теорії оптимізації

Файлы этого ресурса:
Файл Описание РазмерФормат 
238318-Article Text-549440-2-10-20210901.pdf184.97 kBAdobe PDFПросмотреть/Открыть


Все ресурсы в архиве электронных ресурсов защищены авторским правом, все права сохранены.