Implementation of search robot's function to collect information in scientometric systems
DOI:
https://doi.org/10.36097/rsan.v1i32.1002Palabras clave:
search robot, spider, crawler, bot, parser, robot, Hirsch index, Scopus, pythoResumen
At present, the World Wide Web is developing rapidly, and every day the problem of automated collection and analysis of information placed on various web resources is becoming increasingly urgent. If in the 90s of the last century, the World Wide Web was a huge amount of poorly structured information, to search in which it was difficult for a person. It was then that the first developments in the field of automated agents began to appear, facilitating the task of finding the necessary information on the web. The main part of such systems is a search robot - a software package that navigates through web resources and collects information for a database. In the Kazan (Volga Region) Federal University, a monthly rating of academic staff is compiled based on data placed in the personal offices of employees in the Electronic University system. Now there is a need to move away from manually filling the Hirsch index in a personal account with KFU staff to avoid incorrect data filing and validation of the entered information by the Prospective Development Center. What was required was the creation of a search robot to automatically collect the Hirsch indices of KFU employees from the Scopus system. This article discusses the search robot: What is it? How does he work? How to write your program to collect information? All these issues were addressed in this article. The possible types of search robots and the whole process of their work were considered. The Scopus scientometric system and scientometric indicator - Hirsch index, its purpose, and calculation were considered. For implementation, the Python programming language was used and the tools for implementing HTTP requests and processing HTML pages were considered.
Descargas
Citas
Web Scraping with Python. Ryan Mitchell, 2015 https://yanfei.site/docs/dpsa/references/PyWebScrapingBook.pdf
Official documentation for Python library Beautiful Soup. https://www.crummy.com/software/BeautifulSoup/bs4/doc/
Requests Documentation Release 2.21.0. Kenneth Reitz, 2019 https://buildmedia.readthedocs.org/media/pdf/requests/master/requests.pdf
Scopus. 2018 https://ru.wikipedia.org/wiki/Scopus
Methods of search in the Scopus database. Dudnikova O.V., Bondarenko S.A., 2011 https://library.sfedu.ru/media/upload
/%20Материалы%20ДПО%20/Учебно-методическое%20пособие_Scopus2.pdf
Search robots. Markova T.I., Zakharova K.V. 2009 https://cyberleninka.ru/article/v/poiskovye-roboty
Adaptive crawler for searching and collecting external hyperlinks A.A. Pechnikov, D.I. Chernobrovkin. 2012 https://cyberleninka.ru/article/v/adaptivnyy-krauler-dlya-poiska-i-sbora-vneshnih-giperssylok
Hirsch Index. https://ru.wikipedia.org/wiki/Indeks_Hirsha
What is the Hirsch index and how to raise it? Alex Zvansky, 2017 https://wos-scopus.com/chto-takoe-indeks-hirsha/
HTTP response codes. 2019 https://developer.mozilla.org/ru/docs/Web/HTTP/Status
Search robot. https://ru.wikipedia.org/wiki/Search_robot
The search robot is what it is and how it works. http://seo-dnevnik.ru/blogosfera/poiskovyiy-robot-robotyi-poiskovyih-sistem.html
Bot (program). https://ru.wikipedia.org/wiki/bot_ (program)
What is a search robot? https://wiki.rookee.ru/poiskovyj-robot/
HTML. 2019, https://ru.wikipedia.org/wiki/HTML
Search robots. 2010, http://wiki.webimho.ru/search exploit
Search engine robots. 2006, https://www.seonews.ru/masterclasses/robotyi-poiskovyih-sistem/
Jahwari, N. A., & Khan, M. F. (2016). ORGANIZATIONAL LEARNING MECHANISMS IN SOHAR UNIVERSITY. Humanities & Social Sciences Reviews, 4(2), 76-87. https://doi.org/10.18510/hssr.2016.423
Shirvani, M., Mohammadi, A., & Shirvani, F. (2015). Comparative study of cultural and social factors affecting urban and rural women's Burnout in Shahrekord Township. UCT Journal of Management and Accounting Studies, 3(1), 1-4.