Article Review Procedure
Academic Areas and Subjects
Herald of Advanced Information Technology
Search by article
Vol. 4 № 1
Vol. 4 № 2
Vol. 3 № 1
Vol. 3 № 2
Vol. 3 № 3
Vol. 3 № 4
Vol. 2 № 1
Vol. 2 № 2
Vol. 2 № 3
Vol. 2 № 4
Vol. 1 № 1
5 Oct 2021
On October 5, 2021, a business meeting was held between representatives of the EPAM Systems IT Company Denis Grinev and Sergey Garashchuk with the Rector of the State University “Odessa Polytechnic” Gennadii Alexandrovich Oborskiy.
17 Sept 2021
International Summer School "Augmented Reality and Tourism"
15 July 2021
We invite Master students to participate in the program 2ouble Degree - double degree program with the Slovak Republic
OPTIMIZATION OF ANALYSIS AND MINIMIZATION OF INFORMATION LOSSES IN TEXT MINING
Information is one of the most important resources of today's business environment. It is difficult for any company to succeed without having sufficient information about its customers, employees and other key stakeholders. Every day, companies receive unstructured and structured text from a variety of sources, such as survey results, tweets, call center notes, phone emails, online customer reviews, recorded interactions, emails and other documents. These sources provide raw text that is difficult to understand without using the right text analysis tool. You can do text analytics manually, but the manual process is inefficient. Traditional systems use keywords and cannot read and understand language in emails, tweets, web pages, and text documents. For this reason, companies use text analysis software to analyze large amounts of text data. The software helps users retrieve textual information to act accordingly The most common manual annotation is currently the most common, which can be attributed to the high quality of annotation and its “meaningfulness”. Typical disadvantages of manual annotation systems, textual information analysis systems are the high material costs and the inherent low speed of work. Therefore, the topic of this article is to explore the methods by which you can effectively annotate reviews of various products from the largest marketplace in Ukraine. The following tasks should be solved: to analyze modern approaches to data analysis and processing; to study basic algorithms for data analysis and processing; build a program that will collect data, design the program architecture for more efficient use, based on the use of the latest technologies; clear data using minimize information loss techniques; analyze the data collected, using data analysis and processing approaches; to draw conclusions from the results of all the above works. There are quite a number of varieties of the listed tasks, as well as methods of solving them. This again confirms the importance and relevance of the topic we choose. The purpose of the study is the methods and means by which information losses can be minimized when analyzing and processing textual data. The object of the study is the process of minimizing information losses in the analysis and processing of textual data. In the course of the study, recent research on the analysis and processing of textual information was analyzed; methods of textual information processing and Data Mining algorithms are analyzed.
Anna S. Kolomiiets
, Cand. of Economic Sciences
( email@example.com )
Olha O. Mezentseva
, Cand. of Economic Sciences
( firstname.lastname@example.org )
text analysis; annotation; text mining; software; algorithm; text data; natural language
1. Aronovich, E. (2012). “TF-IDF”. – Available at: https://www.cs.tau.ac.il/~nin/Courses/Workshop13a/ tf-idf.pdf. – Access date: 12.01.2020.
2. Barzilay, R. (2011). “Using Lexical Chains for Text Summarization”. – Available at: https://www.aclweb.org/anthology/W97-0703. – Access date: 12.12.2019.
3. Borgman, C. L. (2018). “Text Data Mining from the Author’s Perspective: Who’s Text, who’s mining, and to who’s Benefit?” – Available at: https://arxiv.org/pdf/1803.04552.pdf. – Access date: 24.12.2019.
4. Christopher, M. D. (2014). “The Stanford CoreNLP Natural Language Processing Toolkit”. – Available at: https://www.aclweb.org/ anthology/P14-5010. – Access date: 20.01.2020.
5. Kolesnikova, K., Lukianov, D., Gogunskii, V., Olekh, T. & Bespanskaya-Paulenka, K. (2017). “Communication management in social networks for the actualization of publications in the world scientific community on the example of the network researchgate”. Eastern-European Journal of Enterprise Technologies . Vol 4, No. 3 (88) , pp. 60- 65. – Available at: http://journals.uran.ua/eejet/article/view/108589. – Access date: 10.12.2019.
6. Kolomiets, A. & Tsesliv, O. (2017). “Technologiya pobydovi ta upravlinnya bazami ta shovischami danih (textbook)”. Publ. KPI, 281 p. (in Ukranian).
7. Mezentseva, O. (2019). “Intellectualization of enterprise management using business intelligence instruments”. Eastern-European Journal of Enterprise Technologies . Vol. 4, No. 3 (88) , pp. 60-65. – Available at: http://journals.uran.ua/tarp/article/view/179264. – Access date: 14.12.2019.
8. Miller, G. A. (1956). “The magical number seven, plus or minus two: Some limits on our capacity for processing information”. Psychological review, 63(2), pp. 81-97.
9. Morozov, V., Kalnichenko, O., Proskurin, M. & Mezentseva, O. (2019). “Investigation of Forecasting Methods of the State of Complex ITProjects with the Use of Deep Learning Neural Networks”, Advances in Intelligent Systems and Computing. – Available at: https://link.springer.com/chapter/10.1007/978-3- 030-26474-1_19. – Access date: 24.01.2020.
10. Morozov, V., Steshenko, G. & Kolomiiets, A. (2017). “Learning through practice in IT management projects master program implementation approach”. Proceedings of the 9th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications. – Available at: https://ieeexplore.ieee.org/document/8095223. – Access date: 15.01.2020.
11. Qingyu, Z. (2018). “Neural Document Summarization by Jointly Learning to Score and Select Sentencesю”. – Available at: https://arxiv.org/pdf/1807.02305v1.pdf. – Access date: 29.12.2019.
12. Rakshith, V. (2017). “What is One Hot Encoding? Why And When do you have to use it?” – Available at: https://hackernoon.com/what-isone-hot-encoding-why-and-when-do-you-haveto-use-it-e3c6186d008f. – Access date: 12.01.2020.
13. Redmore, S. (2019). “Machine Learning for Natural Language Processing”. – Available at: https://www.lexalytics.com/lexablog/machinelearning-vs-natural-language-processing-part-1. – Access date: 17.01.2020.
14. Sinha, S. (2019). “Extractive Text Summarization using Neural Networks”. – Available at: https://arxiv.org/pdf/1802.10137.pdf. – Access date: 12.12.2019.
15. Stilo, G. & Velardi, P. (2016). “Efficient temporal mining of micro-blog texts and its application to event discovery, Data Mining and Knowledge Discovery”, 30(2), pp. 372-402. – Available at: https://link.springer.com/article/10.1007/s10618- 015-0412-3. – Access date: 24.01.2020.
16. “SVM (Support Vector Machine) – Theory”. – Available at: https://medium.com/machine-learning-101/chapter2-svm-support-vector-machine-theory-f0812effc72. – Access date: 10.12.2019.
17. Wang, F. (2019). “Feature Learning Viewpoint of AdaBoost and a New Algorithm”. – Available at: https://arxiv.org/pdf/1904.03953.pdf. – Access date: 17.01.2020.
18. Wong, K. (2008). “Extractive Summarization Using Supervised and Semisupervised Learning”. – Available at: https://www.aclweb.org/ anthology/C08-1124. – Access date: 20.01.2020.
Received after revision 16.02.2020
Vol. 3 № 1, 2020
22 Oct 2021
Search by author
Methodological Principles of Information Technology
1. Mathematical Foundations ofInformation Technology
2. Modeling and Design of Information Technology and Systems
3. Quality Assurance and Estimation for Software Systems
4. Design of Software Systems and Services
5. Information Technology of Different NaturedataProcessing and Analysis, Intelligent Sensors
6. Intelligent Information Technology: Neural Networks, Machine Learning, Forecasting
7. Virtual and Augmented Reality
Information Technology in Computer Systems
1. Cloud Technology, DistributedComputing
2. Design of Computer Components, Systems, Networks
3. Computer Systems Diagnostics
4. Methods and Systems of Information Protection in Computer and Information Systems
5. Smart Cities, the Internet of Things: Components, Applications
6. Technology, Models and Methods of Information Security or Cybersecurity
Information Technology in Management
1. Information Support for the Construction of Management Systems of Productionfacilities and Technology
2. Energy-efficient Control Systems for Production and Power Plants
3. Traction Electrical Systems, on-board Power Supply Systems for Electric Vehicles
4. Renewable Energy Conversion Systems and Devices
5. Simulation and Diagnostics of Complex Systems and Processes
6. Power Electronic Single and Multilevel Converters
7. Sliding Control Systems
Information Technology in Socio-Economic, Organizational and Technical Systems
1. Information Technologies in Economics and Sociology
2. Information Technology in Project, Program andPortfolio Management
3. Information Technology in Education
4. Geographic Information Systems
5. Information Technology in Medicine
6. Information Technology in Life Safety
KarelWintersky ] [
[ © Odessa National Polytechnic University, 2018.]