Performance Comparison of BERT, ALBERT and RoBERTa for Sentiment Analysis in Critical Pilot Communication Prior to Aviation Accident.
Keywords:
Sentiment Analysis, RoBERTa, Natural Languange ProcessingAbstract
Purpose: This research aims to analyze the emotional states of pilots during critical situations before aviation accidents by applying sentiment analysis to pilot communication. It addresses the challenge of identifying emotional cues, stress levels, and urgency in pilot dialogues, crucial for aviation safety. The study evaluates and compares the performance of transformer models—BERT, ALBERT, and RoBERTa—in analyzing these emotional factors, contributing to enhanced situational awareness and safety.
Methods: The primary data used was the Cockpit Voice Recorder (CVR), which captures real-time pilot communication. BERT, ALBERT, and RoBERTa were employed for sentiment analysis, trained on domain-specific data to detect emotional distress and stress in critical situations. Model performance was evaluated using metrics such as precision, recall, F1 score, and support to assess accuracy in identifying emotional cues and overall effectiveness in classifying different emotional states.
Results: All models performed well with an accuracy of around 80%. While BERT excelled in detecting negative stressed sentiment, it struggled with neutral sentiment. RoBERTa outperformed both BERT and ALBERT by 5-10% in identifying negative stressed conversations, with higher precision and recall.
Conclusions: RoBERTa is the most effective model for sentiment analysis of pilot conversations, particularly in detecting stress and urgency, crucial for aviation safety. It was more stable and better at handling emotional variations. Further improvements in fine-tuning or exploring data augmentation could enhance its accuracy.
References
Aftab, F., Bazai, S. U., Marjan, S., Baloch, L., Aslam, S., Amphawan, A., & Neo, T. K. (2023). A Comprehensive Survey on Sentiment Analysis Techniques. International Journal of Technology, 14(6), 1288–1298. https://doi.org/10.14716/ijtech.v14i6.6632
Alamsyah, A., & Girawan, N. D. (2023). Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language Model. Big Data and Cognitive Computing, 7(4). https://doi.org/10.3390/bdcc7040168
Ashforth, .B, & Ashforth, .B. (1986). from the SAGE Social Science Collections . Rights Reserved . The ANNALS of the American Academy of Political and Social Science, 503(1), 122–136.
Azizah, S. F. N., Cahyono, H. D., Sihwi, S. W., & Widiarto, W. (2023). Performance Analysis of Transformer Based Models (BERT, ALBERT, and RoBERTa) in Fake News Detection. 2023 6th International Conference on Information and Communications Technology, ICOIACT 2023, November, 425–430. https://doi.org/10.1109/ICOIACT59844.2023.10455849
Causse, M., Dehais, F., Péran, P., Sabatini, U., & Pastor, J. (2013). The effects of emotion on pilot decision-making: A neuroergonomic approach to aviation safety. Transportation Research Part C: Emerging Technologies, 33(August), 272–281. https://doi.org/10.1016/j.trc.2012.04.005
Eang, C., & Lee, S. (2024). Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based). Applied Sciences (Switzerland), 14(18). https://doi.org/10.3390/app14188388
Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences of the United States of America, 120(30), 1–3. https://doi.org/10.1073/pnas.2305016120
Jones, R. K. (2003). Miscommunication between pilots and air traffic control. Language Problems and Language Planning, 27(3), 233–248. https://doi.org/10.1075/lplp.27.3.03jon
Kayten, P. (2017). The Application of CVR and FDR Data In Human Performance Investigations. September 1985.
Kim, K. H., & Jeong, C. S. (2023). F-ALBERT: A Distilled Model from a Two-Time Distillation System for Reduced Computational Complexity in ALBERT Model. Applied Sciences (Switzerland), 13(17). https://doi.org/10.3390/app13179530
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2020). Albert: a Lite Bert for Self-Supervised Learning of Language Representations. 8th International Conference on Learning Representations, ICLR 2020, October.
Narayanaswamy, G. R. (2021). Exploiting BERT and RoBERTa to Improve Performance for Aspect Based Sentiment Analysis. Dissertations. https://doi.org/10.21427/3w9n-we77
NASA. (2015). Effects of Acute Stress on Aircrew Performance: Literature Review and Analysis of Operational Aspects. NASA Technical Memorandum 2015-218930, August, 1–30.
Noort, M. C., Reader, T. W., & Gillespie, A. (2021). Cockpit voice recorder transcript data: Capturing safety voice and safety listening during historic aviation accidents. Data in Brief, 39, 107602. https://doi.org/10.1016/j.dib.2021.107602
Oliaee, A. H., Das, S., Liu, J., & Rahman, M. A. (2023). Using Bidirectional Encoder Representations from Transformers (BERT) to classify traffic crash severity types. Natural Language Processing Journal, 3(April), 100007. https://doi.org/10.1016/j.nlp.2023.100007
Özkurt, C. (2024). Comparative Analysis of State-of-the-Art Q A Models: BERT, RoBERTa, DistilBERT, and ALBERT on SQuAD v2 Dataset. Chaos and Fractals, 0–22. https://doi.org/10.69882/adba.chf.2024073
Sayeed, M. S., Mohan, V., & Muthu, K. S. (2023). BERT: A Review of Applications in Sentiment Analysis. HighTech and Innovation Journal, 4(2), 453–462. https://doi.org/10.28991/HIJ-2023-04-02-015
Singla, A. (2024). Roberta and BERT: Revolutionizing Mental Healthcare through Natural Language. Shodh Sagar Journal of Artificial Intelligence and Machine Learning, 1(1), 10–27. https://doi.org/10.36676/ssjaiml.v1.i1.02
Sy, C. Y., Maceda, L. L., Canon, M. J. P., & Flores, N. M. (2024). Beyond BERT: Exploring the Efficacy of RoBERTa and ALBERT in Supervised Multiclass Text Classification. International Journal of Advanced Computer Science and Applications, 15(3), 223–233. https://doi.org/10.14569/IJACSA.2024.0150323
Tamrakar, A. K. (2022). Natural Language Processing in Artificial Intelligence, NLPinAI 2021. Studies in Computational Intelligence, 999 SCI(April).
Tikayat Ray, A., Bhat, A. P., White, R. T., Nguyen, V. M., Pinon Fischer, O. J., & Mavris, D. N. (2023). Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace, 10(9). https://doi.org/10.3390/aerospace10090770
Vaeng, K. A. (2012). School of Hotel Management Master ’ S Thesis. 1–127.
van Atteveldt, W., van der Velden, M. A. C. G., & Boukes, M. (2021). The Validity of Sentiment Analysis: Comparing Manual Annotation, Crowd-Coding, Dictionary Approaches, and Machine Learning Algorithms. Communication Methods and Measures, 15(2), 121–140. https://doi.org/10.1080/19312458.2020.1869198
Wu, Y., Jin, Z., Shi, C., Liang, P., & Zhan, T. (2024). Research on the application of deep learning-based BERT model in sentiment analysis. Applied and Computational Engineering, 71(1), 14–20. https://doi.org/10.54254/2755-2721/71/2024ma
Zinaida, R. S. (2022). Optimizing Air Traffic Controller Communication for Enhanced Flight Safety : A Case Study of AirNav Indonesia Palembang. 03(01), 1307–1314. https://doi.org/10.12928/sylection.v3i1.14517
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Prof. Dr. Yus M. Cholily.,M.Si, Very Sugiarto, S.Pd.,M.Kom , Alvionitha Sari Agstriningtyas, M.Tr.T (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.


