Performance Comparison of BERT, ALBERT and RoBERTa for Sentiment Analysis in Critical Pilot Communication Prior to Aviation Accident.

Authors

  • Yus M. Cholily universitas muhammadiyah malang Author
  • Very Sugiarto State Polytechnic of Malang Author
  • Alvionitha Sari Agstriningtyas State Polytechnic of Malang Author https://orcid.org/0009-0006-7388-1293

Keywords:

Sentiment Analysis, RoBERTa, Natural Languange Processing

Abstract

Purpose: This research aims to analyze the emotional states of pilots during critical situations before aviation accidents by applying sentiment analysis to pilot communication. It addresses the challenge of identifying emotional cues, stress levels, and urgency in pilot dialogues, crucial for aviation safety. The study evaluates and compares the performance of transformer models—BERT, ALBERT, and RoBERTa—in analyzing these emotional factors, contributing to enhanced situational awareness and safety.

Methods: The primary data used was the Cockpit Voice Recorder (CVR), which captures real-time pilot communication. BERT, ALBERT, and RoBERTa were employed for sentiment analysis, trained on domain-specific data to detect emotional distress and stress in critical situations. Model performance was evaluated using metrics such as precision, recall, F1 score, and support to assess accuracy in identifying emotional cues and overall effectiveness in classifying different emotional states.

Results: All models performed well with an accuracy of around 80%. While BERT excelled in detecting negative stressed sentiment, it struggled with neutral sentiment. RoBERTa outperformed both BERT and ALBERT by 5-10% in identifying negative stressed conversations, with higher precision and recall.

Conclusions: RoBERTa is the most effective model for sentiment analysis of pilot conversations, particularly in detecting stress and urgency, crucial for aviation safety. It was more stable and better at handling emotional variations. Further improvements in fine-tuning or exploring data augmentation could enhance its accuracy.

References

Aftab, F., Bazai, S. U., Marjan, S., Baloch, L., Aslam, S., Amphawan, A., & Neo, T. K. (2023). A Comprehensive Survey on Sentiment Analysis Techniques. International Journal of Technology, 14(6), 1288–1298. https://doi.org/10.14716/ijtech.v14i6.6632

Alamsyah, A., & Girawan, N. D. (2023). Improving Clothing Product Quality and Reducing Waste Based on Consumer Review Using RoBERTa and BERTopic Language Model. Big Data and Cognitive Computing, 7(4). https://doi.org/10.3390/bdcc7040168

Ashforth, .B, & Ashforth, .B. (1986). from the SAGE Social Science Collections . Rights Reserved . The ANNALS of the American Academy of Political and Social Science, 503(1), 122–136.

Azizah, S. F. N., Cahyono, H. D., Sihwi, S. W., & Widiarto, W. (2023). Performance Analysis of Transformer Based Models (BERT, ALBERT, and RoBERTa) in Fake News Detection. 2023 6th International Conference on Information and Communications Technology, ICOIACT 2023, November, 425–430. https://doi.org/10.1109/ICOIACT59844.2023.10455849

Causse, M., Dehais, F., Péran, P., Sabatini, U., & Pastor, J. (2013). The effects of emotion on pilot decision-making: A neuroergonomic approach to aviation safety. Transportation Research Part C: Emerging Technologies, 33(August), 272–281. https://doi.org/10.1016/j.trc.2012.04.005

Eang, C., & Lee, S. (2024). Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based). Applied Sciences (Switzerland), 14(18). https://doi.org/10.3390/app14188388

Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences of the United States of America, 120(30), 1–3. https://doi.org/10.1073/pnas.2305016120

Jones, R. K. (2003). Miscommunication between pilots and air traffic control. Language Problems and Language Planning, 27(3), 233–248. https://doi.org/10.1075/lplp.27.3.03jon

Kayten, P. (2017). The Application of CVR and FDR Data In Human Performance Investigations. September 1985.

Kim, K. H., & Jeong, C. S. (2023). F-ALBERT: A Distilled Model from a Two-Time Distillation System for Reduced Computational Complexity in ALBERT Model. Applied Sciences (Switzerland), 13(17). https://doi.org/10.3390/app13179530

Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2020). Albert: a Lite Bert for Self-Supervised Learning of Language Representations. 8th International Conference on Learning Representations, ICLR 2020, October.

Narayanaswamy, G. R. (2021). Exploiting BERT and RoBERTa to Improve Performance for Aspect Based Sentiment Analysis. Dissertations. https://doi.org/10.21427/3w9n-we77

NASA. (2015). Effects of Acute Stress on Aircrew Performance: Literature Review and Analysis of Operational Aspects. NASA Technical Memorandum 2015-218930, August, 1–30.

Noort, M. C., Reader, T. W., & Gillespie, A. (2021). Cockpit voice recorder transcript data: Capturing safety voice and safety listening during historic aviation accidents. Data in Brief, 39, 107602. https://doi.org/10.1016/j.dib.2021.107602

Oliaee, A. H., Das, S., Liu, J., & Rahman, M. A. (2023). Using Bidirectional Encoder Representations from Transformers (BERT) to classify traffic crash severity types. Natural Language Processing Journal, 3(April), 100007. https://doi.org/10.1016/j.nlp.2023.100007

Özkurt, C. (2024). Comparative Analysis of State-of-the-Art Q A Models: BERT, RoBERTa, DistilBERT, and ALBERT on SQuAD v2 Dataset. Chaos and Fractals, 0–22. https://doi.org/10.69882/adba.chf.2024073

Sayeed, M. S., Mohan, V., & Muthu, K. S. (2023). BERT: A Review of Applications in Sentiment Analysis. HighTech and Innovation Journal, 4(2), 453–462. https://doi.org/10.28991/HIJ-2023-04-02-015

Singla, A. (2024). Roberta and BERT: Revolutionizing Mental Healthcare through Natural Language. Shodh Sagar Journal of Artificial Intelligence and Machine Learning, 1(1), 10–27. https://doi.org/10.36676/ssjaiml.v1.i1.02

Sy, C. Y., Maceda, L. L., Canon, M. J. P., & Flores, N. M. (2024). Beyond BERT: Exploring the Efficacy of RoBERTa and ALBERT in Supervised Multiclass Text Classification. International Journal of Advanced Computer Science and Applications, 15(3), 223–233. https://doi.org/10.14569/IJACSA.2024.0150323

Tamrakar, A. K. (2022). Natural Language Processing in Artificial Intelligence, NLPinAI 2021. Studies in Computational Intelligence, 999 SCI(April).

Tikayat Ray, A., Bhat, A. P., White, R. T., Nguyen, V. M., Pinon Fischer, O. J., & Mavris, D. N. (2023). Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace, 10(9). https://doi.org/10.3390/aerospace10090770

Vaeng, K. A. (2012). School of Hotel Management Master ’ S Thesis. 1–127.

van Atteveldt, W., van der Velden, M. A. C. G., & Boukes, M. (2021). The Validity of Sentiment Analysis: Comparing Manual Annotation, Crowd-Coding, Dictionary Approaches, and Machine Learning Algorithms. Communication Methods and Measures, 15(2), 121–140. https://doi.org/10.1080/19312458.2020.1869198

Wu, Y., Jin, Z., Shi, C., Liang, P., & Zhan, T. (2024). Research on the application of deep learning-based BERT model in sentiment analysis. Applied and Computational Engineering, 71(1), 14–20. https://doi.org/10.54254/2755-2721/71/2024ma

Zinaida, R. S. (2022). Optimizing Air Traffic Controller Communication for Enhanced Flight Safety : A Case Study of AirNav Indonesia Palembang. 03(01), 1307–1314. https://doi.org/10.12928/sylection.v3i1.14517

Downloads

Published

2025-05-23

How to Cite

Performance Comparison of BERT, ALBERT and RoBERTa for Sentiment Analysis in Critical Pilot Communication Prior to Aviation Accident. (2025). International Journal of Artificial Intelligence and Information Technology (IJAIIT), 1(1), 19-32. https://publish.umam.edu.my/index.php/ijaiit/article/view/57