Model Klasifikasi Berbasis Multiclass Classification dengan Kombinasi Indobert Embedding dan Long Short-Term Memory untuk Tweet Berbahasa Indonesia

Published: Nov 11, 2022

Abstract:

Purpose: This research aims to improve the performance of the text classification model from previous studies, by combining the IndoBERT pre-trained model with the Long Short-Term Memory (LSTM) architecture in classifying Indonesian-language tweets into several categories.

Method: The classification text based on multiclass classification was used in this research, combined with pre-trained IndoBERT namely Long Short-Term Memory (LTSM). The dataset was taken using crawling method from API Twitter. Then, it will be compared with Word2Vec-LTSM and fined-tuned IndoBERT.

Result: The IndoBERT-LSTM model with the best hyperparameter combination scenario (batch size of 16, learning rate of 2e-5, and using average pooling) managed to get an F1-score of 98.90% on the unmodified dataset (0.70% increase from the Word2Vec-LSTM model and 0.40% from the fine-tuned IndoBERT model) and 92.83% on the modified dataset (4.51% increase from the Word2Vec-LSTM model and 0.69% from the fine-tuned IndoBERT model). However, the improvement from the fine-tuned IndoBERT model is not very significant and the Word2Vec-LSTM model has a much faster total training time.

Keywords:
1. Text Classification
2. Indonesian Tweets
3. IndoBERT
4. Long Short-Term Memory
Authors:
1 . Thariq Iskandar Zulkarnain Maulana Putra
2 . Suprapto Suprapto
3 . Arif Farhan Bukhori
How to Cite
Putra, T. I. Z. M. ., Suprapto, S., & Bukhori, A. F. . (2022). Model Klasifikasi Berbasis Multiclass Classification dengan Kombinasi Indobert Embedding dan Long Short-Term Memory untuk Tweet Berbahasa Indonesia. Jurnal Ilmu Siber Dan Teknologi Digital, 1(1), 1–28. https://doi.org/10.35912/jisted.v1i1.1509

Downloads

Download data is not yet available.
Issue & Section
References
  1. Alammar, J. (2018a, June 27). The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/
  2. Alammar, J. (2018b, December 3). The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)–Jay Alammar–Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-bert/
  3. Alwehaibi, A., Bikdash, M., Albogmi, M., & Roy, K. (2021). A study of the performance of embedding methods for Arabic short-text sentiment analysis using deep learning approaches. Journal of King Saud University-Computer and Information Sciences.
  4. Aydo?an, M., & Karci, A. (2020). Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification. Physica A: Statistical Mechanics and Its Applications, 541, 123288. https://doi.org/10.1016/j.physa.2019.123288
  5. Ayo, F. E., Folorunso, O., Ibharalu, F. T., & Osinuga, I. A. (2020). Machine learning techniques for hate speech classification of twitter data: State-of-The-Art, future challenges and research directions. Computer Science Review, 38, 100311. https://doi.org/10.1016/j.cosrev.2020.100311
  6. Brownlee, J. (2021, January 18). How to Choose an Activation Function for Deep Learning. https://machinelearningmastery.com/choose-an-activation- function-for-deep-learning/
  7. Cai, R., Qin, B., Chen, Y., Zhang, L., Yang, R., Chen, S., & Wang, W. (2020). Sentiment analysis about investors and consumers in energy market based on BERT-BILSTM. IEEE Access, 8, 171408–171415. https://doi.org/10.1109/ACCESS.2020.3024750
  8. Chauhan, N. S. (2021, August 2). Loss Functions in Neural Networks. https://www.theaidream.com/post/loss-functions-in-neural-networks
  9. Chaumond, J., Delangue, C., & Wolf, T. (2016). huggingface (Hugging Face). https://huggingface.co/huggingface
  10. Cournapeau, D. (2007). scikit-learn: machine learning in Python—scikit-learn 1.1.1 documentation. https://scikit-learn.org/stable/#
  11. Devlin, J., Chang, M.-W., Lee, K., Google, K. T., & Language, A. I. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. http://arxiv.org/abs/1810.04805
  12. Digmi, I. (2018, January 25). Memahami Epoch Batch Size Dan Iteration - JournalToday. https://imam.digmi.id/post/memahami-epoch-batch-size-dan- iteration/
  13. Ge?ron, A. (2017). Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc.
  14. Google Brain Team. (2015, November 9). TensorFlow. https://www.tensorflow.org/
  15. Goyal, A., Gupta, V., & Kumar, M. (2021). A deep learning-based bilingual Hindi and Punjabi named entity recognition system using enhanced word embeddings. Knowledge-Based Systems, 107601. https://doi.org/10.1016/j.knosys.2021.107601
  16. Gupta, V., & Lehal Professor, G. S. (2009). A Survey of Text Mining Techniques and Applications. www.alerts.yahoo.com
  17. Hilmiaji, N., Lhaksmana, K. M., & Purbolaksono, M. D. (2021). Identifying Emotion on Indonesian Tweets using Convolutional Neural Networks. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(3), 584–593. https://doi.org/10.29207/RESTI.V5I3.3137
  18. Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/NECO.1997.9.8.1735
  19. Keras Team. (2015, March 27). Dropout layer. https://keras.io/api/layers/regularization_layers/dropout/
  20. Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. 757–770. https://doi.org/10.18653/v1/2020.coling-main.66
  21. Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown,
  22. D. (2019). Text Classification Algorithms: A Survey. Information 2019, Vol. 10, Page 150, 10(4), 150. https://doi.org/10.3390/INFO10040150
  23. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings. https://arxiv.org/abs/1301.3781v3
  24. Muhammad, P. F., Kusumaningrum, R., & Wibowo, A. (2021). Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews. Procedia Computer Science, 179, 728–735. https://doi.org/10.1016/J.PROCS.2021.01.061
  25. Nguyen, Q. T., Nguyen, T. L., Luong, N. H., & Ngo, Q. H. (2020). Fine-Tuning BERT for Sentiment Analysis of Vietnamese Reviews. Proceedings – 2020 7th NAFOSTED Conference on Information and Computer Science, NICS 2020, 302–307. https://doi.org/10.1109/NICS51282.2020.9335899
  26. Pahwa, B., Kasliwal, N., Scholar, R., Vidyapith, B., & Taruna, R. S. (2018). Sentiment Analysis-Strategy for Text Pre-Processing Indianization and customization for Indian consumers View project Aspect level sentiment analysis View project Sentiment Analysis-Strategy for Text Pre-Processing Bhumika Pahwa. Article in International Journal of Computer Applications, 180(34), 975–8887. https://doi.org/10.5120/ijca2018916865
  27. Putra, J. W. G. (2020). Pengenalan Pembelajaran Mesin dan Deep Learning.
  28. Rahman, D. (2019). deryrahman/word2vec-bahasa-indonesia: Word2Vec untuk bahasa Indonesia dari korpus Wikipedi https://github.com/deryrahman/word2vec-bahasa-indonesia
  29. Ramadhan, N. G. (2021). Indonesian Online News Topics Classification using Word2Vec and K-Nearest Neighbor. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(6), 1083–1089. https://doi.org/10.29207/RESTI.V5I6.3547
  30. Rao, A., & Spasojevic, N. (2016). Actionable and Political Text Classification using Word Embeddings and LSTM. https://arxiv.org/abs/1607.02501v2
  31. Robbani, H. A. (2018, September 24). GitHub - har07/PySastrawi: Indonesian stemmer. Python port of PHP Sastrawi project. PySastrawi. https://github.com/har07/PySastrawi
  32. Sharma, A. K., Chaurasia, S., & Srivastava, D. K. (2020). Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec. Procedia Computer Science, 167, 1139–1147. https://doi.org/10.1016/J.PROCS.2020.03.416
  33. Sun, Z., Zemel, R., & Xu, Y. (2021). A computational framework for slang generation. Transactions of the Association for Computational Linguistics, 9, 462–478. https://doi.org/10.1162/TACL_A_00378/1921784/TACL_A_00378.PDF
  34. Sutanto, T. (2020). nlptm-01. Tau-Data Indonesia. https://tau-data.id/d/nlptm- 01.html
  35. Uysal, A. K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information Processing & Management, 50(1), 104–112. https://doi.org/10.1016/J.IPM.2013.08.006
  36. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ?. ukasz, & Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. v Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 30). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4 a845aa-Paper.pdf
  37. Wang, Z., Huang, Z., & Gao, J. (2020). Chinese Text Classification Method Based on BERT Word Embedding. ACM International Conference Proceeding Series, 66–71. https://doi.org/10.1145/3395260.3395273
  38. Wu, Y., Schuster, M., Chen, Z., Le, Q. v., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, ?., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144v2