Model Klasifikasi Berbasis Multiclass Classification dengan Kombinasi Indobert Embedding dan Long Short-Term Memory untuk Tweet Berbahasa Indonesia

Published: Nov 11, 2022

Abstract:

Purpose: This research aims to improve the performance of the text classification model from previous studies, by combining the IndoBERT pre-trained model with the Long Short-Term Memory (LSTM) architecture in classifying Indonesian-language tweets into several categories.

Method: The classification text based on multiclass classification was used in this research, combined with pre-trained IndoBERT namely Long Short-Term Memory (LTSM). The dataset was taken using crawling method from API Twitter. Then, it will be compared with Word2Vec-LTSM and fined-tuned IndoBERT.

Result: The IndoBERT-LSTM model with the best hyperparameter combination scenario (batch size of 16, learning rate of 2e-5, and using average pooling) managed to get an F1-score of 98.90% on the unmodified dataset (0.70% increase from the Word2Vec-LSTM model and 0.40% from the fine-tuned IndoBERT model) and 92.83% on the modified dataset (4.51% increase from the Word2Vec-LSTM model and 0.69% from the fine-tuned IndoBERT model). However, the improvement from the fine-tuned IndoBERT model is not very significant and the Word2Vec-LSTM model has a much faster total training time.

Keywords:
1. Text Classification
2. Indonesian Tweets
3. IndoBERT
4. Long Short-Term Memory
Authors:
1 . Thariq Iskandar Zulkarnain Maulana Putra
2 . Suprapto Suprapto
3 . Arif Farhan Bukhori
How to Cite
Putra, T. I. Z. M. ., Suprapto, S., & Bukhori, A. F. . (2022). Model Klasifikasi Berbasis Multiclass Classification dengan Kombinasi Indobert Embedding dan Long Short-Term Memory untuk Tweet Berbahasa Indonesia. Jurnal Ilmu Siber Dan Teknologi Digital, 1(1), 1–28. https://doi.org/10.35912/jisted.v1i1.1509

Downloads

Download data is not yet available.
Issue & Section
References

    Alammar, J. (2018a, June 27). The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/

    Alammar, J. (2018b, December 3). The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)–Jay Alammar–Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-bert/

    Alwehaibi, A., Bikdash, M., Albogmi, M., & Roy, K. (2021). A study of the performance of embedding methods for Arabic short-text sentiment analysis using deep learning approaches. Journal of King Saud University-Computer and Information Sciences.

    Aydo?an, M., & Karci, A. (2020). Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification. Physica A: Statistical Mechanics and Its Applications, 541, 123288. https://doi.org/10.1016/j.physa.2019.123288

    Ayo, F. E., Folorunso, O., Ibharalu, F. T., & Osinuga, I. A. (2020). Machine learning techniques for hate speech classification of twitter data: State-of-The-Art, future challenges and research directions. Computer Science Review, 38, 100311. https://doi.org/10.1016/j.cosrev.2020.100311

    Brownlee, J. (2021, January 18). How to Choose an Activation Function for Deep Learning. https://machinelearningmastery.com/choose-an-activation- function-for-deep-learning/

    Cai, R., Qin, B., Chen, Y., Zhang, L., Yang, R., Chen, S., & Wang, W. (2020). Sentiment analysis about investors and consumers in energy market based on BERT-BILSTM. IEEE Access, 8, 171408–171415. https://doi.org/10.1109/ACCESS.2020.3024750

    Chauhan, N. S. (2021, August 2). Loss Functions in Neural Networks. https://www.theaidream.com/post/loss-functions-in-neural-networks

    Chaumond, J., Delangue, C., & Wolf, T. (2016). huggingface (Hugging Face). https://huggingface.co/huggingface

    Cournapeau, D. (2007). scikit-learn: machine learning in Python—scikit-learn 1.1.1 documentation. https://scikit-learn.org/stable/#

    Devlin, J., Chang, M.-W., Lee, K., Google, K. T., & Language, A. I. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. http://arxiv.org/abs/1810.04805

    Digmi, I. (2018, January 25). Memahami Epoch Batch Size Dan Iteration - JournalToday. https://imam.digmi.id/post/memahami-epoch-batch-size-dan- iteration/

    Ge?ron, A. (2017). Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc.

    Google Brain Team. (2015, November 9). TensorFlow. https://www.tensorflow.org/

    Goyal, A., Gupta, V., & Kumar, M. (2021). A deep learning-based bilingual Hindi and Punjabi named entity recognition system using enhanced word embeddings. Knowledge-Based Systems, 107601. https://doi.org/10.1016/j.knosys.2021.107601

    Gupta, V., & Lehal Professor, G. S. (2009). A Survey of Text Mining Techniques and Applications. www.alerts.yahoo.com

    Hilmiaji, N., Lhaksmana, K. M., & Purbolaksono, M. D. (2021). Identifying Emotion on Indonesian Tweets using Convolutional Neural Networks. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(3), 584–593. https://doi.org/10.29207/RESTI.V5I3.3137

    Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/NECO.1997.9.8.1735

    Keras Team. (2015, March 27). Dropout layer. https://keras.io/api/layers/regularization_layers/dropout/

    Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. 757–770. https://doi.org/10.18653/v1/2020.coling-main.66

    Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown,

    D. (2019). Text Classification Algorithms: A Survey. Information 2019, Vol. 10, Page 150, 10(4), 150. https://doi.org/10.3390/INFO10040150

    Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings. https://arxiv.org/abs/1301.3781v3

    Muhammad, P. F., Kusumaningrum, R., & Wibowo, A. (2021). Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews. Procedia Computer Science, 179, 728–735. https://doi.org/10.1016/J.PROCS.2021.01.061

    Nguyen, Q. T., Nguyen, T. L., Luong, N. H., & Ngo, Q. H. (2020). Fine-Tuning BERT for Sentiment Analysis of Vietnamese Reviews. Proceedings – 2020 7th NAFOSTED Conference on Information and Computer Science, NICS 2020, 302–307. https://doi.org/10.1109/NICS51282.2020.9335899

    Pahwa, B., Kasliwal, N., Scholar, R., Vidyapith, B., & Taruna, R. S. (2018). Sentiment Analysis-Strategy for Text Pre-Processing Indianization and customization for Indian consumers View project Aspect level sentiment analysis View project Sentiment Analysis-Strategy for Text Pre-Processing Bhumika Pahwa. Article in International Journal of Computer Applications, 180(34), 975–8887. https://doi.org/10.5120/ijca2018916865

    Putra, J. W. G. (2020). Pengenalan Pembelajaran Mesin dan Deep Learning.

    Rahman, D. (2019). deryrahman/word2vec-bahasa-indonesia: Word2Vec untuk bahasa Indonesia dari korpus Wikipedi https://github.com/deryrahman/word2vec-bahasa-indonesia

    Ramadhan, N. G. (2021). Indonesian Online News Topics Classification using Word2Vec and K-Nearest Neighbor. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(6), 1083–1089. https://doi.org/10.29207/RESTI.V5I6.3547

    Rao, A., & Spasojevic, N. (2016). Actionable and Political Text Classification using Word Embeddings and LSTM. https://arxiv.org/abs/1607.02501v2

    Robbani, H. A. (2018, September 24). GitHub - har07/PySastrawi: Indonesian stemmer. Python port of PHP Sastrawi project. PySastrawi. https://github.com/har07/PySastrawi

    Sharma, A. K., Chaurasia, S., & Srivastava, D. K. (2020). Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec. Procedia Computer Science, 167, 1139–1147. https://doi.org/10.1016/J.PROCS.2020.03.416

    Sun, Z., Zemel, R., & Xu, Y. (2021). A computational framework for slang generation. Transactions of the Association for Computational Linguistics, 9, 462–478. https://doi.org/10.1162/TACL_A_00378/1921784/TACL_A_00378.PDF

    Sutanto, T. (2020). nlptm-01. Tau-Data Indonesia. https://tau-data.id/d/nlptm- 01.html

    Uysal, A. K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information Processing & Management, 50(1), 104–112. https://doi.org/10.1016/J.IPM.2013.08.006

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ?. ukasz, & Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. v Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 30). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4 a845aa-Paper.pdf

    Wang, Z., Huang, Z., & Gao, J. (2020). Chinese Text Classification Method Based on BERT Word Embedding. ACM International Conference Proceeding Series, 66–71. https://doi.org/10.1145/3395260.3395273

    Wu, Y., Schuster, M., Chen, Z., Le, Q. v., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, ?., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144v2

  1. Alammar, J. (2018a, June 27). The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time. https://jalammar.github.io/illustrated-transformer/
  2. Alammar, J. (2018b, December 3). The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)–Jay Alammar–Visualizing machine learning one concept at a time. http://jalammar.github.io/illustrated-bert/
  3. Alwehaibi, A., Bikdash, M., Albogmi, M., & Roy, K. (2021). A study of the performance of embedding methods for Arabic short-text sentiment analysis using deep learning approaches. Journal of King Saud University-Computer and Information Sciences.
  4. Aydo?an, M., & Karci, A. (2020). Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification. Physica A: Statistical Mechanics and Its Applications, 541, 123288. https://doi.org/10.1016/j.physa.2019.123288
  5. Ayo, F. E., Folorunso, O., Ibharalu, F. T., & Osinuga, I. A. (2020). Machine learning techniques for hate speech classification of twitter data: State-of-The-Art, future challenges and research directions. Computer Science Review, 38, 100311. https://doi.org/10.1016/j.cosrev.2020.100311
  6. Brownlee, J. (2021, January 18). How to Choose an Activation Function for Deep Learning. https://machinelearningmastery.com/choose-an-activation- function-for-deep-learning/
  7. Cai, R., Qin, B., Chen, Y., Zhang, L., Yang, R., Chen, S., & Wang, W. (2020). Sentiment analysis about investors and consumers in energy market based on BERT-BILSTM. IEEE Access, 8, 171408–171415. https://doi.org/10.1109/ACCESS.2020.3024750
  8. Chauhan, N. S. (2021, August 2). Loss Functions in Neural Networks. https://www.theaidream.com/post/loss-functions-in-neural-networks
  9. Chaumond, J., Delangue, C., & Wolf, T. (2016). huggingface (Hugging Face). https://huggingface.co/huggingface
  10. Cournapeau, D. (2007). scikit-learn: machine learning in Python—scikit-learn 1.1.1 documentation. https://scikit-learn.org/stable/#
  11. Devlin, J., Chang, M.-W., Lee, K., Google, K. T., & Language, A. I. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. http://arxiv.org/abs/1810.04805
  12. Digmi, I. (2018, January 25). Memahami Epoch Batch Size Dan Iteration - JournalToday. https://imam.digmi.id/post/memahami-epoch-batch-size-dan- iteration/
  13. Ge?ron, A. (2017). Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. O’Reilly Media, Inc.
  14. Google Brain Team. (2015, November 9). TensorFlow. https://www.tensorflow.org/
  15. Goyal, A., Gupta, V., & Kumar, M. (2021). A deep learning-based bilingual Hindi and Punjabi named entity recognition system using enhanced word embeddings. Knowledge-Based Systems, 107601. https://doi.org/10.1016/j.knosys.2021.107601
  16. Gupta, V., & Lehal Professor, G. S. (2009). A Survey of Text Mining Techniques and Applications. www.alerts.yahoo.com
  17. Hilmiaji, N., Lhaksmana, K. M., & Purbolaksono, M. D. (2021). Identifying Emotion on Indonesian Tweets using Convolutional Neural Networks. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(3), 584–593. https://doi.org/10.29207/RESTI.V5I3.3137
  18. Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/NECO.1997.9.8.1735
  19. Keras Team. (2015, March 27). Dropout layer. https://keras.io/api/layers/regularization_layers/dropout/
  20. Koto, F., Rahimi, A., Lau, J. H., & Baldwin, T. (2020). IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. 757–770. https://doi.org/10.18653/v1/2020.coling-main.66
  21. Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown,
  22. D. (2019). Text Classification Algorithms: A Survey. Information 2019, Vol. 10, Page 150, 10(4), 150. https://doi.org/10.3390/INFO10040150
  23. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings. https://arxiv.org/abs/1301.3781v3
  24. Muhammad, P. F., Kusumaningrum, R., & Wibowo, A. (2021). Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews. Procedia Computer Science, 179, 728–735. https://doi.org/10.1016/J.PROCS.2021.01.061
  25. Nguyen, Q. T., Nguyen, T. L., Luong, N. H., & Ngo, Q. H. (2020). Fine-Tuning BERT for Sentiment Analysis of Vietnamese Reviews. Proceedings – 2020 7th NAFOSTED Conference on Information and Computer Science, NICS 2020, 302–307. https://doi.org/10.1109/NICS51282.2020.9335899
  26. Pahwa, B., Kasliwal, N., Scholar, R., Vidyapith, B., & Taruna, R. S. (2018). Sentiment Analysis-Strategy for Text Pre-Processing Indianization and customization for Indian consumers View project Aspect level sentiment analysis View project Sentiment Analysis-Strategy for Text Pre-Processing Bhumika Pahwa. Article in International Journal of Computer Applications, 180(34), 975–8887. https://doi.org/10.5120/ijca2018916865
  27. Putra, J. W. G. (2020). Pengenalan Pembelajaran Mesin dan Deep Learning.
  28. Rahman, D. (2019). deryrahman/word2vec-bahasa-indonesia: Word2Vec untuk bahasa Indonesia dari korpus Wikipedi https://github.com/deryrahman/word2vec-bahasa-indonesia
  29. Ramadhan, N. G. (2021). Indonesian Online News Topics Classification using Word2Vec and K-Nearest Neighbor. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(6), 1083–1089. https://doi.org/10.29207/RESTI.V5I6.3547
  30. Rao, A., & Spasojevic, N. (2016). Actionable and Political Text Classification using Word Embeddings and LSTM. https://arxiv.org/abs/1607.02501v2
  31. Robbani, H. A. (2018, September 24). GitHub - har07/PySastrawi: Indonesian stemmer. Python port of PHP Sastrawi project. PySastrawi. https://github.com/har07/PySastrawi
  32. Sharma, A. K., Chaurasia, S., & Srivastava, D. K. (2020). Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec. Procedia Computer Science, 167, 1139–1147. https://doi.org/10.1016/J.PROCS.2020.03.416
  33. Sun, Z., Zemel, R., & Xu, Y. (2021). A computational framework for slang generation. Transactions of the Association for Computational Linguistics, 9, 462–478. https://doi.org/10.1162/TACL_A_00378/1921784/TACL_A_00378.PDF
  34. Sutanto, T. (2020). nlptm-01. Tau-Data Indonesia. https://tau-data.id/d/nlptm- 01.html
  35. Uysal, A. K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information Processing & Management, 50(1), 104–112. https://doi.org/10.1016/J.IPM.2013.08.006
  36. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ?. ukasz, & Polosukhin, I. (2017). Attention is All you Need. In I. Guyon, U. v Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 30). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4 a845aa-Paper.pdf
  37. Wang, Z., Huang, Z., & Gao, J. (2020). Chinese Text Classification Method Based on BERT Word Embedding. ACM International Conference Proceeding Series, 66–71. https://doi.org/10.1145/3395260.3395273
  38. Wu, Y., Schuster, M., Chen, Z., Le, Q. v., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, ?., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144v2