Classification of Natural Disaster Reports from Social Media using K-Means SMOTE and Multinomial Naïve Bayes

  • Nor Indrani Lambung Mangkurat University
  • Mohammad Reza Faisal Universitas Lambung Mangkurat
  • Irwan Budiman Lambung Mangkurat University
  • Dwi Kartini Lambung Mangkurat University
  • Friska Abadi Lambung Mangkurat University
  • Septyan Eka Prastya Teknologi Informasi Department, Universitas Sari Mulia
  • Mera Kartika Delimayanti Department of Computer and Informatics Engineering, Politeknik Negeri Jakarta
Abstract: 443 Viewers PDF: 624 Viewers


Disasters can occur anytime and anywhere. Floods and forest fires are two types of disasters that occur in Indonesia. South Kalimantan Province is an area that frequently experiences floods and forest fires. The dataset used for previous research's flood and forest fire disaster data is unbalanced. Unbalanced data conditions can complicate the classification method in carrying out the classification process. The sampling method for the data level approach that can be used to solve imbalance problems is oversampling, one of the derivatives of oversampling, namely SMOTE. The K-Means SMOTE method is a modification of SMOTE. One Naïve Bayes model often used in text classification is Multinomial Naïve Bayes. Multinomial Naïve Bayes has a good performance in classifying text. The research results on flood disaster data using K-Means SMOTE with Multinomial Naïve Bayes yielded an f1 score of 66.04%, and forest fire disaster data using K-Means SMOTE with Multinomial Naïve Bayes produced an f1 score of 66.31%.

Intelligent System and Computer Vision