Term Weighting Based Indexing Class and Indexing Short Document for Indonesian Thesis Title Classification

  • Ana Tsalitsatun Ni'mah Universitas Trunojoyo Madura
  • Fahmi Syuhada Universitas Qamarul Huda Badaruddin Bagu, West Nusa Tenggara
DOI: https://doi.org/10.29303/jcosine.v6i2.471
Abstract: 199 Viewers PDF: 120 Viewers

Abstract

Document classification nowadays is an easy thing to do because there are the latest methods to get maximum results. Document classification using the term weighting TF-IDF-ICF method has been widely studied. Documents used in this research generally use large documents. If the term weighting TF-IDF method is used in a short text document such as the Thesis Title, the document will not get a perfect score from the classification results. Because in the IDF will calculate the weight of words that always appear to be few, ICF will calculate the weight of words that often appear in the class to be few. While the word should have great weight to be the core of a short text document. Therefore, this study aims to conduct research on word weighting based on class indexation and short document indexation, namely TF-IDF-ICF-IDSF. This study uses a classification comparison Naïve Bayes and SVM. The dataset used is Thesis Title of Informatics Education student at Trunojoyo Madura University. The test results show that the classification results using the TF-IDF-ICF-IDSF term weighting method outperform other term weighting, namely getting 91% Precision, 93% Recall, 86% F1-Score, and 84% Accuracy on SVM.

Published
2022-12-21
Section
Intelligent System and Computer Vision