Date Received: 24-02-2022 / Date Accepted: 20-12-2022
Today, most of the users of mobile devices are regularly bothered by a large number of scam messages, advertising messages in different fields such as entertainment, shopping, finance, and real estate. Among these, each SMS message can belong to one or more different message types at the same time. Therefore, using single-label classification methods to classify messages would be inappropriate. In this study, we have summarized multi-label classification techniques, collected a dataset of 2,000 Vietnamese SMS messages (SMSVN), and improved the accuracy of the methods for multi-label classification by using the preprocessing techniques to normalize and clean data. Moreover, we have also applied the well-known multiple classifiers to test classification on this dataset. The results show that, after applying the preprocessing techniques, most of the multi-label classification techniques had higher accuracy and lower classification error. The Classifier Chains technique using Naïve Bayes model was suitable for the Vietnamese SMS data classification issues.