Date Received: 25-11-2024
Date Accepted: 31-07-2025
Date Published: 31-07-2025
##submissions.doi##: https://doi.org/10.31817/tckhnnvn.2025.23.7.06
Views
Downloads
How to Cite:
Comparison of some Feature Extraction and Classification Methods for Bee Sound Data
Keywords
MFCC, Chroma, Wavelet, Random Forest, XGBoost, Beehive sound recognition
Abstract
In this study, we have tested and evaluated several feature extraction methods of bee sound data. The features were extracted by MFCC, Chroma and Wavelet techniques and important features were selected by Random Forest (RF), Extra trees, and XGBoost method from raw data and then provided to machine learning algorithms such as SVM, Random Forest, XGBoost to solve the bee recognition problem. The experiment results show that using Random Forest and XGBoost models with MFCC features, bee sound recognition is fully possible with the accuracy over 99.9%.
References
Abdalla M.I. & Ali H.S. (2010). Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models. Telecommunications. 1(2): 16-21.
Breiman L. (2001). Random forests. Machine Learning. 45(1): 5-32.
Bromenshenk J.J., Henderson C.B., Seccomb R.A., Rice S.D. & Etter R.T. (2009). Honey bee acoustic recording and analysis system for monitoring hive health. Google Patents.
Davis S. & Mermelstein P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Transactions on Acoustics, Speech, and Signal Processing. 28(4): 357-366.
Du N.H., Dong N.D., Luu V.T., Hoang N.V., Thai P.H., Ngoc P.T., Long N.V. & Hong P.T.T. (2020). Toward Audio Beehive Monitoring Based on IoT-AI techniques: A Survey and Perspective, Vietnam Journal of Agricultural Sciences. 3(1): 530-540.
Geurts P., Ernst D. & Wehenkel L. (2006). Extremely randomized trees. Machine learning, 63(1): 3-42.
Kattel M., Nepal A., Shah A.K. & Shrestha D. (2019). Chroma feature extraction. Conference: Chroma Feature Extraction using Fourier Transform. 20(1).
Kulyukin V. (2018). BeePi: A Multisensor Electronic Beehive Monitor Retrieved. Truy cập từ https://www.kickstarter.com/projects/beepihoneybeesmeetai/beepi-a-multisensor-electronic-beehive-monitor ngày 10/9/2021.
Kulyukin V., Mukherjee S. & Amlathe P. (2018). Toward audio beehive monitoring: Deep learning vs. standard machine learning in classifying beehive audio samples. Applied Sciences. 8(9): 1573.
Kuo B.C., Ho H.H., Li C.H., Hung C.C. & Taur J.S. (2013). A kernel-based feature selection method for SVM with RBF kernel for hyperspectral image classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 7(1): 317-326.
Mallat S.G. (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 11(7): 674-693.
Mather P. & Tso B. (2016). Classification methods for remotely sensed data. CRC press.
Nolasco I., Terenzi A., Cecchi S., Orcioni S., Bear H.L. & Benetos E., (2019). Audio-based identification of beehive states. ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE. pp. 8256-8260.
Python (2021). Truy cập từ https://www.python.org/ ngày 20/6/2021.
Robles-Guerrero A., Saucedo-Anaya T., González-Ramérez E. & Galván-Tejada C.E. (2017). Frequency Analysis of Honey Bee Buzz for Automatic Recognition of Health Status: A Preliminary Study. Research in Computing Science 142: 89-98.
Scikit-Learn (2021). Truy cập từ https://scikit-learn.org ngày 20/6/2021.
Terenzi A., Cecchi S., Orcioni S. & Piazza F. (2019). Features extraction applied to the analysis of the sounds emitted by honey bees in a beehive. 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA). IEEE. pp. 03-08.
Vũ Hữu Tiệp (2021). Random Forest algorithm. Truy cập từ https://machinelearningcoban.com/ tabml _book/ch_model/random_forest.html ngày 12/5/2021.