Medición del desempeño de clasificadores usando atributos sintéticos polinomiales y selección de atributos con MrMR

Authors

  • Autor

Keywords:

Classifier, MrMR, Performance metrics, Synthetic attributes

Abstract

The different complexities present in the data impair the performance of predictive models. Among the most common complexities in data are class imbalance, the presence of outliers, class overlap, and high dimensionality. One of the ways to deal with this problem is to create and add synthetic attributes to the data, in order to improve performance. In this article, a comparison is made of the behavior (in terms of the F1-score metric) of six classifiers when polynomial-type synthetic attributes are added. The objective of the experiments is to verify if the creation of synthetic attributes helps to achieve better performance compared to the original attributes.

References

A. C. Lorena, A. I. Maciel, P. B. C. de Miranda, I. G. Costa, and R. B. C. Prudêncio, “Data complexity

meta-features for regression problems,” Mach Learn, vol. 107, no. 1, pp. 209–246, Jan. 2018, doi:

1007/s10994-017-5681-1.

J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on

support vector machine classification: Applications, challenges and trends,” Neurocomputing, 2020, doi:

1016/j.neucom.2019.10.118.

S. Xiang, Y. Fu, G. You, and T. Liu, “Attribute analysis with synthetic dataset for person re-identification,”

Jun. 2020, [Online]. Available: http://arxiv.org/abs/2006.07139

J. Brownlee, “Imbalanced Classification with Python Choose Better Metrics, Balance Skewed Classes, and

Apply Cost-Sensitive Learning,” 2020.

A. J. Larner, The 2x2 Matrix. Springer International Publishing, 2021. doi: 10.1007/978-3-030-74920-0.

H. Dalianis, Clinical text mining: Secondary use of electronic patient records. Springer International

Publishing, 2018. doi: 10.1007/978-3-319-78503-5.

A. H. Alsaffar, “Empirical study on the effect of using synthetic attributes on classification algorithms,”

International Journal of Intelligent Computing and Cybernetics, vol. 10, no. 2, pp. 111–129, 2017, doi:

1108/IJICC-08-2016-0029.

Mazzanti Samuele, “‘MRMR’ Explained Exactly How You Wished Someone Explained to You | by

Samuele Mazzanti | Towards Data Science,” Feb. 12, 2021. https://towardsdatascience.com/mrmr-

explained-exactly-how-you-wished-someone-explained-to-you-9cf4ed27458b (accessed Jul. 15, 2022).

M. Billah and S. Waheed, “Minimum redundancy maximum relevance (mRMR) based feature selection

from endoscopic images for automatic gastrointestinal polyp detection,” Multimed Tools Appl, vol. 79, no.

–34, pp. 23633–23643, Sep. 2020, doi: 10.1007/s11042-020-09151-7.

D. Dua and C. Graff, “UCI Machine Learning Repository.” 2017. [Online]. Available:

http://archive.ics.uci.edu/ml

Published

2024-10-06