Medición del desempeño de clasificadores usando atributos sintéticos polinomiales y selección de atributos con MrMR
Keywords:
Classifier, MrMR, Performance metrics, Synthetic attributesAbstract
The different complexities present in the data impair the performance of predictive models. Among the most common complexities in data are class imbalance, the presence of outliers, class overlap, and high dimensionality. One of the ways to deal with this problem is to create and add synthetic attributes to the data, in order to improve performance. In this article, a comparison is made of the behavior (in terms of the F1-score metric) of six classifiers when polynomial-type synthetic attributes are added. The objective of the experiments is to verify if the creation of synthetic attributes helps to achieve better performance compared to the original attributes.
References
A. C. Lorena, A. I. Maciel, P. B. C. de Miranda, I. G. Costa, and R. B. C. Prudêncio, “Data complexity
meta-features for regression problems,” Mach Learn, vol. 107, no. 1, pp. 209–246, Jan. 2018, doi:
1007/s10994-017-5681-1.
J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on
support vector machine classification: Applications, challenges and trends,” Neurocomputing, 2020, doi:
1016/j.neucom.2019.10.118.
S. Xiang, Y. Fu, G. You, and T. Liu, “Attribute analysis with synthetic dataset for person re-identification,”
Jun. 2020, [Online]. Available: http://arxiv.org/abs/2006.07139
J. Brownlee, “Imbalanced Classification with Python Choose Better Metrics, Balance Skewed Classes, and
Apply Cost-Sensitive Learning,” 2020.
A. J. Larner, The 2x2 Matrix. Springer International Publishing, 2021. doi: 10.1007/978-3-030-74920-0.
H. Dalianis, Clinical text mining: Secondary use of electronic patient records. Springer International
Publishing, 2018. doi: 10.1007/978-3-319-78503-5.
A. H. Alsaffar, “Empirical study on the effect of using synthetic attributes on classification algorithms,”
International Journal of Intelligent Computing and Cybernetics, vol. 10, no. 2, pp. 111–129, 2017, doi:
1108/IJICC-08-2016-0029.
Mazzanti Samuele, “‘MRMR’ Explained Exactly How You Wished Someone Explained to You | by
Samuele Mazzanti | Towards Data Science,” Feb. 12, 2021. https://towardsdatascience.com/mrmr-
explained-exactly-how-you-wished-someone-explained-to-you-9cf4ed27458b (accessed Jul. 15, 2022).
M. Billah and S. Waheed, “Minimum redundancy maximum relevance (mRMR) based feature selection
from endoscopic images for automatic gastrointestinal polyp detection,” Multimed Tools Appl, vol. 79, no.
–34, pp. 23633–23643, Sep. 2020, doi: 10.1007/s11042-020-09151-7.
D. Dua and C. Graff, “UCI Machine Learning Repository.” 2017. [Online]. Available: