SciELO - Scientific Electronic Library Online

 
vol.11 issue1Frequency of risk factors for the development of prediabetes in health workers author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

  • Have no cited articlesCited by SciELO

Related links

  • Have no similar articlesSimilars in SciELO

Share


Revista Virtual de la Sociedad Paraguaya de Medicina Interna

On-line version ISSN 2312-3893

Abstract

GUEVARA TIRADO, Alberto. Application of supervised learning algorithms to identify sociodemographic factors associated with depressive symptoms in Peruvian adults: CHAID model, 2022. Rev. virtual Soc. Parag. Med. Int. [online]. 2024, vol.11, n.1, e11122412.  Epub May 01, 2024. ISSN 2312-3893.  https://doi.org/10.18004/rvspmi/2312-3893/2024.e11122412.

Introduction:

Depressive symptoms are highly prevalent in the Peruvian population. The use of the decision tree algorithm could be beneficial in finding groups especially vulnerable to suffering from depressive symptoms.

Objective:

To determine the groups especially vulnerable to having depressive symptoms according to sociodemographic factors using a machine learning decision tree algorithm.

Material and methods:

An observational, descriptive, retrospective and cross-sectional design was applied. Data came from the National Demographic and Health Survey. The population was 32,062 adults and the dependent variable was the presence of depressive symptoms, and as explanatory variables: age group, mother tongue, ethnic group, educational level, age of onset of alcohol consumption, alcohol consumption, marital status, sex. The decision tree algorithm using automatic chi-square interaction detection (CHAID) was used.

Results:

The significant variables in the algorithm were sex, type of mother tongue, marital status, age group, educational level achieved, correctly classifying 75.80% of the cases of depressive symptoms. The nodes mainly associated with the presence of depressive symptoms were: node 2 (female sex), node 6 (adults from 39 years old), and node 13 (education up to secondary school). According to sex, in women, the variables mainly associated were those corresponding to node 2 (adults from 39 years of age), node 5 (education up to secondary school) and node 13 (original mother tongue). In men, the nodes mainly associated with depressive symptoms were node 2 (native mother tongue), node 6 (adults from 39 years of age) and node 11 (educational level reached up to secondary school).

Conclusions:

The main sociodemographic group associated with the development of depressive symptoms is the female sex, from the age of 39 and whose education has reached the school stage. The use of machine learning algorithms is useful to create screening tools for populations vulnerable to suffering from depressive symptoms.

Keywords : machine learning; depression; epidemiology; mental health; Peru.

        · abstract in Spanish     · text in Spanish     · Spanish ( pdf )