Machine Learning in the Analysis of Social Problems: The Case of Global Human Trafficking
The British University in Dubai (BUiD)
Machine learning has been key to significant information technology discoveries in myriad disciplines. However, it has received mixed outlook in the social science field. This study aims to use the methods of learning from real data set on human trafficking, which is a serious social problem of today. The Counter-Trafficking Data Collaborative (CTDC) dataset, which is an initiative of the International Organization for Migration (IOM) for human trafficking was used for the experimental study. The exploration of the dataset revealed 61% of missing data — another incentive for the applicability of machine learning via multiple imputation using chained equations (MICE) instead of single imputation or deletion. Agglomerative hierarchical clustering using Gower's Distance was used for pattern discovery of the categorical type of data in this research, with a comparison to Fuzzy k-mode clustering. Results show that MICE had a level of effectiveness in handling missing data, while agglomerative hierarchical clustering was successful in identifying distinct and describable clusters from three time periods that the imputed dataset was segmented.
human trafficking, machine learning, pattern mining, Multiple Imputation using Chained Equations (MICE), agglomerative hierarchical clustering