Impact of Data Quality Types on Computational Time in Data Source Selection Using Ant Colony Optimization

Main Article Content

Nor Amalina Mohd Sabri
Abd Samad Hasan Basari
Nurul Akmar Emran

Abstract

Data quality varies dramatically from source to source, even within the same domain. Given these challenges, data source selection has emerged as a crucial step in information integration. It demands efficient and scalable approaches that can handle massive data volumes while ensuring the quality of results. Adapting the ACO algorithm to solve the data sources selection problems may lead to inconsistent computational time if the data sources provided are vary in quality. These challenges bring the issues of time consuming in selecting the required data sources. However, how much the computational time needed in solving the data sources selection is depending on the type of data quality. Hence, in this article, the impact of quality type of data towards computational time is examined in solving the data sources selection problems. For the methodology used, there are five steps need to be followed which are first collect data set, second import the data sources to the data sources selection model, third implement the ACO algorithm, fourth obtain the computational time and lastly compare the results. The experiment shows that low-quality data set achieve higher computational time compared to the high-quality data set which achieve the minimum computational time by 3.38 % faster. The results obtained in this experiment shown that the quality type of data has given an impact to the computational time of ACO algorithm. The results also clearly show the contribution of high-quality data set in minimizing computational time in the selection process. The validation on quality type of data with computational time is to clarify the importance of selecting a good quality data to save the computational time.

Article Details

How to Cite
Mohd Sabri, N. A., Hasan Basari, A. S., & Emran, N. A. (2025). Impact of Data Quality Types on Computational Time in Data Source Selection Using Ant Colony Optimization . Journal of Informatics and Web Engineering, 4(3), 408–415. https://doi.org/10.33093/jiwe.2025.4.3.24
Section
Thematic (AI-Enhanced Computing and Digital Transformation)

References

Y. Lin, H. Wang, J. Li, and H. Gao, “Data source selection for information integration in big data era,” Information Sciences, vol. 479, pp. 197–213, 2019.

J. Wang et al., “Overview of Data Quality: Examining the Dimensions, Antecedents, and Impacts of Data Quality,” Journal of the Knowledge Economy, vol. 15, pp. 1159–1178, 2024, doi: 10.1007/s13132-022-01096-6.

Y. Lin, H. Wang, S. Zhang, J. Li, and H. Gao, “Efficient quality-driven source selection from massive data sources,” The Journal of Systems and Software, vol. 118, pp. 221–233, 2016, doi: 10.1016/j.jss.2016.05.026.

F.A. Bernardi, D. Alves, N. Crepaldi, D.B. Yamada, V.C. Lima, and R. Rijo, “Data Quality in Health Research: Integrative Literature Review,” J Med Internet Res, vol. 25, pp. e41446, 2023, doi: 10.2196/41446.

L. Ehrlinger and W. Wöß, “A Systematic Review of Data Quality Measurement and Monitoring Tools,” Frontiers in Big Data, vol. 5, Art. no. 850611, 2022, doi: 10.3389/fdata.2022.850611.

T. Peixoto et al., “Data Quality Assessment in Smart Manufacturing: A Review,” Systems, vol. 13, no. 243, pp. 1–28, Mar. 2025, doi: 10.3390/systems13040243.

S. Mohammed, L. Budach, M. Feuerpfeil, N. Ihde, A. Nathansen, N. Noack, H. Patzlaff, F. Naumann, and H. Harmouch, “The effects of data quality on machine learning performance on tabular data,” Information Systems, vol. 132, p. 102549, Mar. 2025, doi: 10.1016/j.is.2025.102549.

Matoni, A. Kesper, and G. Taentzer, “How to Define the Quality of Data? A Feature-Based Literature Survey,” arXiv preprint arXiv:2504.01491 2025.

F. Ridzuan, and W. M. N. W. Zainon, “A Review on Data Quality Dimensions for Big Data,” Procedia Computer Science, vol. 234, pp. 341-348, 2024, doi: 10.1016/j.procs.2024.03.008.

H. Cho, and S. Lee, “Data Quality Measures and Efficient Evaluation Algorithms for Large-Scale High-Dimensional Data,” Appl. Sci., vol. 11, no. 2, pp. 472, Jan. 2021, doi: 10.3390/app11020472.

Z. Qi, H. Wang, J. Li, and H. Gao, “Impacts of Dirty Data: an Experimental Evaluation," arXiv:1803.06071v2, Mar. 2018.

O. Ozonze, P.J. Scott, and A.A. Hopgood, “Automating Electronic Health Record Data Quality Assessment,” Journal of Medical Systems, vol. 47, Art. no. 23, 2023, doi: 10.1007/s10916-022-01892-2.

L.G. Fahad et al., “Ant Colony Optimization-Based Streaming Feature Selection: An Application to the Medical Image Diagnosis,” Sci. Program., vol. 2020, Article ID 1064934, pp. 1–10, Oct. 2020, doi: 10.1155/2020/1064934.

M. R. Abdurrahman, H. Al-Aziz, F.A. Zayn, M.A. Purnomo, and H.A. Santoso, “Development of Robot Feature for Stunting Analysis Using Long-Short Term Memory (LSTM) Algorithm,” J. Informatics Web Eng., vol. 3, no. 3, pp. 165–175, Oct. 2024, doi: 10.33093/jiwe.2024.3.3.10.

P.-W. Chin, K.-W. Ng, and N. Palanichamy, “Plant Disease Detection and Classification Using Deep Learning Methods: A Comparison Study,” J. Informatics Web Eng., vol. 3, no. 1, pp. 156–168, Feb. 2024, doi: 10.33093/jiwe.2024.3.1.10.

N.A.M. Sabri, N.A. Emran, N. Harum, “Open Government Data (OGD) portals selection using Ant Colony Optimization (ACO) Algorithm,” International Journal of Advanced Trends in Computer Science and Engineering (IJATCSE), vol 9, pp. 6555-6562, 2020.

M.A. Awadallah et al., “Multi-objective Ant Colony Optimization: Review,” Archives of Computational Methods in Engineering, vol. 32, pp. 995–1037, 2025, doi: 10.1007/s11831-024-10178-4.

F. Dahan, “An Effective Multi-Agent Ant Colony Optimization Algorithm for QoS-Aware Cloud Service Composition,” IEEE Access, vol. 9, 2021, doi: 10.1109/ACCESS.2021.3052907.

N. Sharma, Sonal, and P. Garg, “Ant colony based optimization model for QoS-based task scheduling in cloud computing environment,” Measurement: Sensors, vol. 24, p. 100531, 2022, doi: 10.1016/j.measen.2022.100531.

H.R. Kanan and K. Faez, “An improved feature selection method based on ant colony optimization (ACO) evaluated on face recognition system,” Applied Mathematics and Computation, vol. 205, pp. 716-725, 2008, doi: 10.1016/j.amc.2008.05.115.

Z. Zhang, J. Li, and N. Xu, “Robust optimization based on ant colony optimization in the data transmission path selection of WSNs,” Neural Computing and Applications, vol. 33, pp. 17119–17130, 2021, doi: 10.1007/s00521-021-06303-0.

N.A.M. Sabri, N. A. Emran, N. Abdullah, “Quality-Based Open Data Source Selection Using Ant Colony Optimization (ACO) Algorithm,” International Journal on Emerging Technologies, vol 11, pp. 1164-1168, 2020.

Most read articles by the same author(s)