DATA ANALYSIS AND MACHINE LEARNING APPLICATIONS IN ENVIRONMENTAL MANAGEMENT
Keywords:
Air Pollution Epidemiology, Data Mining, Machine Learning, Predictive ModellingAbstract
The rapid expansion of data on air contaminants and climate change, particularly concerning public health, presents both opportunities and challenges for traditional epidemiological methods. This study aims to address these challenges by exploring advanced data collection, pattern identification, and predictive modeling techniques in the context of air pollution research. The focus is leveraging data mining and computational methods to enhance the understanding of air pollution's impact on public health, specifically ozone exposure. A comprehensive review of the scientific literature was conducted, utilizing databases such as Professor, Scholar, Embl, and Nih to identify relevant studies on air pollution epidemiology. The review highlights the integration of data mining, machine learning, and spatiotemporal modeling to improve the detection, analysis, and forecasting of air pollution-related health issues. The findings reveal a growing trend in applying data mining techniques within the field of air pollution epidemiology. Advanced methods, such as spatiotemporal analysis and geographic data mining, enable more precise tracking and forecasting of pollution-related health risks. Continuous advancements in artificial intelligence and the development of more sophisticated sensors and data storage technologies are enhancing the accuracy and reliability of air quality monitoring and public health predictions. This study highlights the transformative potential of integrating data mining and AI techniques into air pollution epidemiology. Exploring emerging technologies like spatiotemporal mining and next-generation sensors paves the way for more accurate, timely, and scalable solutions to monitor air quality and predict its impact on public health, opening new avenues for research and policy interventions.
Downloads
References
Domingos P. A few useful things to know about machine learning. Commun ACM. 2012;55:78–87.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521: 436–44.
Dietterich TG, et al. Ensemble methods in machine learning. Multiple Classif Syst. 2000;1857:1–15.
Lary DJ, Faruque FS, Malakar N, Moore A, Roscoe B, Adams ZL, Eggelston Y. Estimating the global abundance of ground level presence of particulate matter . Geospatial Health. 2014;8:611–30. 5. Neto UMB, Dougherty ER. Error estimation for pattern recognition. Hoboken: John Wiley & Sons; 2015.
Japkowicz N, Shah M. Evaluating Learning Algorithms: a Classification Perspective. Cambridge: Cambridge University Press; 2011.
Bellinger C, Amid A, Japkowicz N, Victor H. Multi-label classification of anemia patients. In: Proceedings of the IEEE 14th International Conference on Machine Learning and Applications . 2015. p. 825–30.
Bishop C. Pattern Recognition and Machine Learning . 1613-9011. Cambridge: Springer; 2006.
Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and Regression Trees. Chicago: CRC Press; 1984.
Quinlan JR. Induction of decision trees. Mach Learn. 1986;1:81–106.
Quinlan JR. C4. 5: Programs for Machine Learning. San Francisco: Morgan Kaufmann; 1993.
Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. Parallel Distributed Processing: Exploration of the Micro-structure of Cognition. 1986;1:1–34.
Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the International Conference on Machine Learning. 2013.
Vapnik V. The Nature of Statistical Learning Theory. New York: Springer; 1995.
Schölkopf B, Smola AJ. Learning with kernels. Cambridge: MIT Press; 2002.
Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases . 1994. p. 487–99.
Kitchenham B. Procedure for undertaking systematic reviews. Technical report, Computer Science Department, Keele University and National ICT Australia Ltd , Joint Technical Report. 2004.
Moher D, Liberati A, Tetzlaff J, Altman D. Preferred reporting items for systematic reviews and meta-analyses: the prisma statement. J Clin Epidemiol. 2009;62:1006–12.
Runge-Ranzinger S, Horstick O, Marx M, Kroeger A. What does dengue disease surveillance contribute to predicting and detecting outbreaks and describing trends? Trop Med Int Health. 2008;13:1022–41.
Gass K, Klein M, Chang HH, Flanders WD, Strickland MJ. Classification and regression trees for epidemiologic research: an air pollution example. Environ Health. 2014;13:17.
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The weka data mining software: An update. SIGKDD Explor Newsl. 2009;11:10–18.
Chen HW, Tsai CT, She CW, Lin YC, Chiang CF. Exploring the background features of acidic and basic air pollutants around an industrial complex using data mining approach. Chemosphere. 2010;81:1358–67.
Chen M, Wang P, Chen Q, Wu J, Chen X. A clustering algorithm for sample data based on environmental pollution characteristics. Atmos Environ. 2015;107:194–203.
Singh KP, Gupta S, Rai P. Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmos Environ. 2013;80:426–37.
Thurston GD, Spengler JD. A quantitative assessment of source contributions to inhalable particulate matter pollution in metropolitan boston. Atmos Environ . 1985;19:9–25.
Jiang W, Wang Y, Tsou MH, Fu X. Using social media to detect outdoor air pollution and monitor air quality index : a geo-targeted spatiotemporal analysis framework with sina weibo . PloS ONE. 2015;10:0141185.
Wang S, Paul MJ, Dredze M. Social media as a sensor of air quality and public response in china. J Med Internet Res. 2015;17:22.
Freund Y, Schapire RE. Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning; 1996. p. 148–56.
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29:1189–232.
Xu Y, Yang W, Wang J. Air quality early-warning system for cities in china. Atmos Environ. 2017;148:239–57.
Bobb JF, Valeri L, Henn BC, Christiani DC, Wright RO, Mazumdar M, Godleski JJ, Coull BA. Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics. 2014;16:058.
Payus C, Sulaiman N, Shahani M, Bakar AA. Association rules of data mining application for respiratory illness by air pollution database. Int J Basic Appl Sci. 2013;13:11–16.
Kukkonen J, Partanen L, Karppinen A, Ruuskanen J, Junninen H, Kolehmainen M, Niska H, Dorling S, Chatterton T, Foxall R, et al. Extensive evaluation of neural network models for the prediction of no 2 and pm 10 concentrations, compared with a deterministic modelling system and measurements in central helsinki. Atmos Environ. 2003;37: 4539–550.
Rajab Asaad, R. (2021). Review on Deep Learning and Neural Network Implementation for Emotions Recognition . Qubahan Academic Journal, 1(1), 1–4. https://doi.org/10.48161/qaj.v1n1a25
Mohammed Sadeeq, M., Abdulkareem , N. M. ., Zeebaree , S. R. M. ., Mikaeel Ahmed, D. ., Saifullah Sami, A. ., & Zebari, R. R. (2021). IoT and Cloud Computing Issues, Challenges and Opportunities: A Review. Qubahan Academic Journal, 1(2), 1–7. https://doi.org/10.48161/qaj.v1n2a36
Rajab Asaad, R., & Masoud Abdulhakim, R. (2021). The Concept of Data Mining and Knowledge Extraction Techniques. Qubahan Academic Journal, 1(2), 17–20. https://doi.org/10.48161/qaj.v1n2a43
Hussen Maulud , D., Zeebaree , S. R. M. ., Jacksi, K. ., Mohammed Sadeeq, M. A., & Hussein Sharif, K. . (2021). State of Art for Semantic Analysis of Natural Language Processing . Qubahan Academic Journal, 1(2), 21–28. https://doi.org/10.48161/qaj.v1n2a44
M. Almufti, S., B. Marqas, R. ., A. Nayef, Z. ., & S. Mohamed, T. . (2021). Real Time Face-mask Detection with Arduino to Prevent COVID-19 Spreading. Qubahan Academic Journal, 1(2), 39–46. https://doi.org/10.48161/qaj.v1n2a47
Rashid Abdulqadir, H. ., R. M. Zeebaree , S. ., M. Shukur, H., Mohammed Sadeeq, M., Wasfi Salim, B. ., Abid Salih, A. ., & Fattah Kak, S. . (2021). A Study of Moving from Cloud Computing to Fog Computing . Qubahan Academic Journal, 1(2), 60–70. https://doi.org/10.48161/qaj.v1n2a49
Salih Ageed, Z., R. M. Zeebaree, S. . ., Mohammed Sadeeq, M., Fattah Kak, S. ., Saeed Yahia, H. ., R. Mahmood, M., & Mahmood Ibrahim, I. . (2021). Comprehensive Survey of Big Data Mining Approaches in Cloud Systems . Qubahan Academic Journal, 1(2), 29–38. https://doi.org/10.48161/qaj.v1n2a46
Salih Ageed, Z., R. M. Zeebaree, S. ., Mohammed Sadeeq, M., Fattah Kak, S. ., Najat Rashid, Z., Abid Salih, A. ., & M. Abdullah, W. . (2021). A Survey of Data Mining Implementation in Smart City Applications. Qubahan Academic Journal, 1(2), 91–99. https://doi.org/10.48161/qaj.v1n2a52
Adil Yazdeen, A. ., Zeebaree , S. R. M. ., Mohammed Sadeeq, M., Kak, S. F. ., Ahmed, O. M. ., & Zebari, R. R. (2021). FPGA Implementations for Data Encryption and Decryption via Concurrent and Parallel Computation: A Review. Qubahan Academic Journal, 1(2), 8–16. https://doi.org/10.48161/qaj.v1n2a38
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Dilovan Asaad Majeed, Hawar Bahzad Ahmad, Ahmed Alaa Hani, Subhi R. M. Zeebaree, Saman Mohammed Abdulrahman, Renas Rajab Asaad, Amira Bibo Sallow
This work is licensed under a Creative Commons Attribution 4.0 International License.