Automation in Data Engineering: Implementing GitHub Actions for CI/CD in ETL Workflows
DOI:
https://doi.org/10.5281/zenodo.13994660Keywords:
GITHUB, Data, ETLAbstract
Automation in data engineering, particularly within ETL (Extract, Transform, Load) workflows, has become critical as data volumes increase. This paper demonstrates the use of GitHub Actions to implement Continuous Integration/Continuous Deployment (CI/CD) in ETL workflows, significantly improving performance and efficiency. Our research focuses on automating ETL stages—data extraction, transformation, and loading—using GitHub Actions to reduce errors, enhance deployment success rates, and minimize manual interventions. Results show a 33.33% reduction in average execution time, a 58.33% decrease in error rates, and an 18.75% increase in successful deployment rates. Additionally, automation led to a total time savings of 23 hours across ETL tasks. These findings highlight the importance of CI/CD in modern data engineering, emphasizing the role of automation in optimizing ETL workflows for greater reliability and efficiency.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Ronakkumar Bathani

This work is licensed under a Creative Commons Attribution 4.0 International License.
Research Articles in 'International Journal of Engineering and Management Research' are Open Access articles published under the Creative Commons CC BY License Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/. This license allows you to share – copy and redistribute the material in any medium or format. Adapt – remix, transform, and build upon the material for any purpose, even commercially.






