top of page

Software engineering for data engineers

Git-Icon-1788C.png

Category

industrialize data pipelines

Duration (fully-guided training)

12h

Flipped-classroom training duration:

NA

of videos and

NA

of interactive workshop.

About the Course

It is a misconception that once a data product is made, it remains forever static. Requirements change, as does the underlying technology. So how can we integrate the new requirements and tech in an agile way? This is where we look at several software engineering principles that help us improve our deliverables as validation kicks in automatically. As a result, less time would be spent on maintenance and new people can be onboarded faster.

Software engineering principles for data engineers focuses on best practices when coding, the paradigms of object oriented programming (OOP) and functional programming (FP), code structure and devops practices, in particular testing (unit testing and data quality) and CI/CD. We illustrate all these concepts and work through exercises to bring home the key message. You'll be surprised by how easy it is to fall victim to bad practices. Dedicated training raises awareness, and quality will improve as a result .

Central to all of this is the idea to reduce maintenance by improving code clarity, and intensive automation of all kinds of quality checks through testing and CI/CD, all of which aim to reduce the time it takes between conceiving an idea and seeing that rolled out into production.

bottom of page