
Helm to manage your charts and install airflow in an easy way.Kubectl CLI to interact with your cluster.For this demo, I’m using minikube, but you can use any other Kubernetes cluster like kind, micro k8s, or a cloud-based one like GKE. Installing Helm and Kubectlįirst, you need a Kubernetes cluster. So, today, I would like to show how can you set airflow (the number one orchestration tool for data pipelines) in a Kubernetes cluster with GitSync, which will enable you to sync your DAGs inside a code repository (like GitHub) with your airflow, so you don’t need to build a docker image with your dags inside every time you update or create a new one. And once you start using it you will wonder how could you do things differently in the past. With just one line of code, you can install and run a ton of applications without worrying about environment variables and dependencies. When I first start learning about Kubernetes I was really impressed by how easy it is to set an application and start running things. And for me, the solution was containerized applications and, more specifically, Kubernetes. jars files and dependencies… Well, it’s a task that can scare newcomers right away.īut as a problem appears, so does a solution. Get environment variables right, match each component version in a way that nothing crashes, keep updating your.


APACHE AIRFLOW GITHUB HOW TO
And only learning about those tools wasn’t enough: for every tool available, one must know how to set things up and it can be as annoying as it seems. When I first started studying data engineering I felt overwhelmed by the number of tools and techniques that were required to run data pipelines.
