Ayoub Fakir.

I build data platforms.

Senior Data Engineer & Architect. Scala · Rust · Go · Python. Distributed systems, functional programming, blockchain.

  • Certified Kubernetes Administrator (CKA-2000-008592-0100)

01 About

I'm a Senior Data Engineer & Architect with a decade of building distributed systems, data platforms, and the infrastructure that keeps them honest. I write Scala, Rust, Go, and Python (sometimes Clojure), and I have strong opinions about Functional Programming, blockchain data, and how teams should work with data.

I've led platforms at Décathlon, Glassnode, Algolia, and HPE: moving petabytes, untangling pipelines, and migrating things from where they shouldn't be to where they should. On the side, I teach Data Engineering at Paris-Est Créteil University.

02 Experience

  1. Senior Data Engineer · Décathlon

    Led the migration of Decathlon’s data pipelines from Talend/Redshift to Spark/Scala on AWS Databricks.

    • Lead Data Engineer on the PerfECO project, moving Talend/Redshift workloads to a Spark/Scala pipeline.
    • Data validation at scale (batch + streaming) with Cats / ZIO on POSLog.
    • Trained team members on Functional Programming and distributed programming.
    • Migrated Spark/Redshift workloads to AWS Databricks.
    • Built an agent-based distributed streaming system for ingestion.
    • Member of the architecture committee across multiple Décathlon projects.
    • Scala
    • Spark
    • Databricks
    • Cats
    • ZIO
    • AWS
  2. Senior Data Engineer · Glassnode

    Led the data platform ingesting and exposing on-chain data from dozens of vendors.

    • Ingested blockchain data from CoinGecko, CoinMarketCap, Sonar, Dune, ETF feeds and more.
    • Set up the Medallion architecture in GCP, integrated Snowflake and BigQuery.
    • Modernized the platform with Airflow (Composer), Spark/Dataproc, dbt.
    • Introduced Lakehouse formats (Delta / Iceberg).
    • GCP
    • Snowflake
    • BigQuery
    • Airflow
    • dbt
    • Delta
    • Iceberg
  3. Senior Data Engineer · Algolia

    Petabytes-monthly ingestion and processing on the Data Platform team.

    • Ingestion via Kafka and Kinesis from cloud providers and vendors (Salesforce, …).
    • Migrated Stitch → Meltano; framework to automate API/DB ingestions (Airflow + deferred ECS Tasks).
    • Spark with EMR and AWS Glue. dbt framework for Analytics Engineers.
    • Led Redshift → Databricks/Snowflake feasibility studies and PoCs.
    • Lakehouse datalake on Delta Lake + Databricks.
    • Kafka
    • Kinesis
    • Spark
    • EMR
    • Glue
    • dbt
    • Delta
    • Databricks
  4. Senior DataOps · Air Liquide

    5-month mission

    Architected and shipped an in-house Airflow data platform on EKS.

    • Studied feasibility of an Airflow deployment park on EKS, validated with stakeholders.
    • Set up DEV/PROD infrastructure via Terraform; deployments via Kubernetes / Helm.
    • Automated cluster chores via the Airflow API.
    • Managed cross-team role access; shipped the first dbt / Spark DAGs.
    • Airflow
    • EKS
    • Terraform
    • Helm
    • dbt
    • Spark
  5. Senior Scala / Data Engineer · Hewlett Packard Enterprise

    Core Team contributor on the Harmony platform.

    • Maintained and extended the Harmony platform; admin of the team’s Kubernetes cluster.
    • CI/CD with GitHub Actions and Jenkins (legacy).
    • Scala features with ZIO and Cats.
    • Co-designed the Complex Event Processing engine from scratch.
    • Messaging via Kafka / Pulsar.
    • Scala
    • ZIO
    • Cats
    • Kafka
    • Pulsar
    • Kubernetes
  6. DataOps · HydraDX.io

    Infra for Polkadot/Kusama parachains and analytics.

    • IaC with Terraform, Consul, Rundeck, Ansible, GitHub Actions.
    • Managed the full project infrastructure (Polkadot + Kusama parachains).
    • Analytics infra: Scala/Spark + ZIO on EMR; ad-hoc on Zeppelin / JupyterHub.
    • Ephemeral testnet deployments for the Runtime team via GitHub Actions + K8s.
    • Terraform
    • Polkadot
    • Kusama
    • Spark
    • ZIO
    • Kubernetes
  7. DataOps & Senior Data Engineer · Alterway Cloud Consulting

    Lead data engineer for client audits and infra-as-code engagements.

    • IaC for various clients (Terraform, CDK).
    • Data infrastructure and job-performance audits.
    • Kubernetes ecosystem on AWS.
    • Terraform
    • CDK
    • AWS
    • Kubernetes
  8. Senior Data Engineer · Andjaro

    Industrialized a full data pipeline; built a data catalog for product teams.

    • Audited the existing data infrastructure.
    • Pipeline industrialization: Kubernetes, EMR, Airflow, Jenkins, Athena, Glue.
    • Spark/Scala cleaning + aggregation jobs.
    • Data Catalog implementation for product teams.
    • EMR
    • Airflow
    • Athena
    • Glue
    • Spark
  9. Senior Data Engineer · Voodoo.io

    Daily-TB Spark jobs and a fresh datalake on AWS.

    • Spark/Scala cleaning and aggregation across terabytes daily.
    • Built a datalake on AWS; Airflow workflows; Kubernetes.
    • CI/CD with CircleCI / S3.
    • Spark
    • AWS
    • Airflow
    • Kubernetes
    • CircleCI
  10. Senior Data Engineer · Société Générale

    via Devoteam

    Spark migration and Kafka/NiFi ingestion.

    • Migrated jobs from Spark 1.x to 2.x.
    • Ingested data through Kafka and NiFi.
    • CI/CD with Jenkins / Nexus.
    • Spark
    • Kafka
    • NiFi
  11. Data Engineer · AXA Data Innovation Lab

    via Devoteam

    Cloudera platform admin + GDPR compliance project.

    • Spark/Scala cleaning and normalization of AXA’s entities data.
    • Cloudera cluster admin as part of the Platform Team.
    • Platform KPIs and YARN resource management.
    • On-site interventions for internal clients (Hong Kong, Germany, Spain, France).
    • GDPR compliance project: deletion / update of users’ sensitive data on request.
    • Spark
    • Cloudera
    • Hadoop
    • YARN
  12. Data Consultant · Devoteam Technology

    Pre-sales technical lead and internal Big Data trainer.

    • Tech / architecture lead on business opportunities.
    • Conducted technical interviews.
    • Big Data trainer at Devoteam University (Spark, Scala, Python, Hadoop).
    • Knowledge Community Leader: internal social-network articles on Big Data.
    • Spark
    • Scala
    • Python
    • Hadoop
  13. Data Engineer · Crédit Agricole CIB

    HortonWorks Hadoop clusters and PoCs.

    • Built and maintained HortonWorks Hadoop clusters.
    • Various proofs-of-concept.
    • Hadoop
    • HortonWorks

03 Skills

Programming

  • Scala
  • Python
  • Go
  • Rust
  • Clojure

Cloud

  • AWS
  • GCP
  • Azure

Data Platforms

  • Databricks
  • Snowflake

Data

  • Hadoop
  • Spark
  • Hudi
  • Delta
  • Iceberg

FP / Frameworks

  • ZIO
  • Cats
  • Cats Effect

NoSQL

  • Cassandra
  • HBase
  • DynamoDB
  • MongoDB

CI/CD

  • GitHub Actions
  • CircleCI
  • Jenkins
  • GitLab CI

Blockchain

  • Polkadot
  • Hyperledger
  • Ethereum

04 Education

Master's Degree, Engineering of Distributed Systems Paris 12 University · 2015

05 Beyond Code

B23 Dossier de Vol

✈ Aviation

A Streamlit app to prep VFR flights on the Bristell B23 (F-HBTI / F-HRDV). Reproduces the 9 sections of an ACAF flight dossier: weather, NOTAM, navlog with wind triangle, performance interpolation, fuel planning, weight & balance with envelope checks, and PDF export. Built it because I'm training as a pilot at Aéroclub Air France and wanted prep to be faster.

  • Python
  • Streamlit
  • pandas
  • matplotlib
  • fpdf2
aero.fakir.dev

06 Awards & Teaching

Odyssey Hackathon Winner Watch the demo
Morocco Web Awards Winner kezakoo.com
Lecturer in Data Engineering Paris-Est Créteil University