Data Engineer

Ferdian Maulana Akbar

Building reliable data pipelines at scale

Jakarta, Indonesia 5+ years exp Open to work

Data Engineer with 5+ years of experience at PT Astra International Tbk, Indonesia's largest automotive conglomerate, specializing in ETL pipeline development, data warehouse design, and ML-powered automation. I've built production Airflow pipelines that reduced manual effort by 71–83% across digital marketing, truck monitoring, and demand forecasting projects.

FMA

Skills

Languages & Querying

Python SQL Node.js

Pipelines & Orchestration

Apache Airflow Crontab

Data Modeling

dbt Kimball SCD Star Schema

Cloud & Storage

GCP BigQuery Cloud Functions Cloud Run GCS Snowflake Databricks PostgreSQL SQL Server

API & Backend

FastAPI Flask

AI, ML, & NLP

LangChain Sentiment Analysis Topic Modeling Predictive Analytics

BI & Dashboard Embedding

Power BI Looker

Web Scraping & Automation

Selenium BeautifulSoup

Version Control & CI/CD

GitHub GitLab

Experience

Data Engineer Apr 2021 — Present

PT Astra International Tbk · Full-time · Jakarta

Indonesia's largest diversified conglomerate (Fortune 500 equivalent), with operations spanning automotive, financial services, and infrastructure. Leading data engineering work across ETL pipelines, ML deployment, and product development.

ETL PIPELINES & AUTOMATION

  • Ads Performance & Social Media Monitoring: built end-to-end ETL pipelines integrating Facebook Ads, Google Ads, Google Analytics, YouTube, and SimilarWeb — reduced manual effort by 83%.
  • Truck Market Monitoring: operationalized Python extraction scripts into a production Airflow pipeline — reduced manual effort by 75%.
  • Customer Truck Usage Monitoring: built Airflow-orchestrated ETL for usage data tracking — reduced manual effort by 71%.
  • Data Warehouse Migration: designed a Snowflake DWH for a beauty company covering finance, reseller, and sales — delivered 10 SCD Type 2 dimensions and 19 fact tables using Kimball star schema, orchestrated via Airflow.

ML & PREDICTIVE TOOLS

  • Truck Demand Prediction: designed and deployed a full ML pipeline for demand forecasting — reduced manual effort by 82%.
  • Online Lead Scoring (Car Sales): implemented data flows via GCP Functions and App Script for an A/B experiment on lead scoring — increased conversion rate by 1% and revenue creation by 150% vs. control group.

PRODUCT & PLATFORM DEVELOPMENT

  • Sales Planning Tools: managed external data integrations and delivered analytics-ready datasets — won 1st place at Astra Innovation Competition 2023.
  • AI Chatbot for Sales Planning: designed and built an AI chatbot using FastAPI, LangChain, and Next.js for self-serve data insights.
  • Text Analysis Product: built NLP models for a media monitoring product — sentiment analysis, topic modeling, and location extraction on social media and news data.
  • Led the data workstream for a large-scale automotive distributor project, delivering integrated analytics dashboards and data quality monitoring across dealer and central systems.
Python Apache Airflow BigQuery Snowflake dbt GCP FastAPI LangChain SQL Power BI
Web Developer Intern Jun 2019 — Aug 2019

PT Bank Rakyat Indonesia Tbk · Internship · Jakarta

Developed a web-based employee attendance portal information system.

Web Development

Portfolio

🚗
Automotive Aftersales Pipeline

Production-grade batch data pipeline for automotive dealer aftersales analytics — raw ingestion to BigQuery mart layer with dbt and Airflow.

🤖
Marketing AI Agent

Autonomous AI agent for YouTube channel analytics — ETL pipeline feeding a LangChain + Groq (Llama 3.3 70B) agent exposed via FastAPI.

📊
Power BI Report Embedding

Flask web app for securely embedding Power BI reports via Azure AD authentication and dynamic token generation.

FastAPI ML Predictor

REST API for serving machine learning model predictions — supports single and batch inference with Dockerized deployment.

🏗️
Star Schema ETL with Airflow

Scalable ETL pipeline from GCS to BigQuery organized in a Kimball star schema — raw, core (SCD Type 2), and datamart layers with CI/CD.