Summary
Overview
Work History
Education
Skills
Websites
Certification
Work Preference
Timeline
Generic
Saif Ullah
Open To Work

Saif Ullah

Islamabad

Summary

Offering highly analytical mindset and passion for leveraging data to solve complex problems, eager to learn and grow within dynamic team. Brings foundational understanding of data systems and database design that can be quickly developed through hands-on experience. Ready to use and develop skills in SQL and Python in Data Engineer role.

Overview

2
2
years of professional experience

Work History

Data Engineer

LAAM Technologies
05.2025 - Current
  • Reduced ClickHouse storage footprint by 20% by redesigning schema
  • Improved ClickHouse indexes, reducing data read by 60%, enabling efficient bulk ingestion into ElasticSearch
  • Optimized ClickHouse queries by 75x by eliminating redundant joins & expensive functions
  • Enabled users to analyze campaigns, inventory flow & latency by designing end-to-end pipelines on GCP & ClickHouse
  • Ensured consistent data by designing & developing a parallelized data validation service in Go for front-end events
  • Saved $2000/month by optimizing deployments & storage formats for events-handling data pipelines

Data Engineer

DCore
06.2024 - 05.2025
  • Deployed 10+ core microservices & data tools including Airbyte, Kafka, Jasper Reports, Superset, Goldilocks
  • Proposed & led the deployment of ClickHouse, for efficient historical data capture
  • Built & deployed a CDC pipeline in Airbyte ingesting 100M+ records from Postgres (AWS RDS) to ClickHouse.
  • Managed ClickHouse in 5+ environments, optimized its logs retention to avoid OOM Kills
  • Developed staging/bronze data layer for 3+ core services in DBT, standardizing raw data for downstream modelling
  • Accelerated DBT models by 8x using multi-threading & lightweight deletes in ClickHouse
  • Streamlined execution of DBT pipelines by deploying Dagster across 5+ environments
  • Increased error detection rate by 50% by creating Kibana alerts on Postgres replication slot lag
  • Participated in 24/7 on-call support for multiple sprints resolving high-priority infrastructure & pipeline issues

Data Engineer

Vyro
01.2024 - 06.2024
  • Built automated ETL pipelines on GCP to load data from GCS to BigQuery
  • Wrote advanced SQL using BigQuery with CTEs, subqueries, & window functions for efficient business analytics
  • Deployed data ingestion scripts on Cloud Functions & orchestrated the scripts using Cloud Scheduler, reducing manual intervention time by 90%
  • Reduced execution time of data ingestion scripts by 8x using asynchronous programming
  • Managed vital data pipelines from APIs including GA4, RevenueCat, Adjust, Singular & Stripe for downstream analytics

Big Data Engineer

Ufone 4G
07.2023 - 08.2023
  • Streamlined ingestion pipeline, bypassing HDFS dependency, with direct data transformation & loading into Teradata.
  • Harnessed Teradata utilities (FastLoad, MultiLoad, TPT, TPump, BTEQ, FastExport) for simplified data loading
  • Employed diverse SerDes (XML, JSON, CSV, Fixed-width) using HiveQL to seamlessly ingest semi-structured data

Big Data Analyst

Optical Networks & Technologies Lab
06.2022 - 09.2022
  • Used Spark Streaming to analyze .pcap data in optimized batches, each containing 1 Million packets
  • Executed data analysis on remote servers using PySpark, showcasing adaptability & remote processing expertise
  • Optimized .pcap parsing by customizing Scapy, converting .pcap files to dataframes & improving efficiency by 50%

Education

Bachelor of Engineering - Software Engineering

National University of Sciences & Technology (NUST)
Islamabad, Pakistan

Skills

  • SQL
  • Shell Scripting
  • Docker & K8s
  • Terraform
  • ETL development (Airbyte, DBT)
  • Data warehousing (ClickHouse, Big Query)
  • Data modeling
  • Data orchestrators (Dagster, Airflow, Talend)
  • Data lineage (Dagster)
  • Data pipeline design
  • Git version control
  • Python programming

Certification

  • Taming Big Data with Apache Spark & Python | Udemy Jul 2022
  • Airflow in Python | Datacamp Dec 2023
  • LangChain for LLM Application Development | DeepLearning.AI Feb 2025

Timeline

Data Engineer

LAAM Technologies
05.2025 - Current

Data Engineer

DCore
06.2024 - 05.2025

Data Engineer

Vyro
01.2024 - 06.2024

Big Data Engineer

Ufone 4G
07.2023 - 08.2023

Big Data Analyst

Optical Networks & Technologies Lab
06.2022 - 09.2022

Bachelor of Engineering - Software Engineering

National University of Sciences & Technology (NUST)
Saif Ullah