Managing Consultant - Data Engineering

Building Scalable
Data Ecosystems
for 8+ Years.

With more than half a decade of professional experience, I've been designing, developing, and delivering Data Engineering solutions ranging from creating standalone ETL scripts to complex frameworks on clouds like AWS and Azure. Writing clean, reusable, and well-documented code one project at a time.

Experience

A timeline of engineering complex data architectures and leading technical transformations.

March 2021 — PRESENT

Managing Consultant

Systems Limited - Lahore, Pakistan

  • terminal PartnerLinq - Visibility - Data Engineering Lead
  • terminal Creator of the Data Framework (Python, Pandas, PySpark, Delta Lake) capable of acquiring, ingesting, and executing analytical processes.
  • terminal Responsible for end-to-end solutioning and design, innovation, and optimizations.
  • terminal Designed and data modeled the metastore, data/delta lake, Canonical Models, and SQL Server datastore.
  • terminal Designed and developed REST and GraphQL APIs for Semantic Integration, allowing outside applications to interact with internal systems.
  • terminal Designed and orchestrated data pipelines using Databricks Workflows, Airflow, and Azure Data Factory.
  • terminal Developed a query builder to construct SQL queries for a variety of dialects based on a grid model.
August 2018 — March 2021

Senior Software Engineer

Northbay Solutions - Lahore, Pakistan

  • terminal Developed ETL scripts using PySpark and created a comprehensive Data Lake framework.
  • terminal Migrated SAS code into PySpark code and developed dependency packages for AWS Glue ETL jobs.
  • terminal Developed a service responsible for triggering processes automatically when files are placed on S3 utilizing AWS Lambda and Cloudwatch.
  • terminal Created dynamically parallel State Machines and frameworks to automate cloud migration and infrastructure deployments (Cloudformation).
  • terminal Created API using API Gateway for Kinesis Data Streams and built tools for DynamoDB access.
2018

Intern

Mentor Graphics - Lahore, Pakistan

  • terminal Interned as QA for the Nucleus Real-time Operating System utilizing C/C++, Python, and Bash.

Core Technical Stack

Validated Expertise
data_object
Python & Bash
database
SQL & NoSQL
hub
Apache Spark & Databricks
cloud
AWS & Azure Ecosystems
account_tree
Airflow & Orchestration
ac_unit
Delta Lake / Data Lakes
transform
GraphQL & REST APIs
grid_view
Infrastructure as Code
dynamic_feed
Kafka & Big Data Engineering

Featured Projects

Flagship projects that I've created from the scratch .

dataset Data Engineering

PartnerLinq - Data Framework

Built a modular data framework using Python, Pandas, PySpark, and Delta Lake. This framework allows creation of data pipeline for both Data Engineering and Machine Learning use-cases via simple orchestration. The modules are independent and can be used in any order. It can use Domain-Specific Language (DSL) for curation of the data into the Canonical Models as well as for building analytical queries, using Semantic Model, which allows infinite flexibility and scalability regadless of the number of inputs and outputs as well as the complexity of the transformations. Complex data pipelines can be created without having to do any deployment.

Python PySpark Delta Lake Pandas Azure Databricks
api API

PartnerLinq - Visibility - Integration API

This API, based on Django Rest Framework, is responsible for allowing other systems to use the Data Framework for their own data engineering and machine learning needs. It can keep track of the statuses of the jobs and can act smartly route jobs between multiple clusters depending on the requirements.

Django Rest Framework Azure Web Apps
api API

PartnerLinq - Visibility - Semantic API

This API, based on Django Rest Framework, for serving the metadata from the metastore. It exposes a GraphQL endpoint for the frontend to consume so that it can build the views dynamically based on the metadata. It also has a REST endpoint for backward compatibility as well for some complex scenaerios.

Django Rest Framework Strawberry-Django Azure Web Apps

Let's build something scalable.

Currently open to consulting opportunities or senior engineering leadership roles focused on high-scale data infrastructure.