Hello, I'm
M. Abdullah Janjua

Data Professional

About Me

Profile Picture

I am a Data Engineer by core expertise, with hands on experience across the entire data lifecycle, from raw data ingestion and pipeline design to analytics, machine learning, and AI-powered applications. I specialize in building scalable, reliable data systems that enable downstream analytics, data science, and intelligent decision making.

My background spans data engineering, data analysis, data science, and AI engineering, allowing me to work end-to-end: performing exploratory data analysis, designing ETL/ELT pipelines, modeling data for analytics, training and deploying machine learning models, and delivering insights through dashboards and APIs.

I have also worked on AI engineering and Generative AI systems, including LLM-based applications, LangChain pipelines, vector databases, and agentic AI workflows, integrating them with structured and unstructured data sources. This enables me to bridge traditional data platforms with modern AI driven solutions.

I focus on building solutions that are production ready, scalable, and aligned with real world business and research needs, with strong proficiency in Python, SQL, data pipelines, machine learning workflows, and data visualization. My strength lies in connecting data engineering foundations with advanced analytics and AI to deliver measurable impact.

My Projects

MedAssist Insight
MedAssist Insight

Built an end-to-end Generative & Agentic AI pipeline in Python to process unstructured medical lab reports using OCR, NLP, and ETL workflows. Designed a RAG-based knowledge retrieval system and deployed an interactive Streamlit app for real-time insights.

News Analytics ETL Pipeline
End-to-End News Analytics ETL Pipeline

Designed a complete ETL pipeline to extract news articles, clean & normalize text, generate embeddings, perform clustering and classification, and create analytics-ready datasets for sentiment, keywords, and locations.

Media Sentiment Dashboard
Media Sentiment & Regional Insights Dashboard

Built interactive Power BI dashboards on top of the NLP data pipeline to analyze news sentiment trends by category, location, and time, enabling exploratory and comparative analysis.

Company Atlas
Company Atlas

Built an end-to-end data & ML pipeline in Python with scalable ETL workflows using Polars & PySpark. Applied NLP and unsupervised learning to generate company clusters and deployed an interactive Streamlit app for cluster visualization & analytics.

Project 1
Data Science Analysis on Factors Affecting 1st SGPA

Collected real-world data from students via Google Forms and performed cleaning and preprocessing using Python, Pandas, and NumPy. Conducted statistical analysis and EDA with Seaborn and Matplotlib in python to extract insights, then trained a Machine Learning model to predict student GPA. Built a dashboard based on insights extracted from data analysis on tableau. Developed and deployed a Flask-based web app on Vercel, enabling real-time GPA predictions through an interactive user interface.

Project 2
Sales Dashboard

Designed professional sales dashboards in Google Looker Studio, Tableau and IBM Cognos to visualize key metrics, enabling data-driven business insights and reporting.

Project 3
Maze AI

Utilized and implemented different AI search strategies in python, so a computer could solve a Maze, blindly.

Project 2
Classic Game Arcade

Designed and implemented a modular gaming arcade in Python, developing multiple Python-based games and integrating them into a unified system with a menu-driven interface for seamless gameplay selection.

Skills & Expertise

Data Engineering
Data Engineering

Expertise in designing scalable ETL pipelines, data warehousing, data integration, and big data processing. Skilled in handling large datasets, ensuring data quality, and building analytics-ready pipelines.

ETL Data Pipelines SQL NoSQL Polars PySpark Pandas Data Modeling
AI Engineering
AI Engineering

Skilled in building end-to-end AI pipelines, generative AI, agentic AI, and LLM-based solutions. Experienced in NLP, embeddings, RAG systems, and deploying interactive AI applications.

LangChain LLMs Generative AI Agentic AI NLP PyTorch Transformers RAG Streamlit
Python
Python Programming

Advanced Python development with expertise in data manipulation, statistical analysis, and automation. Proficient in comprehensive data science libraries and big data processing frameworks.

Pandas Polars PySpark NumPy SciPy Matplotlib Seaborn Plotly Scikit-learn
TensorFlow
Machine Learning

Comprehensive ML expertise using Scikit-learn library. Building and deploying predictive models with advanced algorithms for classification, regression, and clustering.

Scikit-learn Classification Regression
MySQL
SQL

Expert in SQL for complex data extraction, transformation, and analysis. Experience with data warehousing, ETL processes, and database optimization.

MySQL PostgreSQL ETL
Power BI
Data Visualization

Creating interactive dashboards and compelling visualizations using Power BI, Looker Studio, and Python libraries (Matplotlib, Seaborn, Plotly).

Matplotlib Seaborn Power BI Google Looker Studio Tableau IBM Cognos
Excel
Excel & Analytics

Advanced Excel skills for data analysis, pivot tables, VBA automation, and business intelligence. Creating dynamic reports and dashboards.

Pivot Tables VBA Power Query
R
Statistical Analysis

Strong foundation in statistical methods, hypothesis testing, A/B testing, and experimental design. Translating data insights into actionable recommendations.

Hypothesis Testing A/B Testing Regression
Jupyter
Exploratory Data Analysis

Comprehensive EDA skills for uncovering patterns, detecting anomalies, and understanding data distributions. Creating insightful visualizations and statistical summaries.

Data Profiling Correlation Analysis Outlier Detection
Git
Data Science Tools

Proficient in industry-standard tools and platforms including Google Colab, Kaggle, Jupyter Notebooks, and Git for collaborative data science projects.

Google Colab Kaggle Jupyter Git Github

Get In Touch