I am a Data Engineer by core expertise, with hands on experience across the entire data lifecycle, from raw data ingestion and pipeline design to analytics, machine learning, and AI-powered applications. I specialize in building scalable, reliable data systems that enable downstream analytics, data science, and intelligent decision making.
My background spans data engineering, data analysis, data science, and AI engineering, allowing me to work end-to-end: performing exploratory data analysis, designing ETL/ELT pipelines, modeling data for analytics, training and deploying machine learning models, and delivering insights through dashboards and APIs.
I have also worked on AI engineering and Generative AI systems, including LLM-based applications, LangChain pipelines, vector databases, and agentic AI workflows, integrating them with structured and unstructured data sources. This enables me to bridge traditional data platforms with modern AI driven solutions.
I focus on building solutions that are production ready, scalable, and aligned with real world business and research needs, with strong proficiency in Python, SQL, data pipelines, machine learning workflows, and data visualization. My strength lies in connecting data engineering foundations with advanced analytics and AI to deliver measurable impact.
Designed a complete ETL pipeline to extract news articles, clean & normalize text, generate embeddings, perform clustering and classification, and create analytics-ready datasets for sentiment, keywords, and locations.
Collected real-world data from students via Google Forms and performed cleaning and preprocessing using Python, Pandas, and NumPy. Conducted statistical analysis and EDA with Seaborn and Matplotlib in python to extract insights, then trained a Machine Learning model to predict student GPA. Built a dashboard based on insights extracted from data analysis on tableau. Developed and deployed a Flask-based web app on Vercel, enabling real-time GPA predictions through an interactive user interface.
Utilized and implemented different AI search strategies in python, so a computer could solve a Maze, blindly.
Designed and implemented a modular gaming arcade in Python, developing multiple Python-based games and integrating them into a unified system with a menu-driven interface for seamless gameplay selection.
Expertise in designing scalable ETL pipelines, data warehousing, data integration, and big data processing. Skilled in handling large datasets, ensuring data quality, and building analytics-ready pipelines.
Skilled in building end-to-end AI pipelines, generative AI, agentic AI, and LLM-based solutions. Experienced in NLP, embeddings, RAG systems, and deploying interactive AI applications.
Advanced Python development with expertise in data manipulation, statistical analysis, and automation. Proficient in comprehensive data science libraries and big data processing frameworks.
Comprehensive ML expertise using Scikit-learn library. Building and deploying predictive models with advanced algorithms for classification, regression, and clustering.
Expert in SQL for complex data extraction, transformation, and analysis. Experience with data warehousing, ETL processes, and database optimization.
Creating interactive dashboards and compelling visualizations using Power BI, Looker Studio, and Python libraries (Matplotlib, Seaborn, Plotly).
Advanced Excel skills for data analysis, pivot tables, VBA automation, and business intelligence. Creating dynamic reports and dashboards.
Strong foundation in statistical methods, hypothesis testing, A/B testing, and experimental design. Translating data insights into actionable recommendations.
Comprehensive EDA skills for uncovering patterns, detecting anomalies, and understanding data distributions. Creating insightful visualizations and statistical summaries.
Proficient in industry-standard tools and platforms including Google Colab, Kaggle, Jupyter Notebooks, and Git for collaborative data science projects.