As a Data Engineer, I thrive in bridging the gap between software engineering and data analytics. My expertise centers around a robust toolkit, including Python, SQL, AWS, and PySpark. Furthermore, I hold certifications as an AWS Solutions Architect Associate and Azure Data Engineer Associate. I hold a bachelor's degree in Computer Science from Indian Institutes of Information Technology, Jabalpur.
I am currently working as a Data Engineer at NeenOpal Inc., where I specialize in architecting and implementing cutting-edge cloud-based ETL solutions and data warehousing systems for a diverse global clientele. Notably, I have collaborated with clients across a spectrum of sectors, including logistics, real-estate, ed-tech, and NBFC, delivering tailored solutions that precisely meet their unique needs.
When I'm not working, you can find me indulging in my passion for metal music during my free time. My musical tastes gravitate toward doom and death metal genres. As a side interest, I contribute reviews of metal albums on my blog, Cursed Collection. I meticulously track my listening history on Spotify and Bandcamp through Lastfm, and I further use stats.fm to monitor my Spotify listening history.
Developing end-to-end cloud-based automated data warehousing and scalable ETL solutions for global clients operating in a variety of sectors
Developed a comprehensive end-to-end ETL solution for 6 distinct SBUs within a company, AWS and PowerBI.
- Architected a robust data model and established a data warehouse from the ground up on Redshift.
- Developed ETL pipelines to facilitate the seamless migration of data from an array of sources, including SAP, third-party data sources via APIs, and manual Excel files.
- Developed intricate SQL queries to ingeniously transform raw data tables into informative datasets to fuel PowerBI dashboards with actionable insights.
Led a team of 2 Data Engineers for development of automated ETL jobs
- Orchestrated the development of ETL workflows to seamlessly migrate transactional data from AWS Aurora to Redshift, effectively utilizing Glue.
- Ensured the consistent quality control and performance monitoring of daily ETL scripts, promptly addressing any issues or errors that arose.
Led a team of 2 Data Engineers for development of automated ETL jobs
- Actively engaged in collaborative meetings with stakeholders to gain a deep understanding of the underlying logic behind various sales KPIs.
- Developed SQL queries and Python scripts to execute complex calculations for the KPIs
- Created backend data models and views in Azure SQL database to dynamically transform raw Salesforce data into insightful data for powering 8 different dashboards consisting of 30+ sales KPIs.
Took charge of the complete data engineering responsibilities for the entire project
- Engineered ETL pipelines to facilitate the monthly migration of data from government sources, extracting information from both APIs and manual files stored in S3, and seamlessly loading it into an RDS PostgreSQL database.
- Developed complex SQL queries for transforming raw property related data into insightful data.
- Collaborated with the data scientist to contribute to the development of a sophisticated multivariate time-series forecasting model.
- This model was designed to predict the average property price growth in various UK regions over the course of the next decade.
Collaborated with a Senior Data Scientist in the development of forecasting model
- Assisted in developing a hybrid forecasting model leveraging ML algorithms and statistical models.
- Automated the manual effort of importing, cleaning, and preprocessing data, scheduled ML model autoruns.
- Used Tableau for presenting interesting insights to the stakeholders.
Assisted Prof. Anupam Agarwal in his reserach
- Performed Literature survey on early detection of Autism in toddlers.
- Worked on Autism Detection application using Image Processing and Machine Learning techniques.
- Worked on POCs on various new domains of machine learning like Transfer Learning.
A Neural Network Accelerator for Darknet Reference Model for image classification on Imagenet Dataset on Intel Cyclone V Soc FPGA which achieved around 300% faster inference speed than CPU, when connected to ARM Cortex A9 processor.
Credit card fraud detection with AWS Sage maker and deployed on AWS Cloud9.
Explored various models for forecasting time series and compared the performance of the models over two different metrics.
Detected anomalies in 5000 time series sequences with 140 timestamps obtained with ECG and corresponds to heartbeats from a single patient using LSTM Auto-encoder.
What's common between top songs on Spotify from 2010-19; Analyzed my personal Spotify streaming history; Analyzed music discographies of 2 of my favorite artists; etc.
Predicted when to buy or sell stocks using simple dual moving average crossover strategy. And then backtested it over 5 years of stock.
Explored forecasting models based on Long Short Term Memory(LSTM) and Facebook's Prophet in order to predict the future prices.