Hello!

I'm Jithendra Yenugula, a software engineer passionate about building scalable & efficient data systems.

Check out my blog   jithendray.github.io/blog/

Feel free to get in touch   jithendra.yenugula@gmail.com

Background

As a Data Engineer, I thrive in bridging the gap between software engineering and data analytics. My expertise centers around a robust toolkit, including Python, SQL, AWS, and PySpark. Furthermore, I hold certifications as an AWS Solutions Architect Associate and Azure Data Engineer Associate. I hold a bachelor's degree in Computer Science from Indian Institutes of Information Technology, Jabalpur.

I am currently working as a Data Engineer at NeenOpal Inc., where I specialize in architecting and implementing cutting-edge cloud-based ETL solutions and data warehousing systems for a diverse global clientele. Notably, I have collaborated with clients across a spectrum of sectors, including logistics, real-estate, ed-tech, and NBFC, delivering tailored solutions that precisely meet their unique needs.

When I'm not working, you can find me indulging in my passion for metal music during my free time. My musical tastes gravitate toward doom and death metal genres. As a side interest, I contribute reviews of metal albums on my blog, Cursed Collection. I meticulously track my listening history on Spotify and Bandcamp through Lastfm, and I further use stats.fm to monitor my Spotify listening history.

Skills
Languages
  • Python
  • SQL
  • PySpark
  • Scala
  • HTML/CSS
  • JavaScript
Frameworks
  • Spark
  • Hadoop
  • Keras
  • Kafka
  • Django
  • PyTorch
Tools
  • AWS
  • Git & Github
  • Airflow
  • Azure
  • PostgreSQL & MySQL
  • MongoDB
Technologies
  • ETL design & testing
  • Data Warehousing
  • Time Series Forecasting
  • Machine Learning
  • Data Structures & Algorithms
Experience
Jun 2021 - Present
Data Engineer

Developing end-to-end cloud-based automated data warehousing and scalable ETL solutions for global clients operating in a variety of sectors

Developed a comprehensive end-to-end ETL solution for 6 distinct SBUs within a company, AWS and PowerBI.

- Architected a robust data model and established a data warehouse from the ground up on Redshift.

- Developed ETL pipelines to facilitate the seamless migration of data from an array of sources, including SAP, third-party data sources via APIs, and manual Excel files.

- Developed intricate SQL queries to ingeniously transform raw data tables into informative datasets to fuel PowerBI dashboards with actionable insights.

AWS Python SQL Redshift AWS Glue

Led a team of 2 Data Engineers for development of automated ETL jobs

- Orchestrated the development of ETL workflows to seamlessly migrate transactional data from AWS Aurora to Redshift, effectively utilizing Glue.

- Ensured the consistent quality control and performance monitoring of daily ETL scripts, promptly addressing any issues or errors that arose.

Amazon Redshift Amazon Aurora AWS Glue

Led a team of 2 Data Engineers for development of automated ETL jobs

- Actively engaged in collaborative meetings with stakeholders to gain a deep understanding of the underlying logic behind various sales KPIs.

- Developed SQL queries and Python scripts to execute complex calculations for the KPIs

- Created backend data models and views in Azure SQL database to dynamically transform raw Salesforce data into insightful data for powering 8 different dashboards consisting of 30+ sales KPIs.

Azure SQL Database SQL Azure Data Factory

Took charge of the complete data engineering responsibilities for the entire project

- Engineered ETL pipelines to facilitate the monthly migration of data from government sources, extracting information from both APIs and manual files stored in S3, and seamlessly loading it into an RDS PostgreSQL database.

- Developed complex SQL queries for transforming raw property related data into insightful data.

- Collaborated with the data scientist to contribute to the development of a sophisticated multivariate time-series forecasting model.

- This model was designed to predict the average property price growth in various UK regions over the course of the next decade.

PostgreSQL Python PySpark bash
Nov 2020 - May 2021
Data Scientist Intern

Collaborated with a Senior Data Scientist in the development of forecasting model

- Assisted in developing a hybrid forecasting model leveraging ML algorithms and statistical models.

- Automated the manual effort of importing, cleaning, and preprocessing data, scheduled ML model autoruns.

- Used Tableau for presenting interesting insights to the stakeholders.

Python Machine Learning TimeSeries forecasting
May 2019 - Jul 2019
Research Intern

Assisted Prof. Anupam Agarwal in his reserach

- Performed Literature survey on early detection of Autism in toddlers.

- Worked on Autism Detection application using Image Processing and Machine Learning techniques.

- Worked on POCs on various new domains of machine learning like Transfer Learning.

Python Machine Learning
Certificates
Other Projects

A Neural Network Accelerator for Darknet Reference Model for image classification on Imagenet Dataset on Intel Cyclone V Soc FPGA which achieved around 300% faster inference speed than CPU, when connected to ARM Cortex A9 processor.

FPGA Image Classification C OpenCL

Credit card fraud detection with AWS Sage maker and deployed on AWS Cloud9.

AWS Machine Learning Python JavaScript

Explored various models for forecasting time series and compared the performance of the models over two different metrics.

Time Series Forecasting Python Machine Learning

Detected anomalies in 5000 time series sequences with 140 timestamps obtained with ECG and corresponds to heartbeats from a single patient using LSTM Auto-encoder.

Python Deep Learning Anomaly Detection

What's common between top songs on Spotify from 2010-19; Analyzed my personal Spotify streaming history; Analyzed music discographies of 2 of my favorite artists; etc.

Data Science Data Analysis Python

Predicted when to buy or sell stocks using simple dual moving average crossover strategy. And then backtested it over 5 years of stock.

Data Science Python Back Testing

Explored forecasting models based on Long Short Term Memory(LSTM) and Facebook's Prophet in order to predict the future prices.

Time Series Forecasting Python Statistics