My Portfolio

Data Engineer, TX, Health-Chain (Nov 2023 - Present)

Associated with a startup, collaborating with cross-functional teams to design an end-to-end ETL pipeline for a healthcare data analysis solution utilizing the HL7 FHIR server.

Boosted upstream data ingestion from FHIR API by 25% using Azure Function App with a timer trigger.
Reduced code integration errors by 20% by implementing GitLab version control for CI/CD pipelines
Transformed 1 million patient records using Spark DataFrame API for analyzing patient authorization data.
Established data integration connections with APIs and cloud services.

Student Data Analyst, MI, Grand Valley State University (Aug 2021 - April 2023)

Planned with stakeholders to comprehend data needs, conducted extensive analysis to generate tailored insights and solutions.
Automated manual data processing tasks through Airflow DAGs, saving 15 hours per week for the data engineering team.
Prepared PySpark and Azure Databricks to partition and bucket data, resulting in an 80% reduction of data processing time.
Coordinated with teams and developed ETL workflows using Azure Synapse Analytics.

Data Engineer, India, Amazon (Nov 2018 - Jul 2021)

Designed and optimized ETL pipelines using S3 buckets and AWS Glue to reduce data processing time by 30%.
Implemented AWS services like Lambda and EC2 to automate data ingestion and storage, leading to a 50% increase in data accessibility.
Conducted in-depth data analysis to improve data-driven decision-making and resource allocation.

Projects 🚀

Project 1 : ETL Project in Azure

Description: I led an ETL project to manage and process large datasets efficiently.

Leveraging Azure Blob Storage and Azure Data Lake as secure and scalable data storage solutions

Designed data workflows and orchestrated data processing using Azure Data Factory.

Tools Used: Azure Blob Storage, Azure Data Lake, Azure Data Factory

Project 2: API Integration Pipeline using AWS

Description: I have spearheaded development of a data pipeline to connect and ingest data from multiple API sources.

Utilized AWS S3 to efficiently manage incoming data and AWS Glue for data transformation and created dashboard in Tableau.

Integrated Python scripts to handle API requests and process data.

Tools Used: AWS S3, AWS Glue, Tableau, Python

Project 3: Real-time Data Streaming and Processing with Apache Kafka and Apache Spark

Description: Implemented a real-time data streaming and processing pipeline using Apache Kafka and Apache Spark.

Integrated various data sources into Kafka topics for seamless ingestion and utilized Spark Streaming to process and analyze data in real-time.

The project resulted in a significant reduction in data latency, enabling timely insights and actionable decision-making.

Tools Used: Apache Kafka, Apache Spark

Real-time Data Streaming and Processing with Apache Kafka and Apache Spark

Project 4: Regression Analysis and ML Deployment

Description: Designed and deployed machine learning pipelines on AWS, including model training, evaluation, and deployment.

Utilized AWS services such as Amazon SageMaker, Elastic Kubernetes Service (EKS), CloudWatch, and Lambda for efficient model management, scalability, and automated model retraining.

Implemented MLOps practices to establish an automated and continuous model deployment process.

Tools Used: Amazon SageMaker, Elastic Kubernetes Service (EKS), CloudWatch, Lambda

Maruthi Chadalapaka

Data Engineer

Hi 👋 ! I'm Maruthi,
Welcome to my page

I'm an result Oriented Data Engineer and Machine Learning expert with proficiency in Azure and AWS
Coming to fun stuff, I'm blogger in Medium and Writer for GoPenAI and Plumbers Of Data Science

Experience 💼

Data Engineer, TX, Health-Chain (Nov 2023 - Present)

Student Data Analyst, MI, Grand Valley State University (Aug 2021 - April 2023)

Data Engineer, India, Amazon (Nov 2018 - Jul 2021)

Education 🎓

Masters of Science, Data Science and Analytics

Bachelor of Technology, Electronics and Communication

Skills 🧑🏻‍🏫

Languages and Databases

Machine Learning Libraries

Big Data Frameworks

Cloud Technologies

Data Visualizations

Projects 🚀

Project 1 : ETL Project in Azure

Description: I led an ETL project to manage and process large datasets efficiently.

Leveraging Azure Blob Storage and Azure Data Lake as secure and scalable data storage solutions

Designed data workflows and orchestrated data processing using Azure Data Factory.

Project 2: API Integration Pipeline using AWS

Project 3: Real-time Data Streaming and Processing with Apache Kafka and Apache Spark

Project 4: Regression Analysis and ML Deployment

Resume 📄

Maruthi Chadalapaka

Data Engineer

Data Engineer, TX, Health-Chain (Nov 2023 - Present)

Student Data Analyst, MI, Grand Valley State University (Aug 2021 - April 2023)

Data Engineer, India, Amazon (Nov 2018 - Jul 2021)

Masters of Science, Data Science and Analytics

Bachelor of Technology, Electronics and Communication

Languages and Databases

Machine Learning Libraries

Big Data Frameworks

Cloud Technologies

Data Visualizations

Projects 🚀

Project 1 : ETL Project in Azure

Description: I led an ETL project to manage and process large datasets efficiently. Leveraging Azure Blob Storage and Azure Data Lake as secure and scalable data storage solutions Designed data workflows and orchestrated data processing using Azure Data Factory.

Project 2: API Integration Pipeline using AWS

Project 3: Real-time Data Streaming and Processing with Apache Kafka and Apache Spark

Project 4: Regression Analysis and ML Deployment

Description: I led an ETL project to manage and process large datasets efficiently.

Leveraging Azure Blob Storage and Azure Data Lake as secure and scalable data storage solutions

Designed data workflows and orchestrated data processing using Azure Data Factory.