Hi 👋 ! I'm Maruthi,
Welcome to my page

I'm an result Oriented Data Engineer and Machine Learning expert with proficiency in Azure and AWS
Coming to fun stuff, I'm blogger in Medium and Writer for GoPenAI and Plumbers Of Data Science

Experience 💼

Data Engineer, TX, Health-Chain (Nov 2023 - Present)

    Associated with a startup, collaborating with cross-functional teams to design an end-to-end ETL pipeline for a healthcare data analysis solution utilizing the HL7 FHIR server.

  • Boosted upstream data ingestion from FHIR API by 25% using Azure Function App with a timer trigger.
  • Reduced code integration errors by 20% by implementing GitLab version control for CI/CD pipelines
  • Transformed 1 million patient records using Spark DataFrame API for analyzing patient authorization data.
  • Established data integration connections with APIs and cloud services.

Student Data Analyst, MI, Grand Valley State University (Aug 2021 - April 2023)

  • Planned with stakeholders to comprehend data needs, conducted extensive analysis to generate tailored insights and solutions.
  • Automated manual data processing tasks through Airflow DAGs, saving 15 hours per week for the data engineering team.
  • Prepared PySpark and Azure Databricks to partition and bucket data, resulting in an 80% reduction of data processing time.
  • Coordinated with teams and developed ETL workflows using Azure Synapse Analytics.

Data Engineer, India, Amazon (Nov 2018 - Jul 2021)

  • Designed and optimized ETL pipelines using S3 buckets and AWS Glue to reduce data processing time by 30%.
  • Implemented AWS services like Lambda and EC2 to automate data ingestion and storage, leading to a 50% increase in data accessibility.
  • Conducted in-depth data analysis to improve data-driven decision-making and resource allocation.

Education 🎓

  • Masters of Science, Data Science and Analytics
  • Grand Valley State University, MI, USA GPA 3.53

    Year of Completion: Apr 2023

  • Bachelor of Technology, Electronics and Communication
  • Mahaveer Institute Of Science and Technology, INDIA GPA 3.50

    Year of Completion: Jun 2016

    Skills 🧑🏻‍🏫

    Languages and Databases

    Python
    R
    SAS
    MySQL
    PostgreSQL
    AWS DynamoDB

    Machine Learning Libraries

    NumPy
    Pandas
    scikit-learn
    matplotlib

    Big Data Frameworks

    Apache Spark
    Apache Kafka
    Apache Airflow
    Hadoop

    Cloud Technologies

    AWS
    Azure

    Data Visualizations

    Tableau
    PowerBI

    Projects 🚀

    Project 1 : ETL Project in Azure

    • Description: I led an ETL project to manage and process large datasets efficiently.
    • Leveraging Azure Blob Storage and Azure Data Lake as secure and scalable data storage solutions
    • Designed data workflows and orchestrated data processing using Azure Data Factory.

    Tools Used: Azure Blob Storage, Azure Data Lake, Azure Data Factory

    ETL Project in Azure

    Project 2: API Integration Pipeline using AWS

    • Description: I have spearheaded development of a data pipeline to connect and ingest data from multiple API sources.
    • Utilized AWS S3 to efficiently manage incoming data and AWS Glue for data transformation and created dashboard in Tableau.
    • Integrated Python scripts to handle API requests and process data.

    Tools Used: AWS S3, AWS Glue, Tableau, Python

    API Integration Pipeline using AWS

    Project 3: Real-time Data Streaming and Processing with Apache Kafka and Apache Spark

    • Description: Implemented a real-time data streaming and processing pipeline using Apache Kafka and Apache Spark.
    • Integrated various data sources into Kafka topics for seamless ingestion and utilized Spark Streaming to process and analyze data in real-time.
    • The project resulted in a significant reduction in data latency, enabling timely insights and actionable decision-making.

    Tools Used: Apache Kafka, Apache Spark

    Real-time Data Streaming and Processing with Apache Kafka and Apache Spark

    Project 4: Regression Analysis and ML Deployment

    • Description: Designed and deployed machine learning pipelines on AWS, including model training, evaluation, and deployment.
    • Utilized AWS services such as Amazon SageMaker, Elastic Kubernetes Service (EKS), CloudWatch, and Lambda for efficient model management, scalability, and automated model retraining.
    • Implemented MLOps practices to establish an automated and continuous model deployment process.

    Tools Used: Amazon SageMaker, Elastic Kubernetes Service (EKS), CloudWatch, Lambda

    Regression Analysis and ML Deployment

    Resume 📄