Katipalli Manisha
Data Engineer | Transforming Data into Insights
Professional Summary
Data Engineer with 4+ years of experience building scalable cloud data platforms, modern ELT pipelines, and analytics-ready datasets across
healthcare and enterprise environments. Strong expertise in Python, SQL, Spark, dbt, Azure Data Factory, and AWS services with hands-on
experience in data modeling, orchestration, and production deployments. Familiar with AI fundamentals, feature engineering workflows, and
API-driven data systems including GraphQL integrations. Proven ability to design reliable pipelines, improve data quality, and enable analytics
and ML teams through optimized, governed data solutions.
Contact Information
📱 +1 (361) 228-9360
🔗 LinkedIn: katipallimanisha
📍 United States
Core Technical Expertise
Data Engineering
Python, SQL, PySpark, ETL/ELT Development, Data Modeling, Apache Airflow, Data Warehousing
Cloud Platforms
AWS (Glue, Lambda, S3, Redshift, EMR), Azure (Data Factory, Databricks, Synapse), Snowflake
Analytics & Visualization
Tableau, Power BI, Data Analysis, Business Intelligence, KPI Development
DevOps & Automation
CI/CD Pipelines, Jenkins, Docker, Terraform, Git, Infrastructure as Code
Professional Certifications
Databricks Fundamentals
Advanced data engineering and analytics platform expertise
Microsoft Data Analytics
Get started with Microsoft data analytics certification
Accenture Data Analytics
Data Analytics and Visualization Job Simulation
GCP Platform Architect
Google Cloud Platform architecture certification
AWS Solutions Architecture
AWS APAC Solutions Architecture Job Simulation
IBM - Senior Data Engineer
August 2024- Present | United States
Pipeline Optimization
Designed end-to-end data pipelines using PySpark, SQL, and Airflow to ingest, cleanse, and transform high-volume enterprise data, enhancing quality and enabling consistent reporting.
Cloud Architecture
Built cloud-native ETL workflows on AWS using Glue, Lambda, S3, and Redshift, automating ingestion and reducing manual preparation by 80%.
Data Lake Implementation
Implemented scalable architectures leveraging Delta Lake and Snowflake with bronze, silver, and gold layers for improved governance and query speed.
Key IBM Achievements
80%
Automation Increase
Reduced manual data preparation through automated ETL workflows
40%
Performance Boost
Accelerated execution times through SQL and Spark optimization
100%
CI/CD Coverage
Automated deployment using Jenkins, Docker, and Terraform
Framework Development
Developed reusable PySpark frameworks for anomaly detection, schema validation, and data reconciliation, increasing accuracy and preventing downstream reporting issues.
Cross-Functional Collaboration
Partnered with analytics and data science teams to deliver feature-ready datasets supporting predictive modeling and KPI analysis.
CitiusTech - Data Engineer
July 2020 - July 2023 | India
01
Healthcare Pipeline Engineering
Engineered large-scale pipelines using Spark, Python, and SQL to process claims, EHR, provider, and patient datasets with HIPAA-compliant workflows.
02
ETL Integration
Built ETL pipelines with Airflow, Azure Data Factory, and Python to integrate HL7, FHIR, EMR, and vendor systems.
03
Data Modeling
Designed scalable models in Snowflake and Azure Synapse for analytical queries and patient outcomes reporting.
04
Quality Frameworks
Implemented data quality frameworks using Python and Great Expectations for validation and compliance.
Healthcare Data Impact
Cloud Scaling
Leveraged Azure Databricks and AWS EMR to scale processing workloads and improve throughput for complex transformations.
BI Optimization
Developed optimized fact/dimension tables powering Power BI and Tableau dashboards for clinical analytics and executive KPI tracking.
Compliance & Governance
Developed automated audit logs and metadata lineage to support compliance audits and strengthen governance.
Optimized Spark performance through cluster tuning, partitioning strategies, and caching techniques, reducing compute overhead and improving SLA adherence for high-priority ETL workloads.
Education & Academic Background
1
Master's Degree - Computer Science
Texas A&M University-Kingsville
August 2023 - May 2025
Advanced studies in data engineering, algorithms, and distributed systems
2
Bachelor of Technology - Civil Engineering
R.V.R. & J.C. College of Engineering
2018 - 2022
Foundation in analytical thinking and problem-solving methodologies
Let's Build Something Together
Ready to Collaborate?
I'm passionate about transforming complex data challenges into scalable, efficient solutions. Whether you're looking to build robust data pipelines, optimize cloud infrastructure, or unlock insights from your data, let's connect.

Open to opportunities: Data Engineering | Cloud Architecture | ETL Development | Analytics Solutions
Made with