About me

Over 6+ years of experience in Data Engineering and Big Data Analytics, specializing in designing scalable data pipelines, enterprise data governance, and cloud-based analytics solutions. Expertise in Hadoop (Spark, Kafka, Hive), Azure (Data Factory, Databricks, Synapse), and AWS (Glue, Redshift, S3). Skilled in Python, SQL, SparkSQL, and Shell scripting for ETL automation and data transformation.

Proficient in Power BI, Tableau, Terraform, and Docker, with a strong focus on compliance frameworks like GDPR. Experienced in CI/CD pipelines, data modeling (Star, Snowflake schemas), and optimizing data warehouses for analytics. Collaborative team player delivering impactful, business-driven solutions.

What I'm Doing

  • Data Governance

    Enterprise data governance, metadata management, and compliance frameworks (GDPR, HIPAA).

  • Data Tools

    Expertise in Collibra, Informatica, Alation, and Apache Atlas.

  • Big Data Technologies

    Skilled in Hadoop (HDFS, MapReduce), Apache Spark, and Hive.

  • Cloud Platforms

    Proficient in AWS (Lambda, SageMaker, Redshift, Glue) and Azure (Data Factory, Synapse, AI Studio).

  • Documentation & Communication

    Experienced in technical and business documentation, training delivery, and presentations.

  • Programming & Scripting

    Expertise in Python, SQL, and Shell scripting.

  • Infrastructure & IaC

    Proficient in Terraform, Docker, and Kubernetes.

Resume

Download Resume

Education

  1. Drexel University

    Aug 2017 – Jun 2025

    PhD in Computer Science (GPA: 3.9/4.0)

  2. Drexel University

    Aug 2015 – Aug 2017

    Master of Science in Computer Science (GPA: 3.9/4.0)

  3. COEP Technological University

    2002 — 2004

    Bachelor of Technology in Electronics & Telecommunication Engineering (GPA: 8.6/10.0)

Experience

  1. Amazon Web Services (AWS)

    Aug 2022 – Present

    • Designed and implemented data governance policies and standards, ensuring compliance with GDPR and internal regulations.
    • Created metadata management solutions to improve accessibility for technical and business stakeholders.
    • Developed AWS Glue-based ETL pipelines, reducing data processing times by 40%.
    • Built scalable Redshift data warehouses for analytics, optimizing query performance and storage.
    • Conducted training sessions on data governance frameworks, improving team alignment on best practices.
    • Automated infrastructure deployment using Terraform, reducing manual errors and enhancing consistency.
    • Optimized Lambda functions for real-time processing, ensuring data integrity across pipelines.
    • Utilized CloudWatch for real-time monitoring and troubleshooting of AWS services and workflows.
    • Designed and documented data pipelines to enable self-service analytics for non-technical teams.

  2. Amazon Web Services (AWS)

    Jun 2020 – Aug 2022

    • Created and managed S3-based data lakes to centralize and organize diverse datasets for analytics.
    • Leveraged Amazon Kinesis for real-time data streaming, ensuring timely insights for critical business operations.
    • Used AWS EMR with PySpark to process large-scale data, enhancing performance for complex analytical workloads.
    • Developed Python scripts to integrate data from multiple sources into AWS EMR, ensuring seamless data flow.
    • Built scalable, cost-efficient data lakes on S3 and processed them with AWS Glue and PySpark for advanced analytics and machine learning.
    • Collaborated with cross-functional teams to resolve AWS infrastructure issues, improving platform reliability.
    • Designed and documented data pipelines to enable self-service analytics for non-technical teams.

  3. Mobileuc Technologies

    Jun 2013 – Jul 2015

    • Developed data governance frameworks and policies to standardize data management practices across the organization.
    • Built interactive Tableau dashboards to provide stakeholders with actionable insights and KPI tracking.
    • Automated ETL pipelines using SQL and Shell scripting, reducing manual intervention by 50%.
    • Designed and implemented data quality assurance processes, enhancing data accuracy and consistency.
    • Managed data ingestion workflows from diverse sources into HDFS for analysis and reporting.
    • Supported the integration of new systems into existing infrastructure with minimal downtime.
    • Conducted performance tuning for SQL queries, improving report generation speed by 30%.
    • Provided technical documentation on data governance processes to ensure alignment across teams.
    • Collaborated closely with the business analytics team to deliver solutions tailored to business needs.
    • Enhanced Tableau reports with dynamic filtering and real-time updates for better decision-making.

  4. Coriolis Technologies

    Jun 2011 – Jun 2013

    • Prepared and analyzed large datasets using SQL to derive insights and generate comprehensive reports.
    • Created technical documentation to guide data users on best practices for data governance and analytics.
    • Designed and maintained interactive dashboards using Tableau, improving data visualization.
    • Conducted data quality checks and implemented workflows to standardize incoming datasets.
    • Implemented process automation scripts in Python to streamline routine reporting tasks.
    • Collaborated with stakeholders to define reporting requirements and ensure data-driven decision-making.
    • Integrated diverse datasets from multiple sources into a centralized reporting framework.
    • Applied statistical methods to identify trends and anomalies, supporting operational improvements.
    • Reduced data processing times by 20% through workflow optimization and query enhancements.
    • Provided actionable insights from data analysis that contributed to cost savings and efficiency improvements.

Contact

Contact Form