Data Engineer – Cloud Integration

5 days ago


Agra, Uttar Pradesh, India Boston Insights Full time

Company Overview

Boston Insights is an innovative startup creating competitive advantage for pharmaceutical companies by unlocking their clinical supply chain data and enabling end-to-end visibility. We augment risk resiliency and agility to ensure uninterrupted supply of investigational drugs to patients on-time. Our mission is to transform how pharmaceutical companies manage their clinical supply chains through cutting-edge data solutions.

Position Overview

We are seeking a Data Engineer with 5+ years of experience, deep expertise in the Microsoft Azure technology stack, and proven ability in integrating data from external data lakes, AWS data warehouses, and enterprise supply chain solutions like SAP. The ideal candidate will also have strong experience in data governance and building data automation tools using Python and related languages.

Key Responsibilities

Data Integration & Pipeline Development

  • Design, build, and maintain scalable data pipelines using Azure Data Factory, Azure Synapse Analytics, and Azure Data Lake
  • Lead the integration strategy for ingesting and harmonizing data from external sources, including AWS-based data lakes/warehouses (such as S3, Redshift) and SAP systems.
  • Automate ETL processes for data extraction, transformation, and loading across hybrid and multi-cloud environments.
  • Build and maintain real-time and batch data integration workflows between Azure, AWS, and on-premises sources.

Data Architecture & Infrastructure

  • Design and implement data lake and data warehouse solutions on Azure platform
  • Establish data governance frameworks and ensure data quality across all pipelines
  • Implement security best practices for handling sensitive pharmaceutical data
  • Create and maintain data documentation and lineage tracking

Data Governance & Quality

  • Define and enforce data governance frameworks: data cataloging, lineage, quality, privacy, and compliance
  • Implement robust data validation, cleansing, and monitoring systems to ensure accuracy and reliability
  • Support security standards through effective data management practices.

Automation & Tooling

  • Develop data automation tools and reusable components using Python, PySpark (and other relevant frameworks/languages)
  • Enable end-to-end process automation for data ingestion, processing, and reporting.
  • Implement CI/CD processes for data solutions, including testing, monitoring, and alerting.

Analytics & Reporting Support

  • Collaborate with data scientists and analysts to support advanced analytics
  • Build data models that enable risk assessment and supply chain optimization
  • Develop APIs and data services to support front-end applications
  • Create monitoring and alerting systems for data pipeline health

Collaboration & Support

  • Partner with supply chain, analytics, and business stakeholders to understand business requirements and translate them into scalable technical solutions.
  • Collaborate with SAP functional and technical teams to optimize data extraction and synchronization.

Required Technical Qualifications

  • 5+ years of professional data engineering experience.
  • Data Integration: Proven track record integrating data from AWS services (S3, Redshift, Glue, etc.) into Azure or other cloud environments
  • Azure Data Services: Expert-level knowledge of Azure Data Factory, Azure Synapse Analytics, Data Bricks, Azure Data Lake Storage, and Azure SQL Database, Apache Spark
  • Database Technologies: Strong knowledge of both relational (SQL Server, PostgreSQL) and NoSQL (Cosmos DB) databases
  • Programming Languages: Proficiency in Python, SQL, PySpark and PowerShell for data automation and wrangling
  • Version Control: Proficiency with Git and Azure DevOps
  • Hands-on experience with SAP data models and integrating SAP data with Azure data lake

Preferred Technical Skills

  • Experience with API-based data integration for cloud and enterprise applications.
  • Experience with Infrastructure as Code (ARM templates, Terraform)
  • Familiarity with data quality tools, metadata management, and automated data lineage tracking.
  • Knowledge of containerization (Docker, Kubernetes) for data automation workflows.
  • Knowledge of machine learning pipelines and MLOps practices
  • Experience with data visualization tool, Power BI

Professional Skills

  • Strong problem-solving and analytical thinking abilities
  • Excellent communication skills with ability to explain technical concepts to non-technical stakeholders
  • Experience with Agile development methodologies
  • Attention to detail and commitment to data quality

The opportunity we offer

  • Competitive salary commensurate with experience.
  • Professional development training and certifications.

Work Environment

  • Remote work arrangement with flexible hours
  • State-of-the-art technology and tools
  • Collaborative, innovation-driven culture
  • Access to cutting-edge pharmaceutical industry data and challenges
  • Opportunity to shape an innovative pharma analytics platform.

Application Instructions

Please submit your resume and a cover letter highlighting:

  • Experience in Azure, AWS data integration, and SAP data extraction.
  • Examples of data automation tools or frameworks you have developed.

Join us to unlock new possibilities in pharmaceutical supply chain data through advanced engineering and multi-cloud innovation

Boston Insights is transforming pharmaceutical supply chains through innovative data solutions. Join us in ensuring that life-saving investigational drugs reach patients on-time, every time.



  • Agra, Uttar Pradesh, India beBeeDataEngineer Full time US$ 1,50,000 - US$ 1,80,000

    Cloud Integration Data EngineerWe are seeking a skilled professional to design and implement scalable data pipelines using cloud-based technologies.The ideal candidate will have expertise in integrating data from various sources, including external data lakes, AWS data warehouses, and enterprise supply chain solutions like SAP.Key Responsibilities:Design,...


  • Agra, Uttar Pradesh, India beBeeData Full time ₹ 18,00,000 - ₹ 24,00,000

    Job SummaryWe are seeking a highly skilled data engineer to design and operate scalable data pipelines that support real-time clinical workflows.Develop robust data flows that integrate with EHR/EMR systems, leveraging standards such as FHIR and HL7.Collaborate with cross-functional teams to align data flows with business needs.Continuously improve pipeline...

  • Cloud Data Engineer

    12 hours ago


    Agra, Uttar Pradesh, India beBeeDataEngineering Full time ₹ 1,20,00,000 - ₹ 2,50,00,000

    Cloud Data EngineerWe are seeking a skilled Cloud Data Engineer to design and develop robust data pipelines that clean, transform, and aggregate data from disparate sources.Collaborate with data scientists to design and develop robust data pipelines using industry-wide tools such as Apache Beam, Airflow, and BigQuery.Model front-end and back-end data sources...


  • Agra, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 1,50,00,000 - ₹ 2,50,00,000

    Job Overview:We are seeking a skilled Data Engineering professional with 4–7 years of experience to design, develop, and optimize scalable data pipelines using cloud-based technologies.Develop high-quality database programming requirements of the sprint.Experience in cloud platforms like Azure SQL, Databricks, Data factory (ADF), Data Lake, Data storage,...


  • Agra, Uttar Pradesh, India beBeeData Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Job Title:Cloud Data Engineer with PL/SQL, SQL, AWS, Data Bricks">About the RoleWe are seeking a skilled Cloud Data Engineer to join our analytics practice and contribute to building a world-class cloud data platform.The successful candidate will be responsible for managing the existing cloud data platform for scalability, reliability, and cost...


  • Agra, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 1,40,00,000 - ₹ 2,52,00,000

    Senior Data EngineerJob Summary:We are seeking a seasoned professional to spearhead the development of scalable, high-performance data pipelines. The ideal candidate will have expertise in AWS cloud services and Databricks to design and implement seamless data availability for analytics, AI/ML, and business intelligence use cases.Key Responsibilities:Design,...

  • Cloud Data Expert

    16 hours ago


    Agra, Uttar Pradesh, India beBeeData Full time ₹ 1,20,00,000 - ₹ 2,00,00,000

    Senior Data EngineerWe are seeking a seasoned Senior Data Engineer with expertise in Google Cloud Platform (GCP) and AtScale to design, build, and optimize scalable data solutions.The ideal candidate will have hands-on experience in developing robust data pipelines, implementing efficient data models, and ensuring performance optimization for...


  • Agra, Uttar Pradesh, India beBeeDatabase Full time US$ 1,50,000 - US$ 2,00,000

    Job Title: Enterprise Redshift Cluster EngineerSenior Redshift engineers are responsible for designing, implementing and managing scalable AWS cloud environments and applications.Administer and maintain database platforms including AWS Redshift, AWS RDS and MySQL.Develop and implement database security measures to protect against unauthorized...


  • Agra, Uttar Pradesh, India beBeeDataEngineer Full time ₹ 2,00,00,000 - ₹ 2,50,00,000

    Key Data Engineer RoleWe are seeking a seasoned data engineer with extensive experience in designing and building large-scale data platforms. As part of our team, you will work on cutting-edge data solutions using the best of modern data engineering tools and practices.Job Responsibilities:Design and build robust data pipelines for ingestion, transformation,...


  • Agra, Uttar Pradesh, India beBeeInformatica Full time ₹ 25,00,000 - ₹ 35,00,000

    Cloud Data Integration DeveloperWe are seeking an experienced developer to work on Cloud Data Integration (CDI) and Cloud Application Integration (CAI) solutions using Informatica tools.Develop and configure data pipelines for file-based, FTP/SFTP, APIs, real-time, and batch integrations.Build and maintain reference data models and support their integration...