Reliability Engineer

1 day ago


Pune, Maharashtra, India Roche Full time

Job Overview

We are seeking an experienced Reliability Engineer to join our team at Roche. In this role, you will be responsible for designing, implementing, and maintaining site reliability engineering (SRE) practices that ensure the reliability and performance of our production systems.

Key Responsibilities:

  • Design and implement SRE practices that align with our company's overall reliability and performance goals.
  • Develop and maintain automated monitoring and alerting systems to proactively identify and address potential issues.
  • Implement incident response procedures to effectively resolve incidents and minimize downtime.
  • Collaborate with developers and other engineers to define and implement service level agreements (SLAs).
  • Conduct regular reviews of SRE practices to ensure they remain effective and aligned with evolving needs.
  • Monitor and troubleshoot production systems to identify and resolve issues before they impact users.
  • Continuously monitor production systems for performance degradation, potential failures, and security vulnerabilities.
  • Thoroughly investigate and troubleshoot incidents to identify the root cause and implement corrective actions.
  • Proactively identify potential issues by analyzing system logs, metrics, and trends.
  • Collaborate with developers and other engineers to implement workarounds and fixes for identified issues.
  • Document incident investigations and corrective actions to prevent recurrence and improve future troubleshooting efforts.
  • Develop and implement automated monitoring and alerting systems.
  • Design and implement automated monitoring systems to collect and analyze real-time data from production systems.
  • Configure alerting systems to notify appropriate personnel of potential issues or performance deviations.
  • Continuously evaluate and improve the effectiveness of automated monitoring and alerting systems.
  • Automate repetitive tasks to improve efficiency and reduce manual intervention.
  • Collaborate with developers and other engineers to design and implement new features and infrastructure changes.
  • Work closely with developers to understand the impact of new features and code changes on system reliability and performance.
  • Provide guidance and recommendations to developers on SRE best practices and design for reliability.
  • Participate in code reviews to identify potential reliability issues and suggest improvements.
  • Collaborate with infrastructure engineers to ensure that new infrastructure components are designed and deployed with reliability and performance in mind.
  • Stay up-to-date on the latest technologies and trends in SRE and DevOps to contribute to continuous innovation and improvement.
  • Prepare and deliver technical presentations and documentation.
  • Prepare and deliver technical presentations to share SRE best practices, incident investigations, and lessons learned.
  • Document SRE practices, procedures, and guidelines to ensure knowledge transfer and consistency.
  • Contribute to internal documentation and knowledge bases to aid troubleshooting and problem-solving.
  • Presentation findings and recommendations to management and stakeholders to inform decision-making processes.

Required Qualifications:

  • 3 – 6 years of relevant experience
  • Proven hands-on Software/Application support with Cloud as main technology area.
  • Troubleshooting and the ability to delve deeply into technical details & acquire/create the necessary
  • Knowledge to effectively troubleshoot and repair of the applications
  • Knowledge Splunk, VictorsOps, Appdynamics, web automation like selenium and ability to learn new tools and technologies.
  • Collaborative team player with excellent influence and interpersonal skills; inspires confidence.
  • Experience with public Cloud providers, including Amazon Web Services architecture, tools, and Cloud methodologies.
  • Proven ability to design, implement, and maintain SRE practices that ensure system reliability and performance.
  • Experience in monitoring and troubleshooting production systems to identify and resolve incidents.
  • Familiarity with automated monitoring and alerting systems, including tool selection, configuration, and maintenance.
  • Experience collaborating with developers and other engineers to design, implement, and operate reliable systems.
  • Excellent written and verbal communication
  • Exposure to handling customers from various geographies
  • Ability to work with minimum supervision
  • Team player who shares ideas and resources
  • Flexibility to work in shifts or weekends as per schedule

Compensation Package:

The salary range for this position is $120,000-$180,000 per year, depending on experience.

About Roche:

We are a global pharmaceutical company dedicated to improving human life through science and innovation. Our mission is to reimagine medicine and redefine what it means to care.


  • Reliability Engineer

    3 weeks ago


    Pune, Maharashtra, India F337 Deutsche India Private Limited, Pune Branch Full time

    About the RoleF337 Deutsche India Private Limited, Pune Branch is seeking a skilled Reliability Engineer to join our team. As a key member of our Infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our systems.Your Key ResponsibilitiesCollaborate with cross-functional teams to design, build, and maintain...


  • Pune, Maharashtra, India Red Hat India Private Limited Full time

    Red Hat India Private Limited is seeking a highly skilled Cloud Reliability Engineer to join its team. As a Cloud Reliability Engineer, you will be responsible for developing, scaling, and operating our OpenShift managed cloud services.">Company OverviewFounded in 1993, Red Hat is the world's leading provider of enterprise software solutions. Our...


  • Pune, Maharashtra, India Roche Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Roche. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining site reliability engineering practices that ensure the reliability and performance of our production systems.Key ResponsibilitiesDesign and implement SRE...

  • Reliability Engineer

    4 weeks ago


    Pune, Maharashtra, India LTIMindtree Full time

    **About the Role:**We are seeking a highly skilled Site Reliability Engineer to join our team at LTIMindtree. As a key member of our DevOps team, you will play a vital role in ensuring the reliability, scalability, and performance of our mission-critical services. **Responsibilities:*** Engage in the entire lifecycle of services, from inception and design to...


  • Pune, Maharashtra, India People First Consultants Full time

    At People First Consultants, we're seeking a Cloud Reliability Engineer to join our team and play a key role in ensuring the reliability, efficiency, and performance of our applications meet our customers' needs.Responsibilities:Collaborate with development teams and other partner teams to ensure application reliability, efficiency, and performance meet...


  • Pune, Maharashtra, India Tata Technologies Full time

    Job SummaryWe are seeking a skilled Reliability Assurance Engineer to join our team at Tata Technologies. As a key member of our engineering team, you will be responsible for managing the program Design Verification & Reporting (DVP&R) document and developing Verification & Validation plans.The ideal candidate will have a strong understanding of accelerated...


  • Pune, Maharashtra, India People First Consultants Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at People First Consultants.About the RoleThe successful candidate will be responsible for ensuring the reliability, scalability, and performance of our applications and systems. This will involve working closely with development teams to identify and resolve issues,...


  • Pune, Maharashtra, India PubMatic Full time

    Job Title: Site Reliability EngineerPubMatic, a leading technology company, is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the seamless operation and optimal performance of our large-scale distributed software applications.Key Responsibilities:Monitor and analyze...


  • Pune, Maharashtra, India Coupa Software Full time

    About the RoleCoupa Software is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in building and maintaining the technologies on our Coupa Cloud platform.ResponsibilitiesDesign and develop scalable, reliable, and secure cloud-based systemsWork closely with cross-functional teams to...


  • Pune, Maharashtra, India F337 Deutsche India Private Limited, Pune Branch Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Deutsche Bank's Corporate Bank division. As a key member of our agile delivery team, you will play a pivotal role in ensuring the reliability, scalability, and performance of our systems.Your Key ResponsibilitiesDesign, build, and maintain robust and efficient...


  • Pune, Maharashtra, India Hansen Technologies Full time

    About The RoleWe are seeking a skilled Site Reliability Engineer to join our team in Pune, India. As a key member of our technical operations team, you will play a crucial role in ensuring the reliability, performance, and scalability of our systems.About YouWe are looking for a highly motivated and experienced Site Reliability Engineer who is passionate...


  • Pune, Maharashtra, India Fulcrum Digital Full time

    About Fulcrum Digital: We are a dynamic company seeking an experienced Senior Reliability Engineer to join our team. As a key contributor, you will play a pivotal role in ensuring the reliability, scalability, and performance of our critical systems. Our company culture emphasizes collaboration and innovation. You will work closely with development,...


  • Pune, Maharashtra, India Practicology Full time

    Job OverviewWe are seeking a highly skilled Cloud Reliability Engineer to join our team. As a Cloud Reliability Engineer, you will be responsible for designing, building, and maintaining scalable and reliable infrastructure in AWS (Postgres, Redis, Docker, Queues, Kinesis Streams, S3, etc.).Key ResponsibilitiesInfrastructure and AutomationDesign and build...


  • Pune, Maharashtra, India Fulcrum Digital Full time

    About the RoleFulcrum Digital is seeking a skilled System Reliability Engineer to join our team. As a System Reliability Engineer, you will be responsible for designing, implementing, and enhancing our deployment automation based on Chef. Your goal will be to develop a reliable and efficient release and deployment process.Key ResponsibilitiesDesign and...

  • Reliability Engineer

    3 weeks ago


    Pune, Maharashtra, India Tata Technologies Full time

    Job Description:Duties & ResponsibilitiesAs a key member of our team at Tata Technologies, you will be responsible for managing the program Design Verification & Reporting (DVP&R) document and developing a Verification & Validation plan that includes timing, test part allocation, and test part build. Additionally, you will develop, manage, and execute the...


  • Pune, Maharashtra, India Virtusa Full time

    Job Title: Cloud Reliability EngineerVirtusa is seeking a skilled Cloud Reliability Engineer to join our team. The ideal candidate will have a minimum of 5 years of experience in SRE, focusing on integration platforms and cloud-based deployments. Strong programming skills, particularly in integration tier and middleware, are essential. Experience with...


  • Pune, Maharashtra, India Tata Consultancy Services Full time

    {"Tata Consultancy Services is a global leader in the technology arena, and we are growing together.Job RequirementsRole: Site Reliability EngineerExperience Range: 8 – 12 YearsLocation: Bangalore/Chennai/Pune/DelhiEssential Skills:Exceptional skills in Docker/Kubernetes deployment and configuration, scaling, and management of containerized...


  • Pune, Maharashtra, India Fulcrum Digital Full time

    About the RoleFulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation.Our team is looking for a highly skilled System Reliability Engineer to plan, manage, and oversee all aspects of a Production Environment for Big Data Platforms.

  • Reliability Engineer

    1 month ago


    Pune, Maharashtra, India Fulcrum Digital Full time

    About the RoleWe are seeking a highly skilled System Reliability Engineer to join our team at Fulcrum Digital. As a key member of our digital transformation and technology services team, you will play a critical role in ensuring the smooth operation of our Production Environment Java, J2EE, and Spring Boot applications.Key Responsibilities:Plan, manage, and...

  • Reliability Engineer

    3 weeks ago


    Pune, Maharashtra, India Global Payments Asia-Pacific India Private Limited Full time

    Overview As a Site Reliability Engineer at Global Payments, you will be responsible for ensuring the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of our systems. You will create a bridge between development and operations by applying a software engineering mindset to system...