Reliability Engineering Specialist

1 week ago


Hyderabad, Telangana, India Thomson Reuters Full time

About the Role

In this opportunity as Systems Reliability Engineer, you will:

  • Work with application teams to manage and support applications in production environments.
  • Develop and maintain a continuous improvement strategy for on-going support models, including release and change management for maintaining strategic environments (production, non-production, etc.).
  • Provide well-written documentation and technical presentations on projects supported by the team.
  • Provide problem management services using diagnostic and debugging tools to aid in troubleshooting efforts, including 24x7 rotating pager support.
  • Coordinate the implementation of application monitoring, establish support documentation, and provide training on products and procedures.
  • Provide technical assistance on troubleshooting and performance tuning of supported environment(s).

About You

You're a fit for the role of Systems Reliability Engineer if your background includes:

  • 3-4 years of experience in an enterprise-level operations support role, SRE, or DevOps role.
  • Working knowledge of infrastructure components, such as routers, load balancers, cloud products, container systems, compute, storage, and networks.
  • Expertise in observability and monitoring tools, like Datadog, AppDynamics, Splunk, etc.
  • Deep understanding of Application Performance Monitoring (APM) and User Monitoring.
  • Knowledge of Infrastructure as Code (IaC): AWS CloudFormation, Ansible, Terraform, etc. Apply standards of cloud compliance to application design to achieve reliability.
  • Experience in site reliability engineering in .NET, Java, Kubernetes, and Database platforms (like Postgres).
  • Experience with Load Balancers and AWS services, such as AWS ECS, EMR, State Machines/Step Functions, CloudFormation, CloudWatch, Lambda, SQS, ECR, Fargate, Elastic Search, networking concepts, etc.
  • Sound knowledge of IT Service Management (ITSM) process, Service Level Objectives (SLOs), Service Level Agreements (SLAs), incident resolution, and automation techniques.
  • Ability to analyze application and server logs, error interpretation.
  • Incident response and recovery: SREs are responsible for responding to incidents and implementing processes for incident response, monitoring, and automated recovery.
  • Scripting knowledge in PowerShell, Bash, shell scripting.
  • Ability to code in one of the programming languages (Java, C#, Python, JavaScript, etc.).
  • Working knowledge of ITIL Change and Incident Management processes.
  • Excellent written and verbal communication skills and strong collaboration skills.

Salary

The estimated annual salary for this position is between $120,000 and $180,000, depending on location and experience.

Benefits

We offer a comprehensive benefits package, including:

  • A hybrid work model with flexible scheduling and remote work options.
  • Comprehensive health insurance plans.
  • A 401(k) plan with company match.
  • A generous paid time off policy.
  • A range of professional development opportunities.
  • A commitment to diversity, equity, and inclusion.

About Us

We are a global news and media organization dedicated to serving our customers and the public interest. We believe in the importance of transparency, accountability, and fairness in business and society.

Location

This position can be located in various cities across the United States and globally, depending on the company's needs.



  • Hyderabad, Telangana, India Talent500 Full time

    About the RoleWe are seeking an experienced Cloud Reliability Engineering Specialist to join our team at FedEx ACC. As a Cloud Reliability Engineer, you will play a critical role in ensuring the scalability, performance, and reliability of our cloud-based applications.


  • Hyderabad, Telangana, India Oracle Full time

    Job DescriptionWe are seeking an experienced Reliability Engineering Specialist to join our team at Oracle.About the RoleThis is a key position that will play a crucial role in defining and developing software for tasks associated with the development, design, and debugging of software applications or operating systems.You will be responsible for managing...


  • Hyderabad, Telangana, India FedEx ACC Full time

    About FedEx ACC">We are a leading company in the logistics industry, known for our reliability and efficiency.">Salary Range">$120,000 - $180,000 per year">Job Description">A Cloud Systems Reliability Specialist is responsible for ensuring the scalability, performance, and reliability of large-scale cloud-based applications. They combine software engineering...


  • Hyderabad, Telangana, India FedEx ACC Full time

    About FedEx ACC India:As a strategic technology division, we develop innovative solutions for customers and team members worldwide. Our goal is to enhance productivity, minimize expenses, and update our technology infrastructure to deliver exceptional customer experiences.A Site Reliability Engineer (SRE) combines software engineering and Cloud capabilities...


  • Hyderabad, Telangana, India GMR Group Full time

    Job OverviewThe GMR Group is seeking a highly skilled Maintenance Reliability Specialist to join our team. This key role will be responsible for developing and implementing maintenance programs to improve equipment reliability and minimize downtime.


  • Hyderabad, Telangana, India F5 Full time

    F5 is a leading provider of digital transformation solutions. Our teams empower organizations to create, secure, and run applications that enhance the digital experience.We are passionate about cybersecurity, from protecting consumers to enabling companies to focus on innovation.Our culture centers around people, prioritizing diversity and individual...


  • Hyderabad, Telangana, India Tata Consultancy Services Full time

    Are you passionate about data recovery and eager to work with a leading company in the industry? Tata Consultancy Services is currently seeking a skilled Reliable Data Recovery Specialist to join our team.Job Overview:We are looking for an experienced professional who can provide reliable data recovery services for our clients. As a Reliable Data Recovery...


  • Hyderabad, Telangana, India Tanla Platforms Limited Full time

    About the RoleAs a Site Reliability Engineer, you will play a pivotal role in ensuring the availability, scalability, and reliability of our platforms and applications. Your expertise will be instrumental in maintaining optimal system uptime and preventing performance issues.Key Responsibilities:Build and Maintain Scalable Deployments: Design, implement, and...


  • Hyderabad, Telangana, India Arcesium Full time

    Company OverviewArcesium is a global financial technology firm that solves data-driven challenges faced by sophisticated financial institutions. Our platform and capabilities continuously innovate to meet tomorrow's challenges, anticipate risks, and design advanced solutions for transformational business outcomes.We value intellectual curiosity, proactive...


  • Hyderabad, Telangana, India Live Connections Full time

    We are looking for a highly skilled Site Reliability Engineering Lead to join our team at Live Connections in Hyderabad. As a key member of our organization, you will be responsible for leading and managing a team of engineers to ensure the reliability, scalability, and performance of our systems.**Estimated Salary: ₹25,00,000 - ₹35,00,000 per...


  • Hyderabad, Telangana, India Truetech Full time

    Job Summary:As a Cloud Reliability Engineer at TrueTech, you will lead and manage a team of Site Reliability Engineers, providing mentorship, guidance, and support to ensure the team's success. You will also develop and implement strategies for improving system reliability, scalability, and performance.Establish and enforce SRE best practices, including...


  • Hyderabad, Telangana, India Ideagen Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team as a Monitoring & Observability Lead. As a key member of our infrastructure monitoring team, you will play a critical role in ensuring the optimal performance and reliability of our SaaS infrastructure across a multi-cloud environment.As a Monitoring &...


  • Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...


  • Hyderabad, Telangana, India SID Global Solutions Full time

    At SID Global Solutions, we are seeking a highly motivated and detail-oriented Site Reliability Engineer to join our team. As an ideal candidate, you will have a strong passion for system reliability, automation, and incident response.About the Role:This is an entry-level position that offers a unique opportunity for professional growth and development. You...

  • Reliability Engineer

    3 weeks ago


    Hyderabad, Telangana, India Unison Consulting Pte Ltd Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Unison Consulting Pte Ltd. As a key member of our infrastructure team, you will be responsible for ensuring the high availability and performance of our applications.About the RoleThe ideal candidate will have a minimum of 5-7 years' experience as a Site Reliability...


  • Hyderabad, Telangana, India Ideagen Full time

    About UsIdeagen is a global leader in software solutions, empowering organizations to achieve their safety and quality goals. Our innovative products and services help ensure the reliability and performance of mission-critical systems.As a Monitoring and Observability Lead, you will play a critical role in shaping our SaaS infrastructure to meet the evolving...


  • Hyderabad, Telangana, India Live Connections Full time

    About Live ConnectionsWe're a cutting-edge technology firm dedicated to delivering innovative solutions. Our team is passionate about crafting exceptional products that drive business success.Job Description:System Reliability Engineer ManagerThis role offers an exciting opportunity to lead our site reliability engineering team, driving strategies for...


  • Hyderabad, Telangana, India Tech Mahindra Full time

    Job Title: Cloud Data Engineer SpecialistWe are seeking an experienced Cloud Data Engineer Specialist to join our team at Tech Mahindra. This role involves designing, building, and maintaining large-scale data processing systems on cloud platforms like Azure.About the Role:As a Cloud Data Engineer Specialist, you will be responsible for developing and...


  • Hyderabad, Telangana, India Live Connections Full time

    We are seeking an experienced Site Reliability Engineering Team Lead to join our team at Live Connections in Hyderabad.About the RoleThis is a leadership position that requires a strong technical background in site reliability engineering and experience in managing teams. The ideal candidate will have a proven track record of driving projects to successful...


  • Hyderabad, Telangana, India Capgemini Engineering Full time

    Job Title: Embedded Linux Kernel/ Device Drivers SpecialistAbout the Role:We are seeking an experienced Embedded Linux Kernel/Device Drivers specialist to join our team at Capgemini Engineering in Hyderabad. The successful candidate will be responsible for developing and porting embedded software on Linux and ARM platforms.Responsibilities:Develop and...