Reliability Engineering Specialist
1 week ago
About the Role
In this opportunity as Systems Reliability Engineer, you will:
- Work with application teams to manage and support applications in production environments.
- Develop and maintain a continuous improvement strategy for on-going support models, including release and change management for maintaining strategic environments (production, non-production, etc.).
- Provide well-written documentation and technical presentations on projects supported by the team.
- Provide problem management services using diagnostic and debugging tools to aid in troubleshooting efforts, including 24x7 rotating pager support.
- Coordinate the implementation of application monitoring, establish support documentation, and provide training on products and procedures.
- Provide technical assistance on troubleshooting and performance tuning of supported environment(s).
About You
You're a fit for the role of Systems Reliability Engineer if your background includes:
- 3-4 years of experience in an enterprise-level operations support role, SRE, or DevOps role.
- Working knowledge of infrastructure components, such as routers, load balancers, cloud products, container systems, compute, storage, and networks.
- Expertise in observability and monitoring tools, like Datadog, AppDynamics, Splunk, etc.
- Deep understanding of Application Performance Monitoring (APM) and User Monitoring.
- Knowledge of Infrastructure as Code (IaC): AWS CloudFormation, Ansible, Terraform, etc. Apply standards of cloud compliance to application design to achieve reliability.
- Experience in site reliability engineering in .NET, Java, Kubernetes, and Database platforms (like Postgres).
- Experience with Load Balancers and AWS services, such as AWS ECS, EMR, State Machines/Step Functions, CloudFormation, CloudWatch, Lambda, SQS, ECR, Fargate, Elastic Search, networking concepts, etc.
- Sound knowledge of IT Service Management (ITSM) process, Service Level Objectives (SLOs), Service Level Agreements (SLAs), incident resolution, and automation techniques.
- Ability to analyze application and server logs, error interpretation.
- Incident response and recovery: SREs are responsible for responding to incidents and implementing processes for incident response, monitoring, and automated recovery.
- Scripting knowledge in PowerShell, Bash, shell scripting.
- Ability to code in one of the programming languages (Java, C#, Python, JavaScript, etc.).
- Working knowledge of ITIL Change and Incident Management processes.
- Excellent written and verbal communication skills and strong collaboration skills.
Salary
The estimated annual salary for this position is between $120,000 and $180,000, depending on location and experience.
Benefits
We offer a comprehensive benefits package, including:
- A hybrid work model with flexible scheduling and remote work options.
- Comprehensive health insurance plans.
- A 401(k) plan with company match.
- A generous paid time off policy.
- A range of professional development opportunities.
- A commitment to diversity, equity, and inclusion.
About Us
We are a global news and media organization dedicated to serving our customers and the public interest. We believe in the importance of transparency, accountability, and fairness in business and society.
Location
This position can be located in various cities across the United States and globally, depending on the company's needs.
-
Cloud Reliability Engineering Specialist
7 days ago
Hyderabad, Telangana, India Talent500 Full timeAbout the RoleWe are seeking an experienced Cloud Reliability Engineering Specialist to join our team at FedEx ACC. As a Cloud Reliability Engineer, you will play a critical role in ensuring the scalability, performance, and reliability of our cloud-based applications.
-
Reliability Engineering Specialist
2 weeks ago
Hyderabad, Telangana, India Oracle Full timeJob DescriptionWe are seeking an experienced Reliability Engineering Specialist to join our team at Oracle.About the RoleThis is a key position that will play a crucial role in defining and developing software for tasks associated with the development, design, and debugging of software applications or operating systems.You will be responsible for managing...
-
Cloud Systems Reliability Specialist
2 weeks ago
Hyderabad, Telangana, India FedEx ACC Full timeAbout FedEx ACC">We are a leading company in the logistics industry, known for our reliability and efficiency.">Salary Range">$120,000 - $180,000 per year">Job Description">A Cloud Systems Reliability Specialist is responsible for ensuring the scalability, performance, and reliability of large-scale cloud-based applications. They combine software engineering...
-
Reliability Engineering Specialist
7 days ago
Hyderabad, Telangana, India FedEx ACC Full timeAbout FedEx ACC India:As a strategic technology division, we develop innovative solutions for customers and team members worldwide. Our goal is to enhance productivity, minimize expenses, and update our technology infrastructure to deliver exceptional customer experiences.A Site Reliability Engineer (SRE) combines software engineering and Cloud capabilities...
-
Maintenance Reliability Specialist
3 weeks ago
Hyderabad, Telangana, India GMR Group Full timeJob OverviewThe GMR Group is seeking a highly skilled Maintenance Reliability Specialist to join our team. This key role will be responsible for developing and implementing maintenance programs to improve equipment reliability and minimize downtime.
-
Reliability Engineering Specialist
3 weeks ago
Hyderabad, Telangana, India F5 Full timeF5 is a leading provider of digital transformation solutions. Our teams empower organizations to create, secure, and run applications that enhance the digital experience.We are passionate about cybersecurity, from protecting consumers to enabling companies to focus on innovation.Our culture centers around people, prioritizing diversity and individual...
-
Reliable Data Recovery Specialist
3 days ago
Hyderabad, Telangana, India Tata Consultancy Services Full timeAre you passionate about data recovery and eager to work with a leading company in the industry? Tata Consultancy Services is currently seeking a skilled Reliable Data Recovery Specialist to join our team.Job Overview:We are looking for an experienced professional who can provide reliable data recovery services for our clients. As a Reliable Data Recovery...
-
Reliability Engineer
4 days ago
Hyderabad, Telangana, India Tanla Platforms Limited Full timeAbout the RoleAs a Site Reliability Engineer, you will play a pivotal role in ensuring the availability, scalability, and reliability of our platforms and applications. Your expertise will be instrumental in maintaining optimal system uptime and preventing performance issues.Key Responsibilities:Build and Maintain Scalable Deployments: Design, implement, and...
-
Senior Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India Arcesium Full timeCompany OverviewArcesium is a global financial technology firm that solves data-driven challenges faced by sophisticated financial institutions. Our platform and capabilities continuously innovate to meet tomorrow's challenges, anticipate risks, and design advanced solutions for transformational business outcomes.We value intellectual curiosity, proactive...
-
Site Reliability Engineering Lead
3 days ago
Hyderabad, Telangana, India Live Connections Full timeWe are looking for a highly skilled Site Reliability Engineering Lead to join our team at Live Connections in Hyderabad. As a key member of our organization, you will be responsible for leading and managing a team of engineers to ensure the reliability, scalability, and performance of our systems.**Estimated Salary: ₹25,00,000 - ₹35,00,000 per...
-
TrueTech - Cloud Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India Truetech Full timeJob Summary:As a Cloud Reliability Engineer at TrueTech, you will lead and manage a team of Site Reliability Engineers, providing mentorship, guidance, and support to ensure the team's success. You will also develop and implement strategies for improving system reliability, scalability, and performance.Establish and enforce SRE best practices, including...
-
Site Reliability Engineering Team Lead
2 weeks ago
Hyderabad, Telangana, India Ideagen Full timeAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team as a Monitoring & Observability Lead. As a key member of our infrastructure monitoring team, you will play a critical role in ensuring the optimal performance and reliability of our SaaS infrastructure across a multi-cloud environment.As a Monitoring &...
-
Site Reliability Engineer
2 months ago
Hyderabad, Telangana, India RiskInsight Consulting Pvt Ltd Full timeJob Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at RiskInsight Consulting Pvt Ltd. As a Site Reliability Engineer, you will be responsible for ensuring the smooth operation of our banking applications and infrastructure.Key Responsibilities:Manage a 24/7 production support team in the Banking...
-
Site Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India SID Global Solutions Full timeAt SID Global Solutions, we are seeking a highly motivated and detail-oriented Site Reliability Engineer to join our team. As an ideal candidate, you will have a strong passion for system reliability, automation, and incident response.About the Role:This is an entry-level position that offers a unique opportunity for professional growth and development. You...
-
Reliability Engineer
3 weeks ago
Hyderabad, Telangana, India Unison Consulting Pte Ltd Full timeJob DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Unison Consulting Pte Ltd. As a key member of our infrastructure team, you will be responsible for ensuring the high availability and performance of our applications.About the RoleThe ideal candidate will have a minimum of 5-7 years' experience as a Site Reliability...
-
Site Reliability Engineer
1 month ago
Hyderabad, Telangana, India Ideagen Full timeAbout UsIdeagen is a global leader in software solutions, empowering organizations to achieve their safety and quality goals. Our innovative products and services help ensure the reliability and performance of mission-critical systems.As a Monitoring and Observability Lead, you will play a critical role in shaping our SaaS infrastructure to meet the evolving...
-
Chief Site Reliability Engineering Lead
7 days ago
Hyderabad, Telangana, India Live Connections Full timeAbout Live ConnectionsWe're a cutting-edge technology firm dedicated to delivering innovative solutions. Our team is passionate about crafting exceptional products that drive business success.Job Description:System Reliability Engineer ManagerThis role offers an exciting opportunity to lead our site reliability engineering team, driving strategies for...
-
Cloud Data Engineer Specialist
3 days ago
Hyderabad, Telangana, India Tech Mahindra Full timeJob Title: Cloud Data Engineer SpecialistWe are seeking an experienced Cloud Data Engineer Specialist to join our team at Tech Mahindra. This role involves designing, building, and maintaining large-scale data processing systems on cloud platforms like Azure.About the Role:As a Cloud Data Engineer Specialist, you will be responsible for developing and...
-
Site Reliability Engineering Team Lead
3 days ago
Hyderabad, Telangana, India Live Connections Full timeWe are seeking an experienced Site Reliability Engineering Team Lead to join our team at Live Connections in Hyderabad.About the RoleThis is a leadership position that requires a strong technical background in site reliability engineering and experience in managing teams. The ideal candidate will have a proven track record of driving projects to successful...
-
Hyderabad, Telangana, India Capgemini Engineering Full timeJob Title: Embedded Linux Kernel/ Device Drivers SpecialistAbout the Role:We are seeking an experienced Embedded Linux Kernel/Device Drivers specialist to join our team at Capgemini Engineering in Hyderabad. The successful candidate will be responsible for developing and porting embedded software on Linux and ARM platforms.Responsibilities:Develop and...