Site Reliability Engineer III

2 weeks ago


Hyderabad, India F5 Full time
At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.

Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive.F5 leads the market in building products to make every app run faster, smarter, and safer anywhere. To support our growing business, we need to expand our organization by creating a new team in India specializing in site reliability for our NGINX services. In this role, successful candidate will work closely with their global team that brings software engineering and automated solution mindset to work.The Site Reliability Engineer III will be responsible for ensuring the reliability, availability, and scalability of critical NGINX systems and SaaS platforms. Systems under the care of a Site Reliability Engineer III must operate effectively and reliably through scalable builds and deployments, frequent releases, and complex architectures that encompass modern technologies. You will work closely with technical and non-technical teams throughout the organization to facilitate the design and implementation of scalable solutions, drive automation initiatives, and monitor and maintain the performance of critical NGINX systems.We are looking for someone who has:Multi-cloud experience, both public and private cloud.

Strong knowledge of continuous delivery, testing, security practices, performance, and disaster recovery.

Experience supporting mission-critical, customer facing systems in production environments, including incident management response.

Responsibilities:Collaborate with developers to promote the concept of reliability engineering during all phases of the SDLC to detect and correct performance issues early in the lifecycle.

Scope tooling and automation, monitoring, workflow management, maintaining and improving data pipelines, CI/CD, etc. Assess gaps in as-is monitoring tool capabilities and develop automated solutions to support the production infrastructure.

Establish and enhance infrastructure and application performance metrics; provide actionable reporting to proactively identify and address issues.

Run the CI/CD infrastructure production environment by monitoring availability and taking a holistic view of system health.

Performs proactive data analysis to identify problems before a service is impacted, and ad-hoc data analysis to quickly identify root cause for service impacting issues as they arise. Defines and implements alerting rules, and manages, prioritizes, and responds to alerts.

Knowledge, Skills, and ExperienceExperience setting up and using incident and on-call management systems.

Experience setting up and building tools to collect and visualize data (logs, metrics, alerts), building dashboards, alerting, and monitoring systems.

Experience with deploying secure infrastructure and services in one or more cloud environments such as AWS or Azure.

Experience with configuration management and deployment automation tools, such as Terraform, Ansible, Packer, etc.

Proficiency in scripting languages such as Python and Bash.

Experience with container (Docker) and orchestration systems (Kubernetes).

Solid understanding of Linux OS + systems administration skills

Excellent analytical and trouble-shooting skills.

Dynamic collaborator who thrives in diverse, geographically distributed locales.

Team player that demonstrates diplomacy, promotion of sound ideas & concepts, paired with the desire to help others grow their skills.

Strong verbal and written communication skills.

Experience with NGINX technologies a strong plus.

Fundamental competencies:SYSTEM EXPERIENCEApplication Build and Deployment Processes (git*, automation pipelines, Infrastructure as code, etc.)

Automated Application Delivery (load balancers, container orchestration, service mesh, High Availability architectures, Frontend, Backend technologies including database, etc.)

Service Operation (Define, instrument, measure, and manage service level objectives. Experience with observability tooling including logging infrastructure, time series metrics databases, tracing systems, alert definitions, etc.)

Incident management (service restoration, root cause analysis, postmortem authorship, define roles and responsibilities, etc.)

Security awareness and competencies, including security as code.

Configuration management

OBSERVABILITYExplores beyond the obvious to ensure Service Level Objectives (SLO) are met.

Understands and measures system behaviors to quickly and efficiently diagnose, identify, and address needs.

Proactively test, automate, monitor outputs, leverage signals to infer services and needs.

Data management to explore properties, patterns, and distributed tracing

SOLUTIONISTConstantly seeking ways to improve systems, making them more efficient and reducing toil.

Understands the difference between short-term strategic and long-term fixes

Simplifies decisions and judgments by recognizing what to pay attention to and what to ignore; a proficient problem solver. Tenacious and resourceful with an inherent predisposition toward action; unafraid to try something new in the name of innovation.

FORWARD THINKINGPossess an inherent bias toward innovation, always abreast of developing ideas and technologies. Thoughtfully and strategically considers future needs, opportunities, and advocates positive change.

Technological creativity and capacity

COMMUNICATION AND COLLABORATIONConveys information, vision, and strategy in an accurate and timely manner, adjusting to ensure understanding based on the audience. Actively listens; seeks to understand rather than respond. Proactively solicits and values diverse perspectives, ideas, and opinions.

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.

  • hyderabad, India F5 Full time

    At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.   Everything we do centers...


  • Hyderabad, India F5 Full time

    At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.    Everything we do centers...


  • Hyderabad, India F5 Full time

    At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.    Everything we do centers...


  • hyderabad, India F5 Full time

    At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.   Everything we do centers...

  • Linux Engineer III

    2 weeks ago


    Hyderabad, India RealPage, Inc. Full time

    SUMMARYThe Engineer III is a senior role reporting to the Sr. Manager, requiring 8 to 10 years of experience as a Site Reliability Engineer (SRE). The successful candidate will have a strong background in Linux systems support, cloud platforms like AWS/OCI/GCP, and automation technologies. This role requires flexibility to work in shifts and a commitment to...


  • hyderabad, India Insight Global Full time

    Required Skills and Experience *- Bachelor's or master's degree in computer science, Software Engineering, or a related field.- Proven experience (7+ years) in SRE, automation testing- Strong skills in developing and implementing automation testing strategies and frameworks.- Solid understanding of site reliability principles and best practices.- Leadership...


  • Hyderabad, India Insight Global Full time

    Required Skills and Experience *- Bachelor's or master's degree in computer science, Software Engineering, or a related field.- Proven experience (7+ years) in SRE, automation testing- Strong skills in developing and implementing automation testing strategies and frameworks.- Solid understanding of site reliability principles and best practices.- Leadership...


  • Hyderabad, India Insight Global Full time

    Required Skills and Experience * - Bachelor's or master's degree in computer science, Software Engineering, or a related field. - Proven experience (7+ years) in SRE, automation testing - Strong skills in developing and implementing automation testing strategies and frameworks. - Solid understanding of site reliability principles and best practices. -...

  • Linux Engineer III

    2 weeks ago


    Hyderabad, India RealPage, Inc. Full time

    SUMMARY The Engineer III is a senior role reporting to the Sr. Manager, requiring 8 to 10 years of experience as a Site Reliability Engineer (SRE). The successful candidate will have a strong background in Linux systems support, cloud platforms like AWS/OCI/GCP, and automation technologies. This role requires flexibility to work in shifts and a...

  • Linux Engineer III

    2 weeks ago


    hyderabad, India RealPage, Inc. Full time

    SUMMARY The Engineer III is a senior role reporting to the Sr. Manager, requiring 8 to 10 years of experience as a Site Reliability Engineer (SRE). The successful candidate will have a strong background in Linux systems support, cloud platforms like AWS/OCI/GCP, and automation technologies. This role requires flexibility to work in shifts and a...


  • Hyderabad, India Virtusa Full time

    Site Reliability engineer - CREQ188641 Description Position : SRE Primary skills: devops CI/CD pipeline Location: Hyderabad Should have proficiency in understanding of application monitoring stack(Logs, Events, Metrics and Alerts) and ability to visualize and setup end-to-end observability. Should have proficiency in industry standard monitoring tools...


  • hyderabad, India Virtusa Full time

    Site Reliability engineer - CREQ188641 Description Position : SRE Primary skills: devops CI/CD pipeline Location: Hyderabad Should have proficiency in understanding of application monitoring stack(Logs, Events, Metrics and Alerts) and ability to visualize and setup end-to-end observability.Should have proficiency in industry standard monitoring...


  • Hyderabad, India Snaphunt Full time

    The OfferWork within a company with a solid track record of successGreat work environmentAttractive salary & benefitsThe Job You will be responsible for : Gathering and evaluating user feedback.Providing code documentation and other inputs to technical documents.Supporting continuous improvement by investigating alternatives and new technologies and...


  • hyderabad, India Snaphunt Full time

    The Offer Work within a company with a solid track record of success Great work environment Attractive salary & benefits The Job You will be responsible for : Gathering and evaluating user feedback. Providing code documentation and other inputs to technical documents. Supporting continuous improvement by investigating alternatives and new technologies...


  • Hyderabad, India Microsoft Full time

    Overview Are you interested in working for one of the most exciting products at Microsoft, passionate about exceeding customer expectations and advancing Microsoft's cloud first strategy? Are you interested in a start-up like the environment, passionate about cloud computing technology and driving growth in one of Microsoft's core businesses? If so,...


  • hyderabad, India Microsoft Full time

    Overview Are you interested in working for one of the most exciting products at Microsoft, passionate about exceeding customer expectations and advancing Microsoft's cloud first strategy? Are you interested in a start-up like the environment, passionate about cloud computing technology and driving growth in one of Microsoft's core businesses? If...


  • Hyderabad, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM IST We are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • Hyderabad, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM IST We are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • Hyderabad, India FedEx ACC Full time

    Skill Required: Under general supervision, assists in the development and design of deliverables that support the resolution of moderately complex problems and technical design gaps. Supports improvement initiatives that are aligned with overarching global reliability of the company‘s systems, including capacity planning, failover strategies, performance...


  • Hyderabad, India ValueLabs Full time

    Experienced in SRE or Site Reliability Engineer Design, implement, and maintain automated processes for deploying, monitoring, and managing applications on Azure DevOps. Collaborate with cross-functional teams to optimize system performance, reliability, and scalability. Develop and maintain tools for continuous integration, continuous deployment (CI/CD),...