Senior Principal Site Reliability Engineer
1 week ago
At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.
Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive.
- F5xc SRE: Play the role of a hands-on SRE Engineer focused on automation and toil-reduction and participate in Ops cycles to support our product.
- Perform oncall support function on a rotation basis, providing timely resolution of issues and ensuring operational excellence in managing and maintaining distributed networking and security products
- Easy-to-Use Automation: Continue to grow the infra-automation (k8s, ArgoCD, Helm Charts, Golang services, AWS, GCP, Terraform) with a focus on ease of configuration
- Environment Stability using Observability: Create and continue to evolve existing Observability (metrics & alerts) and participate in regular monitoring of infrastructure for stability.
- Collaborative Engagement: Collaborate closely with application owners and SRE team members as part of roadmap execution and continuous improvement of existing systems.
- Scale & Resilient systems: Design & deploy systems/infra which is highly available and resilient for the configured failure domains.
- Design systems using strong security principles with security by default.
The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.
Knowledge, Skills And Abilities
- Hands-on experience with the Cortex suite of observability tools, including Cortex, Loki, Tempo, and Prometheus integration for scalable, multi-tenant monitoring systems.
- Proficient in deploying and managing Cortex in microservice environments, including configuration of distributors, ingesters, queriers, and store-gateways for high availability and performance.
- Experienced with Grafana Mimir,including cluster setup, alerting, rule evaluation, and long-term metric storage at scale.
- Skilled in optimizing Cortex/Mimir query performance, tuning compaction, and managing sharding/replication for massive telemetry workloads.
- Familiar with integrating Cortex/Mimir with Grafana dashboards, Thanos, or Prometheus Remote Write to support observability-as-a-service use cases
- Elasticsearch: Deep understanding of indexing strategies, query optimization, cluster management, and tuning for high-throughput use cases. Familiarity with slow query analysis, scaling, and shard management.
- ClickHouse: Proven experience in designing and managing OLAP workloads, optimizing query performance, and implementing efficient table engines and materialized views.
- Apache Kafka: Expertise in event streaming architecture, topic design, producer/consumer configuration, and handling high-volume, low-latency data pipelines. Experience with Kafka Connect and Schema Registry is a plus.
- Vector ): Proficiency in configuring Vector for observability pipelines, including log transformation, enrichment, and routing to multiple sinks (e.g., Elasticsearch, S3, ClickHouse).
- Hands-on programming experience in any one language python,golang + shell scripting.
- Strong networking fundamentals and experience dealing with different layers of the networking stack.
- SRE/Devops on Linux & Kubernetes: Demonstrate excellent, hands-on knowledge of deploying workloads and managing lifecyle on kubernetes, with practical experience on debugging issues.
- Experience in upgrading workloads for SaaS Services without downtime.
- Oncall Experience in managing everyday OPs for production environments. Experience in production alerts management and using dashboards to debug issues.
- GipOps: Experience with helm charts/kustomizations and gitops tools like ArgoCD/FluxCD.
- CI/CD: Experience working with/designing functional CI/CD systems.
- Cloud Infrastructure: Prior experience in deploying workloads and managing lifecycle on any cloud provider (AWS/GCP/Azure)
Qualifications
- Typically, requires at least 15 years of related experience with a bachelor's degree, 12+ year and a master's degree, or a PhD with 10+ year of experience or equivalent experience.
- Excellent organizational agility and communication skills throughout the organization.
Environment
- Empowered Work Culture: Experience an environment that values autonomy, fostering a culture where creativity and ownership are encouraged.
- Continuous Learning: Benefit from the mentorship of experienced professionals with solid backgrounds across diverse domains, supporting your professional growth.
- Team Cohesion: Join a collaborative and supportive team where you'll feel at home from day one, contributing to a positive and inspiring workplace.
The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.
Please note that F5 only contacts candidates through F5 email address (ending with ) or auto email notification from Workday (ending with or
)
.
Equal Employment Opportunity
It is the policy of F5 to provide equal employment opportunities to all employees and employment applicants without regard to unlawful considerations of race, religion, color, national origin, sex, sexual orientation, gender identity or expression, age, sensory, physical, or mental disability, marital status, veteran or military status, genetic information, or any other classification protected by applicable local, state, or federal laws. This policy applies to all aspects of employment, including, but not limited to, hiring, job assignment, compensation, promotion, benefits, training, discipline, and termination. F5 offers a variety of reasonable accommodations for candidates. Requesting an accommodation is completely voluntary. F5 will assess the need for accommodations in the application process separately from those that may be needed to perform the job. Request by contacting
-
Principal Software Engineer
7 hours ago
Karnataka, India NIKE Full timePRINCIPAL SITE RELIABILITY ENGINEERIndia Technology Center WHO YOU WILL WORK WITHThe Principal Site Reliability Engineer will work alongside a talented team of Site Reliability Engineers focused on delivering reliabile and observable software used by millions of athletes* around the world. You will be a part of the Resilience Engineering organization which...
-
Principal Software Engineer
21 hours ago
Karnataka, Karnataka, India NIKE Full timePRINCIPAL SITE RELIABILITY ENGINEERIndia Technology Center WHO YOU WILL WORK WITHThe Principal Site Reliability Engineer will work alongside a talented team of Site Reliability Engineers focused on delivering reliabile and observable software used by millions of athletes* around the world. You will be a part of the Resilience Engineering organization which...
-
Principal Site Reliability Engineer
1 week ago
Bangalore - Manyata Tech Park Road, India Commonwealth Bank Full time ₹ 1,04,000 - ₹ 1,30,878 per yearOrganization: At CommBank, we never lose sight of the role we play in other people's financial wellbeing. Our focus is to help people and businesses move forward to progress. To make the right financial decisions and achieve their dreams, targets, and aspirations. Regardless of where you work within our organisation, your initiative, talent, ideas, and...
-
Site Reliability Engineer
11 hours ago
bangalore, India WhiteLotus Talent Partners Full timeWe are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system...
-
SRE (Senior Site Reliability Engineer)
6 days ago
Bengaluru Rural, India Synechron Full time ₹ 15,00,000 - ₹ 28,00,000 per yearWe have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years.Synechron BangaloreJob Role: - SRE (Senior Site Reliability Engineer)Job Location: - BangaloreNotice Period: Within 30daysJob DescriptionBase Skills:Performance Testing & Engg, Scalability, Availability. Exp with Load Testing tools: Jmeter/LoadRunner, and exp with any APM...
-
Site Reliability Engineer
1 day ago
Bangalore, India Xebia Full timePerformance & Reliability Engineer ( Senior, Lead , Principal & Manager) Hybrid Location: Pune, Chennai, Bangalore & Gurgaon Need immediate joiners only Job description Role: Performance & Reliability Engineer Job Location: Gurgaon, Chennai, Pune, Bangalore Hybrid Job Overview: We are seeking a highly skilled and motivated ...
-
Associate Site Reliability Engineer
6 hours ago
Bangalore, Karnataka, India Pearson Full timeJob Category Technology Role Overview Learning The Associate Site Reliability Engineer s SRE primary focus will be on acquiring and honing the essential skills required to excel in the role They will work closely with more experienced engineers who will mentor and guide them throughout their journey The responsibilities will encompass various...
-
Site Reliability Engineer
1 day ago
Bangalore, India Synechron Full timeWe have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years. Synechron – Bangalore Job Role: - SRE (Senior Site Reliability Engineer) Job Location: - Bangalore Notice Period: Within 30days About Synechron We began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown...
-
Site Reliability Engineer
8 hours ago
bangalore, India Synechron Full timeWe have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years.Synechron – BangaloreJob Role: - SRE (Senior Site Reliability Engineer)Job Location: - BangaloreNotice Period: Within 30daysAbout SynechronWe began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+...
-
Site Reliability Engineer
1 day ago
Bangalore, India ViewSonic Full timeJob Requirements: Bachelor's degree in Computer Science, Engineering, or a related field. 3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory. Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS. Interest and understanding of...