Manager - Site Reliability (SRE)
3 days ago
Imagine what we could do together. At Apple, new ideas have a way of becoming excellent products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at Apple don't just build products - they craft the kind of wonder that's revolutionized entire industries. It's the diversity of those people and their ideas that encourages the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Apple's ETS group is looking for a versatile Site Reliability Engineering (SRE) Manager with great technical acumen, strong background in operations, automation, implementation and development. As a Site Reliability Engineering Manager, you will be leading a team responsible for ensuring the availability of high volume, critical enterprise platforms / applications and scale seamlessly. The application range from a broad spectrum of security platforms, anomaly detection, malware and abuse detection and prevention, edge security etc. to name a few and integrations with Apple's supply chain partners such as manufacturers, logistics providers, banks, resellers and business customers.
Description
As a Site Reliability Engineering (SRE) Manager, candidate will be responsible for building, developing, and, retaining a high-performing team of software engineers and build an environment where they can thrive and succeed. While the primary role is leading/managing employees, you should have deep technical knowledge on distributed systems and cloud computing, security platforms and can quickly understand and respond to peer teams' needs. It is also encouraged that you have strong experience working with short release cycles, do not hesitate to : - Actively participate in architectural and functional design, implementation and troubleshooting sessions - Review hardware, software infrastructure and application functionality for identifying and optimizing performance bottlenecks - Drive major incident management to restore order - Spearhead in designing and implementing comprehensive monitoring for applications, integrations and anomalies - Innovate and find opportunities and drive automation efforts across various platform and security applications - Working closely with Cross functional IT organization, Business group, Apple's production support team, application engineers, systems engineers, database administrators and QA team to effectively ensure implementation and reliability of Platforms/Applications - A proven track record with managing, motivating and providing technical guidance to a team of software engineers to draw out their best work will be key to success - Ensuring quality in every deliverable, creative thinking, strong problem solving, and the ability to collaborate with other global cross-functional teams in a fast paced environment will be meaningful attributes to succeed in this role
Minimum Qualifications
- At least 12+ years of prior demonstrated experience in a Site Reliability Engineering, DevOps, or an Infrastructure-focused role
- 3+ years of experience leading and managing high performance SRE teams
- Proven track record in leading sophisticated SRE projects, enterprise services at a large scale
- Strong analytical, troubleshooting and problem solving skills
- Good knowledge in at least one object oriented programming language (preferably Java, Python)
- Unix Performance Monitoring & Tuning
- Good understanding of Database concepts, PL/SQL and NoSql Technologies
- Hands on experience with monitoring and data analysis tools (e.g., Prometheus, Splunk, Grafana, Cloudwatch)
- Building and operating container orchestrating systems like Kubernetes or EKS
- Deep understanding of security concepts and protocols -authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, PKI, X509 certificates and PGP
- Good fundamentals on Release Management & continuous Integration
- Familiarity with modern web services architectures, cloud platforms such as AWS, GCP, Azure and distributed storage systems (ScaleIO, Amazon S3)
- Ability to communicate with large cross-functional teams about various engineering topics such as system architecture, detailed design, APIs, project schedules etc.
- Ability to make right trade-off choices when dealing with functional complexity, conflicting priorities and aggressive schedules
- Represent the team and remove hurdles to enable each team member to operate at the highest level of efficiency and productivity
- Ability to hire, mentor and manage the performance of a large team
- Ability to connect with senior executives and business stakeholders
- A learning attitude to continuously improve self, team and the organisation
- Ability to work under pressure and manage difficult situations in a fast-paced work environment
- Bachelor or Masters or equivalent experience in Computer Science or other related field
Preferred Qualifications
- Java and JVM technologies runtime configurations and troubleshooting is preferred
- Good fundamentals on data modelling and machine learning algorithms
Strong knowledge on securing applications, thorough understanding of OWASP top 10 risks and solutions.
Submit CV
-
SRE(Site Reliability Engineer)
1 week ago
Hyderabad, Telangana, India Talent Worx Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSRE (Site Reliability Engineer)Talent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services. Your work will involve both software engineering and systems operations as you strive to improve...
-
SRE (Site Reliability Engineer)
6 days ago
Hyderabad, Telangana, India VXI Global Solutions Full time ₹ 15,00,000 - ₹ 30,00,000 per yearIt's fun to work in a company where people truly BELIEVE in what they are doingWe're committed to bringing passion and customer focus to the business.Job Summary:We are seeking a skilled SRE Engineer to design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on...
-
Devops SRE Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Infoshare Systems, Inc. Full time ₹ 20,00,000 - ₹ 25,00,000 per yearPosition: DevOps SRELocationHyderabad - hybridDuration: FulltimeOptumRequired Skills & Experience5 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles.Strong experience with Linux/Unix systems administration.Proficiency with Python, Go, or Bash scripting.Solid understanding of networking, firewalls, and security...
-
Site Reliability Engineer/SRE
1 week ago
Hyderabad, Telangana, India NCR Corporation Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJob Title: Site Reliability Engineer IILocation: HyderabadJob Type: Full-Time, 24*7 AvailabilityJob Description:As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and observability of our transaction processing and settlement platforms, while also providing hands-on support for critical business applications....
-
Senior Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Jade Global Software Pvt Ltd Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSenior Site Reliability Engineer (SRE) – Datadog ObservabilitySenior Site Reliability Engineer (SRE) – Datadog Observability1 Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad...
-
Senior Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Jade Global Full time ₹ 12,00,000 - ₹ 24,00,000 per yearSenior Site Reliability Engineer (SRE) – Datadog Observability1Job Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an...
-
Senior Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Jade Global Full time ₹ 1,00,00,000 - ₹ 2,00,00,000 per yearJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in Datadog Location: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer (SRE) to lead end-to-end SRE...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India BYLD Group Full time ₹ 12,00,000 - ₹ 36,00,000 per yearDescriptionJob Title :Site Reliability Engineer (SRE) - DataDog / AWS Lambda / DynamoDB / ServerlessLocation :Bangalore / Pune / HyderabadExperience :5- 10 YearsAbout The RoleWe are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in DataDog integration, AWS Lambda, DynamoDB, and Serverless architectures. The ideal candidate will...
-
Site Reliability Engineering Manager
18 hours ago
Hyderabad, Telangana, India Tata Consultancy Services Full time ₹ 9,00,000 - ₹ 12,00,000 per yearRole : Site Reliability Engineering (SRE)Experience : 5 to 12Location : Chennai, Bangalore, HyderabadKeywords :Kubernetes (CLI), PostgreSQL, SQL, GKE, Google Cloud, Terraform, AnsibleInterview Mode : Weekend Walkin DriveMust-Have :GKE(Preferable); Kubernetes (Any cloud) + PostgresSQL, SQL(Must)Linux (Optional), Java (Optional) , Kubernetes (CLI), Prior...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Technology Next Full time ₹ 20,00,000 - ₹ 30,00,000 per yearUrgently hiring for Site Reliability Engineer (SRE) / Chaos EngineerLocation: HyderabadJob Type: Full-time, PermanentJob Description:We are looking for an experienced Site Reliability Engineer (SRE) with strong Python automation skills (Boto3 required) and hands-on experience in chaos engineering to improve system reliability and resilience. The ideal...