Persistent Systems | SRE Manager

1 day ago


hyderabad, India Persistent Systems Full time
About Position:
As a Site Reliability Manager, you will play a pivotal role in ensuring the scalability, performance, and reliability of systems. Responsible for ensuring the scalability, performance, and reliability of our software systems. You will work closely product development team to design, build, and maintain the infrastructure and tools needed to support and to guarantee uptime to our customers. The Cloud Engineering Manager is responsible for infrastructure deployed through infrastructure-as-code and GitOps practices across multiple major cloud platforms. The SRE Team, within Cloud Engineering, advances DevOps Model by solving challenging problems, addressing inefficiencies, and enabling developer effectiveness through the adoption and maturation of DevOps capabilities. The SRE team works remotely across the globe by leveraging collaboration tools and agile methodologies. We balance day-to-day support for our application development teams with our planned objectives. Objectives align to a clear vision across themes of Global Scalability, Operational Cost Efficiency, & Infrastructure, and Workload Reliability.
Role: SRE Manager
Location: Hyderabad
Experience: 8+ Years
Job Type: Full Time Employment
What You'll Do:
Team Management: Establish and grow a high-performing team within the global SRE group and Cloud Engineering organization.
Provide guidance, support, and career development opportunities to team members.
Coach and mentor other team members within the SRE team.
Act as site lead for all Cloud Engineering team members in your region.
Service Reliability and Availability: Set and maintain high standards for service reliability and availability. Continuously update and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to achieve SRE goals.
Incident Response and Root Cause Analysis: help scale a global 24/7/365 on-call rotation and participate in critical incident responses.
Ensure prompt and effective resolutions to incidents, and lead root cause analysis activities to prevent future
occurrences.
Continuous Improvement: Foster a culture of continuous improvement within the SRE team. Identify and resolve pain points across the entire engineering team, improve developer experience, and adopt best practices.
Reporting: Regularly report team and project status to relevant stakeholders.
Team Leading – leading a team of 2-4 members. Overview, assign work and guide team members on their regular tasks, provide them inputs and enable them to perform the job smoothly.
Managing the impediments and make sure required help is available.
Tracking and monitoring performance and reporting the regular updates to leadership team.
Collaboration with Development Teams: Work closely with application development teams to incorporate their feedback, improving developer experience and reducing toil.
Participate in service capacity planning and demand forecasting, software performance analysis, and system tuning.
Define and track key metrics related to service reliability and performance.
Work closely with development teams to ensure that platforms are designed with "operability" in mind.
Tool and Framework Development: Communicate and promote best practices by building out tools and frameworks that increase the adoption of SRE practices across engineering teams.
Develop automation and systems management tools for daily operations. Design, develop, and maintain scalable, automated, user-friendly systems, tools, and processes to support our software development and operations.
Strategic Planning: Present plans and proposals to Engineering Leadership and contribute to the creation of SRE roadmaps.
Lead projects that fulfill objectives shared across SRE teams.
Infrastructure and Deployment Tooling: Develop tooling to manage infrastructure and deployments, primarily using Python and Terraform, to support application teams.
Elevated Support: Provide an advanced level of support to application teams, helping to resolve complex issues and improve system performance.
Lead root cause analysis of critical issues and incidents, providing long-term resolution and mitigation plans.
Facilitate blameless post-mortems and drive comprehensive post-mortem analyses and reviews. Participate in on-call within the Cloud SRE team, ensuring high availability and timely incident response.
Enhanced Observability: Implement and refine observability frameworks to enhance visibility into the system's health and performance.
Utilize APM, synthetic monitoring, and logging tools to proactively identify and address performance bottlenecks and system anomalies.
Develop and maintain dashboards for real-time monitoring of critical metrics, ensuring rapid response to incidents and continuous system improvement.
Expertise You'll Bring:
Bachelor’s degree in computer science, Engineering, or related field, or equivalent work experience.
6+ years of experience in a Site Reliability Engineer, DevOps, or similar role.
2+ years of team management experience with a deep understanding of SRE methodologies and strong hands-on abilities
In-depth knowledge of software development processes and tools.
Strong understanding of cloud computing services (AWS, Google Cloud, Azure).
Experience with infrastructure as code tools such as Terraform.
Experience with container orchestration (e.g., Kubernetes, Docker Swarm).
Familiarity with continuous integration (CI) and continuous deployment (CD) methodologies and tools (e.g., Jenkins, GitLab CI).
Deep understanding of networking protocols and services.
Strong troubleshooting and problem-solving skills.
Excellent communication and collaboration skills.
Experience with monitoring and alerting tools (e.g., New Relic, Grafana, ELK stack).
Knowledge of Application Performance Management (APM), synthetic monitoring, and other observability tools to ensure high system performance and reliability.
Benefits:
Competitive salary and benefits package
Culture focused on talent development with quarterly promotion cycles and company-sponsored higher education and certifications.
Opportunity to work with cutting-edge technologies.
Employee engagement initiatives such as project parties, flexible work hours, and Long Service awards
Annual health check-ups
Insurance coverage: group term life, personal accident, and Mediclaim hospitalization for self, spouse, two children, and parents
Our company fosters a values-driven and people-centric work environment that enables our employees to:
Accelerate growth, both professionally and personally
Impact the world in powerful, positive ways, using the latest technologies.
Enjoy collaborative innovation, with diversity and work-life wellbeing at the core.
Unlock global opportunities to work and learn with the industry’s best.
Let’s unleash your full potential at Persistent.
“Persistent is an Equal Opportunity Employer and prohibits discrimination and harassment of any kind.”
  • SRE Manager

    2 days ago


    Hyderabad, India Persistent Systems Full time

    About Position: As a Site Reliability Manager, you will play a pivotal role in ensuring the scalability, performance, and reliability of systems. Responsible for ensuring the scalability, performance, and reliability of our software systems. You will work closely product development team to design, build, and maintain the infrastructure and tools needed...

  • SRE Manager

    2 days ago


    Hyderabad, India Persistent Systems Full time

    About Position: As a Site Reliability Manager, you will play a pivotal role in ensuring the scalability, performance, and reliability of systems. Responsible for ensuring the scalability, performance, and reliability of our software systems. You will work closely product development team to design, build, and maintain the infrastructure and tools needed to...

  • SRE Manager

    20 hours ago


    Hyderabad, India Persistent Systems Full time

    About Position:As a Site Reliability Manager, you will play a pivotal role in ensuring the scalability, performance, and reliability of systems. Responsible for ensuring the scalability, performance, and reliability of our software systems. You will work closely product development team to design, build, and maintain the infrastructure and tools needed to...

  • SRE Engineer

    2 days ago


    Hyderabad, India Persistent Systems Full time

    About Position: You will be essential in maintaining and enhancing the scalability, performance, and reliability of our systems. Your responsibilities will include collaborating with the product development team to design, build, and manage the infrastructure and tools necessary to support our software and ensure consistent uptime for our customers. Role:...

  • SRE Engineer

    19 hours ago


    Hyderabad, India Persistent Systems Full time

    About Position:You will be essential in maintaining and enhancing the scalability, performance, and reliability of our systems. Your responsibilities will include collaborating with the product development team to design, build, and manage the infrastructure and tools necessary to support our software and ensure consistent uptime for our customers. Role: SRE...


  • Hyderabad, India Persistent Systems Full time

    About Position:We are looking for candidates having experience on Spring boot, Kafka, Microservices.Role: Java DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Design, develop, test, and deploy scalable and resilient microservices using Java, Spring Boot, Spring Cloud, and Dataflow.Collaborate with other...


  • Hyderabad, India Persistent Systems Full time

    About Position:Looking for .NET lead with 4+ years of Experience in .NET core with New Versions, WCF, WPF, RESTful Services, MySQL.Role: .NET DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Execute development pipeline and delivery of the organization's software products to QA, and ultimately to...


  • Hyderabad, India Persistent Systems Full time

    About Position:Looking for. NET lead with 4+ years of Experience in. NET core with New Versions, WCF, WPF, RESTful Services, My SQL.Role:. NET DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Execute development pipeline and delivery of the organization's software products to QA, and ultimately to...


  • hyderabad, India Persistent Systems Full time

    About Position:Looking for .NET lead with 4+ years of Experience in .NET core with New Versions, WCF, WPF, RESTful Services, MySQL.Role: .NET DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Execute development pipeline and delivery of the organization's software products to QA, and ultimately to...


  • hyderabad, India Persistent Systems Full time

    About Position:We are looking for candidates having experience on Spring boot, Kafka, Microservices.Role: Java DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Design, develop, test, and deploy scalable and resilient microservices using Java, Spring Boot, Spring Cloud, and Dataflow.Collaborate with other...


  • hyderabad, India Persistent Systems Full time

    About Position: Looking for .NET lead with 4+ years of Experience in .NET core with New Versions, WCF, WPF, RESTful Services, MySQL. Role: .NET Developer Location: Hyderabad Experience: 4 to 12 Years Job Type: Full Time Employment What You'll Do: Execute development pipeline and delivery of the organization's software products to QA, and ultimately...


  • hyderabad, India Persistent Systems Full time

    About Position: Looking for .NET lead with 4+ years of Experience in .NET core with New Versions, WCF, WPF, RESTful Services, MySQL.Role: .NET DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Execute development pipeline and delivery of the organization's software products to QA, and ultimately to...


  • hyderabad, India Persistent Systems Full time

    About Position: We are looking for candidates having experience on Spring boot, Kafka, Microservices. Role: Java Developer Location: Hyderabad Experience: 4 to 12 Years Job Type: Full Time Employment What You'll Do: Design, develop, test, and deploy scalable and resilient microservices using Java, Spring Boot, Spring Cloud, and Dataflow. Collaborate...


  • hyderabad, India Persistent Systems Full time

    About Position: We are looking for candidates having experience on Spring boot, Kafka, Microservices.Role: Java DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat You'll Do:Design, develop, test, and deploy scalable and resilient microservices using Java, Spring Boot, Spring Cloud, and Dataflow.Collaborate with other...


  • Hyderabad, India Persistent Systems Full time

    About Position:The content readiness squad is building out interactivity and interoperability features to give learners the best-in-class experience. With autonomy and ownership, you will leverage cloud-native architectures and lead-edge technologies to maintain and build new features critical to the user experience. Candidates for this senior role will...


  • hyderabad, India Persistent Systems Full time

    About Position:We are excited to announce a face-to-face interview event at our Hyderabad office, on Saturday,7th December 2024. at SATTVA ARGUS, Hyderabad, Time- 10 AM to 5 PM.Looking for .NET lead with 4+ years of Experience in .NET core with New Versions, WCF, WPF, RESTful Services, MySQL.Role: .NET DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob...


  • Hyderabad, India Persistent Systems Full time

    About Position:We are excited to announce a face-to-face interview event at our Hyderabad office, on Saturday,7th December 2024. at SATTVA ARGUS, Hyderabad, Time- 10 AM to 5 PM.We are looking for candidates having experience on Spring boot, Kafka, Microservices.Role: Java DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob Type: Full Time EmploymentWhat...


  • Hyderabad, India Persistent Systems Full time

    About Position:We are excited to announce a face-to-face interview event at our Hyderabad office, on Saturday,7th December 2024. at SATTVA ARGUS, Hyderabad, Time- 10 AM to 5 PM.Looking for .NET lead with 4+ years of Experience in .NET core with New Versions, WCF, WPF, RESTful Services, MySQL.Role: .NET DeveloperLocation: HyderabadExperience: 4 to 12 YearsJob...


  • hyderabad, India Persistent Systems Full time

    About Position: Experience in Data Analysis, ETL deployment and Datawarehouse skills using Abinitio, SQL and Unix. Role: Abinitio Developer Location: Hyderabad or Bangalore Experience: Between 6 to 8 Years Job Type: Full Time Employment What You'll Do: Design and construct Abinitio graphs/mappings, packages workflows of varying complexity. Work on...


  • hyderabad, India Persistent Systems Full time

    About Position: Experience in Data Analysis, ETL deployment and Datawarehouse skills using Abinitio, SQL and Unix.Role: Abinitio DeveloperLocation: Hyderabad or BangaloreExperience: Between 6 to 8 Years Job Type: Full Time EmploymentWhat You'll Do:Design and construct Abinitio graphs/mappings, packages workflows of varying complexity.Work on Transform...