Site Reliability Engineering Manager
3 weeks ago
The Role:
A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.
As a Site Reliability Engineering Manager working on critical services your mission will be to ensure our services are fast, highly available, scalable, and able to withstand unprecedented increases in load. The site Reliability Engineering Manager will be at the heart of solving production problems. Your scope is from the kernel to the application. The position requires the flexibility to take a holistic approach to troubleshooting and the ability to delve deeply into technical details. The Site Reliability Engineering Manager is co-located with the various application development teams. This ensures the Systems Reliability Engineer will acquire the necessary domain knowledge to effectively troubleshoot and repair an outage. The Site Reliability Engineering Manager will build automation tools for system health and production acceptance tests to validate production changes. The Site Reliability Engineering Manager will ensure the system is well instrumented and highly fault tolerant.
Key Leadership Responsibilities:
- Engage, influence, and evangelize SRE practices with development, operational and product groups to align technology service/solution delivery.
- Drive quality accountability within the organization with well-defined processes, metrics, and goals for process quality. This includes leading effective postmortems and ensuring actions are followed-up.
- Manage availability, latency, scalability, and efficiency of Bloomberg applications development by instilling engineering reliability into our development life cycle with a focus on fault tolerant approaches.
- Drive capacity planning, performance analysis, instrumentation, and other non-functional systems requirements.
- Must be able to define and report "progress" on strategic initiates and project level tasks to all stakeholders including senior executives, clients and use effective communication approaches with each constituency.
- Implement metrics driven processes to ensure service quality targets are met.
Key skills:
- Expert knowledge in all aspects of designing, developing, managing large real-time systems.
- Project and process management
- Prior successful experience as a systems performance or site/systems reliability engineer.
- Mastery of fault tolerant approaches in a large-scale distributed environment and high-performance systems.
- Demonstrated experience working in large, complex systems environments.
- AWS cloud experience is mandatory.
- Experience in Infrastructure-a-Code using Terraform.
- Experience with securing the AWS workloads and security practices will be a huge plus.
- Experienced on Site Reliability Engineering (preferred) and automating repetitive tasks using Python, PowerShell, etc.
- Experience delivering complex solutions utilizing common programming languages C#, JS, TypeScript, YAML, Terraform, PHP
- Extensive experience with configuring and monitoring via tools such as DataDog, ELK, Splunk, AppDynamics, etc.
- Experience collaborating across multiple functional and/or technical teams to deliver an Agile-based project.
- Demonstrated growth mentality, enthusiasm about learning new technologies quickly and applying the gained knowledge to address practical business problems.
- Ability to communicate with team members and partners to work through technical solutions.
- Demonstrated knowledge of fundamental cloud security (e.g., Identity and Access Management, firewalls, etc...)
Qualifications, Knowledge, and Experience:
- The successful candidate will possess an outstanding record of professional experience and will thrive in an environment that demands accountability. He/She must possess significant technology management and product development experience. He/She must also have strong planning, organizational, communication skills, and be a key driver to help the team understand the big picture perspective.
- Proven leader of technology solutions in a high-volume transaction environment.
- Accomplished leader with 5+ years managing regional and global areas.
- Have excellent time management, communication, decision-making, presentation, and organizational skills.
- Maintain excellent written and verbal communications with clients, employees, and management chain, including status reports, project plans, presentations, etc.
- Ability to lead across functions and motivate a matrix staff.
Our teams thrive in a culture of openness, creativity, leadership, customer-centricity and people growth, Click here to learn more about the work we do.
Follow us on
-
Site Reliability Engineer
1 month ago
india Cricbuzz.com Full timeSite Reliability Engineer We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services. Experience - 3 - 5 years Responsibilities: ●...
-
Manager, Site Reliability Engineering
1 month ago
india Greenway Health Full timeJob Description Job Summary The Manager is responsible for implementing the development process and site reliability engineering practices to resolve issues and identify opportunity areas. This role will lead development and site reliability engineering teams and establish and implement best practices and standards related to engineering...
-
Site Reliability Engineer
1 month ago
india SID Global Solutions Full timeDear Candidates, We are looking for immediate joiners 8 to 9 years for Hyderabad Location for a talented Site Reliability Engineer-Manager to join our dynamic team and contribute to the development of our cutting-edge web applications. If you're passionate about the role and have experience in SRE, GCP and Kubernetes , send me your updated cv : Please...
-
Site Reliability Engineer
2 months ago
india Quiktrak, LLC Full timeJob Title: Azure Site Reliability Engineer (SRE) / DevOps Engineer Job Description: Summary: As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...
-
Site Reliability Engineer
4 weeks ago
india Korn Ferry Full timeRole - Site Reliability Engineer Exp - 5+ years Required Location - Hyderabad ( Work from Office-Hybrid) Shift Timings - 5AM -1 PM IST We are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely...
-
Site Reliability Engineering Manager
1 month ago
india CloudBees Full timeJob Title - Manager, Site Reliability Engineer Location - Bangalore and Chennai Year of Experience - 10+ Years About CloudBees CloudBees is the leading software delivery platform that enables enterprises to deliver scalable, compliant, and secure software, empowering developers to do their best work. Seamlessly integrating into any hybrid and...
-
Site Reliability Engineer
3 weeks ago
india Thoucentric Full timeJob Description Job Description:We are seeking a skilled and dedicated Site Reliability Engineer (SRE) to join our team. The SRE will be responsible for ensuring the reliability, performance, and scalability of our systems and applications. This role combines software development and systems engineering to build and run large-scale, distributed,...
-
Site Reliability Engineer
2 weeks ago
india Thoucentric Full timeJob Description Job Description:We are seeking a skilled and dedicated Site Reliability Engineer (SRE) to join our team. The SRE will be responsible for ensuring the reliability, performance, and scalability of our systems and applications. This role combines software development and systems engineering to build and run large-scale, distributed,...
-
Site Reliability Engineer
1 month ago
india System Soft Technologies Full timeTitle: Site Reliability Engineer 100% REMOTE The Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...
-
Site Reliability Engineer
1 month ago
India System Soft Technologies Full timeTitle: Site Reliability Engineer100% REMOTEThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...
-
Site Reliability Engineer
2 months ago
india ViewSonic Full timeJob Requirements: Bachelor’s degree in computer science, Engineering, or a related field. 3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Proficient in AWS solutions including but not limited to EC2, S3, CloudWatch, Lambda, and RDS. Strong understanding of Platform Engineering concepts and principles. Experience...
-
Site Reliability Engineer
3 weeks ago
india WaferWire Cloud Technologies Full timeRole: SRE (Site Reliability Engineer) Experience: 4+ Years About WaferWire Cloud Technologies: WaferWire Cloud Technologies is a leading provider of innovative cloud solutions aimed at transforming businesses and driving digital growth. With a focus on cutting-edge technology and customer-centric approaches, we empower organizations to thrive in the...
-
Site Reliability Engineer
3 weeks ago
india HCLSoftware Full timeThe Role: HCL BigFix is looking for a Site Reliability Engineer to work on infrastructure for a new product that will help keep our customers’ end points secure. You will be a part of a team that leverages modern technological solutions to drive growth and efficiency. Your daily responsibilities will be centered on HCL BigFix’s cloud infrastructure,...
-
Sr. Site Reliability Engineer
1 month ago
india Encora Inc. Full timeDescription Sr. Software Engineer (Site Reliability Engineer) Important Information Location: Ahmedabad Experience: 5+ years Job Mode: Full-time Work Mode: Remote Job Summary Working with DevOps SRE with good experience in Site Reliability Engineer. Responsibilities and Duties Design, implement, and maintain highly...
-
Sr. Site Reliability Engineer
2 weeks ago
india Encora Inc. Full timeDescription Sr. Software Engineer (Site Reliability Engineer) Important Information Location: Ahmedabad Experience: 5+ years Job Mode: Full-time Work Mode: Remote Job Summary Working with DevOps SRE with good experience in Site Reliability Engineer. Responsibilities and Duties Design, implement, and maintain highly...
-
Site Reliability Engineer
2 weeks ago
india Circles Life Full timeJob Description Role: Site Reliability Engineer (SRE) Title: Software Engineer II, SRE Location: Bangalore About Circles Founded in 2014, Circles is a global technology company reimagining the telco industry with its SaaS platform - Circles X, helping telco operators launch and operate successful digital brands through its...
-
Site Reliability Engineer
4 weeks ago
india STAFIDE Full timeJob Description About us: Stafide is the premier destination for tech talent consulting, providing comprehensive employment services throughout Europe. Our mission is straightforward: to effortlessly connect job seekers with employers, focusing on the rapidly changing technology sector. Boasting unparalleled expertise and a steadfast commitment, we...
-
Site Reliability Engineer
2 weeks ago
india STAFIDE Full timeJob Description About us: Stafide is the premier destination for tech talent consulting, providing comprehensive employment services throughout Europe. Our mission is straightforward: to effortlessly connect job seekers with employers, focusing on the rapidly changing technology sector. Boasting unparalleled expertise and a steadfast commitment, we...
-
Site Reliability Engineer
4 weeks ago
india UBS Full timeYour role We're looking for a Site Reliability Engineer to:• work as a part of an agile pod (team)• determine the reliability of our digital products, technology services, and the infrastructure that underpins them• minimize the risk and impact of failures by engineering operational improvements, such as predictive monitoring, auto scaling or...
-
Site Reliability Engineer
2 weeks ago
india UBS Full timeYour role We're looking for a Site Reliability Engineer to:• work as a part of an agile pod (team)• determine the reliability of our digital products, technology services, and the infrastructure that underpins them• minimize the risk and impact of failures by engineering operational improvements, such as predictive monitoring, auto scaling or...