Site Reliability Engineer
2 months ago
Job Description :
- The Staff Platform Engineer 1 plays a role of Technical Lead.
- He has the capability to lead one the Platform Engineering team from technical side.
- He defines, implement and maintain the Platform Engineering Best Practices across BF & BVs.
- He can work on architectural design individually.
- He has complex problem-solving capability.
- He also mentor junior and mid-level engineers and handle high-priority platform-related issues will.
- He/she will also be responsible for focusing on value creation, growth and serving customers with full ownership and accountability and delivering exceptional customer and business results, embracing a non-hierarchical culture of collaboration, transparency, and trust.
Key Responsibilities :
Platform Engineering :
- Lead the implementation of critical platform enhancements.
- Mentor and guide mid-level engineers.
- Develop programs to scale mentorship and knowledge-sharing initiatives.
- Collaborate with leadership on shaping the overall technology strategy.
- He is the one who can contribute comfortably on all the verticals of SRE Infrastructure as a code, Logging, Monitoring, Security, CICD, developers Empowerment, Costing.
- He can handle, drive all critical P0, and do the RCA and ensure the availability.
- He should be able to technically mentor the team, drive innovation.
- Lead capacity planning initiatives, collaborating with cross-functional teams to optimize infrastructure resources.
- Owns Upgrades and maintenance of critical Platform Components like Kubernetes, Istio, DB's, Kafka.
- Take a leadership role in incident response, ensuring rapid and effective resolution of critical issues.
- Lead and Participate in the design and implementation of software architecture to ensure system reliability and performance.
- Innovate and identify the bottlenecks and drive the Poc with team members.
- Lead the design and implementation of platform solutions.
- Think Reliability first for each infra component but still need some help/validation in looking at the long term picture.
- Contribute to the establishment of best practices.
- Participate in on-call rotations, responding promptly to production outages, and providing operational support to ensure seamless system functionality.
- Contribute to the continuous monitoring and maintenance of system performance, proactively identifying and addressing potential issues.
- Utilize analytical skills to meticulously analyze and debug software application issues, demonstrating the ability to identify root causes and implement robust solutions.
- Efficiently troubleshoot basic platform-related problems, ensuring quick and effective resolution to minimize impact on operations.
- Proactively identify opportunities for process improvement and efficiency gains within the platform engineering domain.
Minimum Qualifications/education :
- Bachelor's degree in Computer Science, Software Engineering, or a related field.
Minimum experience :
- 7-9 years of experience in related function.
Skills :
- Substantial experience in platform engineering.
- Define and implement advanced automation strategies, leveraging cutting-edge tools and technologies.
- Proven expertise in designing large-scale platforms.
- Strong hold with highest proficient with Domains containerization (Docker, Kubernetes).
- Identify opportunities for efficiency gains and implement strategies for continuous improvement.
- Proven Experience in setting up Kafka, Elastic Search Infrastructure from scratch.
- Contribute significantly to the architectural design of systems, ensuring they meet the highest standards of reliability, scalability, and performance.
- Strong Communication and collaboration skills.
- Strong understanding with cloud platforms (AWS, Azure, GCP).
- Strong expertise in Cloud Design and Help Development teams implement architectures with Best Practices.
- Strong Experience with Database Administration like Perform database administration tasks including automation, performance monitoring and tuning and query optimization and data archiving, indexing strategies.
- Strong experience in developing Helm charts, Custom resource Definitions, Advanced Tooling for Kubernetes.
- Demonstrated Experience in Advanced CCID like reducing Build Time, Deployment time and DORA Metrics for Deployments.
- Demonstrated Experience in developing SLI, SLO Dashboards per Business Vertical and maintaining overall SLI.
- Extensive experience in platform engineering.
- In-depth knowledge of cloud services and infrastructure as code.
- Strong leadership and communication skills.
- Strong experience in delivering Highly available, Fault Tolerant solutions.
- In-depth knowledge on Kubernetes advance concepts, Istio - Service Mesh and Load Balancing.
- Able to understand technical opportunities and easily translate them into software requirement.
- Knowledge of e-Commerce ecosystem and value chain from both a business and a technical standpoint.
- Effective verbal and written communication skills for collaborating with team members; expertise in converting technical messages into clear messages that outline why change is needed that appeals to key business personas and non-technical stakeholders.
- Strong ability to articulate the big picture with or without details and work in ambiguous situations.
- Strong business communication and presentation skills.
- Strong English language skills (Speaking, Reading and Writing) with exceptional business writing, Arabic is a plus.
-
Site Reliability Engineer
3 weeks ago
Gurgaon/Gurugram/Bangalore, India Grizmo Labs Full timeJob Title: Site Reliability EngineerGrizmo Labs is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key Responsibilities:Design and implement scalable and highly available...
-
Site Reliability Engineer
3 days ago
gurugram, India Bijak Full timeAs a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...
-
Site Reliability Engineer
2 days ago
gurugram, India Bijak Full timeAs a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...
-
Site Reliability Engineer
3 days ago
Gurugram, India Bijak Full timeAs a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...
-
Site Reliability Engineer
3 days ago
Gurugram, India Bijak Full timeAs a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...
-
Site Reliability Engineer
2 days ago
Gurugram, India Bijak Full timeAs a Site Reliability Engineer I at Bijak, you will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure. You will collaborate with cross-functional teams to support and monitor applications in Production. This role offers an exciting opportunity to contribute to a cutting-edge technology environment and drive...
-
Site Reliability Engineer
2 months ago
Gurugram, India DotPe Full timeRole Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...
-
Site Reliability Engineer
2 months ago
Gurugram, India DotPe Full timeRole Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...
-
Site Reliability Engineer
2 months ago
gurugram, India DotPe Full timeRole Summary: We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...
-
Site Reliability Engineer
2 months ago
gurugram, India DotPe Full timeRole Summary:We are seeking a dedicated and skilled Site Reliability Engineer (SRE) with a minimum of 2 years of hands-on experience. The ideal candidate will have a strong foundation in Linux, networking, AWS, and Kubernetes, along with expertise in monitoring and alerting tools. As an SRE, you will play a critical role in ensuring the reliability,...
-
Senior Site Reliability Engineer
2 months ago
Gurugram, India AMEX Full timeYou Lead the Way. Weve Got Your Back. With the right backing, people and businesses have the power to progress in incredible ways. When you join Team Amex, you become part of a global and diverse community of colleagues with an unwavering commitment to back our customers, communities and each other. Here, youll learn and grow as we help you create a career...
-
Senior Engineer
3 weeks ago
Gurugram, India Callisto Talent Solutions Private limited Full timeAM - Site Reliability Engineer - F2F Interviews onlyOur client is a leading global investment banking firm specializing in Financial Services like Advisory and capital raising, financing, investing, leasing, research, trading and hedging, and banking, advice and intermediary services, and funds management. They are setting up new SRE team based in Gurugram...
-
Senior Site Reliability Engineer I
2 months ago
Gurugram, India RELX India (Pvt) Ltd Risk div Company Full timeAbout the role We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to manage and optimize our AWS cloud resources. The ideal candidate will have a strong background in AWS, Terraform, Kubernetes, and scripting, with proficiency in monitoring and CI/CD tools. Experience with Hashicorp Vault is a plus. Responsibilities: ...
-
Site Reliability Engineer
2 months ago
Gurugram, India upGrad Full timeLocation : Gurugram, Kolkata, Hyderabad, Jaipur, Bangalore. Experience : 4+.Job Description :Job Specification for Site Reliability Engineer (SRE) Futures First. Skills and Experience Required :- 4+ years of experience in a Site Reliability Engineering (SRE) or DevOps role.- Strong expertise in managing high-performance, low-latency distributed systems,...
-
Senior Site Reliability Engineer
4 days ago
Gurugram, India Cvent Full timeSite Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...
-
Senior Site Reliability Engineer
4 days ago
gurugram, India Cvent Full timeSite Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...
-
Senior Site Reliability Engineer
3 days ago
gurugram, India Cvent Full timeSite Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...
-
Senior Site Reliability Engineer
4 days ago
Gurugram, India Cvent Full timeSite Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...
-
Senior Site Reliability Engineer
3 days ago
Gurugram, India Cvent Full timeSite Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we...
-
Senior Engineer
3 weeks ago
Gurgaon/Gurugram, IN Callisto Talent Solutions Private limited Full timeAM - Site Reliability Engineer - F2F Interviews onlyOur client is a leading global investment banking firm specializing in Financial Services like Advisory and capital raising, financing, investing, leasing, research, trading and hedging, and banking, advice and intermediary services, and funds management. They are setting up new SRE team based in Gurugram...