SRE Engineer
1 week ago
Immediate Joiner/ 15-30 days notice period,
As an SRE engineer you will be responsible for creating standards/framework for applications andservices hosted on AWS for resiliency, high availability, multi region strategy, DRstrategy. Youwould also need identify which apps or services not adhering to these standards and plantheircompliance. You would be responsible to collaborate with development teamto ensurebest practices of reliability and resiliency are in place.
RESPONSIBILITES
Responsible for the overall system/framework design working with client requirement.
Set standards for resiliency, high availability, multi-region strategy, DR strategy for application andservices on AWS/Cloud
Run test suites/framework to find non compliance services/applications
Plan strategy to migrate non-compliant applications/services to adhere them to resiliency standards
Know how best to monitor systems and react when things go wrong, constantly writing and rewriting response playbooks to reduce the time to fix any breakdown which may occur
Ensure software applications remain reliable amidst frequent updates from development teams
Collaborate with development teams to optimize application performance & resiliency on AWS platforms
EXPERIENCE AND REQUIRED SKILL SETS
5+ years of experience in AWS, CICD and DevOps tools
Strong understanding of Cloud-based architecture & cloud operations
Working understanding of Infrastructure and application monitoring platforms Datadog, Opensearch, ELK Stack etc.
Good understanding of performance and capacity monitoring. Its configuration & optimization
5+ years of experience in setting up strategy, process and checks for resiliency in AWS Knowledge of Linux, shell scripting, Python is preferred
Working knowledge of terraform is good to have
Excellent problem-solving skills and attention to detail.
Ability to work independently as well as collaboratively in a team environment.
EDUCATION
Bachelors degree or master's in computer science, Engineering, Software Engineering or a relevant field.
Interested candidates submit their updated resume
-
DevOps SRE
1 week ago
Gurugram, Panchkula, India Vishvavyapsamadhanam Services Full time ₹ 8,00,000 - ₹ 25,00,000 per yearImmediate Joiner/15-30 days notice periodSRE+ Python(5+ years) In this role, you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliability Engineering practices.
-
Observability Engineer – SRE
4 weeks ago
Gurugram, India GSPANN Full timeDescription GSPANN is hiring an Observability Engineer with expertise in Site Reliability Engineering (SRE) The role focuses on leveraging SRE principles, automation, and AI-driven observability to enhance reliability and scalability across cloud and ERP environments.Role and Responsibilities Leverage Application Performance Management (APM) tools such...
-
Observability Engineer – SRE
4 weeks ago
Gurugram, India GSPANN Full timeDescription GSPANN is hiring an Observability Engineer with expertise in Site Reliability Engineering (SRE) The role focuses on leveraging SRE principles, automation, and AI-driven observability to enhance reliability and scalability across cloud and ERP environments.Role and Responsibilities Leverage Application Performance Management (APM) tools such as...
-
sre
2 weeks ago
Gurugram, Hyderabad, Noida, India Zensar Full time ₹ 15,00,000 - ₹ 25,00,000 per yearShort Description for Internal CandidatesBachelors degree in Computer Science, IT, or equivalent. - 3–6 years in SRE, Observability, Application Monitoring, or Performance Engineering roles. - Hands-on exposure to Glassbox and Sumo Logic strongly preferred.*Description for CandidatesWe are seeking a Site Reliability Engineer (SRE) with a strong focus on...
-
Site Reliability Engineer
3 weeks ago
Gurugram, Gurugram, India Impronics Technologies Full timeJob Description We are seeking a seasoned Site Reliability Engineer (SRE) with a solid background in payment systems and high-availability architectures. The ideal candidate will have hands-on experience managing large-scale, distributed systems in production, with a deep understanding of reliability, scalability, and performance tuning in the financial...
-
Lead Systems Engineer
4 weeks ago
Gurugram, India Epam Full timeDescription Join our organization as a Lead Systems Engineer (DevOps & SRE) and play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud...
-
SRE Lead
2 weeks ago
Gurugram, Hyderabad, India GSPANN Full time ₹ 12,00,000 - ₹ 36,00,000 per yearRole & responsibilitiesBuilding unified dashboardsDashboards to track overall health of the production support (Derived from SNOW)Dashboards to track overall health of the applications (Using Observability tools like AppDynamics, DynatraceIdentify automation opportunities and implement the sameSelf healing using scripting languages of Java, PythonIntegration...
-
DevOps (SRE) Engineer
4 weeks ago
Gurugram, India MindPec Solutions Full timeKEY RESPONSIBILITIES Ensure 24*7 uptime and stability of production systems Investigate and troubleshoot production issues Collaborate with developers to optimize system performance Participate in on-call rotation to provide 24/7 support for critical systems Work on automation and enhancements to reduce manual processes / intervention. Relevant 5+ years of...
-
DevOps (SRE) Engineer
4 weeks ago
Gurugram, India MindPec Solutions Full timeKEY RESPONSIBILITIES Ensure 24*7 uptime and stability of production systems Investigate and troubleshoot production issues Collaborate with developers to optimize system performance Participate in on-call rotation to provide 24/7 support for critical systems Work on automation and enhancements to reduce manual processes / intervention. Relevant 5+...
-
DevOps (SRE) Engineer
3 weeks ago
Gurugram, India MindPec Solutions Full timeKEY RESPONSIBILITIES Ensure 24*7 uptime and stability of production systems Investigate and troubleshoot production issues Collaborate with developers to optimize system performance Participate in on-call rotation to provide 24/7 support for critical systems Work on automation and enhancements to reduce manual processes / intervention. Relevant 5+ years of...