Consultant - Site Reliability Engineer (sre)
4 weeks ago
Responsibilities Observability - Dynatrace Elastic Stack or any other logging tools Mandatory skills Hands-on experience of APM - application monitoring tools mentioned above At least 5 years of relevant experience in Setting-up monitors integrate with alerting tools integration with third party tools via API REST services Experience to troubleshoot issues perform RCA create maintain dashboards of system application issues in APM tools Deep understanding of set-up of monitoring tools installation tuning vendor management patching upgrade Development Develop tools solutions to automate to standardize deployments and operations in regards to System application infrastructure monitoring Write scripts related to setting-up monitors in Application Performance Management tools such as Dynatrace AppMon etc Develop dashboard within APM tool to be used to research on the application issues Integration with systems over APIs REST services Nice to have System administration Apply software development engineering mindset to sys admin activities Focus on improving monitoring observability of systems in a measurable Define Service Level Indicators and Objectives Provide monitoring services for systems so that teams can begin to track their SLOs and SLIs Also assist in providing realistic objectives for the future and advise on proper SLAs for customers Monitoring Set-up tune monitors for application health status performance Improve monitoring based on symptoms instead of outages Recommend support and train usage of effective monitoring tools and process that can allow real-time system monitoring as well as analysis of long-term reliability trends Create new alerts to find anomalies and understand the root cause of system failures Integrate monitoring tools with third party tools such as XMatters alerting tools SMS tools Automation Document every action to convert findings into repeatable actions and then into automation Discover efficiency by automating things to remove repetitive toil like watching dashboard executing scripts and other manual endeavors Help optimize on-call rotation - add automation and context to alerts - leading to better real-time collaborative response from on-call responders Manage Incidents Understanding and usage of Incident management Assist stakeholders in examining incidents and establishing processes to help prevent or minimize similar problems from arising Develop procedures and policies by which technical support teams will operate These processes will be applied to help in such areas as service failures and security threats Will also train IT support staff Facilitate Retrospective Provide RCA solutions Derive trends from recurring system application issues using dashboard Good Written and Verbal Communication skills Highly motivated individual with a positive and proactive attitude to work and willingness to make changes to improve operational efficiency through innovation process and procedure and adopting and adapting ideas and practices from elsewhere Ready to work in shift weekends and flexible schedule Excellent team skills with ability to listen and contribute to discussions and meetings Ability to motivate staff Excellent team skills with ability to listen and contribute to discussions and meetings Qualifications Graduate - Bachelor s in Engineering Technology degree preferred Any SRE Cloud Technology related certifications Good to have
-
Senior Site Reliability Engineer
3 weeks ago
Pune, India Jade Global Full timeJob DescriptionJob DescriptionJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer...
-
Senior Site Reliability Engineer
3 weeks ago
Pune, India Jade Global Full timeJob DescriptionJob DescriptionJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer...
-
Senior Site Reliability Engineer
2 weeks ago
Pune, India Jade Global Full timeJob DescriptionJob DescriptionJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer...
-
Site Reliability Engineer
1 week ago
Pune, India GfK Full timeDescription About You You are a DevOps or Site Reliability Engineer with a passion for cloud infrastructure and automation. You’re a self-starter and you love keeping up to date with the latest developments in cloud, configuration management and container technologies. You understand the benefits of an immutable infrastructure and you enjoy enabling...
-
Senior Site Reliability Engineer
3 weeks ago
Pune, India Jade Global Full timeJob DescriptionJob DescriptionJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer...
-
Senior Site Reliability Engineer
2 weeks ago
Pune, India Jade Global Full timeJob Description Job DescriptionJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site...
-
Senior Site Reliability Engineer
2 weeks ago
Pune, India Jade Global Full timeJob Description Job DescriptionJob Title: Senior Site Reliability Engineer (SRE) – Datadog ObservabilityExperience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3 + years hands-on experience in DatadogLocation: Hyderabad preferable but open for Pune and remoteJob Summary:We are seeking an experienced Site Reliability Engineer...
-
Senior Site Reliability Engineer
2 weeks ago
Pune, India Jade Global Full timeJob Description Job Description Job Title: Senior Site Reliability Engineer (SRE) – Datadog Observability Experience Required: 8+ years overall in SRE and Infrastructure Operations with minimum 3+ years hands-on experience in Datadog Location: Hyderabad preferable but open for Pune and remote Job Summary: We are seeking an experienced Site Reliability...
-
Site Reliability Engineer
6 days ago
Bengaluru, Chennai, Pune, India Growel Softech Private Limited Full timeJob Description Description We are seeking a skilled Site Reliability Engineer (SRE) to join our team in India. The ideal candidate will be responsible for ensuring the reliability, performance, and availability of our production systems. You will work closely with development teams to design scalable systems and implement automation to improve operational...
-
Site Reliability Engineer
2 weeks ago
Pune, India Synechron Full timeWe have immediate opportunity for SRE (Senior Site Reliability Engineer) 5 to 9 years.Synechron – PuneJob Role: - SRE (Senior Site Reliability Engineer)Job Location: - PuneAbout SynechronWe began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+ people, across 58 offices, in 21...