Sr SRE
2 weeks ago
Required Skills & Experience
- 10+ years of experience in SRE or DevOps roles.
- Deep expertise in Kubernetes (deployment, troubleshooting, performance tuning), Networking (firewalls, routing, connectivity issues), Relational Databases (patching, auditing, performance tuning)
- Strong scripting skills (e.g., Python, Bash) for tooling and automation.
- Experience with operations and development, and the ability to debug at the code layer.
- Proven ability to lead through influence and solve problems across teams.
- Comfortable navigating organizational blockers and driving issues to resolution.
- Experience with incident response and postmortem processes.
- Familiarity with monitoring and observability tools. (Splunk Observability, Dynatrace, Datadog, Grafana, Prometheus etc.)
- Ability to mentor and coach other engineers and development teams.
- Strong communication, and the ability to explain complex technical issues clearly to both technical and non-technical audiences.
- Ability to work cross functionally with DBAs, network engineers, developers, and leadership.
Job Description
An employer is looking for an SRE to join their enterprise level SRE team. They are building a specialized team of Senior Site Reliability Engineers to act as embedded technical experts across their IT organization. This team will be responsible for solving complex production issues, guiding development teams, and building tools that improve system resilience and observability.
This is not a traditional SRE role. You will be a technical leader, coach, and hands-on problem solver who thrives in ambiguity and drives results across organizational boundaries. This role is not on the infrastructure side (not on the terraform / provisioning server side) but supporting applications in production, and requires development and operations skills.
Responsibilities
• Investigate and resolve high-impact production issues across infrastructure and applications.
• Embed with dev teams to guide them through performance, reliability, and architectural challenges.
• Participate in incident response bridges as a technical expert.
• Build tools and scripts to detect vulnerabilities, automate checks, and improve system visibility.
• Conduct post-incident audits and ensure follow-through on remediation.
• Collaborate with DBAs, network engineers, and platform teams to unblock and resolve issues.
• Proactively identify issues and drive them to resolution without waiting for direction.
-
Lead Python Developer
4 days ago
Ahmedabad, Pune, India Zymr Systems Full time ₹ 15,00,000 - ₹ 25,00,000 per yearWe are looking for a Lead Platform Engineer to build functional and efficient server-side applications. The responsibilities include participating in all phases of the Agile software development lifecycle and coaching junior developers. Your ultimate goal is to create high-quality, cloud-native products that meet customer needs.Required Experience: 8 +...