Senior Site Reliability Engineer
3 weeks ago
• Infinite tech exposure & mentorship
• Live case problem-solving with real impact
• Hackdays and continuous learning through tech talks
• Fun, collaborative work environment that's more sincere than seriousKey Responsibilities:Cloud Infrastructure Management : Manage, deploy, and monitor highly scalable and resilient infrastructure using Microsoft Azure .Containerization & Orchestration : Design, implement, and maintain Docker containers and Kubernetes clusters for microservices and large-scale applications.Automation : Automate infrastructure provisioning, scaling, and management using Terraform and other Infrastructure-as-Code (IaC) tools.CI/CD Pipeline Management : Build and maintain CI/CD pipelines for continuous integration and delivery of applications and services, ensuring high reliability and performance.Monitoring & Incident Management : Implement and manage monitoring, logging, and alerting systems to ensure system health, identify issues proactively, and lead incident response for operational challenges.Kafka & Apigee Management : Manage and scale Apache Kafka clusters for real-time data streaming and Apigee for API management.Scripting & Automation : Utilize scripting languages (e.g., Python , Bash , Go etc.) to automate repetitive tasks, enhance workflows, and optimize infrastructure management.Collaboration : Work closely with development teams to improve application architecture for high availability, low latency, and scalability.Capacity Planning & Scaling : Conduct performance tuning and capacity planning for cloud and on-premises infrastructure.Security & Compliance : Ensure security best practices and compliance requirements are met in the design and implementation of infrastructure and services.Required Skills & Qualifications:Experience : 7+ years of experience as a Site Reliability Engineer (SRE)/Platform Engineering, DevOps Engineer, or similar role in cloud environments.Cloud Expertise : Strong hands-on experience with Microsoft Azure services, including compute, storage, networking, and security services.Containerization & Orchestration : Proficiency in managing and deploying Docker containers and orchestrating them with Kubernetes .Infrastructure as Code (IaC) : Deep knowledge of Terraform for provisioning and managing infrastructure.CI/CD : Experience building, maintaining, and optimizing CI/CD pipelines using tools like Jenkins , GitLab CI , Azure DevOps , or others.Message Brokers : Hands-on experience with Kafka for distributed streaming and messaging services like ServiceBus/EventHub. – Good to have exposure with KafkaAPI Management : Familiarity with Apigee or similar API management tools. – Good to have exposure with ApigeeScripting & Automation : Expertise in scripting with languages such as Python , Bash , Go or similar.Monitoring & Logging : Experience with monitoring tools like Newrelic, Prometheus , Grafana , Azure Monitor , and logging solutions such as ELK stack (Elasticsearch, Logstash, Kibana).Version Control : Strong experience using Git, Bitbucket, Github for source control.Problem-Solving : Excellent troubleshooting skills and the ability to resolve complex infrastructure and application issues.Collaboration & Communication : Ability to work in a collaborative, cross-functional environment and communicate complex technical issues effectively to both technical and non-technical teams.Preferred Skills:Experience with additional cloud providers such as AWS or Google Cloud .Familiarity with other message brokers such as RabbitMQ or ActiveMQ .Experience with Apigee Edge for managing APIs and microservices.Knowledge of networking concepts and technologies, such as load balancing, DNS, and VPNs.
-
Site Reliability Expert
2 days ago
Gurgaon, Haryana, India beBee Careers Full timeA key member of the team, this Senior Site Reliability Engineer will focus on incident management and troubleshooting, developing and improving monitoring, alerting, and diagnostic tools, and conducting blameless postmortems.The SRE Specialist will also be responsible for automation and infrastructure as code, managing infrastructure using tools like...
-
Senior Site Reliability Engineer
2 weeks ago
Gurgaon, Haryana, India Cloudologic Full timeCompany Description : Cloudologic is a prominent cloud consulting and IT service provider based in Singapore and rooted in India, focusing on cloud operations, cyber security, and managed services. With a decade of expertise, our dedication to delivering high-quality services has earned the trust of clients worldwide, making us a valued partner in the tech...
-
Senior Site Reliability Engineer
3 days ago
Gurgaon, Haryana, India Cloudologic Full timeCompany Description : Cloudologic is a prominent cloud consulting and IT service provider based in Singapore and rooted in India, focusing on cloud operations, cyber security, and managed services. With a decade of expertise, our dedication to delivering high-quality services has earned the trust of clients worldwide, making us a valued partner in the tech...
-
Gurgaon, Haryana, India Crescendo Full timeAbout Crescendo GlobalCrescendo Global is a niche recruitment agency specializing in senior to C-level placements. We pride ourselves on delivering a memorable job search and leadership hiring experience for both job seekers and employers. Job Summary: Senior Technical LeadWe are seeking a highly skilled Senior Technical Lead to join our team. This...
-
Lead Site Reliability Engineer
2 weeks ago
Gurgaon, Haryana, India UnitedHealth Group Full timeOptum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion,...
-
Site Reliability Engineer
4 weeks ago
Gurgaon, Haryana, India myGwork Full timeThis job is with Synechron, an inclusive employer and a member of myGwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly. Overall Summary:We are seeking a skilled and experienced SRE Engineer to join our team. The ideal candidate will...
-
Gurgaon, Haryana, India MyGwork Full timeAbout the Role: We are seeking an experienced IT professional to join our Automotive Insights team as a Principal Site Reliability Engineer. The role will be responsible for setting Operational and Site Reliability Engineering (SRE) standards that our support teams can leverage.The Impact: By joining our team, you will have the opportunity to work closely...
-
Site Reliability Engineer
1 day ago
Gurgaon, Haryana, India Karix Full timeRole: Site Reliability Engineer (L2 Support)Location: Gurgaon (WFO)About the role: We are seeking an experienced professional Site Reliability Engineer who acts as a bridge between development and IT operations, taking operational tasks to ensure the efficient functioning of Service platforms.They are responsible for monitoring, automating, and improving the...
-
Senior Site Reliability Engineer
2 weeks ago
Gurgaon, Haryana, India UnitedHealth Group Full timeOptum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion,...
-
Senior Site Reliability Engineer
2 weeks ago
Gurgaon, Haryana, India Cvent Full timeJob DescriptionJob Descritiption-As a Site Reliability Engineer, you'll use your advanced development and operations knowledge to identify and prioritize issues. Find universal solutions to common problems and mentor and support junior staff.Additionally, you will:Enlighten, Enable and Empower a fast-growing set of multi-disciplinary teams, across multiple...