SRE (Site Reliability Engineer)
2 days ago
Job Overview
We are looking for a detail-oriented and experienced Site Reliability Engineer to join our team. The Site Reliability Engineer will be responsible for creating and implementing scalable software solutions in order to meet system and application performance goals. You will also be responsible for troubleshooting system errors and resolving any relevant issues.
Roles And ResponsibilitiesSystem Monitoring and Incident Response: for implementing monitoring solutions to track system health,
performance, and availability. They proactively monitor systems, identify issues, and respond to incidents
promptly, working to minimize downtime and mitigate impacts.
Post-Incident Analysis: Led incident response efforts, coordinated with cross-functional teams, and
conducted post-incident analysis to identify root causes and implement preventive measures.
Continuous Improvement and Reliability Engineering: SREs drive continuous improvement efforts by
identifying areas for enhancement, implementing best practices, and fostering a culture of reliability
engineering. They participate in post-mortems, conduct blameless retrospectives, and drive initiatives to
improve system reliability, stability, and maintainability.
Collaboration and Knowledge Sharing: SREs collaborate closely with software engineers, operations teams,
and other stakeholders to ensure smooth coordination and effective communication. They share knowledge,
provide technical guidance, and contribute to the development of a strong engineering culture.
Support and maintain configuration management for various applications and systems
Implement comprehensive service monitoring, including dashboards, metrics, and alerts
Define, measure, and meet key service level objectives, such as uptime, performance, incidents, and chronic
problems
Partner with application and business stakeholders to ensure high quality product development and release
Collaborate with the development team to enhance system reliability and performance.
QualificationsBachelors degree in Information Technology, Computer Science, or related field.
Strong knowledge of software development processes and procedures.
Strong problem-solving abilities.
Excellent understanding of computer systems, servers, and network systems.
Ability to work under pressure and manage multiple tasks simultaneously.
Strong communication and interpersonal skills.
Strong knowledge of coding languages like Python, Java, Go, etc.
Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++,
Ruby, and JavaScript
Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic
resource management frameworks (Apache Mesos, Kubernetes,Yarn)
Job DescriptionExperience with cloud computing platforms such as AWS, Azure, or Google Cloud
Experience with DevOps tools such as Git, Jenkins, Ansible, Terraform, Docker, etc.
Experience with monitoring tools such as Splunk, Prometheus
Skills: problem solving,post-incident analysis,aws,monitoring tools,cloud computing,key service level objectives,reliability engineering,configuration management,devops practices,coding languages,monitoring tools (splunk, prometheus),continuous improvement,site reliability engineering,service monitoring,incident response,reliability,software development processes,system monitoring,splunk,devops tools (git, jenkins, ansible, terraform, docker),kubernetes,cloud computing (aws, azure, google cloud),devops,ansible,programming (python, java, go, c/c++, ruby, javascript),prometheus,cloud infrastructure,monitoring servicesKeywordscloud computing,splunk,prometheus,software development processes,system monitoring,devops tools,git,jenkins,ansible,terraform,docker,python,java,go,c/c++,ruby,javascript,Site Reliability Engineering*Mandatory Key Skillscloud computing,splunk,prometheus,software development processes,system monitoring,devops tools,git,jenkins,ansible,terraform,docker,python,java,go,c/c++,ruby,javascript,Site Reliability Engineering*
-
Site Reliability Engineer
7 days ago
Pune, Maharashtra, India Idox Full time ₹ 9,00,000 - ₹ 12,00,000 per yearSite Reliability Engineer (AWS)Pune, IndiaAbout the roleWe are seeking a driven and detail-oriented Site Reliability Engineer (SRE) with a strong passion for building resilient, scalable cloud infrastructure. This role offers an exciting opportunity for professionals with 2 to 4 years of experience in DevOps, Cloud, or Infrastructure to deepen their...
-
Site Reliability Engineer
1 week ago
Pune, Maharashtra, India Talent Worx Full time ₹ 15,00,000 - ₹ 25,00,000 per yearSite Reliability Engineer (SRE)At Talent Worx, we are looking for a dedicated Site Reliability Engineer (SRE) to join our team. This role involves maintaining high availability and reliability of our services through the application of software engineering practices and systems administration skills. The ideal candidate will bridge the gap between...
-
Site Reliability Engineer
2 weeks ago
Pune, Maharashtra, India Equifax Full time US$ 1,25,000 - US$ 1,75,000 per yearSite Reliability Engineering (SRE) at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.SRE is also an...
-
Site Reliability Engineer
5 days ago
Pune, Maharashtra, India Creospan Inc. Full time ₹ 15,00,000 - ₹ 28,00,000 per yearCreospan is a growing tech collective of makers, shakers, and problem solvers, offering solutions today that will propel businesses into a better tomorrow. "Tomorrow's ideas, built today" In addition to being able to work alongside equally brilliant and motivated developers, our consultants appreciate the opportunity to learn and apply new skills and...
-
Site Reliability Engineering
7 days ago
Pune, Maharashtra, India Deutsche Bank Full time ₹ 10,00,000 - ₹ 25,00,000 per yearSite Reliability Engineering (SRE) Lead, VPJob ID: R0402474Full/Part-Time: Full-timeRegular/Temporary: RegularListed: Location: PunePosition OverviewJob Title: Site Reliability Engineering (SRE) LeadCorporate Title: Vice PresidentLocation: Pune, IndiaRole DescriptionWe are seeking an experienced and highly capable Site Reliability Engineering (SRE) Lead to...
-
Site Reliability Engineer
3 days ago
Pune, Maharashtra, India Equifax Full time ₹ 10,00,000 - ₹ 25,00,000 per yearSite Reliability Engineering (SRE)at Equifax is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.SRE is also an...
-
Site Reliability Engineer
2 weeks ago
Pune, Maharashtra, India Techverito Software Solutions LLP Full time ₹ 8,00,000 - ₹ 24,00,000 per yearJob Description3-5 years of proven and progressive experience as an SRE or DevOps Engineer. As a SRE Engineer, you will have a strong background in cloud infrastructure management and deployment, with expertise in AWS cloud, DevOps tools, and Kubernetes ecosystem. The primary focus of this role will be to design, implement, and manage our cloud...
-
Site Reliability Engineer
7 days ago
Pune, Maharashtra, India Aziro Full time ₹ 15,00,000 - ₹ 20,00,000 per yearWe are hiring a "SRE [Site Reliability Engineer] Infrastructure Support" engineer with deep expertise in Linux, Kubernetes, and hardware infrastructure management for our "Enterprise-grade high-performance supercomputing" platform. We are helping enterprises and service providers build their Al inference platforms for end users, powered by our...
-
Associate Site Reliability Engineer
7 days ago
Pune, Maharashtra, India Acquia Full time ₹ 5,00,000 - ₹ 12,00,000 per yearJob Title: Associate Site Reliability EngineerAcquia is the open source digital experience company. We provide the world's most ambitious brands with technology that allows them to embrace innovation and create customer moments that matter. At Acquia we believe in the power of community and collaboration – giving our customers the freedom to build tomorrow...
-
Site Reliability Engineer
3 weeks ago
Pune, Maharashtra, India, Maharashtra TechVerito Full timeAbout the Role:3-5 years of proven and progressive experience as an SRE or DevOps Engineer. As a SRE Engineer, you will have a strong background in cloud infrastructure management, migration and deployment, with expertise in Google Cloud Platform (GCP), DevOps tools, and Kubernetes ecosystem. The primary focus of this role will be to migrate and manage our...