Site Reliability Architect
7 days ago
Job Description :
Core Skills :
- 12 to 14 years of experience in Site Reliability Engineering, DevOps, or a related field, with at least 3 years in a senior or architect-level role.
- Strong expertise in system architecture, distributed systems, cloud computing (e.g., AWS, Azure, GCP), containerization (e.g., Docker, Kubernetes), and infrastructure as code (e.g., Terraform, Ansible).
- Proficiency in one or more programming/scripting languages (e.g., Python, Groovy, Shell, Powershell or similar).
- Strong background of DevOps practices, Cloud Technologies in ensuring scalability, reliability and security of Cloud infrastructure
- Experience with monitoring and observability tools (e.g., Dynatrace, Prometheus, Grafana, ELK stack, Datadog).
- Experience in integrating SRE with backend technologies like databases, messaging systems, etc. Strong understanding of software engineering principles and practices
- Deep understanding of incident management, root cause analysis, and post-incident review processes.
- Involvement in setting strategic direction for SRE practices, leading technical initiatives, and promoting a culture of excellence in site reliability engineering.
- Excellent problem-solving and communication skills and ability to work collaboratively in a fast-paced and dynamic environment.
- Proven ability to lead technical projects, influence cross-functional teams, and drive change.
- Excellent verbal and written communication skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences.
- Certifications in relevant technologies like Cloud certified DevOps Architect, Cloud Operations Support Architect etc.
Key Responsibilities :
- Architecting Systems : Design and architect highly available, scalable, and resilient systems to meet the demands of our growing user base and evolving business needs.
- Reliability Engineering : Develop and implement strategies to improve system reliability, including incident management, monitoring, and automated remediation.
- Performance Optimization : Identify and address performance bottlenecks, optimize system performance, and ensure efficient resource utilization.
- Collaboration : Partner with development teams, product managers, and other stakeholders to integrate SRE practices into the development lifecycle and ensure alignment with business objectives.
- Automation : Drive automation initiatives to reduce manual intervention, increase efficiency, and improve system reliability.
- Incident Management : Lead post-incident reviews, root cause analysis, and develop strategies for preventing future incidents.
- Best Practices : Establish and enforce best practices for system design, monitoring, and incident management.
- Mentorship : Provide guidance and mentorship to junior SREs and engineering teams on SRE principles and practices.
Qualifications :
- Experience : 8+ years of experience in Site Reliability Engineering, DevOps, or a related field, with at least 3 years in a senior or architect-level role.
- Technical Skills : SProgramming : Proficiency in one or more programming languages (e.g., Python, Go, Java, or similar).
- Monitoring Tools : Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Datadog).
- Incident Response : Leadership : Proven ability to lead technical projects, influence cross-functional teams, and drive change.
Communication :
Preferred Qualifications :
- Certifications : Relevant certifications (e.g., AWS Certified Solutions Architect, Google Professional Cloud Architect) are a plus.
- Experience : Previous experience in high-growth or high-availability environments.
-
Site Infrastructure Expert
7 days ago
Anywhere in India/Multiple Locations Infogain Full timeAbout This RoleThis is an exciting opportunity to join our team as a Site Infrastructure Expert at Infogain. As a key member of our infrastructure team, you will be responsible for designing and implementing reliable, scalable, and secure systems that meet the needs of our growing user base.The ideal candidate will have a strong background in DevOps...
-
Infogain - Site Reliability Architect - DevOps
4 weeks ago
india Infogain Full timeJob Description : Core Skills :- 12 to 14 years of experience in Site Reliability Engineering, DevOps, or a related field, with at least 3 years in a senior or architect-level role.- Strong expertise in system architecture, distributed systems, cloud computing (e.g., AWS, Azure, GCP), containerization (e.g., Docker, Kubernetes), and infrastructure as code...
-
Site Reliability Engineer
2 weeks ago
Anywhere in India/Multiple Locations ca-one tech cloud inc Full timeJob Description : - At least 5 years of experience in configuring enterprise-level Linux systems within a highly networked environment.- Expertise in using Chef for configuration management and automation, including the creation and management of Chef cookbooks and recipes.- Strong proficiency in scripting languages such as Python and Bash for automating...
-
Emorphis Technologies
3 weeks ago
Anywhere in India/Multiple Locations Emorphis Technologies Pvt. Ltd. Full timeJob Summary : We are looking for a highly skilled Site Reliability Engineer with 8+ years of experience to drive reliability, performance, and efficiency across our applications and infrastructure. This role requires a deep understanding of cloud platforms, observability tools, and automation technologies, coupled with a proactive approach to improving...
-
Site Reliability Engineer
4 weeks ago
india Coforge Full timeJob Title: Site Reliability Engineer Skills : SRE, CI/CD, AWS, Python, Terraform & Kubernetes Location: Hyderabad (Work from Office) Experience: 6-14 Years Note: Immediate joiners are preferable Job Description: We at Coforge are hiring a Site Reliability Engineer with the following skillset: Design, implement, and manage scalable and secure cloud-based...
-
Site Reliability Engineer
4 weeks ago
india Coforge Full timeJob Title:Site Reliability EngineerSkills : SRE, CI/CD, AWS, Python, Terraform & KubernetesLocation:Hyderabad (Work from Office)Experience:6-14 YearsNote:Immediate joiners are preferableJob Description:We at Coforge are hiring a Site Reliability Engineer with the following skillset:Design, implement, and manage scalable and secure cloud-based infrastructure...
-
Site Reliability Engineer
4 weeks ago
india Coforge Full timeJob Title: Site Reliability EngineerSkills: SRE, CI/CD, AWS, Python, Terraform & KubernetesLocation: Hyderabad (Work from Office)Experience: 6-14 YearsNote: Immediate joiners are preferableJob Description:We at Coforge are hiring a Site Reliability Engineer with the following skillset:- Design, implement, and manage scalable and secure cloud-based...
-
Site Reliability Engineer
3 weeks ago
Anywhere in India/Multiple Locations ca-one tech cloud inc Full timeJob Description : - At least 5 years of experience in configuring enterprise-level Linux systems within a highly networked environment.- Expertise in using Chef for configuration management and automation, including the creation and management of Chef cookbooks and recipes.- Strong proficiency in scripting languages such as Python and Bash for automating...
-
Site Reliability Engineer
4 weeks ago
india, india BigRio Full timeJob Title: Site Reliability Engineer Location: Remote with Quarterly visits to Chennai, Tamil Nadu, India Duration: Full-Time About BigRio: BigRio is a remote-based, technology consulting firm headquartered in Boston, MA. We deliver software solutions ranging from custom development and software implementation to data analytics and machine learning/AI...
-
Site Reliability Engineer
4 weeks ago
india, india BigRio Full timeJob Title: Site Reliability Engineer Location: Remote with Quarterly visits to Chennai, Tamil Nadu, India Duration: Full-Time About BigRio: BigRio is a remote-based, technology consulting firm headquartered in Boston, MA. We deliver software solutions ranging from custom development and software implementation to data analytics and machine learning/AI...
-
Integration Architect
2 weeks ago
Anywhere in India/Multiple Locations Dimiour Full timeJob Description for Integration Architect Profile. We are seeking a highly skilled Integration Architect with over 15 years of experience to join our team. As Integration Architect, you will be responsible for designing and implementing scalable, reliable, and secure solutions that meet the needs of our clients. You will work closely with our development...
-
Site Reliability Engineer
4 weeks ago
india CorroHealth Full timeHiring Alert!!!We are looking for highly skilled Site Reliability Engineer (SRE) for our Product Development team based out at Noida Location!!!Only Immediate Joiners preferred!!Candidates who are available for F2F round of interview, can only apply!!Job DescriptionWe are seeking a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal...
-
Technical Architect
4 days ago
Anywhere in India/Multiple Locations Dimiour Full timeRole : Platform App Architect. Location : Remote. We are seeking a highly skilled Platform App Architect with over 15 years of experience to join our team. As a Platform App Architect, you will be responsible for designing and implementing scalable, reliable, and secure platforms that meet the needs of our clients. You will work closely with our development...
-
Site Reliability Engineer
3 hours ago
India Agivant Technologies Full timeJob Description : We are looking for a highly skilled Site Reliability Engineer (SRE) with strong engineering and architectural expertise to design, implement, and manage large-scale, mission-critical infrastructure across multiple data centers and cloud providers. As an SRE, you will be responsible for architecting and optimizing our global infrastructure,...
-
Platform Architect Expert
4 days ago
Anywhere in India/Multiple Locations Dimiour Full timeJob DescriptionWe are seeking a highly skilled Platform App Architect to join our team at Dimiour. As a key member, you will be responsible for designing and implementing scalable, reliable, and secure platforms that meet the needs of our clients.Your primary focus will be on working closely with our development team to ensure that our platforms are built...
-
Scalable Software Architect
4 days ago
Anywhere in India/Multiple Locations Cybyrotek Solutions Full timeKey ResponsibilitiesAs a Scalable Software Architect at Cybyrotek Solutions, you will be responsible for designing and implementing high-performance, scalable, and reliable backend systems using Go (Golang). You will also work closely with cross-functional teams to design and implement system architecture, optimize applications for maximum speed and...
-
Site Reliability Leader
16 hours ago
India Vaco Binary Full timeVaco Binary is looking for a talented Site Reliability Leader to lead our DevOps team. As a Site Reliability Leader, you will be responsible for ensuring the reliability, scalability, and security of our applications and infrastructure.You will work closely with our development teams to design and implement collaborative solutions for enterprise applications...
-
Site Reliability Engineer
4 weeks ago
india CSC Full timeRole: Site Reliability EngineerLocation: Mumbai / Bangalore / ChennaiJob Shift: 12PM IST – 9PM ISTWorking Model: HybridIntro:Do you want to be noticed for your work? Make a difference every day? Be Impactful? Work with cutting edge technology. If so, you will fit in perfectly at CSC and especially within the Regulatory Technology Team. The world’s...
-
Cloud Site Reliability Engineer
21 hours ago
India Agivant Technologies Full timeJob Description : We are looking for a highly skilled Site Reliability Engineer (SRE) with strong engineering and architectural expertise to design, implement, and manage large-scale, mission-critical infrastructure across multiple data centers and cloud providers. As an SRE, you will be responsible for architecting and optimizing our global infrastructure,...
-
Site Reliability Engineer
3 weeks ago
India iVedha Inc. Full timeSite Reliability Engineer (SRE) Remote in India and have to work in EST (US/Canada) Time Zone with 24*7 Support Model Position Overview: We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with strong expertise in Python , advanced proficiency in Azure-based infrastructure , and significant experience in Customer Reliability...