
Site Reliability Engineer
4 weeks ago
Job Description :
- Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions
- Operate, monitor, and triage all aspects of our production and non-production environments
- Collaborate with other engineers on code, infrastructure, design reviews, and process enhancements.
- Evaluate and integrate new technologies to improve system reliability, security, and performance
- Develop and implement automation to provision, configure, deploy, and monitor Apple services
- Participate in an on-call rotation providing hands-on technical expertise during service-impacting events
- Design, build, and maintain highly available and scalable infrastructure
- Implement and improve monitoring, alerting, and incident response systems
- Automate operations tasks and develop efficient workflows
- Conduct system performance analysis and optimization
- Collaborate with development teams to ensure smooth deployment and release processes
- Implement and maintain security best practices and compliance standards
- Troubleshoot and resolve system and application issues
- Participate in capacity planning and scaling efforts
- Stay up-to-date with the latest trends, technologies, and advancements in SRE practices
- Contribute to capacity planning, scale testing, and disaster recovery exercises.
- Approach operational problems with a software engineering mindset
- BS degree in computer science or equivalent field with 5+ years of experience
- 5+ years in an Infrastructure Ops, Site Reliability Engineering, or DevOps-focused role.
- Knowledge of Linux operating system principles, networking fundamentals, and systems management.
- Demonstrable fluency in at least one of the following languages : Java, Python, or Go
- Experience managing and scaling distributed systems in a public, private, or hybrid cloud environment
- Develop and implement automation tools and apply best practices for system reliability.
- You will be responsible for the availability & scalability of our services and manage the disaster recovery and other operational tasks.
- Collaborate with the development team to improve application codebase for logging, metrics and traces for observability.
- Collaborate with data science teams and other business units to design, build and maintain the infrastructure that runs machine learning and generative AI workloads.
- Influence architectural decisions with focus on security, scalability and performance.
- Find and fix problems in production, and work to avoid them from happening again
Preferred Qualifications :
- Familiarity with micro-services architecture and container orchestration with Kubernetes.
- Awareness of key security principles including encryption, keys (types and exchange protocols).
- Understanding SRE principles includes monitoring, alerting, error budgets, fault analysis, and automation.
- Strong sense of ownership, with a desire to communicate and collaborate with other engineers and teams.
- Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.
-
Site Reliability Engineer
3 weeks ago
Ahmedabad, Gujarat, India ACL Digital Full timeJob Description :- Continuous monitoring of system performance and identify potential issues before they impact users.- Experience working with Industry leading monitoring tools.- Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.- Analyze monitoring data to identify trends, anomalies, to...
-
Site Reliability Engineer
2 weeks ago
Ahmedabad, Gujarat, India ACL Digital Full timeJob Description : - Continuous monitoring of system performance and identify potential issues before they impact users. - Experience working with Industry leading monitoring tools. - Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly. - Analyze monitoring data to identify trends, anomalies, to...
-
Site Reliability Manager
1 day ago
Ahmedabad, Gujarat, India beBeeReliability Full time ₹ 9,00,000 - ₹ 12,00,000Job DescriptionWe are seeking a skilled IT Operations professional to join our team. In this role, you will be responsible for delivering high-quality IT services to ensure the smooth operation of our site.Deliver site IT services in line with quality, reliability, and cost expectations.Lead Margin Improvement Projects delivery and monitor site incidents and...
-
Senior Site Reliability Engineer
3 days ago
Ahmedabad, Gujarat, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job Title: Senior Site Reliability EngineerAbout the JobWe are seeking a seasoned Senior Site Reliability Engineer to join our team as a technical leader, coach, and hands-on problem solver.Key Responsibilities:Investigate and resolve high-impact production issues across infrastructure and applications.Educate and guide development teams on performance,...
-
Site Reliability Engineer
1 day ago
Ahmedabad, Gujarat, India Azilen Technologies Full timeJob PurposeTo ensure the reliability, performance, and resilience of our systems by managing Windows and Linux servers, SQL Server, .NET applications, and Azure services, while bridging development and operations teams to foster a culture of reliability.Who you are:● Lead incident management processes, carry out on-call duties, and effectively use incident...
-
Site Engineer
3 weeks ago
Ahmedabad, Gujarat, India Devashish Infrastructure Pvt Ltd Full timeAbout the RoleDevashish Infrastructure Pvt. Ltd. is seeking a skilled and dedicated Project Site Engineer to oversee the on-site execution of Pre-Engineered Building (PEB) projects. The ideal candidate will have hands-on experience in managing site-level activities, coordinating with teams and vendors, and ensuring timely and quality execution as per...
-
Site Reliability Engineer
4 weeks ago
Ahmedabad, Gujarat, India VOLANSYS (An ACL Digital Company) Full timeExperience: 5+ YearsWork Mode: Work from office onlyJob Description:1. AWS Cloud InfrastructureDesign, deploy, and manage scalable, secure, and highly available systems on AWS.Optimize cloud costs, enforce tagging, and implement security best practices (IAM, VPC, GuardDuty, etc.). Automate infrastructure provisioning using Terraform or AWS CDK. Ensure...
-
Site Engineer
3 weeks ago
Ahmedabad, Gujarat, India S V SHAH PROJECTS & CONSULTANT LLP Full timeCompany: S.V. Shah Project and Consultant LLPPosition: Site EngineerSite Location: Prantij (Sabarkantha)Qualification: Diploma / B.E in Civil EngineeringPost: 5 Nos.About Us:S.V. Shah Project and Consultant LLP is a leading project management consultancy specializing in construction management. We offer comprehensive services from the design and development...
-
Electrical Site Engineer
3 days ago
Ahmedabad, Gujarat, India Aarvi Encon Limited Full timeGreetings from Aarvi Encon LimitedWe have an opening for an Electrical Site Engineer position at Dahej (Gujarat) location.Job DescriptionExecution & Planning of Electrical project & site work like-Coordinate & support design team over drawing & technical document.Inspection and testing of electrical panels, Motor, Earthing pit & DBs.Verifying BOQ and...
-
Site Reliability Engineer
1 week ago
Ahmedabad, Gujarat, India Wipro Full timePrimary Skills (Must have) : - well versed with Unix Shell Scripting, - good in building CI/CD, familiar using Jenkins, Git & Maven - Troubleshooting using logs, Splunk / Dynatrace, alert configuration. - good knowledge on ITSM – Incident, Change and Problem Management, Must be able to extract, modify, update data into Postgres, SQL DB Job Description...