
Site Reliability Engineer
6 days ago
We are hiring a "SRE [Site Reliability Engineer] Infrastructure Support" engineer with deep expertise in Linux, Kubernetes, and hardware infrastructure management for our "Enterprise-grade high-performance supercomputing" platform. We are helping enterprises and service providers build their Al inference platforms for end users, powered by our state-of-the-art RDU (Reconfigurable Dataflow Unit) hardware architecture. This is a high-impact, high-visibility role. The ideal candidate will play a pivotal role in supporting and maintaining our enterprise infrastructure stack, ensuring high availability and optimal performance across mission-critical Al & ML environments. This role involves close collaboration with global SRE and Platform teams to manage and troubleshoot enterprise systems and clusters.
Key Responsibilities:
- Linux Administration: Manage,configure,and optimize Linux servers (RHEL, Ubuntu, or similar), including patching, security hardening, and performance tuning
- Kubernetes Administration: Deploy, manage, and troubleshoot Kubernetes clusters,ensuring reliability and scalability.
- Hardware Infrastructure Management: Oversee physical data center infrastructure,including servers, storage, and networking hardware.
- Security & Compliance: Apply security patches and upgrades for Linux-based Kubernetes environments and ensure compliance with organizational policies.
- Collaboration & Support: Work closely with SRE and Platform teams worldwide to support enterprise systems and clusters.
- Ticket-Based Case Management: Handle tickets efficiently using tools such as Salesforce or ServiceNow.
Required Qualifications:
- Strong hands-on experience with Linux system administration (RHEL, Ubuntu, or similar). RHCSA/RHCE certification is a plus.
- Solid understanding of Kubernetes administration; CKA/CKS certification is a plus.
- Hands-on experience with bare-metal and hardware infrastructure (servers, storage, networking).
- Good understanding of networking concepts (TCP/IP, DNS, Load Balancers, Firewalls); knowledge of Juniper OS is a plus.
- Strong troubleshooting skills across hardware, OS, and Kubernetes environments.
- Knowledge of automation tools such as Ansible, Python, Bash, or similar is a plus.
- Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK) is a plus.
Soft Skills:
- Strong communication, problem-solving, and collaboration abilities.
- Ability to work effectively in fast-paced, dynamic environments and adapt to evolving Al & ML technologies.
- Proactive mindset with a focus on automation,scalability, and operational excellence.
Why Join Us:
- Work on cutting-edge Al & ML infrastructure supporting mission-critical applications.
- Collaborate with global teams and gain exposure to advanced cloud-native and enterprise techno logies.
- Opportunity to grow your expertise in Linux, Kubernetes, and data center operations
-
Specialist - Site Reliability Engineer
4 days ago
Pune, Maharashtra, India Accelya Group Full time ₹ 20,00,000 - ₹ 25,00,000 per yearFor more than 40 years, Accelya has been the industry's partner for change, simplifying airline financial and commercial processes and empowering the air transport community to take better control of the future. Whether partnering with IATA on industry-wide initiatives or enabling digital transformation to simplify airline processes, Accelya drives the...
-
Specialist - Site Reliability Engineer
4 days ago
Pune, Maharashtra, India Accelya Group Full time ₹ 15,00,000 - ₹ 25,00,000 per yearFor more than 40 years, Accelya has been the industry's partner for change, simplifying airline financial and commercial processes and empowering the air transport community to take better control of the future. Whether partnering with IATA on industry-wide initiatives or enabling digital transformation to simplify airline processes, Accelya drives the...
-
Site Reliability Engineer
14 hours ago
Pune, Maharashtra, India Ather Energy Full time ₹ 6,00,000 - ₹ 18,00,000 per yearYou'll be our: Site Reliability EngineerYou'll be based at: Pune Zonal OfficeYou'll be aligned with: Cloud and Data Platform Lead / Cloud ArchitectYou'll be a member of: Cloud and Data Platform TeamAther's fleet of smart scooters is growing rapidly, and so is the volume of data they generate. Our Vehicle Data Platform (VDP) is the core of this ecosystem, and...
-
SRE (Site Reliability Engineer)
20 hours ago
Pune, Maharashtra, India Apex One Full time ₹ 6,00,000 - ₹ 18,00,000 per yearJob Overview We are looking for a detail-oriented and experienced Site Reliability Engineer to join our team. The Site Reliability Engineer will be responsible for creating and implementing scalable software solutions in order to meet system and application performance goals. You will also be responsible for troubleshooting system errors and resolving any...
-
Site Reliability Engineer
4 days ago
Pune, Maharashtra, India Idox Full time ₹ 9,00,000 - ₹ 12,00,000 per yearSite Reliability Engineer (AWS)Pune, IndiaAbout the roleWe are seeking a driven and detail-oriented Site Reliability Engineer (SRE) with a strong passion for building resilient, scalable cloud infrastructure. This role offers an exciting opportunity for professionals with 2 to 4 years of experience in DevOps, Cloud, or Infrastructure to deepen their...
-
Site Reliability Engineer
4 weeks ago
Pune, Maharashtra, India Reveille Technologies Full timeJob Summary :We are seeking a skilled and proactive Site Reliability Engineer (SRE) with a strong DevOps mindset and hands-on experience in application troubleshooting. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our applications and infrastructure. This role requires a blend of software engineering,...
-
Site Reliability Engineer
4 weeks ago
Pune, Maharashtra, India Allianz Full timeSite Reliability Engineer (SRE) - One Identity Access ManagementThe primary objective of the Site Reliability Engineer (SRE) specializing in One Identity Access Management is to ensure the seamless operation, reliability, and scalability of IAM systems within the organization.This role is critical in maintaining system integrity, optimizing performance, and...
-
Site Reliability Engineer
4 weeks ago
Pune, Maharashtra, India Uplers Full timeJob DescriptionMust have skills required :Azure DevOps, SRE concepts, TerraData, CDC, CDC tool, NEWRELGood to have skills :Aws cloudwatchReflections Info Systems (One of Uplers Clients) is Looking for:Site Reliability Engineer who is passionate about their work, eager to learn and grow, and who is committed to delivering exceptional results. If you are a...
-
Site Reliability Engineer
3 days ago
Pune, Maharashtra, India Creospan Inc. Full time ₹ 15,00,000 - ₹ 28,00,000 per yearCreospan is a growing tech collective of makers, shakers, and problem solvers, offering solutions today that will propel businesses into a better tomorrow. "Tomorrow's ideas, built today" In addition to being able to work alongside equally brilliant and motivated developers, our consultants appreciate the opportunity to learn and apply new skills and...
-
Site Reliability Engineering
5 days ago
Pune, Maharashtra, India Deutsche Bank Full time ₹ 10,00,000 - ₹ 25,00,000 per yearSite Reliability Engineering (SRE) Lead, VPJob ID: R0402474Full/Part-Time: Full-timeRegular/Temporary: RegularListed: Location: PunePosition OverviewJob Title: Site Reliability Engineering (SRE) LeadCorporate Title: Vice PresidentLocation: Pune, IndiaRole DescriptionWe are seeking an experienced and highly capable Site Reliability Engineering (SRE) Lead to...