Sre- Ai Infrastructure

2 weeks ago


Pune, India Ellicium Full time

What are we about? At Ellicium be ready for contagious excitement, worthy challenges and enriching learning experience every day. We trust in the process of failing fast and learning fast. You will find ‘Ellicians’ putting their heart to get the perfect dish during our monthly potlucks or arguing over best Nolan movie during lunch breaks. It is all about having fun We are passionate people with immense love for what we do and we are proud of what we have created. Our Key Values AMBITION Empowering individuals to reach new heights and achieve their ambitious goals. TEAMWORK Collaborating to achieve extraordinary results and support each other’s success. GROWTH Providing continuous learning and development opportunities, unlocking your full potential. COMMUNITY Building a supportive and inclusive community where we make a positive impact together. Perks and Benefits Businesses need to make better and faster decisions by analyzing data to stay competitive and future-ready Targeted Bonus Program Health Care Competitive Salary

**SRE- AI Infrastructure**:
**Primary skills: Containers, Kubernative, Devops, Python, Golang, TDD, Linux**
**Years of experience: 5-7+**

The Service Operations team at “Product Platform” Systems is responsible for building and operating the platform and infrastructure that enables us to deliver our groundbreaking capabilities to enterprise customers.
As a site reliability engineer on this team, you will lead key system engineering and automation functions, enhancing our capabilities to provide a reliable and scalable service for customers, in a hybrid deployment pattern.

**How you will make an impact**:

- Assume broad responsibilities for successful delivery of our “Product Platform” services in a hybrid model including but not limited to, deployment, configuration, integrations, and ongoing operations.
- Take ownership for ongoing updates, upgrades and patches on customer environments.
- Augment ongoing efforts to design and develop automation for deployments, updates and upgrades of the entire “Product Platform” software stack.
- Build the systems and tools for centralized command and control of distributed environments.
- Partner and collaborate with product and engineering teams to improve the security posture and operational readiness of our systems with the flexibility to integrate into unique customer environments.
- Participate in on-call rotation responsibilities.

**Basic Qualifications**:

- Bachelors and/or Masters in CS /EE or related field.
- 5+ years of hands-on experience as an SRE with focus on systems and infrastructure for cloud/SaaS production requirements.
- Extensive experience building, configuring, securing and administering Linux systems large-scale production environments.
- Strong scripting /programming skills (Python preferable) with experience with automated deployment systems, e.g. Ansible, Terraform, etc.
- Systematic problem-solving approach to troubleshooting, and the desire to solve the root cause of common problems in 24×7 environments.

**Preferred Qualifications**:

- Deep understanding of DNS, DHCP, LDAP, NFS, Kerberos, PAM, PXE, SNMP, SSH, HTTP/S, NTP, troubleshooting network performance issues.
- Knowledge of software development processes and methods, CI/CD pipelines and experience with common version control software.
- Knowledge of virtualization, multiple hypervisor technologies, Kubernetes cluster administration and management.
- Experience with monitoring and logging systems and the ability to identify new technologies as appropriate.
- Configuration and maintenance of web servers, load balancers, databases, storage systems and messaging systems.
- A passion to design for high availability and scale, with the discipline and desire for extensive automation.
- Strong communication skills with the ability and willingness to work with diverse teams, and customers, across multiple time zones.


  • Infrastructure Sre

    1 week ago


    Pune, India Barclays Full time

    Job title :Infrastructure SRE Location: Pune About Barclays Barclays is a British universal bank. We are diversified by business, by different types of customers and clients, and by geography. Our businesses include consumer banking and payments operations around the world, as well as a top-tier, full service, global corporate and investment bank, all of...


  • Kothrud, Pune, Maharashtra, India CRUTZ LEELA ENTERPRISES Full time ₹ 15,40,374 per year

    SRE - Infrastructure Support Engineer – JDWe are hiring a "SRE [Site Reliability Engineer] Infrastructure Support" engineer with deep expertise in Linux,Kubernetes, and hardware infrastructure management for our "Enterprise-grade high-performancesupercomputing" platform. We are helping enterprises and service providers build their AI inference platformsfor...


  • Pune, India Ellicium Full time

    **What are we about?**: - At Ellicium be ready for contagious excitement, worthy challenges and enriching learning experience every day. We trust in the process of failing fast and learning fast. You will find ‘Ellicians’ putting their heart to get the perfect dish during our monthly potlucks or arguing over best Nolan movie during lunch breaks. It is...

  • SRE

    3 weeks ago


    Pune, India Virtusa Full time

    SRE - CREQ Description Minimum 5 years of work experience as an SRE (not Traditional Production Support) covering integration platforms on cloud-based deployments. Coding experience in any programming language, particularly for integration tier and middleware. Working in a 24x7 operations support model for mission critical applications and infrastructure...

  • SRE

    3 weeks ago


    Pune, India Virtusa Full time

    SRE - CREQ Description Minimum 5 years of work experience as an SRE (not Traditional Production Support) covering integration platforms on cloud-based deployments. Coding experience in any programming language, particularly for integration tier and middleware. Working in a 24x7 operations support model for mission critical applications and infrastructure...

  • Software Engineer

    3 weeks ago


    Pune, India Maersk Full time

    About the Role We are looking for a highly skilled Software Engineer with strong AI/ML expertise and a foundational understanding of SRE principles to help transform reliability engineering through intelligent, automation-driven solutions. This role is not just about applying AI; it’s about applying engineering mindset and AI capabilities to...


  • Pune, Maharashtra, India Emergys Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Experience: 6+ yearsNP-0 to 30 daysPlease find JD:We are hiring a "SRE [Site Reliability Engineer] AI ML Support" engineer for our "Enterprise-grade highperformance supercomputing" platform. We are helping enterprises and service providers build their AIinference platforms for end users, powered by our state-of-the-art RDU (Reconfigurable Dataflow...


  • Pune, Maharashtra, India Procallisto solution Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    We are seeking an experienced DevOps Engineer with proven expertise in GitHub to GitLab migration, strong hands-on skills in Python programming, AWS, and Site Reliability Engineering (SRE) practices. The ideal candidate will play a key role in modernizing our CI/CD pipelines, improving cloud infrastructure, and ensuring high system reliability and...


  • Pune, India Maersk Full time

    We are looking for a highly skilled and experienced Site Reliability Engineer (SRE) who will play a key role in transforming reliability engineering through AI-based innovation—while bringing deep expertise in core SRE practices.   This role is not just about applying AI; it’s about being a hands-on SRE first—someone who understands real-world...

  • senior SRE engineer

    4 days ago


    Pune, Maharashtra, India Biyani Technologies Full time ₹ 6,61,000 - ₹ 22,00,801 per year

    Role: Senior SRE Engineer (AWS/GCP)We are looking for a Senior Site Reliability Engineer (SRE) with expertise in AWS/GCP, Kubernetes, CI/CD, and automation. The role involves designing, building, and scaling cloud infrastructure, leading DevOps best practices, and ensuring system reliability in a modern cloud-based environment.Responsibilities Lead the...