Current jobs related to Site Reliability Engineer - Bangalore Metropolitan Area - Quantzig


  • bangalore, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 yearsResponsibilities:● Design,...


  • bangalore, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 4 - 5 yearsResponsibilities:● Design,...


  • bangalore, India Cricbuzz.com Full time

    Site Reliability Engineer We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services. Experience - 4 - 5 years Responsibilities: ●...


  • bangalore, India tsworks Full time

    Who We Are tsworks Technologies India Private Limited (subsidiary of The Software Works, Inc, USA) is a technology product and services company. Our mission is to provide domain expertise, innovative solutions and thought leadership to empower businesses to thrive in a digital world. We value our employees, take pride in providing best value in customer...


  • bangalore, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • bangalore, India Microsoft Full time

    Overview Looking to join an exciting industry and organization at the forefront of the next Tech industry transformation? Are you ready to join a team of the world’s best technical experts to enable the success of Microsoft solutions for our commercial & enterprise customers? We are seeking to build out the team of next generation Site Reliability...


  • bangalore, India Microsoft Full time

    Overview Looking to join an exciting industry and organization at the forefront of the next Tech industry transformation? Are you ready to join a team of the world’s best technical experts to enable the success of Microsoft solutions for our commercial & enterprise customers? We are seeking to build out the team of next generation Site Reliability...


  • bangalore, India Zensar Technologies Full time

    About the Role: Site Reliability Engineer Experience: 5-8Yrs Location: Bangalore Required Skills: Must have skills: - High level of experience using cloud log management and monitoring data platforms ( Dynatrace, Azure Monitor ) Hands on experience in Azure Bicep Experience working with Infrastructure as Code and Containerization tools ( Terraform , Docker,...


  • Bangalore, India Qure.ai Full time

    About the job Job Title: Site Reliability Engineer Department: Engineering Location: Bangalore Years of experience: 2-5 years Type: Full Time Employment About Qure.ai: Qure.ai is one of the fastest-growing startups in India, which develops Artificial Intelligence enabled products and platforms for healthcare diagnostics. We create...


  • bangalore, India Integra Connect Full time

    About IntegraConnectIntegra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • bangalore, India Zensar Technologies Full time

    About the Role: Site Reliability EngineerExperience: 5-8YrsLocation: BangaloreRequired Skills:Must have skills: -High level of experience using cloud log management and monitoring data platforms ( Dynatrace, Azure Monitor )Hands on experience in Azure BicepExperience working with Infrastructure as Code and Containerization tools ( Terraform , Docker,...


  • bangalore, India tsworks Full time

    Who We Aretsworks Technologies India Private Limited (subsidiary of The Software Works, Inc, USA) is a technology product and services company. Our mission is to provide domain expertise, innovative solutions and thought leadership to empower businesses to thrive in a digital world. We value our employees, take pride in providing best value in customer...


  • bangalore, India tsworks Full time

    Who We Are tsworks Technologies India Private Limited (subsidiary of The Software Works, Inc, USA) is a technology product and services company. Our mission is to provide domain expertise, innovative solutions and thought leadership to empower businesses to thrive in a digital world. We value our employees, take pride in providing best value in customer...


  • bangalore, India 5100 Kyndryl Solutions Private Limited Full time

    Who We Are At Kyndryl, we design, build, manage and modernize the mission-critical technology systems that the world depends on every day. So why work at Kyndryl? We are always moving forward – always pushing ourselves to go further in our efforts to build a more equitable, inclusive world for our employees, our customers and our communities. The...


  • bangalore, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • bangalore, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • bangalore, India CirrusLabs Full time

    About the CompanyWe are CirrusLabs. Our vision is to become the world's most sought-after niche digital transformation company that helps customers realize value through innovation. Our mission is to co-create success with our customers, partners and community. Our goal is to enable employees to dream, grow and make things happen. We are committed to...


  • bangalore, India Zensar Technologies Full time

    About the Role: Site Reliability EngineerExperience: 5-8YrsLocation: BangaloreRequired Skills:Must have skills: -High level of experience using cloud log management and monitoring data platforms (Dynatrace, Azure Monitor)Hands on experience in Azure Bicep Experience working with Infrastructure as Code and Containerization tools (Terraform, Docker,...


  • bangalore, India nference Full time

    Staff Site Reliability Engineer: Job Location: Bangalore Work Mode: Hybrid (3 days in the office, 2 days remote) As a Staff Site Reliability Engineer (SRE) at Nference, you will ensure the reliability, scalability, and performance of our nSights platform. Collaborate closely with engineering teams to design, build, and maintain systems supporting our...


  • bangalore, India nference Full time

    Senior Site Reliability Engineer (SRE)Job Location: BangaloreWork Mode: Hybrid (3 days in the office, 2 days remote)As a Senior Site Reliability Engineer (SRE) at Nference, you will ensure the reliability, scalability, and performance of our nSights platform. Collaborate closely with engineering teams to design, build, and maintain systems supporting our...

Site Reliability Engineer

3 months ago


Bangalore Metropolitan Area, India Quantzig Full time

Job Summary:

As a Site Reliability Engineer (SRE) specializing in Machine Learning and AI Platform, you will play a critical role in designing, implementing, and maintaining a highly scalable, reliable, and performant infrastructure to support our organization's machine learning and artificial intelligence initiatives. You'll collaborate closely with cross-functional teams including data scientists, software engineers, and product managers to ensure our ML/AI platform meets the highest standards of reliability, availability, and efficiency.


Key Responsibilities:


1. Main Role as SRE:

- Design and implement robust, scalable, and automated infrastructure solutions to support our machine learning and artificial intelligence workloads.

- Proactively identify and address potential performance bottlenecks, reliability issues, and security vulnerabilities in the ML/AI platform.

- Collaborate with AI engineering teams to define best practices for deploying, monitoring, and managing machine learning models and pipelines in production environments.

- Continuously optimize infrastructure components for cost-effectiveness, scalability, and performance.

- Optimize platform performance and ensure security and compliance standards are met

- Collaborate with cross-functional teams to troubleshoot and resolve platform-related issues

- Provide technical guidance and mentorship to junior team members

- Create, govern and continuously improve IaC automation framework and scripts for our company wide solutions

- Provide direct technical design and delivery support for top priority initiatives while governing, influencing and approving all other initiatives

- Provide support and guidance across the company on technical design and standards

- Ensure that delivered solutions are aligned to enterprise standards (Architecture, Operations, and Infrastructure) and of high quality while maintaining the required non-functional attributes such as performance, supportability, security, usability, reliability and stability.



2. Deliverables:

- Help to architect and deploy highly available and fault-tolerant infrastructure for hosting machine learning models, training pipelines,

- Implement automated deployment pipelines for deploying ML/AI models and pipelines into production environments.

- Develop and maintain monitoring and alerting systems to ensure the health and performance of the ML/AI platform.

- Create documentation and provide training to internal teams on best practices for operating and troubleshooting the ML/AI platform.

- Contribute to the development of internal tools and frameworks to streamline machine learning workflow processes.


Qualifications:


Level of educational attainment required:

  • 5-10 year of experience
  • Academic Degree – BE or BTech, MCA, M.Sc.
  • Engineer, IT-Related professions


- Extensive experience in designing, implementing, and managing cloud-based infrastructure solutions, preferably on platforms such as Azure, GCP is a plus.

- Proficiency in containerization technologies such as Docker and orchestration frameworks like Kubernetes.

- Strong programming skills in languages such as Python, Terraform.

- Experience with monitoring and observability tools such as Prometheus, Grafana, and ELK stack.

- Excellent problem-solving and communication skills, with a proactive and collaborative approach to working in cross-functional teams.