Site Reliability Engineer – 2

4 weeks ago


india [24]7.ai Full time

Job Role: Site Reliability Engineer – 2

Location: Bangalore

Working Hours : Permanent Night Shifts ( PST Working Hours )


Job Description

At (247).ai, we’re passionate about building software that solves problems. We count on our site reliability engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. As we expand our customer deployments, we are currently seeking an experienced SRE to deliver insights from massive scale data in real time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.

Objectives of this Role: -

  • Run the production environment by monitoring availability and taking a holistic view of system health.
  • Build software and systems to manage platform infrastructure and applications.
  • Improve reliability, quality, and time-to-market of our suite of software solutions.
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
  • Provide primary operational support and engineering for multiple large, distributed software applications.

Required Skills: -

  • Strong working knowledge of Red Hat Linux environments.
  • Good knowledge of Python, Bash shell script development.
  • Ability to program with one or more high level languages, such as Python, Perl, etc...
  • Experience with logging, monitoring, alerting and CICD & Big data platform tools.
  • Strong communication and analytical/problem-solving skills.
  • Good understanding of Cloud Technologies like GCP, Azure.

Preferred Qualifications

  • Bachelor’s degree in computer science or other highly technical, scientific discipline.
  • Previous success in technical engineering.
  • Coding experience beyond simple scripts.
  • Good understanding of networking concepts (load balancers, TCP/IP, Firewalls).
  • Logging and Monitoring tools like Logstash, Kibana, Grafana, etc...
  • Strong debugging skills.

Responsibilities:

  • Should flexible to work in PST working hours
  • Perform Incident Management and Change Management to maintain the continuous availability of all Cloud Infrastructure services.
  • Ensure all SRE and operating procedures are maintained and executed.
  • Work in partnership with stakeholders to design, implement, manage, and support a highly available and secure infrastructure.
  • Maintain 24x7 production environment with a high level of service availability and Perform quality reviews, manage operational issues.
  • Partner with development teams in defining and implementing improvements in service architecture.
  • Interface with Dev, QA, OPS teams to identify root cause analysis and re-instrument triggers to prevent future network degradation and outages.
  • Explore and innovate new cloud technologies, features, and tools to improve the platform and automate using Bash, Python or Perl, etc...
  • Implement automation and orchestration for manual processes required to operate and deploy cloud services, be at the heart of developing new ideas into internal tools by working closely with teams.
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Analyze alarms and dashboards to identify problem areas, report incidents, troubleshoot, and escalate as required.
  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
  • Perform ticket review and updates through JIRA ticketing tool.
  • Manage, coordinate, and document all type maintenances / events.
  • Must take initiative and be proactive.
  • Must take on the responsibility to learn new products and procedures.
  • Participate in system design consulting, platform management, and capacity planning.
  • Create sustainable systems and services through automation and uplifts.
  • Conducting post-incident reviews and creating actional reports and coming up with the application optimization recommendations for engineering teams.
  • Implementation of proactive monitoring, alerting, trend analysis and self-healing systems.
  • Understand the existing architecture and work with various Engineering teams to develop and execute strategies to provide a high-quality Global production service.

About (24)7.ai

(24)7.ai is a leader in the Conversational AI market, with over 250+ Fortune 500/1000 customers. We continue to transform our business to drive greater value to our team, shareholders, customers through new product development and market growth.

(24)7.ai is redefining the way companies interact with consumers. Using Artificial Intelligence and Machine Learning to understand consumer intent, (24)7.ai’s technology helps companies create a personalized, predictive and effortless customer experience across all channels. The world’s largest and most recognizable brands are using intent-driven engagement from (24)7.ai to assist several hundred million visitors annually, through more than 1.5 billion conversations, most of which are automated and learn from each consumer experience.

For more information, visit:



  • India Circles Life Full time

    Job Description Role: Site Reliability Engineer (SRE) Title: Software Engineer II, SRE Location: Bangalore About Circles Founded in 2014, Circles is a global technology company reimagining the telco industry with its SaaS platform - Circles X, helping telco operators launch and operate successful digital brands through its offerings. ...


  • India Serendipity Recruiting Full time

    Job Description As a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government.Our client firmly believes that exceptional technology services are built upon exceptional individuals. For over two decades,...


  • india IKAI Technology Solutions Full time

    Company Description IKAI Technology Solutions is a leading provider of IT services, supporting businesses across various industries to harness the full potential of information technology. With extensive experience in managing the intricate systems and operations of global enterprises, IKAI is committed to revolutionizing the way businesses navigate the...


  • India IKAI Technology Solutions Full time

    Company Description IKAI Technology Solutions is a leading provider of IT services, supporting businesses across various industries to harness the full potential of information technology. With extensive experience in managing the intricate systems and operations of global enterprises, IKAI is committed to revolutionizing the way businesses navigate the...


  • India IKAI Technology Solutions Full time

    Company Description IKAI Technology Solutions is a leading provider of IT services, supporting businesses across various industries to harness the full potential of information technology. With extensive experience in managing the intricate systems and operations of global enterprises, IKAI is committed to revolutionizing the way businesses navigate the...


  • India Oracle Full time

    Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...


  • india Cricbuzz.com Full time

    Site Reliability Engineer We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services. Experience - 3 - 5 years Responsibilities: ●...


  • india Quiktrak, LLC Full time

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps Engineer Job Description: Summary: As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...


  • India Unilog Full time

    Job Title : Site Reliability EngineerJob Summary :As a Site Reliability Engineer (SRE) specializing in Google Cloud Platform (GCP), you will be responsible for designing, implementing, and maintaining highly scalable and reliable systems. You will collaborate with development teams to ensure that applications are designed with reliability and performance in...


  • india Korn Ferry Full time

    Role - Site Reliability Engineer Exp - 5+ years Required Location - Hyderabad ( Work from Office-Hybrid) Shift Timings - 5AM -1 PM IST We are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely...


  • india Greenway Health Full time

    Job Description Job Summary The Manager is responsible for implementing the development process and site reliability engineering practices to resolve issues and identify opportunity areas. This role will lead development and site reliability engineering teams and establish and implement best practices and standards related to engineering...


  • India Greenway Health Full time

    Job Description Job Summary The Manager is responsible for implementing the development process and site reliability engineering practices to resolve issues and identify opportunity areas. This role will lead development and site reliability engineering teams and establish and implement best practices and standards related to engineering processes...


  • india SID Global Solutions Full time

    Dear Candidates, We are looking for immediate joiners 8 to 9 years for Hyderabad Location for a talented Site Reliability Engineer-Manager to join our dynamic team and contribute to the development of our cutting-edge web applications. If you're passionate about the role and have experience in SRE, GCP and Kubernetes , send me your updated cv : Please...


  • india Circles Life Full time

    Job Description Role: Site Reliability Engineer (SRE) Title: Software Engineer II, SRE Location: Bangalore About Circles Founded in 2014, Circles is a global technology company reimagining the telco industry with its SaaS platform - Circles X, helping telco operators launch and operate successful digital brands through its...


  • India Exoscale Full time

    Job Description Exoscale is the leading Swiss/European cloud service provider.With services covering the full cloud infrastructure spectrum - from fast deploying virtual machines to S3 compatible object storage - Exoscale provides a simple and scalable experience in order to let its clients focus on their core business.Join a dynamic working environment with...


  • India Exoscale Full time

    Job Description Exoscale is the leading Swiss/European cloud service provider.With services covering the full cloud infrastructure spectrum - from fast deploying virtual machines to S3 compatible object storage - Exoscale provides a simple and scalable experience in order to let its clients focus on their core business.As part of its ongoing efforts to grow...


  • India Agensi Pekerjaan BTC Sdn Bhd Full time

    Job Description Open Position: Site Reliability Engineer (MNC Tech Company) A well-known MNC Tech Company is hiring Site Reliability Engineer to join them in the Kuala Lumpur office.Key responsibilities include: Develop and provide operational support for full-stack software applicationsCollaborate with development operations staff to create, monitor, and...


  • India System Soft Technologies Full time

    Title: Site Reliability Engineer 100% REMOTE The Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • India Aventurine Technologies Inc Full time

    Job Description SRE (Site Reliability Engineer) Dallas, TX – Hybrid (F2F interview will be requested) 6+ Mon Contract Note: Look for candidates with over 9+ Years' experience.Job Description (SRE) • Collaborating closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLO's and...


  • india Aventurine Technologies Inc Full time

    Job Description SRE (Site Reliability Engineer) Dallas, TX – Hybrid (F2F interview will be requested)   6+ Mon Contract  Note: Look for candidates with over 9+ Years’ experience. Job Description (SRE) • Collaborating closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting...