Site Reliability Engineer – 2

4 weeks ago


delhi, India [24]7.ai Full time
Job Role: Site Reliability Engineer – 2
Location: Bangalore
Working Hours : Permanent Night Shifts ( PST Working Hours )
Job Description
At (247).ai, we’re passionate about building software that solves problems. We count on our site reliability engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. As we expand our customer deployments, we are currently seeking an experienced SRE to deliver insights from massive scale data in real time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.
Objectives of this Role: -
Run the production environment by monitoring availability and taking a holistic view of system health.
Build software and systems to manage platform infrastructure and applications.
Improve reliability, quality, and time-to-market of our suite of software solutions.
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
Provide primary operational support and engineering for multiple large, distributed software applications.
Required Skills: -
Strong working knowledge of Red Hat Linux environments.
Good knowledge of Python, Bash shell script development.
Ability to program with one or more high level languages, such as Python, Perl, etc...
Experience with logging, monitoring, alerting and CICD & Big data platform tools.
Strong communication and analytical/problem-solving skills.
Good understanding of Cloud Technologies like GCP, Azure.
Preferred Qualifications
Bachelor’s degree in computer science or other highly technical, scientific discipline.
Previous success in technical engineering.
Coding experience beyond simple scripts.
Good understanding of networking concepts (load balancers, TCP/IP, Firewalls).
Logging and Monitoring tools like Logstash, Kibana, Grafana, etc...
Strong debugging skills.
Responsibilities:
Should flexible to work in PST working hours
Perform Incident Management and Change Management to maintain the continuous availability of all Cloud Infrastructure services.
Ensure all SRE and operating procedures are maintained and executed.
Work in partnership with stakeholders to design, implement, manage, and support a highly available and secure infrastructure.
Maintain 24x7 production environment with a high level of service availability and Perform quality reviews, manage operational issues.
Partner with development teams in defining and implementing improvements in service architecture.
Interface with Dev, QA, OPS teams to identify root cause analysis and re-instrument triggers to prevent future network degradation and outages.
Explore and innovate new cloud technologies, features, and tools to improve the platform and automate using Bash, Python or Perl, etc...
Implement automation and orchestration for manual processes required to operate and deploy cloud services, be at the heart of developing new ideas into internal tools by working closely with teams.
Partner with development teams to improve services through rigorous testing and release procedures.
Analyze alarms and dashboards to identify problem areas, report incidents, troubleshoot, and escalate as required.
Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.
Perform ticket review and updates through JIRA ticketing tool.
Manage, coordinate, and document all type maintenances / events.
Must take initiative and be proactive.
Must take on the responsibility to learn new products and procedures.
Participate in system design consulting, platform management, and capacity planning.
Create sustainable systems and services through automation and uplifts.
Conducting post-incident reviews and creating actional reports and coming up with the application optimization recommendations for engineering teams.
Implementation of proactive monitoring, alerting, trend analysis and self-healing systems.
Understand the existing architecture and work with various Engineering teams to develop and execute strategies to provide a high-quality Global production service.
About (24)7.ai
(24)7.ai is a leader in the Conversational AI market, with over 250+ Fortune 500/1000 customers. We continue to transform our business to drive greater value to our team, shareholders, customers through new product development and market growth.
(24)7.ai is redefining the way companies interact with consumers. Using Artificial Intelligence and Machine Learning to understand consumer intent, (24)7.ai’s technology helps companies create a personalized, predictive and effortless customer experience across all channels. The world’s largest and most recognizable brands are using intent-driven engagement from (24)7.ai to assist several hundred million visitors annually, through more than 1.5 billion conversations, most of which are automated and learn from each consumer experience.
For more information, visit:

  • Delhi, Delhi, India Serendipity Recruiting Full time

    Job DescriptionAs a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government.Our client firmly believes that exceptional technology services are built upon exceptional individuals. For over two decades, our...


  • Delhi, India Daxko Full time

    Company DescriptionDaxko powers health & wellness throughout the world. Every day our team members focus their passion and expertise in helping health & wellness facilities operate efficiently and engage their members.Whether a neighborhood yoga studio, a national franchise with locations in every city, a YMCA or JCC--and every type of organization in...


  • Delhi, India Quiktrak, LLC Full time

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps EngineerJob Description:Summary:As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...


  • delhi, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 yearsResponsibilities:● Design,...


  • delhi, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM ISTWe are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • delhi, India SID Global Solutions Full time

    Dear Candidates,We are looking for immediate joiners 8 to 9 years for Hyderabad Location for a talented Site Reliability Engineer-Manager to join our dynamic team and contribute to the development of our cutting-edge web applications. If you're passionate about the role and have experience in SRE, GCP and Kubernetes , send me your updated cv : Please...


  • Delhi, Delhi, India Exoscale Full time

    Job DescriptionExoscale is the leading Swiss/European cloud service provider.With services covering the full cloud infrastructure spectrum - from fast deploying virtual machines to S3 compatible object storage - Exoscale provides a simple and scalable experience in order to let its clients focus on their core business.Join a dynamic working environment with...


  • Delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer 100% REMOTE The Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • Delhi, India Aventurine Technologies Inc Full time

    Job DescriptionSRE (Site Reliability Engineer)Dallas, TX – Hybrid (F2F interview will be requested)6+ Mon ContractNote: Look for candidates with over 9+ Years’ experience.Job Description (SRE)• Collaborating closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLO’s and...


  • delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer100% REMOTEThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • Delhi, India shreeniwas Full time

    Site Reliability EngineerSite Reliability Engineer (SRE)Work together with a team of highly motivated individuals with great passion in the Fintech space. It will be a startup environment, where things are extremely fast-paced, filled with many exciting challenges and opportunities to learn and adopt new technologies. Come join us if you are up for a...


  • Delhi, India Azilen Technologies Full time

    Job purpose:Design & implement the best engineered technical solutions using latest technologies and tools.Who you are:Bachelors degree in Computer Science, E&C Engineering, IT Engineering or related field. (2023-2024 passout)Any professional certification in area like Cloud Administration (AWS, Azure, GCP etc.), Site Reliability Engineering, Security etc....


  • Delhi, Delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer100% REMOTEThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • delhi, India WaferWire Cloud Technologies Full time

    Role: SRE (Site Reliability Engineer)Experience: 4+ YearsAbout WaferWire Cloud Technologies:WaferWire Cloud Technologies is a leading provider of innovative cloud solutions aimed at transforming businesses and driving digital growth. With a focus on cutting-edge technology and customer-centric approaches, we empower organizations to thrive in the digital...


  • Delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer100% REMOTEApplications written in .NET (python or any other scripting would be good) we need more of a dev background then operations.Automation experience: Ansible preferred but good with Terraform as well.Doesn’t need to come from a 24x7 environment but needs to be okay working in that environment.AWS preferred but any...


  • Delhi, Delhi, India WaferWire Cloud Technologies Full time

    Role:SRE (Site Reliability Engineer)Experience:4+ YearsAbout WaferWire Cloud Technologies:WaferWire Cloud Technologies is a leading provider of innovative cloud solutions aimed at transforming businesses and driving digital growth. With a focus on cutting-edge technology and customer-centric approaches, we empower organizations to thrive in the digital era....


  • delhi, India World Wide Technology Full time

    World Wide Technology (WWT), a global technology integrator and supply chain solutions provider. WWT employs more than 7000 people worldwide and operates in more than 2 million square feet of state-of-the-art warehousing, distribution, and integration space strategically located throughout the world. WWT is ranked on Glassdoor Best Places to Work for 12...


  • Delhi, Delhi, India SID Global Solutions Full time

    "Note : Males only"Role : SRELocation : Hyderabad - On-site(Rotational shifts)Exp : 2-3 yrsSkills:1-2 years of experience in 24x7 support of enterprise level applicationsStrong problem-solving skills and attention to detail.Excellent communication and teamwork abilities.Willingness to learn and adapt in a fast-paced environmentKnowledge of CI/CD pipelines...


  • new delhi, India dentsu Full time

    The purpose of this role is to ensure the availability and stability of production and test platforms. Job Title: Site Reliability Engineer Job Description: Key responsibilities:Troubleshoots and owns issues in our development, test and production environments. Including performance optimisation and continuous tuningWorks alongside the DevOps team in...


  • Delhi, India System Soft Technologies Full time

    Job SummaryThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and engaging with infrastructure teams....