Senior Site Reliability Engineer

3 days ago


Remote, India OutSystems Full time ₹ 10,00,000 - ₹ 25,00,000 per year

There are NO limits to your career: come shape the future and be part of a truly unique global culture at OutSystems

About the role

Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals of SRE are to create scalable and highly reliable systems. Our SREs ensure our production systems' reliability, performance, and scalability while enabling rapid development and deployment of new features and services.

SREs at OutSystems work closely with development teams, acting as an extension of the team, in adopting the reliability tenets with the shared goal of meeting Service Level Objectives (SLOs) and thus delivering a smooth and frictionless Customer Experience.

What You Will Lead/Do or Key Responsibilities

As an SRE at OutSystems, here are your key responsibilities and duties:

  • Lead and onboard services and teams to the reliability tenets;
  • Establish and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs);
  • Design and implement scalable, reliable, and secure infrastructure, while ensuring cloud-native best practices;
  • Collaborate with software development teams to ensure systems are resilient (observable, fault-tolerant, recoverable, scalable) and performant;
  • Implement monitoring, alerting, logging, and tracing solutions to detect and respond to incidents;
  • Lead incident response efforts, ensuring quick resolution and minimal downtime, and conduct RCA/post-mortems;
  • Automate every operational task, with a special focus on fast incident detection & recovery;
  • Foster a culture of continuous improvement and knowledge sharing;
  • Communicate effectively with stakeholders, providing updates on system reliability and performance;
  • Participate in on-call rotation to provide 24/7 support for production systems.

The main KPIs that aid in understanding the impact and success of the SRE function at OutSystems are:

  • SLA and Service Level Objectives (SLO) compliance;
  • SLO Coverage and Detection Ratio;
  • MTTD - Mean time to detect;
  • MTTA - Mean time to acknowledge;
  • MTTR - Mean time to resolve.

Qualifications / What You Need To Succeed:

To illustrate the desired profile for a Site Reliability Engineer. Nevertheless, the selection of candidates will always vary depending on specific knowledge of the field and prior experience.

  • STEM degree (BSc, MSc, in Software Engineering/Computer Science or related fields);
  • 5+ years of experience in software development and/or operations;
  • Proficiency in at least one high-level programming language (C++, Python, Java, C#, etc.).
  • Strong troubleshooting and debugging skills.
  • Fluency in English and excellent communication skills.
  • Communication - able to communicate effectively (in English) both orally and written showing empathy for the other person;
  • Humbleness - accepts mistakes and acts accordingly, with a humble attitude, apologizing for them and mitigating them ASAP to avoid higher impact.
  • Accountability - takes ownership of problems and makes sure to see them through. Even if he does not have all the necessary knowledge to move on alone, can involve the right people to reach closure.
  • Negotiation Skills - has tough and politically complex conversations with colleagues and customers, defusing disagreements and leading towards a mutual agreement and understanding of all parties involved.
  • Process Oriented - is organized and able to properly follow defined processes, whilst being able to properly challenge inefficient processes and suggest improvements.
  • Problem-solving - Has a top-down approach to problems, breaking them into smaller pieces and solving them by starting with a wider scope and narrowing it down as the analysis progresses. Has critical thinking, so can analyze information objectively and make a reasoned judgment.

Experience in any of the following is valued, but not fully required:

  • Containerization technologies and orchestration platforms, mainly Kubernetes
  • (CKA, CKAD, CKS certifications are valued);
  • Experience with automation and Infrastructure as Code (IaC) tools, such as AWS CloudFormation, Terraform, Puppet, Chef, Spacelift, etc;
  • Experience with Python, Go, Bash/Shell scripting, or other automation tools/languages;
  • Familiarity with AWS services like EC2, RDS, ELB, CloudFront, Lambda, etc;
  • Proficiency in monitoring and troubleshooting complex distributed systems;
  • Experience with Grafana, ELK stack, Prometheus, or others;
  • Strong understanding of designing resilient and fault-tolerant systems;
  • Expertise in debugging complex distributed systems.

The Longer Story:

OutSystems enables enterprise teams to build AI-powered applications and agents that reduce manual work, streamline internal operations, and accelerate impact. A proven low-code foundation combined with agentic AI and AI app generation capabilities empowers teams to move up to 10x faster with the assurance of security, scalability, and governance built in.

As the future becomes agentic, our customers need us now more than ever. AI has opened the door to extraordinary possibilities—but inside the enterprise, things are moving fast and feeling chaotic. Some early adopters are making progress in production, but for many, AI tools are sprawling without governance, data isn't ready, and talent isn't there yet. Enterprises are still drowning in application backlogs and struggling with legacy systems. But with the right platform, AI doesn't have to add to the chaos. It can become the breakthrough that brings clarity—and drives real, enterprise-wide impact. At OutSystems, we've built that platform, providing the tools necessary for enterprises to overcome these hurdles.

We are looking for passionate, talented, and motivated people to join us in helping our customers build, deploy, and scale apps and agents—fast, helping them accelerate innovation while enabling secure, governed human-AI collaboration.

OutSystems is a truly global company, with more than 850,000 developer community members, 1,700 employees, more than 500 partners, and thousands of active customers in over 75 countries and across 21 industries. Founded in 2001, OutSystems has offices in the United States, United Kingdom, the Netherlands, Portugal, Germany, the UAE, Japan, Hong Kong, Malaysia, Australia, India, and Singapore, and of course has a thriving, worldwide community of remote employees.

Amongst our 2,400 customers are some of the world's most recognizable brands across diverse industries—brands like Toyota, Heineken, Bosch, KeyBank, and UCLA. These customers are the reason we have a 4.6 star rating on G2. Their success is ours, and their stories demonstrate tangible ROI and transformational impact. We are a 9x Gartner Magic Quadrant Leader for Low-Code Application Platforms and a multi-year leader in the Forrester Wave. We're recognized not just as leaders but as visionaries with a strong ability to execute, now extending our leadership into the AI and agentic application development arena.

Working at OutSystems

Our goal is to ensure that OutSystems is a place for bright, happy, and motivated people who share a common purpose and take pride in doing excellent work to pursue our vision of providing the AI-powered low-code development platform enterprise leaders trust to build, secure, and evolve their business applications, agents, and core systems. Our culture is focused on our core values of trust, customer success, innovation, and alignment. Our team members operate with transparency, integrity, and accountability, define success through the lens of the outcomes we deliver for our customers, push the boundaries with excellence, and work together toward our shared vision to deliver on what matters most.

What do we have to offer you?

  • A company that is always growing, changing, and innovating. We challenge each other to innovate in our products, in our team, and how we use our own technology. And we give our teams space to be proactive and creative.
  • Real career opportunities. We care about growth and development. Yes, vertical career progression is a possibility, but it's not the only one. From lateral moves and joining different teams to mastering specialized skills, we support your growth no matter what your goals are.
  • Work colleagues that are as smart, hard-working, and driven as you. We act as one global OutSystems team, taking ownership and working together toward a shared vision.
  • Disrupting the status quo is in our DNA. In fact, it's why our company exists.
  • We ask "why" a lot. It helps us connect our individual work to the bigger picture and sometimes even uncover a better way.

Are you ready for the next step in your career? Then we'd love to hear from you

OutSystems nurtures an inclusive culture of diversity, where everyone feels empowered to be their authentic self and perform at their best. A company that embraces the creativity and innovation that comes through diverse perspectives. We are committed to creating a team that reflects society through inclusive programs and initiatives and are proud to be an equal opportunity employer. All qualified applicants receive equal consideration regardless of race, place of origin, color, age, marital status, religion, sex, sexual orientation, gender expression or identity, protected veteran status, disability status or any other status protected by law.



  • Remote, India Granicus Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Job Summary:Opening from Default - All locations The Company Serving the People Who Serve the People Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and their constituents together. We are on a mission to support our customers by meeting the needs of...


  • Remote, India Rackspace Technology Full time

    Job DescriptionSite Reliability Engineer / Observability EngineerPublic Cloud - Offerings and Delivery - Workforce Mgmt & Delivery Ops /Full - Time / RemoteRackspace is building up its Professional Services Center of Excellence on Application Performance Monitoring Suites.If you enjoy solving complex business problems and can contribute to building next...


  • Remote, India Immersive Infotech Pvt Ltd Full time ₹ 13,20,000 per year

    Job Title: Site Reliability Engineer (SRE)Experience: 6+ YearsWork Hours: European Time Zone (till 9:30 PM IST)Location: Remote/Offshore (India)Key ResponsibilitiesManage and optimize Windows and Linux server environments in Azure.Ensure system reliability, uptime, and performance aligned with defined SLOs/SLAs/SLIs.Lead incident management and drive root...


  • Remote, India Rackspace Technology Full time

    Job Description Site Reliability Engineer / Observability Engineer Public Cloud - Offerings and Delivery - Workforce Mgmt & Delivery Ops / Full - Time / Remote Rackspace is building up its Professional Services Center of Excellence on Application Performance Monitoring Suites. If you enjoy solving complex business problems and can contribute to building...


  • Remote, India Boost-IT Full time

    Boost IT is a Portuguese technology consultancy company, we are integrated into one of the most entrepreneurial groups in Portugal, with investment in more than 30 companies. We want to be known for being the most dynamic, energetic and reliable company to operate in the market and, for that, we want to count on you. If you're passionate about technology and...


  • India Remote Cyberhaven Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    About the role We're looking for an experienced Site Reliability engineer for making sure systems are reliable, scalable, and performing well especially in production environments. Our technology is new and rapidly evolving as an early member on the team, you'll play a key role in shaping the reliability architecture, building scalable infrastructure, and...


  • Pacific Remote Islands Marine National Monument, India Granicus Full time

    Job Summary: Opening from Default - All locations The Company Serving the People Who Serve the People Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and their constituents together. We are on a mission to support our customers by meeting the needs of...


  • Pacific Remote Islands Marine National Monument, India Immersive Infotech Pvt Ltd Full time

    Job Title: Site Reliability Engineer (SRE) Experience: 6+ Years Work Hours: European Time Zone (till 9:30 PM IST) Location: Remote/Offshore (India) Key Responsibilities Manage and optimize Windows and Linux server environments in Azure. Ensure system reliability, uptime, and performance aligned with defined SLOs/SLAs/SLIs. Lead incident management and drive...


  • Pacific Remote Islands Marine National Monument, India OutSystems Full time

    There are NO limits to your career: come shape the future and be part of a truly unique global culture at OutSystems About the role Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals of SRE are to create scalable and highly reliable...


  • Remote, India Qube Cinema Full time ₹ 8,00,000 - ₹ 24,00,000 per year

    At Qube Cinema, technology and innovation are at our core. Our purpose is to bring to life every story to engage, entertain and enlighten the world. As a company with a passion for cinema, we are committed to creating a seamless world of digital cinema with products that are innovative, powerful, reliable, cost-effective, and constantly evolving to cater to...