Senior Site Reliability Engineer, Platform

1 month ago


Gurugram, India GEMINI Full time

Department

: Platform

Our Platform organization’s purpose is to enable Gemini to scale effectively and empower our engineering teams to focus on building innovative financial products and experiences for individuals around the world. Within Platform, the Site Reliability Engineering team is responsible for partnering with Gemini’s other engineering teams to ensure all our systems are architected, engineered and deployed to be resilient, reliable and performant.

The Embedded SRE team is a part of Site Reliability Engineering with a focus on engaging directly with our other engineering teams to onboard them onto our platform systems, reviewing and recommending design and architectural decisions, and guiding our engineering teams on how to implement the tooling provided by the larger Platform organization required to ensure systems can scale and react to changing conditions, with continuous improvement loops.

The Role: Senior Site Reliability Engineer

You will be an integral part of leading Gemini’s engineering teams towards modern DevOps practices, both by developing and providing modern automation and operational tooling, and working cross-functionally across Gemini’s engineering teams to influence and shape our development practices and culture.

Responsibilities:

Provide primary operational support and engineering for various Gemini services Improve reliability, quality and time-to-market across all Gemini services and offerings Guide engineering teams onto the various supported services provided by Platform Run on-going performance evaluations and improvements for Gemini systems Give architecture recommendations and engagement as part of SDLC Create “Production-ready Scorecards” to evaluate the health of systems pre-launch Implement and teach monitoring, alerting and automated resolution best practices Define SLIs, SLOs with Engineering teams Educate and guide engineering teams on reliability and resiliency best practices, like statelessness, chaos testing, blue/green deployments etc. Build operational tooling and automations

Qualifications:

7+ years using monitoring, alerting, and automation tooling to understand and remediate performance and health issues in systems at scale Good knowledge for various cloud technology providers like AWS, GCP, or Azure Experience in a code-first environment, developing automated solutions to solve support and operational issues Experience as a Technical Leader within a team, helping evaluating and making tech decisions for the team Experience working with containerization such as Nomad, EKS (k8s), Docker, etc. Experience working with Configuration Management such as Ansible, Chef, Puppet Experience writing scripts or cli tools that help increase Developer Productivity in high-level languages like Python, Go, etc. Experience analyzing system and application performance, identifying bottlenecks, and recommending architectural or systemic improvements Experience working with Engineering teams, teaching, training, and mentoring on how to implement best-practice technical solutions Experience working in a code-drive, automation-first public cloud infrastructure (Terraform) It Pays to Work Here The compensation & benefits package for this role includes: Competitive base salary Benefits Discretionary annual bonus

  • gurugram, India GEMINI Full time

    Department : Platform Our Platform organization’s purpose is to enable Gemini to scale effectively and empower our engineering teams to focus on building innovative financial products and experiences for individuals around the world. Within Platform, the Site Reliability Engineering team is responsible for partnering with Gemini’s other...


  • Gurugram, India Airtel Digital Full time

    Site Reliability Engineer is one of the critical role in the technology team and the person working in this team will be responsible for application performance, availability, reliability and system uptime. Candidate is responsible to provide consultation and strategic recommendations by quickly assessing and remediating complex platform availability issues....


  • Gurugram, India Airtel Digital Full time

    Site Reliability Engineer is one of the critical role in the technology team and the person working in this team will be responsible for application performance, availability, reliability and system uptime. Candidate is responsible to provide consultation and strategic recommendations by quickly assessing and remediating complex platform availability issues....

  • Senior SRE

    1 month ago


    Gurugram, India Epam Full time

    Description EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that...

  • Senior SRE

    2 weeks ago


    gurugram, India Epam Full time

    Description EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects...


  • gurugram, India StatusNeo Full time

    Job Description: We are seeking a highly skilled and experienced Senior Site Reliability Engineer with expertise in Core Tools and DevOps to join our dynamic team. The ideal candidate will have a strong background in Linux administration, cloud infrastructure, Infrastructure as Code (IaC), Python programming, and be a subject matter expert in DevOps tools...


  • Gurugram, India StatusNeo Full time

    Job Description: We are seeking a highly skilled and experienced Senior Site Reliability Engineer with expertise in Core Tools and DevOps to join our dynamic team. The ideal candidate will have a strong background in Linux administration, cloud infrastructure, Infrastructure as Code (IaC), Python programming, and be a subject matter expert in DevOps tools...


  • Gurugram, India Codersbrain technology pvt ltd Full time

    Key Responsibilities :- Provide expert production support for application teams utilizing our platform, ensuring high availability, reliability, and performance.- Diagnose and resolve complex issues in production environments, collaborating closely with development teams and stakeholders.- Implement and maintain monitoring, alerting, and logging solutions to...

  • Principle Engineer

    1 month ago


    Gurgaon,Gurugram, India SAR HR Consultancy Full time

    Principle Engineer - SRE What You Need for this Position:- 10+ years of hands-on technical experience within the realm of Site Reliability Engineering- Architect-level understanding of one or more of the major public cloud services (AWS, GCP & Azure), using them to effectively design secure and scalable services.- Strong understanding of SRE concepts and...


  • gurugram, India Epam Full time

    Description EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects...


  • Bangalore/Gurgaon/Gurugram, IN Codersbrain technology pvt ltd Full time

    Key Responsibilities :- Provide expert production support for application teams utilizing our platform, ensuring high availability, reliability, and performance.- Diagnose and resolve complex issues in production environments, collaborating closely with development teams and stakeholders.- Implement and maintain monitoring, alerting, and logging solutions to...


  • Gurgaon/Gurugram, India E-Qube Digital Services Full time

    Job Description : - 5 - 7 years' experience in cloud infrastructure engineering roles- 1-3 years' experience as Site Reliability Engineer or similar role, in a global organization.- Bachelor's degree in computer science, information systems or other related field (or equivalent work experience) - Customer service: experience working with...


  • Gurugram, India GEMINI Full time

    Department : Security In the emerging industry of digital assets, there is nothing more important than trust. The Gemini security team forms the backbone of trust. In fact, Gemini’s very first hires were security specialists and we continue to tackle unique challenges in the crypto space. Our team ensures that our customers, clients, and employees are...


  • gurugram, India GEMINI Full time

    Department : Security In the emerging industry of digital assets, there is nothing more important than trust. The Gemini security team forms the backbone of trust. In fact, Gemini’s very first hires were security specialists and we continue to tackle unique challenges in the crypto space. Our team ensures that our customers, clients, and employees...


  • gurugram, India GEMINI Full time

    Department : Platform The Role: Senior Cloud Platform Engineer Responsibilities: Continuous Integration systems Continuous Delivery systems Observability systems including: Logging, distributed tracing, metrics, dashboards, alerting, synthetic monitoring Gemini integration of third-party and open source off-the-shelf SaaS...


  • Gurugram, India GEMINI Full time

    Department : Platform The Role: Senior Cloud Platform Engineer Responsibilities: Continuous Integration systems Continuous Delivery systems Observability systems including: Logging, distributed tracing, metrics, dashboards, alerting, synthetic monitoring Gemini integration of third-party and open source off-the-shelf SaaS offerings ...


  • gurugram, India Cvent Full time

    Overview: Founded in 1999, Cvent has become the global leader in meetings, event, travel, and hospitality technology, with more than 4000+ employees worldwide. As a leading cloud-based technology company, we have over 28,000+ customers, including 80% of the Fortune 100 companies, in more than 100 countries. Cvent’s software solutions optimize the entire...


  • Gurugram, India Acefone Full time

    Key Responsibilities:1. Telephony Infrastructure Management:Design, implement, and maintain internet telephony systems to ensure high availability and call quality.Manage and optimize cloud telephony services to scale with our growing user base.Troubleshoot and resolve telephony-related issues to minimize downtime and disruptions. 2. Cloud Expertise:Utilize...


  • Gurugram, India Acefone Full time

    Key Responsibilities:1. Telephony Infrastructure Management:Design, implement, and maintain internet telephony systems to ensure high availability and call quality.Manage and optimize cloud telephony services to scale with our growing user base.Troubleshoot and resolve telephony-related issues to minimize downtime and disruptions. 2. Cloud Expertise:Utilize...


  • gurugram, India Acefone Full time

    Key Responsibilities: 1. Telephony Infrastructure Management: Design, implement, and maintain internet telephony systems to ensure high availability and call quality. Manage and optimize cloud telephony services to scale with our growing user base. Troubleshoot and resolve telephony-related issues to minimize downtime and disruptions. 2. Cloud Expertise:...