Senior Site Reliability Developer

5 days ago


India Oracle Full time

Job Description

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Compute is one of the core organisations within OCI. We are responsible for providing Compute power i.e. VMs and BMs. Cloud pretty much cannot exists without our org. We develop and operate multiple services (Provisioning, Monitor, Repair, Control Plane, Data Plane, Re-imaging etc) behind the scene which work like magic for our customers. We're looking for hands-on engineers with expertise and passion in solving difficult problems in distributed systems, virtualised infrastructure, and highly available services. Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning. You should be expert in Linux, Python/Java, and system engineering experience. You value simplicity and scale, work comfortably in a collaborative, agile environment, and are excited to learn.

Qualifications:

- Bachelors in computer science and Engineering or related engineering fields
- 6+ years of experience delivering and operating large scale, highly available distributed systems.
- 5+ years of experience with Linux System Engineering
- 4+ years of experience with Python/Java building infrastructure Automations
- Strong Infrastructure troubleshooting skills.
- Experience in CICD, Cloud Computing and networking

Responsibilities

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

Career Level - IC3



  • India Akamai Technologies Full time

    Job Description Job Description Do you have the passion to architect and lead the next generation of public cloud infrastructure Would you like to lead modernization initiatives while building a public cloud platform from scratch Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure and services that power...


  • India Akamai Full time

    Do you have the passion to architect and lead the next generation of public cloud infrastructure? Would you like to lead modernization initiatives while building a public cloud platform from scratch? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure and services that power the backbone of our cloud...


  • india Synechron Full time

    We have immediate opportunity forSRE (Senior Site Reliability Engineer) 5 to 9 years. Synechron –BangaloreJob Role: -SRE (Senior Site Reliability Engineer) Job Location: -Bangalore Notice Period:Within 30daysAbout Synechron We began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+...


  • India Akamai Full time

    Are you passionate about Linux and automation at scale? Would you like to own critical services in a new public cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure and services that power the backbone of our cloud platform. This is a rare opportunity to help build a public cloud from the...


  • India Oracle Full time

    Job Description OCI is Oracle's next-generation cloud platform, built for the most demanding enterprise workloads. We deliver high-performance computing, storage, networking, and platform services at global scale. The AI Platform, Services & Solutions organization within OCI is building the foundation for enterprise AI-spanning GPU infrastructure, training...


  • India Akamai Full time

    Do you want to grow your career in Linux and Site Reliability Engineering? Would you like to contribute to the foundation of a new public cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure and services that power the backbone of our cloud platform. This is a rare opportunity to help build a...


  • India EQUISOFT Full time

    What is Equisoft? Equisoft is a global provider of digital solutions for insurance and investment, recognized by over 250 of the world's leading financial institutions. We offer a comprehensive ecosystem of scalable solutions that help our customers meet all the challenges brought about by this era of digital transformation, thanks to our business...


  • India EQUISOFT Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    What is Equisoft? Equisoft is a global provider of digital solutions for insurance and investment, recognized by over 250 of the world's leading financial institutions. We offer a comprehensive ecosystem of scalable solutions that help our customers meet all the challenges brought about by this era of digital transformation, thanks to our business...


  • India Synechron Full time

    Good-day,We have immediate opportunity for Senior Site Reliability Engineer.Job Role:Senior Site Reliability EngineerJob Location: Synechron( Bengaluru/ Pune)Experience-8 to 15 yearsNotice : Immediate JoinerAbout Company:At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines creativity and...


  • India Microsoft Full time

    Job DescriptionThe Windows Cloud division is looking for a Senior Site Reliability Engineer that will help us take the Windows Cloud platform, as well as the Windows 365 Cloud PC and Azure Virtual Desktop business to the next level.Windows 365 Cloud PC (W365) and Azure Virtual Desktop (AVD) have recently been recognized as leaders in the Gartner Magic...