
Site Reliability Developer 3
3 days ago
Job Description Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuni Within the Oracle Health (OHAI) organization, the new EHR and Clinical AI Agent cloud services are at the forefront of new generative AI services for healthcare organizations. Building on the success of the established Digital Assistant (ODA) product, EHR and AI Agent enable healthcare providers to leverage advanced AI technologies, together with voice commands, to reduce manual work and enable providers to focus on patient care. Oracle Health EHR is expanding their OCI Operations team, and looking to bring in new Site Reliability Engineers. As an SRE engineer, you will be engaged in solving technical challenges on an advanced OCI cloud service platform, focusing on areas such as reliability, scalability, resilience, security, and performance. You will define how to use latest technologies to optimize the operational efficiency of the service. You will gain a deep understanding of ChatBots, cognitive services, machine learning and analytics. You will work with a team pushing the boundaries of a scalable, self-healing, autonomous platform built on Kubernetes, Docker, Prometheus, and Grafana. You will be exposed to a wide range of OCI cloud services and understand how we interact with many dependent services across the organization. Areas of responsibility - Service Ownership As part of the EHR/Clinial Agent team, you will be responsible for all operational aspects of the OCI services included in our portfolio. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of the Digital Assistant suite of products. Own end-to-end availability, reliability, and performance of a Cloud Service Participate in LiveSite operations, working rapidly to mitigate issues that may arise. - Service Design Designing and implement solutions for rolling out software and security updates with zero downtime Partner with development and product management to build and maintain platform and automation frameworks to ensure maximum up-time and predictability, preventing outages and service interruptions or degradation Analyze system failures and developing rapid response processes - Operations engineering Evaluate the operation of cloud service deployments across commercial and government datacenters Monitor the degradation of the service and dependencies under load, and implement solutions to ensure high availability to our customers Analyse resource utilization and scaling requirements in a high-end production system Resolve security vulnerabilities to conform to corporate and government security standards. - Automation Building on your understanding of automation and orchestration principles, you will be identifying opportunities to automate SRE procedures in production environments The solution implemented will be designed to minimize the possibility of errors being introduced into the system - Technical expertise Handle complex, critical issues encountered in production environments, drawing on your accumulated technical knowledge to rapidly identify the issues and apply steps to mitigate. Develop an understanding of the underlying AI technologies used to implement the Clinical Digital Assistant service As an SME, you will be called in to handle major incidents, and your understanding of the architecture and dependent services will position you to apply mitigations to resolve the issue quickly, then working with development to assist implementing preventative actions. Career Level - IC3 Requirements 5+ years of professional experience as a Site Reliability Engineer or equivalent experience. BS or MS in Information Technology/Computer System Engineering, or equivalent Excellent team skills, can-do attitude, focus on quality. Strong trouble shooting capabilities targeting complicated problems in remote systems Experience with production operations and best practices for deploying quality code in production. Experience with public cloud (OCI, AWS, GCP, Azure). Experience and working knowledge in Python, Perl and/or Shell Scripting. Knowledge of Infrastructure as Code (IaaC) like Shepherd and Terraform. Experience with public cloud managed Kubernetes. Experience with cloud-native administration and monitoring/alerting technologies such as Docker, Helm, Prometheus, Grafana, EFK/ELK, Jaeger, or similar technologies. Knowledge of version control using Git. Experience in Linux/Unix environment ng. Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies. Career Level - IC3
-
Site Reliability Developer 3
24 hours ago
Noida, India Oracle Full timeJob Description Job Description Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale...
-
Site Reliability Developer 4
2 weeks ago
India Oracle Full timeJob Description You will be responsible to work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the...
-
Site Reliability Developer 3
3 weeks ago
Hyderabad, India Oracle Full timeJob Description Job Description Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale...
-
Site Reliability Engineer
4 days ago
Bengaluru, Karnataka, India, Karnataka ViewSonic Full timeJob Requirements:Bachelor's degree in Computer Science, Engineering, or a related field.3+ year of experience in a relevant role, such as Site Reliability Engineer, DevOps Engineer, or similar, is preferred but not mandatory.Basic understanding of AWS solutions including EC2, S3, CloudWatch, Lambda, and RDS.Interest and understanding of Platform Engineering...
-
Site Reliability Engineer
3 weeks ago
India Akamai Full timeDo you want to grow your career in Linux and Site Reliability Engineering? Would you like to contribute to the foundation of a new public cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure and services that power the backbone of our cloud platform. This is a rare opportunity to help build a...
-
Site Reliability Engineer
2 weeks ago
India Akamai Full time ₹ 5,00,000 - ₹ 15,00,000 per yearDo you want to grow your career in Linux and Site Reliability Engineering?Would you like to contribute to the foundation of a new public cloud platform?Join our IaaS Site Reliability Engineering (SRE) team.We design, develop, and operate infrastructure and services that power the backbone of our cloud platform. This is a rare opportunity to help build a...
-
Site reliability engineer
3 weeks ago
India Employ Full timeRole - Site Reliability Engineer (SRE)/ Platform Engineering/ or Dev Ops Engineering roles Location – Fully Remote Type - 6 months Contract Work Ex - 5+ Yrs We’re working with a AI product company that’s building the next generation of Gen AI powered developer platforms . We’re looking for an experienced Site Reliability Engineer to join...
-
Site Reliability Engineer
3 days ago
Bengaluru, India BNP Paribas Full timeJob Description Dear Candidate, BNP Paribas is hiring for Sire Reliability Engineer for Bangalore location! Kindly apply on the below link asap if interested, we shall take your candidature ahead post the application is submitted: https://bwelcome.hr.bnpparibas/su/cba292db5cf89f02 Technical & Behavioral Competencies : Mandatory skills: Site Reliability...
-
Site Reliability Engineer III
2 weeks ago
Hyderabad, India Chase Bank Full timeJob Description There's nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Consumer and Community Banking, youwill solve complex and broad...
-
Site Reliability Engineer
3 weeks ago
india Synechron Full timeWe have immediate opportunity forSRE (Senior Site Reliability Engineer) 5 to 9 years. Synechron –BangaloreJob Role: -SRE (Senior Site Reliability Engineer) Job Location: -Bangalore Notice Period:Within 30daysAbout Synechron We began life in 2001 as a small, self-funded team of technology specialists. Since then, we’ve grown our organization to 14,500+...