Principal Site Reliability Engineer
4 days ago
Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability, security, performance, and reliability for our infrastructure. Site Reliability Engineer expected to work with multiple service and product development teams, identifying cross-team issues that create risk for operations across the organization and resolving those issues with a mixture of engineering, development, troubleshooting expertise, and general operational guidance. This role also requires excellent communication and organizational skills. The candidate is expected to collaborate with service owners, other engineers and developers to deliver a superior support experience to development community.
Responsibilities
- Troubleshoot and resolve complex issues related to Linux environments and Oracle Cloud Infrastructure (OCI)
- Design and delivery of mission critical automation using Chef, Python with focus on security, resiliency, scale, and performance.
- Design, develop, and implement AI-driven solutions for business challenges.
- Collaborate with cross-functional teams to integrate AI models into production environments.
- Identify opportunities and drive the implementation of automation to improve service health, availability and reliability
- Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis
- Quickly grasp and analyze new technologies that are complex and rapidly changing and integrate those into automation and infrastructure support.
- Author functional and technical documentation and standard operating producers (SOP)
- Collaborate with development teams in defining and implementing improvements in service architecture.
Articulate technical characteristics of services and technology areas and guide cross-functional teams to engineer and add capabilities to internal tools.
Knowledge Skills
- 6- 12 years of experience in Site Reliability Engineering and automation.
- Experience in Linux administration with expertise in kernel-level debugging and performance tuning.
- Experience in cloud technologies and infrastructure management.
- Expertise in design, development, and implementation of AI-driven solutions for business challenges.
- Skilled in debugging operating system performance issues
- Expertise in working with highly available, fault-tolerant, distributed systems.
- Expertise in developing scripts and tools to automate routine tasks, improving efficiency.
- Experience in troubleshooting application, compute, storage, and database issues to enhance reliability and scalability.
- Strong background in operations management and problem resolution.
- Development experience with Python and infrastructure management using Chef
- Proven experience managing high-availability production environments.
Experience working with global teams across multiple time zones.
Qualifications required
- 6 to 12 years of experience working in IT Operations\Infrastructure team
Bachelor degree in Computer Engineering, Software Engineering, Computer Science or related areas is preferred
-
Principal Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India JPMorganChase Full time ₹ 20,00,000 - ₹ 40,00,000 per yearJOB DESCRIPTIONJoin a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact.As a Principal Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking division, you will leverage your advanced...
-
Principal Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Cubic Corporation Full time ₹ 8,00,000 - ₹ 24,00,000 per yearBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Principal Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Cubic Corporation Full time ₹ 1,20,000 - ₹ 2,00,000 per yearBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Principal Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Cubic Transportation Systems Full time ₹ 25,00,000 - ₹ 35,00,000 per yearBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Principal Site Reliability Engineer
3 days ago
Hyderabad, Telangana, India Cubic Corporation Full timeBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Principal Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Cubic Corporation Full time ₹ 18,00,000 - ₹ 50,00,000 per yearBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Site Reliability Engineer
5 days ago
Hyderabad, Telangana, India Assurant Full timeSite Reliability Engineer, GCC-AssurantThe Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
7 days ago
Hyderabad, Telangana, India Assurant Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSite Reliability Engineer, GCC-Assurant The Site Reliability Engineer (SRE) will be part of the Assurant Reliability Team, specifically within the Site Reliability Engineering area. This remote position, based in India, focuses on building and maintaining reliable, scalable systems through a combination of software development and network diagnostics. The...
-
Site Reliability Engineer
6 days ago
Hyderabad, Telangana, India Elios Talent Full timeSite Reliability EngineerKey Highlights Build, automate, and support cloud-native infrastructure powering high-availability platforms Contribute to automation-first engineering across AWS, Terraform, CI/CD, and observability tooling Improve reliability, uptime, system health, and performance across production environments Strengthen DevSecOps...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India VXI Global Solutions Full timeWe are looking for a Site Reliability Engineer with 3+ years for Experience into design, implement, and manage robust observability solutions across our cloud infrastructure and applications. The ideal candidate will have hands-on experience withPrometheus,Grafana,Google Cloud Monitoring, andOpenTelemetry, along with exposure toSolarWinds. You should be...