Principal Site Reliability Engineer
1 day ago
Oracle is seeking motivated Principal Site Reliability Engineer who thrives in a fast-paced rapidly evolving technology environment. This position requires wide and overall knowledge in Linux administration, AI technologies, software development, cloud computing, networking, cloud security, performance analysis and monitoring to provide the stability, security, performance, and reliability for our infrastructure. Site Reliability Engineer expected to work with multiple service and product development teams, identifying cross-team issues that create risk for operations across the organization and resolving those issues with a mixture of engineering, development, troubleshooting expertise, and general operational guidance. This role also requires excellent communication and organizational skills. The candidate is expected to collaborate with service owners, other engineers and developers to deliver a superior support experience to development community.
Responsibilities
- Troubleshoot and resolve complex issues related to Linux environments and Oracle Cloud Infrastructure (OCI)
- Design and delivery of mission critical automation using Chef, Python with focus on security, resiliency, scale, and performance.
- Design, develop, and implement AI-driven solutions for business challenges.
- Collaborate with cross-functional teams to integrate AI models into production environments.
- Identify opportunities and drive the implementation of automation to improve service health, availability and reliability
- Act as escalation point for critical issues that may not have a documented procedure and provide Root Cause Analysis
- Quickly grasp and analyze new technologies that are complex and rapidly changing and integrate those into automation and infrastructure support.
- Author functional and technical documentation and standard operating producers (SOP)
- Collaborate with development teams in defining and implementing improvements in service architecture.
Articulate technical characteristics of services and technology areas and guide cross-functional teams to engineer and add capabilities to internal tools.
Knowledge Skills
- 6- 12 years of experience in Site Reliability Engineering and automation.
- Experience in Linux administration with expertise in kernel-level debugging and performance tuning.
- Experience in cloud technologies and infrastructure management.
- Expertise in design, development, and implementation of AI-driven solutions for business challenges.
- Skilled in debugging operating system performance issues
- Expertise in working with highly available, fault-tolerant, distributed systems.
- Expertise in developing scripts and tools to automate routine tasks, improving efficiency.
- Experience in troubleshooting application, compute, storage, and database issues to enhance reliability and scalability.
- Strong background in operations management and problem resolution.
- Development experience with Python and infrastructure management using Chef
- Proven experience managing high-availability production environments.
Experience working with global teams across multiple time zones.
Qualifications required
- 6 to 12 years of experience working in IT Operations\Infrastructure team
Bachelor degree in Computer Engineering, Software Engineering, Computer Science or related areas is preferred
-
Principal Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Cubic Transportation Systems Full time ₹ 12,00,000 - ₹ 36,00,000 per yearHiring Principal Site Reliability EngineerExperience: 12+ YearsLocation: HyderabadNotice: Immediate to 30 DaysWe're seeking an experiencedSite Reliability Engineer (SRE)to ensure our services are robust, scalable, secure, and maintainable. You will blend software engineering and systems operations to automate processes, monitor performance, lead incident...
-
Principal Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India JPMorgan Chase Full time ₹ 45,00,000 - ₹ 90,00,000 per yearJoin a globally recognized financial organization and advance your profession to new heights by contributing to revolutionary projects. You've discovered the perfect environment to have a major impact.As a Principal Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking division, you will leverage your advanced expertise to...
-
Principal Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Cubic Transportation Full time ₹ 12,00,000 - ₹ 36,00,000 per yearHiring Principal Site Reliability EngineerExperience: 12 to 18 YearsLocation: HyderabadNotice Period: Immediate to 30 DaysKey ResponsibilitiesDesign, deploy, and maintain scalable, secure applications and infrastructure in cloud or hybrid environmentsImplement and manage robust monitoring, alerting, and observability systemsAutomate recurrent operational...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Oracle Financial Services Software Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per yearSenior Principal Site Reliability Engineer, Fusion SRE About Oracle Cloud: Oracle Cloud is a comprehensive suite of cloud services—including infrastructure, platform, and applications—designed to help organizations build, deploy, and manage workloads securely at scale. At Oracle, we are building the most intelligent future of cloud computing. Our...
-
Principal Site Reliability Engineer
24 hours ago
Hyderabad, Telangana, India Amgen Inc Full time ₹ 8,00,000 - ₹ 12,00,000 per yearWe are looking for a Site Reliability Engineer/Cloud Engineer (SRE) to work on the performance optimization, standardization, and automation of Amgens critical infrastructure and systems. This role is crucial to ensuring the reliability, scalability, and cost-effectiveness of our production systems. The ideal candidate will work on operational excellence...
-
Site Reliability Engineering
1 week ago
Hyderabad, Telangana, India Acesoft Labs Full time ₹ 20,00,000 - ₹ 25,00,000 per yearHi ,Kindly find the below JD :Job Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends...
-
Site Reliability Engineering
1 week ago
Hyderabad, Telangana, India TECHBLOCKS Full time ₹ 12,00,000 - ₹ 36,00,000 per yearJob Title: Site Reliability Engineering (SRE) ManagerLocation: HyderabadEmployment Type: Full-TimeWork Model - 3 Days from office (Hybrid)Summary:The SRE Manager at TechBlocks India will lead the reliability engineering function, ensuring infrastructure resiliency and optimal operational performance. This hybrid role blends technical leadership with team...
-
Principal Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Cubic Corporation Full time ₹ 12,00,000 - ₹ 36,00,000 per yearBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Principal Site Reliability Engineer
6 days ago
Hyderabad, Telangana, India Cubic Corporation Full time ₹ 8,00,000 - ₹ 24,00,000 per yearBusiness Unit:Cubic Transportation SystemsCompany Details:When you join Cubic, you become part of a company that creates and delivers technology solutions in transportation to make people's lives easier by simplifying their daily journeys, and defense capabilities to help promote mission success and safety for those who serve their nation. Led by our...
-
Site Reliability Engineering Manager
1 week ago
Hyderabad, Telangana, India Zortech Solutions Full time ₹ 1,04,000 - ₹ 1,30,878 per yearJob Title:Site Reliability Engineering (SRE) ManagerLocation:HyderabadEmployment Type:Full-TimeWork Model:Hybrid (3 Days from Office)About the RoleWe are looking for an experiencedSite Reliability Engineering (SRE) Managerto lead our reliability engineering function, ensuring infrastructure resiliency, operational excellence, and seamless user experiences....