Principal Cloud Operation Engineer
3 days ago
At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors. We act with the speed and attitude of a start-up, with the scale and customer-focus of the leading enterprise software company in the world. Compute is one of the core organisations within OCI. We are responsible for providing Compute power i.e. VMs and BMs. Cloud pretty much cannot exists without our org. The Compute org comprises of a family of critical foundational infrastructure services that drive OCI's hardware lifecycle activities
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioural characteristics of production services. Responsible for the mitigating critical customer incidents, or deployments or testing required to improve security, performance, availability, and scalability of service. Authority for end-to-end performance and operability. Partner with development teams in meeting SLA to unblock customers. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilise a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the effect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
ResponsibilitiesInstall, monitor, maintain, support, and optimize all production server hardware and software. Provide escalated technical support for complex technical issues which may include leading problem management cases and providing management status. Coordinate escalated support cases and lead appropriate internal technical resources and/or third party vendors to resolution and coordinate a storage infrastructure of Oracle system and database appliances. Responsible for Oracle production environments; assist with server operating system and application upgrades, bug fixes, and patching; and work on standardization projects for both hardware and software under the Oracle technology stack while providing consistent system uptime as expected in a Cloud environment. Provide on-call support, on a rotating basis.
Responsibilities include but not limited to
- Incident Management
- Support and troubleshooting of Staging/Production environments
- Response and Resolve incidents as per SLA's
- Organise, Anticipate, Plan and work as On-Call in shifts for multiple services (Open to work in shifts & shows flexibility)
- Maintain Service High Availability
- Release Management
- Test and Deploy solutions and automate to replace manual processes
- Build and maintain deployment tools/procedures
- Zero downtime deployments and a high availability mindset
- Define and build innovative solution methodologies and assets around infrastructure, cloud migration and deployment operations at scale.
- Work with service teams to resolve complex issues that require troubleshooting and knowledge of code.
- Keep documentation up to date and resolving similar tickets with lower turnaround time and within SLA
- Ensure production security posture
- Ensure monitoring is robust and effective
- Change Management
- Perform Root Cause Analysis
Required Skills:
- 3+ years overall experience in IT industry
- Minimum 4 years of experience as a Sys Admin/Support
- Strong systems architecture skills
- Strong Linux administration and Troubleshooting skills (Understanding of different Hardware family)
- Virtualisation Technologies
- Scripting Language (Python/Bash etc)
- Understanding of Networking, Cloud Computing, Load Balancers
- Hands on experience at Monitoring/Instrumentation tools (Prometheus/Grafana, new relic, elastic or equivalent).
- Experience with maintaining high scale deployments, managing high throughput and IO intensive services.
Career Level - IC4
-
Principal Cloud Operation Engineer
2 weeks ago
India Oracle Full timeJob Description At Oracle Cloud Infrastructure (OCI), we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors. We act with the speed and attitude of a start-up, with the scale and customer-focus of the leading enterprise software company in the world. Compute is one of the core organisations within OCI. We are...
-
Principal Software Engineer
4 weeks ago
Bengaluru, India Oracle Taleo Full timeJob Description We are building a new Software Assurance Gateway team at OCI. Our mission is to build and operate a set of gateway services to ensure the security and integrity of the services running within a customer's tenancy. The team will develop, maintain and operationalize this new class of services with a high degree of resiliency, scalability and...
-
Principal Cloud Operation Engineer
2 weeks ago
India ORACLE Full timeJob Category Product Development At Oracle Cloud Infrastructure OCI we build the future of the cloud for Enterprises as a diverse team of fellow creators and inventors We act with the speed and attitude of a start-up with the scale and customer-focus of the leading enterprise software company in the world Compute is one of the core organisations within OCI...
-
Principal Engineer, VP
4 weeks ago
Bengaluru, India NatWest Group Full timeJob Description Join our digital revolution in NatWest Digital X In everything we do, we work to one aim. To make digital experiences which are effortless and secure. So we organise ourselves around three principles: engineer, protect, and operate. We engineer simple solutions, we protect our customers, and we operate smarter. Our people work differently...
-
Cloud Engineer
4 weeks ago
Pune, India Uplers Full timeJob Description Principal FinOps Engineer Experience: 10 - 20 Years Exp Salary : Competitive Preferred Notice Period: Within 45 Days Opportunity Type: Hybrid (Pune) Placement Type: Permanent (*Note: This is a requirement for one of Uplers Clients - Perforce Software) Must have skills : FinOps OR Cloud Cost Optimization OR Cloud financial management OR Cloud...
-
Principal Data Engineer
2 weeks ago
India Nexuspoint Consultant Full timeJob Title: Principal Data EngineerContract Period: more than 6 monthsExperience: 6+Working hours: 9:30am to 6:30pm ISTAbout the RoleWe are looking for an accomplished Principal Data Engineer to lead the ideation, architecture,design and development of our next-generation enterprise data platform. This role requires avisionary technologist with deep...
-
Principal Data Engineer
2 weeks ago
India Nexuspoint Consultant Full timeJob Title: Principal Data Engineer Contract Period: more than 6 months Experience: 6+ Working hours: 9:30am to 6:30pm IST About the Role We are looking for an accomplished Principal Data Engineer to lead the ideation, architecture, design and development of our next-generation enterprise data platform. This role requires a visionary technologist with...
-
Principal Data Scientist
2 weeks ago
Bengaluru, Karnataka, India, Karnataka Netcore Cloud Full timeAbout us:Netcore Cloud is a MarTech platform helping businesses design, execute, and optimize campaigns across multiple channels. With a strong focus on leveraging data, machine learning, and AI, we empower our clients to make smarter marketing decisions and deliver exceptional customer experiences. Our team is passionate about innovation and collaboration,...
-
Principal Engineer
2 weeks ago
Bengaluru, Karnataka, India, Karnataka ThoughtSpot Full timeAbout the Role:We are looking for a Principal Engineer to shape the architecture of ThoughtSpot’s cloud-native, distributed platform and scale it to support our ambitious product vision. You will bring deep systems thinking, architectural expertise, and pragmatic problem-solving to help evolve our platform across reliability, scalability, developer...
-
Principal Software Engineer
2 weeks ago
India Oracle Full timeJob Description Description: Oracle Cloud Infrastructure (OCI) provides mission-critical cloud services to enterprises worldwide. The Network Reliability Engineering(NRE) Automation, Reporting, and Tooling team builds innovative solutions that boost the productivity and efficiency of the Global Network Operations Center (GNOC). Our tooling empowers the GNOC...