Principal Service Reliability Engineer

3 weeks ago


Hyderabad Secunderabad Telangana India NetSuite Full time

Oracle, the world leader in Enterprise Cloud, is hiring the most creative technologists in the industry as we continue to add customer-centric, premier, groundbreaking, secure, hyper-scale based solutions throughout all levels of the cloud stack. Oracle's cloud eco-system is the only complete business cloud platform on the planet, with market leading and business redefining solutions spanning SaaS, DaaS, PaaS and IaaS. Oracle's Cloud applications, such as Enterprise Resource Management, Customer Relationship Management, Human Capital Management, and Supply Chain Management are used by thousands of customers across the globe and are the broadest, most innovative in the industry, providing businesses with adaptive intelligence, standardized business processes and competitive advantage at low cost.

As part of market leading ERP Cloud, Oracle ERP Cloud Operations offers a broad suite of modules and capabilities designed to empower modern finance and deliver customer success with streamlined processes, increased efficiency, and improved business decisions.

The ERP Cloud Operations is looking for hardworking, innovative, high caliber, team oriented super stars that seek being a major part of a dynamic revolution in the development of modern business cloud based applications. We are seeking highly capable, best in the world developers, architects and technical leaders at the very top of the industry in terms of skills, capabilities and proven delivery who seek out and implement imaginative and strategic, yet practical, solutions people who calmly take measured and necessary risks while putting customers first.

Key Tasks and Responsibilities

. Service Ownership-You will bea part of the SRE team, whose mission is the shared full stack ownership of a collection of services, with our Service Development and Operations SRE partners.

. Ownership Scope- You will understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of the production services you own. In partnership with your Service Development and Operations SRE partners, you will have the responsibility to ensure that services are designed and delivered to be critically important with focus on monitoring, telemetry, security, resiliency, scale, and performance.

. Service Design- You will partner with the SRE Architect, Service Development and operations SRE teams in defining and implementing improvements in service architecture, both current and future.

You will be an authority at articulating technical characteristics of your services and the dependencies between services, and guide service Development teams to engineer and add SRE capabilities to the Oracle SaaS/ERP service portfolio.
You will participate in feature design reviews to ensure Monitoring, Telemetry, Reliability, Automation, and Runtime Debuggability is represented as a first class, design time priority.
o You will provide technical leadership in defining software engineering patterns, practices, and coding standards focused on increasing reliability and resilience of Oracle SaaS/ERP services. You will deliver strong work artifacts (reusable components, plug-ins, blueprints, sample code, scripts and tooling, etc.) to streamline adoption by Service development.
. Operations Engineering- You willunderstand and be able to communicate the scale, capacity, security, performance attributes and requirements of the services you own. You are an authority, able to understand and communicate every characteristic of your service stack, such as
o Degradation and behavior under load of the services and their dependencies.
o End-to-end tuning needs, optimizing resource utilization, as load patterns fluctuate.
o Instrumentation and metrics that clearly describe the service behaviors.
o Scaling requirements and patterns.
o Resiliency and recoverability, ensuring that backup / restore and disaster recovery capabilities are implemented, tested and maintained.

. Technical Experts- You are the ultimate customer concern point for complex or critical issues that have not yet been documented as SOPs for Level1 staff. You will usually get calledin during major incidents as an SME, when the source of a problem is unclear. You will have the deep understanding of service topology and their dependencies required to solve issues and define mitigations.

. Incident Response- You will be the primary author of technical content for both customer and internal communication used throughout the incident response process, e.g. postmortem/root cause analysis, end-to-end repair item definition, fixes in production.

. Automation- You will have a clear understanding of automation and orchestration principles, and will be eager to automate, wherever and whenever the possibility arises, while simultaneously eliminating technical debt. Automation must bea part of your DNA.

. Prevention- Using data-driven incident findings, you will work on solutions that will ultimately prevent the incident/problem from arising ever again, and interim solutions to more quickly resolve the problem next time.

Skills and Qualifications

. Minimum of 5 years of software development, with demonstratedknowledge of professional software engineering standard methodologies for the full software development process, including coding standards, code reviews, source control, build and release processes, continuous deployment, and test suite development and maintenance.

. Experience deploying andrunning large scale online systems built on Cloud platformssuch as Oracle Cloud, AWS, Azure, Google Cloud Platform, and/or OpenStack

. Experience designing and implementing solutions for platform and application layer telemetry, monitoring,scalability, performance and reliability.

. Experience coordinating resources across teams with varied strengths to restore service and maintain SLA's ITIL certification is preferred.

. Excellent written and verbal technical communications with technical and non-technical peers, customers, and at times, executive leadership.

. Proven success in contributing in a collaborative, team-oriented environment, with the ability to establish and cultivate relationships between multiple teams and navigate dependencies.

. 3+ years of experience
Working in systems and network administration, application security, DevOps and/or Site Reliability Engineering.
o Hands-on with web protocols and Linux/Unix tools and architecture, from kernel to shell, file systems, and client-server protocols.
o Using C#, PowerShell/Shell script, ASP.NET/MVC, JavaScript, TypeScript, React, or T-SQL.
o Maintaining and analyzing, large-scale distributed services
o Building automated tools in Python, Java, GoLang, and/or Ruby.

. Experience with monitoring alerting using technologies like Prometheus, Sensu, Nagios, Kafka, Wavefront, BigPanda, DataDog, and/or PagerDuty.

. Experience implementing, designing, deploying: Docker, Kubernetes, and Serverless (Lambda's).

. Experience with Oracle Linux, RedHat Linux, Ubuntu, Centos, CoreOS, and/or Amazon Linux.

. Experience with one or more orchestration, deployment tools, e.g. CloudFormation, Terraform, Ansible, Packer, and/or Chef.

. Experience with one or more CI tools: Jenkins, TeamCity, Bamboo, Artifactory.

. Experience with configuration management systems such as Ansible, Chef, or Puppet.

. Experience with Agile software development practices.

. Knowledge of testing methodologies, the testing pyramid (i.e., Unit, Integration, UI, E2E, etc.), testing frameworks, and testing automation toolslike QTP, OATS, and Selenium.

. Determined to keep moving things forward even in the face of ambiguity and imperfect knowledge (resilient to hazards of 'analysis paralysis').

. BS in Computer Science or related field and 7 years relevant experience.



  • Hyderabad, India Microsoft Full time

    OverviewEvery minute of every day, customers stake their entire business and reputation on the Microsoft Cloud. The Azure Customer Experience (CXP) team believes that when we meet our high standards for quality and reliability, our customers win. If we falter, our customers fail their end-customers. Our vision is to turn Microsoft Cloud customers into...


  • hyderabad, India Microsoft Full time

    Overview Every minute of every day, customers stake their entire business and reputation on the Microsoft Cloud. The Azure Customer Experience (CXP) team believes that when we meet our high standards for quality and reliability, our customers win. If we falter, our customers fail their end-customers. Our vision is to turn Microsoft Cloud customers...


  • Hyderabad, India Microsoft Full time

    Overview Every minute of every day, customers stake their entire business and reputation on the Microsoft Cloud. The Azure Customer Experience (CXP) team believes that when we meet our high standards for quality and reliability, our customers win. If we falter, our customers fail their end-customers. Our vision is to turn Microsoft Cloud customers into...

  • Principal

    2 days ago


    Hyderabad, Telangana, India Megha & Omega Group of Institutions Full time

    Urgently required a Principal (preferably female), for our reputed Omega Junior College, located at Habisguda. Should be able to lead and monitor the entire structure of the daily operations. Qualification: Must be a Post Graduate Languages required: English, Telugu and Hindi. Experience: 2-5 years of experience as a Principal or a similar role as an...


  • india QuEST Global Services Pte. Ltd Full time

    Quest Global is an organization at the forefront of innovation and one of the world’s fastest growing engineering services firms with deep domain knowledge and recognized expertise in the top OEMs across seven industries. We are a twenty-five-year-old company on a journey to becoming a centenary one, driven by aspiration, hunger and humility. We are...


  • india head-huntress.com Full time

    Job Description Principal Electrical Engineer (Direct Hire) - ConstructionAre you a top-tier Electrical Engineer and are looking for an exceptional opportunity? Are you a self-starter with a founder’s mentality with the ability to work independently, combined with the flexibility to collaborate effectively within a team-based organization?...

  • Service Coordinator

    6 days ago


    Hyderabad, India Select Engineer Full time

    **Salary**: ₹15,000.00 - ₹20,000.00 per month **Benefits**: - Health insurance Schedule: - Morning shift Supplemental pay types: - Yearly bonus Ability to commute/relocate: - Hyderabad, Telangana: Reliably commute or planning to relocate before starting work (required) **Experience**: - total work: 1 year (preferred) **Speak with the...


  • hyderabad, India Genpact Full time

    Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose - the relentless pursuit of a world that works better for people - we...


  • Hyderabad, India Genpact Full time

    Genpact (NYSE: G) is a global professional services and solutions firm delivering outcomes that shape the future. Our 125,000+ people across 30+ countries are driven by our innate curiosity, entrepreneurial agility, and desire to create lasting value for clients. Powered by our purpose - the relentless pursuit of a world that works better for people - we...

  • Principal Engineer

    1 week ago


    hyderabad, India Planful Full time

    Architect/Principal Engineer (Python, AWS, Django) About Us Planful is the pioneer of financial performance management cloud software. The Planful platform, which helps businesses drive peak financial performance, is used around the globe to streamline business-wide planning, budgeting, consolidations, reporting, and analytics. Planful empowers...

  • Principal Engineer

    4 weeks ago


    Hyderabad, India Planful Full time

    Architect/Principal Engineer (Python, AWS, Django) About Us Planful is the pioneer of financial performance management cloud software. The Planful platform, which helps businesses drive peak financial performance, is used around the globe to streamline business-wide planning, budgeting, consolidations, reporting, and analytics. Planful empowers...

  • Principal Engineer

    1 week ago


    india NextGen Healthcare Full time

    Description :The Principal Engineer, SW Development will be responsible for the design and development of software solutions as part of an Agile software development team. The Principal Engineer will serve as the technical lead to develop high level technical designs, produce, and execute code, assess, and troubleshoot software programs and...


  • Hyderabad, India Splunk Inc Full time

    Job DescriptionJoin us as we pursue our ground-breaking vision to make machine data accessible, usable, and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we are committed to our work, customers, having fun, and most significantly to each...


  • hyderabad, India Splunk Inc Full time

    Job Description Join us as we pursue our ground-breaking vision to make machine data accessible, usable, and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we are committed to our work, customers, having fun, and most significantly to...


  • india Qualys Full time

    We are seeking a talented Principal Software Engineer to help build next generation products from ground-up. Working with a team of engineers and architects, you will be responsible for developing and supporting a highly scalable SaaS based Security Analytics product. This is a great opportunity to be an integral part of a team building Qualys’ next...


  • india Cornerstone OnDemand Full time

    We're looking for a Principal Data Engineer This role is Remote Principal Data Engineer We are seeking a talented Principal Data Engineer in Pune, India. Reporting to the Sr. Manager – Data Engineering. The right candidate has strong communication skills, passion for solving business problems with data, domain knowledge in Finance –...


  • Hyderabad, India Pegasystems Full time

    Meet Our Team:Pegasystems develops strategic applications for sales, marketing, service and operations. Pega's applications streamline critical business operations, connect enterprises to their customers seamlessly in real-time across channels, and adapt to meet rapidly changing requirements. Pega's Global 500 customers include the world's largest and most...


  • Hyderabad, India Oracle Full time

    SaaS Cloud CPQ is seeking a motivated Site Reliability Engineer that thrives in a fast-paced rapidly evolving technology environment. This individual will be a member of the CPQ System Administration team and focused on driving for those quality standards across all projects. The purpose of this position is to support build, operations, customer support, and...


  • india Career Stone Consultant Full time

    PRINCIPAL ACCOUNTABILITIES: 1.AWS Infrastructure Design: o Lead the design and implementation of scalable, reliable, and secure AWS infrastructure. o Provide expertise in architecting solutions that maximize the benefits of AWS services. o Lead the upgrade of Apache web servers for improved performance and security. o Oversee the database (DB) upgrade...


  • hyderabad, India Oracle Full time

    SaaS Cloud CPQ is seeking a motivated Site Reliability Engineer that thrives in a fast-paced rapidly evolving technology environment. This individual will be a member of the CPQ System Administration team and focused on driving for those quality standards across all projects. The purpose of this position is to support build, operations, customer support,...