Operations & Site Reliability Engineer

3 weeks ago


hyderabad, India Apple Full time
Summary:
People at Apple don’t just build products — they craft the kind of experience that has revolutionised entire industries. The diverse collection of our people and their ideas encourage innovation in everything we do. Imagine what you could do here Join Apple, and help us leave the world better than we found it. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Every single day, people do amazing things at Apple. Join Apple’s Service Management team as an Operations and Site reliability Engineer and inspire the team for operational excellence and improve availability, scalability and security of multiple highly scalable, fault tolerant, business critical, global applications in Apple Service Management space. Lead operational planning, readiness, monitoring, measurement of system health, incident management and communication for these enterprise level applications. Build and manage systems, infrastructure and applications through automation. Develop tools that bring operational parity across all applications to improve team’s efficiency. The candidate’s skill will be a strong blend between Operations Lead and Engineering.
Key Qualifications:
Strong sense of ownership, customer service, and integrity demonstrated through clear communicationExperience in leading and driving operations teams for large scale Critically important applications working in a 24x7 operations and on/off shore support modelExperience in strategizing and achieving operational excellence in global distributed systemsStrong knowledge of Production support practices for managing web and iOS applicationsExperience in fixing, analyzing logs, building metrics and operational dashboardsPassion for eliminating repetitive manual processes using automationExperience in interpreting data from systems like Hubble, ExtraHop, Splunk and other monitoring toolsFundamental understanding of distributed systems including: Micro services, Messaging Brokers and VersioningExperience in Java, JEE, REST, Swift/Objective C, database schema design and data access technologiesDeep Understanding of programs using a high-level programming language like: C, Java, Ruby, Python, or PerlExperience managing large numbers of diverse systems with containers (Docker), build systems (Jenkins, Ansible, Spinnaker), and infrastructure as a service (Kubernetes, AWS)Understanding of the Linux Operating System, including Kernel, Memory, Process, Threads, Static / Shared Libraries, IPC, SignalsUnderstanding of standard networking protocols and components such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing is a plusExperience in ethical hacking, system security and fraud monitoring are added advantageSelf-starter, flexible, motivated to learn in a fast-paced environment and comfortable working as part of a team of versatile engineersExcellent communication and leadership skillsExcellent organizational and documentation skillsPassion for quality and the optimal user experience
Description:
- We are looking for a highly technical and motivated individual who will own ultimate responsibility for operations of Service systems, working with teams to ensure 24X7 operations, coupled with the ability to ensure smooth rollout of applications that our customers use every day and improve our tool suite and develop new tools to improve the operational efficiency and product quality. - Identify and handle key performance indicators for global applications. Drive operational improvements, metrics tracking and implementation of standard methodologies through level one production support and engineering teams. - Handle Production backlog with business team and prioritize fixes in planned releases. Keep close tab on all product releases and ensure smooth and safe deployments in Production. Drive and handle product rollouts and partner/retail on-boardings. - Lead Production Support team to ensure all servers and application are monitored on an ongoing basis with alerts including CPU, memory, and storage utilization, as well as network and security issues, and performance tuning. Monitor production footprint and lead the effort for Capacity Planning - Keep track and interact with the Data Center, Network and other system teams to plan out OS patches, system upgrade and maintenance. - Drive the team to build, implement application automated health checks ensuring the high availability of applications - Along with applying your technical skills, you will have the opportunity to let your creative juices flowing. You will work very closely to design, develop and operate the best development support and automation tools you can imagine.
Additional Requirements:

  • hyderabad, India Insight Global Full time

    Required Skills and Experience *- Bachelor's or master's degree in computer science, Software Engineering, or a related field.- Proven experience (7+ years) in SRE, automation testing- Strong skills in developing and implementing automation testing strategies and frameworks.- Solid understanding of site reliability principles and best practices.- Leadership...


  • Hyderabad, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM IST We are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • Hyderabad, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM IST We are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • Hyderabad, India Insight Global Full time

    Required Skills and Experience *- Bachelor's or master's degree in computer science, Software Engineering, or a related field.- Proven experience (7+ years) in SRE, automation testing- Strong skills in developing and implementing automation testing strategies and frameworks.- Solid understanding of site reliability principles and best practices.- Leadership...


  • Hyderabad, India Insight Global Full time

    Required Skills and Experience * - Bachelor's or master's degree in computer science, Software Engineering, or a related field. - Proven experience (7+ years) in SRE, automation testing - Strong skills in developing and implementing automation testing strategies and frameworks. - Solid understanding of site reliability principles and best practices. -...


  • Hyderabad, India Microsoft Full time

    Overview Are you interested in working for one of the most exciting products at Microsoft, passionate about exceeding customer expectations and advancing Microsoft's cloud first strategy? Are you interested in a start-up like the environment, passionate about cloud computing technology and driving growth in one of Microsoft's core businesses? If so,...


  • hyderabad, India Microsoft Full time

    Overview Are you interested in working for one of the most exciting products at Microsoft, passionate about exceeding customer expectations and advancing Microsoft's cloud first strategy? Are you interested in a start-up like the environment, passionate about cloud computing technology and driving growth in one of Microsoft's core businesses? If...


  • Hyderabad, India WaferWire Cloud Technologies Full time

    Role: SRE (Site Reliability Engineer)Experience: 4+ YearsAbout WaferWire Cloud Technologies:WaferWire Cloud Technologies is a leading provider of innovative cloud solutions aimed at transforming businesses and driving digital growth. With a focus on cutting-edge technology and customer-centric approaches, we empower organizations to thrive in the digital...


  • Hyderabad, India WaferWire Cloud Technologies Full time

    Role: SRE (Site Reliability Engineer)Experience: 4+ YearsAbout WaferWire Cloud Technologies:WaferWire Cloud Technologies is a leading provider of innovative cloud solutions aimed at transforming businesses and driving digital growth. With a focus on cutting-edge technology and customer-centric approaches, we empower organizations to thrive in the digital...


  • Hyderabad, India Quiktrak, LLC Full time

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps EngineerJob Description:Summary:As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...


  • Hyderabad, India Quiktrak, LLC Full time

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps EngineerJob Description:Summary:As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...


  • hyderabad, India Microsoft Full time

    Overview Do you have a passion for high scale services and working with some of Microsoft’s most critical cloud capabilities? We’re looking for a Senior Site Relability Engineer with the right mix of software development, Cloud experience and passion for quality to envision, design, and deliver solutions for Microsoft's cloud Infrastructure. ...


  • Hyderabad, India Microsoft Full time

    Overview Do you have a passion for high scale services and working with some of Microsoft’s most critical cloud capabilities? We’re looking for a Senior Site Relability Engineer with the right mix of software development, Cloud experience and passion for quality to envision, design, and deliver solutions for Microsoft's cloud Infrastructure. ...


  • Hyderabad, India Virtusa Full time

    Site Reliability engineer - CREQ188641 Description Position : SRE Primary skills: devops CI/CD pipeline Location: Hyderabad Should have proficiency in understanding of application monitoring stack(Logs, Events, Metrics and Alerts) and ability to visualize and setup end-to-end observability. Should have proficiency in industry standard monitoring tools...


  • hyderabad, India Virtusa Full time

    Site Reliability engineer - CREQ188641 Description Position : SRE Primary skills: devops CI/CD pipeline Location: Hyderabad Should have proficiency in understanding of application monitoring stack(Logs, Events, Metrics and Alerts) and ability to visualize and setup end-to-end observability.Should have proficiency in industry standard monitoring...


  • Hyderabad, India Snaphunt Full time

    The OfferWork within a company with a solid track record of successGreat work environmentAttractive salary & benefitsThe Job You will be responsible for : Gathering and evaluating user feedback.Providing code documentation and other inputs to technical documents.Supporting continuous improvement by investigating alternatives and new technologies and...


  • hyderabad, India Snaphunt Full time

    The Offer Work within a company with a solid track record of success Great work environment Attractive salary & benefits The Job You will be responsible for : Gathering and evaluating user feedback. Providing code documentation and other inputs to technical documents. Supporting continuous improvement by investigating alternatives and new technologies...


  • Hyderabad, India FedEx ACC Full time

    Skill Required: Under general supervision, assists in the development and design of deliverables that support the resolution of moderately complex problems and technical design gaps. Supports improvement initiatives that are aligned with overarching global reliability of the company‘s systems, including capacity planning, failover strategies, performance...


  • Hyderabad, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM ISTWe are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • Hyderabad, Telangana, India Alter Domus Full time

    ABOUT US We are Alter Domus. Meaning "The Other House" in Latin, Alter Domus is proud to be home to 85% of the top 30 asset managers in the alternatives industry, and more than 5,000 professionals across 23 countries. With a deep understanding of what it takes to succeed in alternatives, we believe in being different. Invest yourself in the alternative,...