Site Reliability Engineer II

1 week ago


Bengaluru India Microsoft Full time

Job Description The Production Engineering and Artificial Intelligence (AI) Group, part of the Linux Systems Group within Microsoft, plays a critical role in powering Azure Cloud. This team ensures that Azure operates with the latest version of Linux software at the highest levels of quality and performance, serving as the gatekeeper for production software. The team achieves this at Azure scale through efficient automation and by leveraging artificial intelligence to reduce the human effort required for these responsibilities. This is an excellent opportunity to join the Production Engineering and AI Group and contribute to the growth of Microsoft's Azure Cloud infrastructure. As a Site Reliability Engineer II, you will be responsible for ensuring that software deployments follow safe rollout processes while driving operational excellence. You will leverage technical expertise, telemetry analysis, and advanced artificial intelligence to maintain reliability and performance across large-scale systems. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. Responsibilities - Independently write code or scripts that automate the performance of scalable operations processes (e.g., monitoring, alerting, deploying products and updates) across components and features of products. - Create, test and deploy changes through a safe deployment process (SDP) and improve the observability, security, reliability and operability of the systems operating at hyper scale. - Use tools and processes to troubleshoot problems affecting the availability, security, reliability, performance of components, leveraging the AI capabilities - Enable the team to increase the velocity in which changes can reliably and safely deployed in production and monitors the effects of these changes. - Respond to incidents during regular on-call rotations and take appropriate action to mitigate impact. You will develop alerts and automated monitoring infrastructure to notify degradation in performance or availability and draw insights from this data to manage infrastructure in an optimal way Qualifications Required Qualifications: - 4+ years technical experience in software engineering, network engineering, or systems administration - OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration - OR Master's Degree in Computer Science, Information Technology, or related field. - 1+ years experience in Cloud Infrastructure and Data Center Expertise - Managing public cloud infrastructure or large-scale data center setups. - Site Reliability Engineering (SRE) principles. - Safe deployment practices in hyper-scale data centers. - Distributed systems designed for high availability and incident handling protocols. - 1+ years experience in Programming and Automation Skills - Python and Bash or PowerShell scripting and advances in cloud technologies. Other Qualifications - Ability to meet Microsoft, customer and/or govenment security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: - Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter. Preferred Qualifications - 5+ years technical experience in software engineering, network engineering, - OR systems administration OR Bachelor's Degree in Computer Science, Information Technology, - OR related field AND 2+ years technical experience in software engineering, network engineering, - OR systems administration - OR Master's Degree in Computer Science, Information Technology, - OR related field AND 1+ year(s) technical experience in software engineering, network engineering, - 1+ year(s) people management experience. #azurecorejobs Microsoft is an equal opportunity employer. Consistent with applicable law, all qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.



  • Bengaluru, Karnataka, India Microsoft Full time ₹ 8,00,000 - ₹ 24,00,000 per year

    The Production Engineering and Artificial Intelligence (AI) Group, part of the Linux Systems Group within Microsoft, plays a critical role in powering Azure Cloud. This team ensures that Azure operates with the latest version of Linux software at the highest levels of quality and performance, serving as the gatekeeper for production software. The team...


  • India Microsoft Full time

    Job Description The Production Engineering and Artificial Intelligence (AI) Group, part of the Linux Systems Group within Microsoft, plays a critical role in powering Azure Cloud. This team ensures that Azure operates with the latest version of Linux software at the highest levels of quality and performance, serving as the gatekeeper for production software....


  • Bengaluru, Karnataka, India JPMorganChase Full time US$ 80,000 - US$ 1,20,000 per year

    DescriptionPlay a key role in ensuring system reliability at one of the world's most iconic and largest financial institutions.As a Site Reliability Engineer II at JPMorgan Chase within the Corporate Technology, Finance Last Mile Reporting team, you will use technology to solve business problems and leverage software engineering best practices as we strive...


  • Bengaluru, India Flipkart Full time

    Hiring Site Reliability Engineers Exp : 2.5 +years (Excluding internship) Location : Bangalore Apply Here : H7x UGUH The engineer will work in the Reliability and Productivity Engineering team and is responsible for building industry standard large scale platforms to be utilised across FK that helps to significantly improve the reliability of systems and...


  • Bengaluru, India Flipkart Full time

    Hiring Site Reliability EngineersExp : 2.5 +years (Excluding internship)Location : BangaloreApply Here : engineer will work in the Reliability and Productivity Engineering team and is responsible for building industry standard large scale platforms to be utilised across FK that helps to significantly improve the reliability of systems and bring efficiency...


  • Bengaluru, Karnataka, India d416f97b-2589-437a-8e64-3348cfe4008b Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Hiring Site Reliability EngineersExp : 2.5 +years [Excluding internship]Location : BangaloreApply Here : The engineer will work in the Reliability and Productivity Engineering team and is responsible for building industry standard large scale platforms to be utilised across FK that helps to significantly improve the reliability of systems and bring...


  • Bengaluru, Karnataka, India Microsoft Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    The Production Engineering and Artificial Intelligence (AI) Group, part of the Linux Systems Group within Microsoft, plays a critical role in powering Azure Cloud. This team ensures that Azure operates with the latest version of Linux software at the highest levels of quality and performance, serving as the gatekeeper for production software. The team...


  • India Akamai Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Do you like collaborating across teams to solve complex problems?Do you enjoy solving large scale systems problems?Join our Zero Trust Security TeamAkamai is a leading developer of a distributed platform for cloud computing, security, and content delivery. At SIA Enterprise, we develop protective measures that harness Akamai's real-time cloud security...


  • India Atlan Full time ₹ 12,00,000 - ₹ 24,00,000 per year

    Data is at the core of modern business, yet many teams struggle with its overwhelming volume and complexity. At Atlan, we're changing that. As the world's first active metadata platform, we help organisations transform data chaos into clarity and seamless collaboration.From Fortune 500 leaders to hyper-growth startups, from automotive innovators redefining...


  • Bengaluru, Karnataka, India CME Group Full time

    CME Group is the world's leading and most diverse derivatives marketplace, offering futures and options across a wide range of industries. We are seeking a passionate SRE to join our dynamic team.The Application Site Reliability Engineer II will help ensure the reliability and performance of our Markets trading and real-time post-trade systems; systems where...