Engineering-l2-hyderabad-vice President-software Engineering-bengaluru/hyderabad

5 days ago


Hyderabad Telangana, India Goldman Sachs Full time

Job Category Vice President Site Reliability Engineer - Vice President Site Reliability Engineering SRE is an engineering discipline that combines software and systems engineering to build and run scalable massively distributed fault-tolerant systems At Goldman Sachs SRE is responsible for improving the availability and reliability of the firm s most critical platform services and ensures they meet the requirements of our internal and external users It is also responsible for the firmwide policies and standards focused on firm s digital resilience We are looking for engineers who are motivated to collaborate with our businesses to build and run sustainable production systems which can evolve and adapt to changes in our fast-paced global business environment The SRE team develops and maintains platforms and tools which help other Engineering teams in Goldman Sachs to build and operate reliable and resilient systems These systems span on-premises datacenters and multiple public cloud environments The platforms we offer include central logging monitoring agents and alerting and we provide tools to drive adoption and improvements to capacity planning operational readiness assessments production incident postmortems SLIs SLOs and deployment automation including canary releases The products and services we provide to our internal customers are used by thousands of engineers every day We believe that reliability is the most important feature of any system and we are devoted to giving our engineers the platforms and tools they need to build and operate reliable products Role Overview As a Site Reliability Engineer SRE at Goldman Sachs you will be a pivotal leader in ensuring the availability reliability and scalability of the firm s most critical platform applications and services You will combine deep software and systems engineering expertise to architect build and run large-scale massively distributed fault-tolerant systems This role involves providing technical leadership mentoring senior engineers and collaborating closely with internal teams and executive stakeholders to build and operate sustainable production systems that can adapt to our dynamic global business environment You will drive a culture of continuous improvement championing the adoption of advanced SRE principles and best practices across the organization Responsibilities Strategic Reliability Performance Drive the strategic direction for availability scalability and performance of mission-critical applications and platform services ensuring alignment with firm-wide objectives Architectural Leadership Lead the design build and implementation of highly available resilient and scalable infrastructure and application architectures Advanced Automation Tooling Architect and develop sophisticated platforms tools and automation solutions to eliminate toil optimize operational workflows and enhance deployment processes across the enterprise Complex Incident Management Post-Mortem Analysis Lead critical incident response conduct in-depth root cause analysis for systemic issues and implement long-term preventative measures to significantly enhance system stability and resilience System Design Capacity Planning Partner with development teams to embed reliability into application design from inception provide expert system design consulting and lead comprehensive capacity planning initiatives for future growth Observability Insights Define and implement advanced monitoring high volume logging with multi-user query capabilities and tracing strategies to provide deep actionable insights into application performance infrastructure health and user experience Technical Vision Mentorship Provide technical vision lead complex technical projects conduct rigorous code reviews enforce SDLC best practices and actively mentor and develop senior and staff-level engineers Technology Evaluation Adoption Stay at the forefront of industry trends and advancements evaluating and integrating cutting-edge tools and frameworks to significantly improve operational efficiency and reliability On-Call Leadership Participate in and lead on-call rotations providing expert guidance and hands-on support for critical system incidents Qualifications Experience Minimum of 10-15 years of hands-on experience in Site Reliability Engineering with a proven track record in architecting designing building and maintaining highly available scalable and fault-tolerant systems at an enterprise level Technical Proficiency Exceptional programming skills in one or more major languages such as Java Python Go with a focus on building robust scalable software Extensive hands-on experience with cloud platforms e g AWS GCP and deep expertise in containerization and orchestration technologies e g Docker Kubernetes Mastery of Infrastructure as Code IaC tools e g Terraform CloudFormation and configuration management tools e g Puppet Chef Ansible Profound understanding of Linux internals networking distributed systems and advanced system performance tuning Expertise in designing and implementing comprehensive monitoring alerting logging and tracing solutions e g Prometheus Grafana ELK stack Datadog PagerDuty Deep experience with CI CD tools and practices e g Jenkins GitLab Maven Strong foundation in databases and distributed systems Exceptional problem-solving abilities and analytical skills with a track record of resolving complex technical challenges Preferred Experience Experience with Distributed Databases like Elastic Search Experience with working on GCP Big Query Experience with messaging Systems Like Kafka Education Advanced degree Bachelor s or Mas ter s or PhD in Computer Science or a related technical field involving coding and or systems engineering or equivalent practical experience Soft Skills Superior communication collaboration and interpersonal skills with the ability to influence technical direction lead cross-functional initiatives and effectively engage with global teams and executive leadership Proven ability to work independently manage multiple complex stakeholders and drive significant organizational change The Goldman Sachs Group Inc 2023 All rights reserved Goldman Sachs is an equal opportunity employer and does not discriminate on the basis of race color religion sex national origin age veterans status disability or any other characteristic protected by applicable law



  • Hyderabad/ Secunderabad, India Goldman Sachs Services Pvt Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Description Site Reliability Engineer - Vice President Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run scalable, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for improving the availability and reliability of the firms most critical...


  • Hyderabad/ Secunderabad, India Goldman Sachs Services Pvt Ltd Full time ₹ 12,00,000 - ₹ 36,00,000 per year

    Job Description Unix/Compute Engineering is a global team that architects and manages the Linux, Virtualization, and Server Hardware computing platform at Goldman Sachs. The Unix/Compute Engineering team works closely with application developers and strategists to build and deploy technology solutions at Goldman Sachs. The team currently supports an...


  • Hyderabad, India Goldman Sachs Full time

    Site Reliability Engineer - Vice President Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run scalable, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for improving the availability and reliability of the firm’s most critical platform...


  • Hyderabad, India Goldman Sachs Full time

    Site Reliability Engineer - Vice President Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run scalable, massively distributed, fault-tolerant systems. At Goldman Sachs, SRE is responsible for improving the availability and reliability of the firm’s most critical platform...


  • Hyderabad, Telangana, India Goldman Sachs Full time

    Job Category Vice President Senior Site Reliability Engineer SRE 12 Years Experience Short Description for Internal Candidates The Senior Site Reliability Engineer SRE will serve as a technical leader and subject matter expert responsible for defining implementing and optimizing the reliability performance and scalability of our most...

  • Vice President

    3 days ago


    Hyderabad, Telangana, India, Telangana TERRA INTERNATIONAL MUN Full time

    Company Description Terra International MUN is the world’s first sustainable Model United Nations conference, focusing on climate action and diplomacy. We host international conferences that bring together young leaders, diplomats, and changemakers to address pressing environmental and social issues. Our mission is to inspire the next generation to take...


  • Hyderabad, Telangana, India Goldman Sachs Services Pvt Ltd Full time ₹ 20,00,000 - ₹ 25,00,000 per year

    Engineering-L2-Hyderabad-Vice President-Software Engineering Unix/Compute Engineering is a global team that architects and manages the Linux, Virtualization, and Server Hardware computing platform at Goldman Sachs. The Unix/Compute Engineering team works closely with application developers and strategists to build and deploy technology solutions at Goldman...

  • Vice President

    7 days ago


    Hyderabad, India TERRA INTERNATIONAL MUN Full time

    Company DescriptionTerra International MUN is the world’s first sustainable Model United Nations conference, focusing on climate action and diplomacy. We host international conferences that bring together young leaders, diplomats, and changemakers to address pressing environmental and social issues. Our mission is to inspire the next generation to take...

  • Vice president

    2 days ago


    Hyderabad, India TERRA INTERNATIONAL MUN Full time

    Company DescriptionTerra International MUN is the world’s first sustainable Model United Nations conference, focusing on climate action and diplomacy. We host international conferences that bring together young leaders, diplomats, and changemakers to address pressing environmental and social issues. Our mission is to inspire the next generation to take...

  • Vice President

    1 day ago


    hyderabad, India TERRA INTERNATIONAL MUN Full time

    Company Description Terra International MUN is the world’s first sustainable Model United Nations conference, focusing on climate action and diplomacy. We host international conferences that bring together young leaders, diplomats, and changemakers to address pressing environmental and social issues. Our mission is to inspire the next generation to take...