
Mgr, Systems Reliability Engineering
3 days ago
Job Summary Act as a bridge between Development Operations and Business Team Lead and guide teams of support analysts in global hub locations Contribute to seamless management of high critical incidents resumption and ensuring systematic communication to stakeholders by working closely with SME s root cause identification and defects fixing Strong stakeholder management is necessary with CIO Senior Managers in TS Country CTMs covering respective applications Key Business Users Handle day to day operational issues including daily health checks of applications and processes working closely with end users CIO team developers and infrastructure teams to prioritise and resolve tickets to provide work arounds Responsible for identifying delivering improvement opportunities such as manual task automations performance throughput improvements batch optimisation and so on that will create efficiencies and optimize technical processing SIP initiatives- system improvement plans thereby improving production service stability Responsible for identifying implementing solutions to improve overall Operations efficiency that should result in cost saves avoidance Ownership of the observability platform for production engineering Risk and issues escalation and management Manage SLAs with internal and external teams as well as 3rd party vendors Able to accurately estimate development efforts timelines and task plans for projects involving application Identify opportunities to eliminate all manual and repeatable activities Promote a collaborative team environment that fosters creativity innovation and high performance Ensure team members maintain high standard of professionalism in every aspect of their work To communicate and be the focal point of dissemination of information from management to the team and vice versa Attend technology and business working committees To identify and acknowledge team members strengths and nurture their skills to benefit the team Strong people relationship management - International exposure ability to handle cultural diversity Drive proficiency in the team in building strategic alliances and maintaining successful stakeholder relationships through effective communication while providing service that exceeds expectations Good understanding of Technology GRC and controls in all disciplines and technology domains Self-driven and independent rigorous and analytical approach to risk management with high attention to detail and effective control execution and due diligence Gravitas and leadership skills to effectively work in partnership with colleagues globally within the business Maintain detailed working knowledge of relevant laws and regulations industry trends and security products Proven experience in leading initiatives discussions and coordination of many dependencies in a complex and challenging circumstances Key Responsibilities Strategy Play as a team member in SRE team to enhance application and infrastructure resiliency of service through self-healing and automated failovers - target a 99 9 up-time to customers Oversee the planned unplanned disruption of production infrastructure to ensure accountability for building resilient always-on systems Build resilience into the application so underlying system failures are handled gracefully and do not impact end users Influence design development teams to always be thinking of the rainy-day scenarios Availability Reliability Take responsibility for meeting SLA XLA expectations around the operability and reliability of our critical user service journeys where our customers expect a 24x7 digital service offering Examples of always on techniques to be used include caching circuit breakers dark and canary releases store and service patterns and alternate user experience flows Latency Performance Drive conversation around development velocity using SLIs SLOs data to ensure development velocity vs service reliability is optimized in partnership with Product Teams Iteratively review SLI SLO Error Budget policy to ensure the quantitative indicators of customer experience are accurate Where an increased focus on reliability is required influence senior stakeholders to ensure resourcing effort is made available Business Identify opportunities to eliminate all manual and repeatable activities toil via tooling and automation Reduce the number of repeat incidents by permanently fixing the underlying root cause of issues Functional knowledge of Cash Products and Payment schemes Processes Transition to Production Champion and evolve continuous delivery best practice standards to reduce release related incidents manual hands-off and achieve our aspiration of zero ops Partner with development teams to ensure applications are designed with scale resilience and performance in mind People Talent Collaborate with global teams in the production engineering Development Infrastructure and regional team leads Risk Management Identify key issues in the business areas being supported and based on this information put in place appropriate controls and measures to assess monitor control mitigate risks Ensure a full understanding of the risk and control environment within Technology Services Ensure support procedures are in place and adhere to Group Security Audit policies within Technology Services Active engagement with all audit issues arising in this support environment Governance Responsible for assessing the effectiveness of the governance oversight and controls and if necessary oversee changes in these areas Awareness and understanding of the regulatory framework in which the Group operates and the regulatory requirements and expectations relevant to the role Regulatory Business Conduct Display exemplary conduct and live by the Group s Values and Code of Conduct Take personal responsibility for embedding the highest standards of ethics including regulatory and business conduct across Standard Chartered Bank This includes understanding and ensuring compliance with in letter and spirit all applicable laws regulations guidelines and the Group Code of Conduct Effectively and collaboratively identify escalate mitigate and resolve risk conduct and compliance matters Lead to achieve the outcomes set out in the Bank s Conduct Principles Fair Outcomes for Clients Effective Financial Markets Financial Crime Compliance The Right Environment Serve as a Director of the Board Exercise authorities delegated by the Board of Directors and act in accordance with Articles of Association or equivalent Key stakeholders Business Heads in the country and the group Domain Heads in Tech Services Country CIO and CTM Business CIO Hive Lead and Chapter Leads Product Owners Other Responsibilities Design Highly scalable and robust application architecture Class Structures and Flows independently and on-time delivery without compromise on quality Application Debugging and production issue investigation Requirement analysis discussion with stakeholders and technical team Proper documentation and Delivery following Agile delivery model Lead the team on technical design and guide to resolve complex technical issues Weekly status report to senior management Follow SCB standards and Compliance Qualifications Education-A bachelor s degree or master s in computer science cs or information technology it Training-ITIL DEVOPS SRE Skills and Experience PYTHON BASH UNIX Linux AIX Prometheus ITRS Geneos or equivalent monitoring systems ELK Grafana Apache Kafka IBM MQ Solace Ansible Control-M Database and SQL Web Application Servers Jboss Apache Java Spring Boot About Standard Chartered We re an international bank nimble enough to act big enough for impact For more than 170 years we ve worked to make a positive difference for our clients communities and each other We question the status quo love a challenge and enjoy finding new opportunities to grow and do better than before If you re looking for a career with purpose and you want to work for a bank making a difference we want to hear from you You can count on us to celebrate your unique talents and we can t wait to see the talents you can bring us Our purpose to drive commerce and prosperity through our unique diversity together with our brand promise to be here for good are achieved by how we each live our valued behaviours When you work with us you ll see how we value difference and advocate inclusion Together we Do the right thing and are assertive challenge one another and live with integrity while putting the client at the heart of what we do Never settle continuously striving to improve and innovate keeping things simple and learning from doing well and not so well Are better together we can be ourselves be inclusive see more good in others and work collectively to build for the long term What we offer In line with our Fair Pay Charter we offer a competitive salary and benefits to support your mental physical financial and social wellbeing Core bank funding for retirement savings medical and life insurance with flexible and voluntary benefits available in some locations Time-off including annual leave parental maternity 20 weeks sabbatical 12 months maximum and volunteering leave 3 days along with minimum global standards for annual and public holiday which is combined to 30 days minimum Flexible working options based around home and office locations with flexible working patterns Proactive wellbeing support through Unmind a market-leading digital wellbeing platform development courses for resilience and other human skills global Employee Assistance Programme sick leave mental health first-aiders and all sorts of self-help toolkits A continuous learning culture to support your growth with opportunities to reskill and upskill and access to physical virtual and digital learning Being part of an inclusive and values driven organisation one that embraces and celebrates our unique diversity across our teams business functions and geographies - everyone feels respected and can realise their full potential 38711
-
Reliable Systems Engineer
1 week ago
Chennai, Tamil Nadu, India beBeeReliability Full timeSystem Reliability ExpertWe are seeking a talented and proactive system reliability expert to join our infrastructure team. The ideal candidate will combine software engineering expertise with systems engineering skills to build scalable, reliable, and efficient systems.Key Responsibilities:Design, implement, and manage scalable, resilient, and secure...
-
System Reliability Engineer
7 days ago
Chennai, Tamil Nadu, India beBeeTechnical Full time US$ 1,20,000 - US$ 1,40,000We are seeking a skilled technical expert to enhance our automated risk detection systems. This role involves supporting and maintaining rule-based logic, data analysis processes, and scalable tools.In this position, you will play a vital part in system reliability, operational support, and technical troubleshooting, contributing to the protection of...
-
Lead Systems Engineer
3 days ago
Chennai, Tamil Nadu, India Epam Systems Full time US$ 1,50,000 - US$ 2,00,000 per yearWe are looking for a highly skilled Lead Systems Engineer to take ownership of the design, implementation, and maintenance of Google Cloud Platform (GCP) environments for cutting-edge projects.This role requires a visionary with extensive cloud expertise to lead infrastructure initiatives, ensure system reliability, and push the boundaries of...
-
Lead Systems Engineer
3 days ago
Chennai, Tamil Nadu, India EPAM Systems Full time US$ 1,50,000 - US$ 2,00,000 per yearWe are looking for a highly skilledLead Systems Engineerto take ownership of the design, implementation, and maintenance of Google Cloud Platform (GCP) environments for cutting-edge projects.This role requires a visionary with extensive cloud expertise to lead infrastructure initiatives, ensure system reliability, and push the boundaries of...
-
Reliability Systems Architect
2 days ago
Chennai, Tamil Nadu, India beBeeReliability Full time ₹ 15,00,000 - ₹ 20,00,000Job OverviewWe are seeking a highly skilled Reliability Engineer to join our team. The ideal candidate will have expertise in designing and implementing reliable systems, as well as experience with Kubernetes, Containers, Cloud, and Database.The Reliability Engineer will be responsible for ensuring the availability, reliability, and performance of our...
-
Reliable System Developer
4 days ago
Chennai, Tamil Nadu, India beBeeObservability Full time ₹ 15,00,000 - ₹ 25,00,000Job Description:">We are seeking a skilled Site Reliability Engineer to join our team. The ideal candidate will have experience with Dynatrace, observability, and cloud computing platforms. They will be responsible for designing and implementing reliable systems that can handle production traffic efficiently.
-
Reliable Systems Expert
4 days ago
Chennai, Tamil Nadu, India beBeeEngineer Full time US$ 1,04,000 - US$ 1,30,878Job TitleSenior Site Reliability EngineerWe are looking for a Senior Site Reliability Engineer to join our team. This is an exciting opportunity to work with us and contribute to the success of our organization.As a Senior Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our systems and services. You will...
-
Senior Systems Engineer
3 days ago
Chennai, Tamil Nadu, India EPAM Systems Full time US$ 1,25,000 - US$ 1,75,000 per yearWe are looking for a highly skilledSenior Systems Engineer specializing in Google Cloud Platform (GCP)to join our team and take charge of designing, configuring, and maintaining cutting-edge cloud infrastructure.This role is essential in driving innovation and ensuring a scalable, secure, and efficient platform for critical projects.ResponsibilitiesDesign,...
-
Senior Systems Engineer
3 days ago
Chennai, Tamil Nadu, India Epam Systems Full time US$ 1,25,000 - US$ 1,75,000 per yearWe are looking for a highly skilled Senior Systems Engineer specializing in Google Cloud Platform (GCP) to join our team and take charge of designing, configuring, and maintaining cutting-edge cloud infrastructure.This role is essential in driving innovation and ensuring a scalable, secure, and efficient platform for critical projects.ResponsibilitiesDesign,...
-
System Reliability Specialist
1 week ago
Chennai, Tamil Nadu, India beBeeReliability Full time ₹ 8,00,000 - ₹ 12,00,000Job Title: System Reliability Specialist Experience:7 to 12 YearsQualification:Diploma/BE (Mech./Instru.) Locations:VadodaraChennaiResponsibility:We are seeking an experienced System Reliability Specialist to join our team. The successful candidate will be responsible for maintaining and troubleshooting complex systems, including instruments, valves,...