Current jobs related to Lead Site Reliability Engineer - India - Tanla Platforms Limited
-
Site Reliability Engineering Manager
4 weeks ago
India CloudHire Full timeJob SummaryThe Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...
-
Junior Site Reliability Engineer
3 weeks ago
India JoVE Full timeJo VE is the world-leading producer and provider of science video solutions with the mission to improve scientific research and education.Millions of scientists, educators and students use Jo VE for their research, teaching and learning.Our institutional clients comprise over 1,000 universities, colleges, and biopharma companies, including such leaders as...
-
Junior Site Reliability Engineer
4 weeks ago
India JoVE Full timeJoVE is the world- leading producer and provider of video solutions with the mission to improve scientific research and education. Millions of scientists, educators and students use JoVE for their research, teaching and learning. Our institutional clients comprise over 1,000 universities, colleges, and biopharma companies, including such leaders as Harvard,...
-
Senior Site Reliability Engineer
4 weeks ago
India BQE Software Full timeWe are seeking a Senior Site Reliability Engineer to lead reliability efforts across our application stack, focusing on high availability, performance, and scalability.This role will own the health and uptime of our mission-critical application , Cloud infrastructure , database system , and monitoring infrastructure . About Us At BQE, our mission...
-
Site Reliability Engineer
4 weeks ago
India CES Full timeWe're looking for a highly skilled Site Reliability Engineer to help us build, manage, and scale modern infrastructure systems for high-availability applications. If you're passionate about automation, cloud platforms, and solving tough operational challenges, we would love to hear from you.Key Skills and Competencies3+ years of extensive experience with...
-
Site Reliability Engineer
2 days ago
Remote, India Rackspace Technology Full timeJob DescriptionSite Reliability Engineer / Observability EngineerPublic Cloud - Offerings and Delivery - Workforce Mgmt & Delivery Ops /Full - Time / RemoteRackspace is building up its Professional Services Center of Excellence on Application Performance Monitoring Suites.If you enjoy solving complex business problems and can contribute to building next...
-
Urgent Search Site Reliability Engineer
3 weeks ago
India pythian Full timeRemote Site Reliability Engineering - Site Reliability Engineering Full Time Remote Site Reliability Engineer India Multiple Timezones Remote Work from Home Why Pythian At Pythian we are experts in strategic database and analytics services driving digital transformation and operational excellence Pythian a multinational company was...
-
India AionNimbius Full timeWe are looking for a Site Reliability Engineering Manager – Cloud Engineering to join our team in Bengaluru.This role will lead operations for a 24x7 cloud environment, ensuring our systems stay reliable, resilient, and ready to scale.You'll be the one making sure incidents are handled quickly, systems are well-documented, and automation is in place to...
-
Site Reliability Engineer
2 days ago
India Xebia Full timeWe are looking for a highly skilled AWS Engineer with strong Python development and Chaos Engineering expertise to design, build, and validate resilient, scalable, and automated cloud-native environments. The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault tolerance, and operational efficiency...
-
Senior Site Reliability Engineer
2 days ago
India Cimpress Full timeSenior Site Reliability EngineerWho We Are:Cimpress Technology develops cutting-edge, best-in-world software that our mass customization businesses use to create personalized products for over 17 million global customers. Our Mass Customization Platform consists of modular, multi-tenant services. Our businesses can choose the solutions that work for them, or...

Lead Site Reliability Engineer
2 weeks ago
About the Role: As a Site Reliability Engineer , you will be responsible for ensuring platform and application availability, scalability, and reliability, while maintaining optimal system uptime.
What you''ll be Responsible for?
- Build, monitor and maintain highly scalable, large-scale deployments.
- Installation/deployment of new releases, environments for applications.
- Proactively monitor systems and applications, develop and maintain monitoring tools and dashboards, and ensure high availability of production environments by identifying performance issues and implementing corrective actions.
- Incident Management: Lead incident response efforts, diagnose root causes, and implement long-term solutions to prevent recurrence. Ensure effective communication during outages.
- Collaboration & Coordination: Work closely with cross-functional teams to ensure efficient platform integration, API management, and campaign execution, while providing technical guidance and support as needed.
- Troubleshooting and Root Cause Analysis: Utilize your expertise to investigate and resolve incidents quickly during crisis situations, performing root cause analysis to prevent recurrence.
- Ensure high availability of production environments by monitoring performance metrics and implementing corrective actions when necessary.
- Platform Integration: Manage and oversee the integration of various APIs, ensuring seamless interoperability between systems and third-party services.
- Support the compliance and security integrity of the environments.
- Adherence to process compliance & ensuring platform reliability.
- Experience in monitoring and automations in Prometheus Grafana or ELK or Datadog or Dynatrace or any observability tools
- Experience with container management and micro-services architectures such as Docker in cloud or on-premises infrastructure.
What You'd have?
- Kubernetes: Expertise in creation, maintenance, scaling, and upgrades of Production clusters.
- Docker: Must have experience in writing Docker files complying with Industry standard best practices.
- CI/CD: Must have hands-on experience with Azure-DevOps/Jenkins in creation & Execution of Pipelines in a multi-target environment.
- Troubleshooting skills: Expertise in analysis of applications logs to drilldown in identification of the issue with expertise on logging stacks such as ELK, Dynatrace, Splunk
- Monitoring Stacks: Expertise in using Grafana with skills on building & managing of dashboards on various data sources in Grafana.
- Programming Skills: Experience in creating & managing of Bash scripts & Ansible with some exposure on Terraform.
- Environment: Excellent skills and hands-on in Linux environments and able to troubleshoot issues at OS levels.
- Experience on usage of project management tools such as JIRA
- Experience in deploying & Managing of Distributed Queuing systems such as Redis, Kafka Rabbit-MQ, IBM-MQ, MSMQ
- Experience in deploying & managing of Databases in standalone & cluster modes with basic DB Skills on Postgres, MySQL, Click House
- Prior experience in working on high traffic & highly scalable platforms is an added advantage.
- Good command on Linux, Networking concepts (TLS/SSL, DNS, Load Balancers, etc.,) and troubleshooting skills in large scale environments
- Deep understanding of basic security concepts and protocols - authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, X509 certificates
- Good knowledge of ITIL terminology for incident and problem management
- Track record of excellent interpersonal, analytical, and communication skills.
- Bachelor of Science in Computer Science or other related discipline.
Why join us?
- Impactful Work: Play a pivotal role in safeguarding Tanla's assets, data, and reputation in the industry.
- Tremendous Growth Opportunities: Be part of a rapidly growing company in the telecom and CPaaS space, with opportunities for professional development.
- Innovative Environment: Work alongside a world-class team in a challenging and fun environment, where innovation is celebrated. Tanla is an equal opportunity employer.
Tanla is an equal opportunity employer. We champion diversity and are committed to creating an inclusive environment for all employees.