
Lead Site Reliability Engineer
3 weeks ago
About the Role: As a Site Reliability Engineer, you will be responsible for ensuring platform and application availability, scalability, and reliability, while maintaining optimal system uptime.
What you''ll be Responsible for?
- Build, monitor and maintain highly scalable, large-scale deployments.
- Installation/deployment of new releases, environments for applications.
- Proactively monitor systems and applications, develop and maintain monitoring tools and dashboards, and ensure high availability of production environments by identifying performance issues and implementing corrective actions.
- Incident Management: Lead incident response efforts, diagnose root causes, and implement long-term solutions to prevent recurrence. Ensure effective communication during outages.
- Collaboration & Coordination: Work closely with cross-functional teams to ensure efficient platform integration, API management, and campaign execution, while providing technical guidance and support as needed.
- Troubleshooting and Root Cause Analysis: Utilize your expertise to investigate and resolve incidents quickly during crisis situations, performing root cause analysis to prevent recurrence.
- Ensure high availability of production environments by monitoring performance metrics and implementing corrective actions when necessary.
- Platform Integration: Manage and oversee the integration of various APIs, ensuring seamless interoperability between systems and third-party services.
- Support the compliance and security integrity of the environments.
- Adherence to process compliance & ensuring platform reliability.
- Experience in monitoring and automations in Prometheus Grafana or ELK or Datadog or Dynatrace or any observability tools
- Experience with container management and micro-services architectures such as Docker in cloud or on-premises infrastructure.
What You'd have?
- Kubernetes: Expertise in creation, maintenance, scaling, and upgrades of Production clusters.
- Docker: Must have experience in writing Docker files complying with Industry standard best practices.
- CI/CD: Must have hands-on experience with Azure-DevOps/Jenkins in creation & Execution of Pipelines in a multi-target environment.
- Troubleshooting skills: Expertise in analysis of applications logs to drilldown in identification of the issue with expertise on logging stacks such as ELK, Dynatrace, Splunk
- Monitoring Stacks: Expertise in using Grafana with skills on building & managing of dashboards on various data sources in Grafana.
- Programming Skills: Experience in creating & managing of Bash scripts & Ansible with some exposure on Terraform.
- Environment: Excellent skills and hands-on in Linux environments and able to troubleshoot issues at OS levels.
- Experience on usage of project management tools such as JIRA
- Experience in deploying & Managing of Distributed Queuing systems such as Redis, Kafka Rabbit-MQ, IBM-MQ, MSMQ
- Experience in deploying & managing of Databases in standalone & cluster modes with basic DB Skills on Postgres, MySQL, Click House
- Prior experience in working on high traffic & highly scalable platforms is an added advantage.
- Good command on Linux, Networking concepts (TLS/SSL, DNS, Load Balancers, etc.,) and troubleshooting skills in large scale environments
- Deep understanding of basic security concepts and protocols - authentication, authorization, signing, encryption, SSL/TLS, SSH/SFTP, X509 certificates
- Good knowledge of ITIL terminology for incident and problem management
- Track record of excellent interpersonal, analytical, and communication skills.
- Bachelor of Science in Computer Science or other related discipline.
Why join us?
- Impactful Work: Play a pivotal role in safeguarding Tanla's assets, data, and reputation in the industry.
- Tremendous Growth Opportunities: Be part of a rapidly growing company in the telecom and CPaaS space, with opportunities for professional development.
- Innovative Environment: Work alongside a world-class team in a challenging and fun environment, where innovation is celebrated. Tanla is an equal opportunity employer.
Tanla is an equal opportunity employer. We champion diversity and are committed to creating an inclusive environment for all employees.
www.tanla.com
-
Site Reliability Engineer
10 hours ago
Kozhikode, Kerala, India beBeeReliability Full time ₹ 2,50,00,000 - ₹ 3,50,00,000Job Overview">The role of Site Reliability Engineer focuses on ensuring the reliability and scalability of financial systems.Sustainability Management: Responsible for day-to-day operations of accounting and finance applications and data platforms to ensure they meet business expectations.Availability and Reliability: Ensure that accounting and finance...
-
Site Reliability Specialist
11 hours ago
Kozhikode, Kerala, India beBeeSre Full time ₹ 18,75,346 - ₹ 25,16,230Job Summary:We are seeking a seasoned Site Reliability Engineer to ensure the stability and scalability of accounting platforms.The ideal candidate will have 5-7 years of experience in Site Reliability Engineering, DevOps, or Production Engineering, with strong expertise in monitoring/observability tools, CI/CD pipelines, automation frameworks, and IaC...
-
Site Reliability Engineer
2 days ago
Kozhikode, Kerala, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000As a seasoned professional with over 12 years of experience in the industry, we are seeking a skilled Site Reliability Engineer to take on a challenging role.About the PositionThe ideal candidate will be responsible for driving and implementing a robust SRE strategy that aligns with our business objectives.Key responsibilities include:Promoting a culture of...
-
Senior Director of Site Reliability
4 days ago
Kozhikode, Kerala, India beBeeSiteReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000Job SummaryWe are seeking a highly experienced Senior Director of Site Reliability to lead our SRE efforts.About the RoleThe Senior Director of Site Reliability will be responsible for developing and implementing a comprehensive SRE strategy, promoting automation culture in operating services, identifying toil-heavy processes and automating them, developing...
-
Trustworthy Site Reliability Specialist
11 hours ago
Kozhikode, Kerala, India beBeeReliability Full time ₹ 25,00,000 - ₹ 30,00,000Site Reliability Engineer RoleWe are looking for a skilled Site Reliability Engineer with expertise in ensuring the stability and scalability of financial platforms.The ideal candidate will have experience in implementing automation, monitoring, and incident response to drive operational excellence.Key Responsibilities:Ensure accounting and finance platforms...
-
Senior Software Engineer
13 hours ago
Kozhikode, Kerala, India beBeeDevOps Full time ₹ 1,44,00,000 - ₹ 2,16,00,000Senior Software Engineer - Site ReliabilityWe are seeking a highly skilled Senior Software Engineer to join our team. As a key member of our site reliability engineering team, you will play a critical role in ensuring the high availability and performance of our systems.You will be responsible for identifying potential system issues early and implementing...
-
Kozhikode, Kerala, India beBeeEngineering Full time ₹ 1,20,00,000Job OpportunityThe Senior Technical Manager for Site Reliability Engineering will lead a remote team of engineers, driving operational excellence and fostering a high-performing team culture. This role is responsible for overseeing day-to-day operations, technical mentorship, and strategic alignment with business goals.About the Job:This is a leadership...
-
Site Reliability Professional
2 days ago
Kozhikode, Kerala, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 1,80,00,000Job Description\We are seeking a skilled professional to fill the role of Site Reliability Engineer. The successful candidate will be responsible for ensuring the seamless operation of our digital infrastructure.The ideal candidate will have expertise in troubleshooting, automation, and customer support with proficiency in Java, Python, Bash, or similar...
-
Technical Lead
2 days ago
Kozhikode, Kerala, India beBeeReliability Full time ₹ 2,00,00,000 - ₹ 2,50,00,000Job Title: Technical Lead - Platform ReliabilityWe are seeking an experienced technical professional to lead the platform reliability team. The successful candidate will be responsible for ensuring the stability, scalability, and operational excellence of our financial platforms.The ideal candidate will have a strong background in site reliability...
-
System Reliability Engineering Lead
1 week ago
Kozhikode, Kerala, India beBeeReliability Full time ₹ 25,00,000 - ₹ 40,00,000Job Title: System Reliability Engineering LeadThe Role:Our company is seeking an experienced and dynamic Site Reliability Engineering (SRE) Lead to oversee the reliability, scalability, and performance of our critical systems.As an SRE Lead, you will play a pivotal role in establishing and implementing SRE practices, leading a team of engineers, and driving...