
Reliable Systems Specialist
20 hours ago
System Reliability Engineer Job Description
Our company is seeking an experienced System Reliability Engineer to join our team. The successful candidate will be responsible for building and maintaining the platform components of our Observability product.
- Responsible for building and fine-tuning the platform components of the Observability product.
- Collaborate with the performance team, data ingestion, platform DevOps, and data visualization teams under the Observability product.
- Support and maintain applications onboarded to Grafana Observability, Ingestion, and visualization written in PromQL, Log queries, etc., and monitoring technologies.
Requirements
- Experience with gathering and organizing large volumes of data to use for instrumentation into an Enterprise Observability solution.
- Recommend baseline monitoring thresholds, and performance monitoring KPIs and SLAs.
- Install agents, forwarders, APIs, performance monitoring alerts, dashboards, and data trend analysis.
- Good knowledge and understanding of Azure foundation components e.g. App GW, APIM, Virtual Network, NSG, Load Balancer, Azure VM, etc.
- Experience with databases like Azure SQL, PostgreSQL, MySQL, MongoDB, TSDB, or similar databases.
- Knowledge of monitoring tools such as Log Analytics, App Dynamics, Grafana, Prometheus, Splunk, and Sitescope.
- Azure/GCP hands-on experience pulling observability data from managed services.
- Golang/Python coding or a background with experience on SRE development and Open Telemetry implementation.
- Deploying/managing and optimizing enterprise-level observability platforms for Grafana OSS products like Mimir, Loki, Tempo, Fluentbit/Vector.
- Design and develop standard Grafana dashboards for critical metrics for various Azure/GCP services using the observability data.
- At least one of the following languages is required: Java, Python, GoLang, node.js.
- Experience working with ServiceNow or similar Service Management tools.
- Familiarity with cloud technologies in Azure, AWS, and Google Cloud.
- Experience on PCF, Docker, Kubernetes platforms is required.
- DevOps and CI/CD tools and processes experience is required.
Benefits
The successful candidate will have the opportunity to work with a highly motivated team, contributing to the development of cutting-edge technology solutions.
About Us
Our company is committed to innovation and excellence in the field of system reliability engineering. We offer a dynamic and challenging work environment that fosters growth and professional development.
-
Reliable Systems Engineer
4 days ago
Kollam, Kerala, India beBeeReliability Full time ₹ 1,50,00,000 - ₹ 2,50,00,000About This Role">We're seeking a skilled reliability engineer to join our team. As a Senior Site Reliability Engineer, you'll play a critical role in ensuring the stability and performance of our applications.">Your primary focus will be on designing and developing resilient systems that can withstand various types of failures. You'll work closely with...
-
Reliable System Architect
7 days ago
Kollam, Kerala, India beBeesystem Full time ₹ 14,24,100 - ₹ 25,17,700About UsWe are seeking a Site Reliability Engineer to ensure the reliability, scalability, and performance of our critical systems.
-
Reliable System Developer
3 days ago
Kollam, Kerala, India beBeeEngineering Full time ₹ 1,25,00,000 - ₹ 1,75,00,000Site Reliability Engineering ProfessionalKey ResponsibilitiesDevelop and maintain distributed, real-time systems that serve key stakeholders in the global business.Optimize system operations through automation and tooling to ensure efficiency and scalability.Build foundational technical components of a robust SRE program across multiple complex...
-
System Resilience Specialist
20 hours ago
Kollam, Kerala, India beBeeReliability Full time ₹ 30,00,000 - ₹ 35,00,000Job Opportunity:System Resilience SpecialistWe are seeking a skilled System Resilience Specialist to join our team.Key Responsibilities:Collaborate with development teams to design and implement scalable, reliable, and resilient systems on AWS.Develop and maintain monitoring, alerting, and logging solutions to ensure the availability and performance of...
-
Distributed Systems Reliability Engineer
2 days ago
Kollam, Kerala, India beBeeReliability Full time ₹ 10,00,000 - ₹ 16,00,000Job OverviewWe are seeking a highly skilled and motivated professional to join our team as a reliability engineer.The ideal candidate will have a strong background in computer science and experience working with distributed systems, cloud platforms, and containerization technologies.Design, build, and maintain resilient systems that meet the demands of our...
-
Reliability Engineering Specialist
7 days ago
Kollam, Kerala, India beBeeSre Full time ₹ 1,50,00,000 - ₹ 2,00,00,000SRE Lead PositionWe are seeking a seasoned professional to assume the role of SRE Lead. As a critical member of our team, you will be responsible for overseeing the reliability and scalability of our systems.This is an exceptional opportunity to leverage your expertise in cloud platforms, infrastructure-as-code tools, and distributed systems to drive...
-
AI Reliability Expert
20 hours ago
Kollam, Kerala, India beBeeDataQuality Full time ₹ 1,50,00,000 - ₹ 2,00,00,000Machine Learning SpecialistAs a Machine Learning Engineer - Data Quality Lead, you will play a pivotal role in ensuring the reliability and quality of our Large Language Models.Automate pipelines that guarantee data integrity, safety, and relevance for training datasets.Run proxy fine-tuning and reward modeling experiments to validate dataset quality.Design...
-
Senior Reliability Engineer
1 day ago
Kollam, Kerala, India beBeeFinancial Full time ₹ 2,00,00,000 - ₹ 2,50,00,000About the Role:As a key member of our operational team, you will play a pivotal role in ensuring the reliability, scalability and excellence of our financial systems.This critical position is focused on overseeing the day-to-day operations for Accounting and Finance applications and data platforms, guaranteeing they run smoothly and meet business...
-
Software Reliability Engineer
2 days ago
Kollam, Kerala, India beBeeQuality Full time ₹ 8,00,000 - ₹ 12,50,000Job Opportunity: Data Quality Assurance Specialist As a skilled quality assurance professional, you will be responsible for ensuring the quality and reliability of software products through manual and automated testing techniques. This entails executing thorough tests to identify and rectify defects.Key Responsibilities:
-
Autonomous Vehicle Systems Specialist
1 day ago
Kollam, Kerala, India beBeeAutonomous Full time ₹ 12,00,000 - ₹ 16,00,000Job Title: Autonomous Vehicle Systems SpecialistExperience: 4 to 8 Years in Testing and Validating Autonomous Vehicles and Advanced Driver Assistance Systems.Key Responsibilities:Test and validate AV/ADAS functions such as Adaptive Cruise Control (ACC), Lane Keeping (LK), Autonomous Emergency Braking (AEB), Automated Parking Assist (APA), and more.Validate...