Site Reliability Engineer
3 days ago
About the role:
We are seeking a highly skilled Site Reliability Engineer (SRE) to join our dynamic team. In this role, you will be responsible for ensuring the reliability, scalability, and performance of our satellite communication systems, which leverage AI and ML for automation and optimization. You will play a key role in maintaining the infrastructure, automating deployment processes, and troubleshooting complex issues in a mission-critical environment.
Key Responsibilities:
- System Reliability & Monitoring:
- Design, implement, and maintain monitoring and alerting systems to ensure high availability and performance of satellite communication platforms.
- Proactively identify and address system bottlenecks, vulnerabilities, and other reliability challenges.
- Ensure infrastructure is capable of supporting AI and ML workloads at scale, with a focus on automation and efficiency.
- Infrastructure Management & Automation:
- Build and maintain CI/CD pipelines for satellite communication AI/ML applications, ensuring smooth deployment and integration processes.
- Implement and optimize cloud-native architectures, using platforms such as AWS, GCP, or Azure, to support AI/ML models and satellite communication systems.
- Automate scaling, deployment, and configuration of infrastructure to ensure high availability and fault tolerance.
- Incident Management & Root Cause Analysis:
- Lead incident response efforts, including troubleshooting, root cause analysis, and resolution of production issues.
- Implement post-mortem analysis processes to continuously improve the reliability and performance of systems.
- Ensure the implementation of best practices for incident documentation, including actionable feedback and lessons learned.
- Collaboration & Continuous Improvement:
- Work closely with engineering teams, including AI/ML developers, software engineers, and network engineers, to identify areas for improvement and optimize system performance.
- Collaborate with satellite engineers to integrate AI/ML solutions into the satellite communication stack, ensuring performance optimization and automation.
- Contribute to the development of internal tools and dashboards to enhance system reliability and transparency.
- Security & Compliance:
- Ensure security best practices are implemented across the satellite communication platform, particularly regarding AI/ML data privacy and satellite systems.
- Collaborate with security teams to ensure systems are compliant with industry standards and regulations.
Qualifications:
- Required:
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience).
- 3+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
- Strong knowledge of cloud platforms (AWS, GCP, Azure) and container orchestration tools (Kubernetes, Docker).
- Experience with infrastructure-as-code tools (Terraform, Ansible, etc.).
- Strong expertise in monitoring, logging, and alerting tools (Prometheus, Grafana, ELK Stack, etc.).
- Familiarity with AI/ML systems and how they can be scaled and managed in production environments.
- Experience with scripting languages (Python, Bash, Go, etc.) for automation and tool development.
- Preferred:
- Experience with satellite communication systems or space-based infrastructure.
- Knowledge of networking protocols and technologies related to satellite communication.
- Experience with machine learning frameworks (TensorFlow, PyTorch, etc.) and deploying AI models in production.
- Familiarity with disaster recovery, backup strategies, and high-availability configurations for cloud-based systems.
- Certification in cloud platforms (AWS Certified Solutions Architect, Google Cloud Professional Cloud Architect, etc.).
Skills & Attributes:
- Problem-Solving & Critical Thinking:
Ability to think creatively and analytically to solve complex problems in real-time. - Collaboration:
Excellent team player with the ability to work cross-functionally in a collaborative environment. - Adaptability:
Able to thrive in a fast-paced, constantly evolving environment and adapt to new technologies and methodologies. - Communication:
Strong written and verbal communication skills, with the ability to explain technical concepts clearly to non-technical stakeholders.
What We'll Offer
- Professional development opportunities.
- Collaborative and innovative work environment
- Aviation, Maritime domain exposure, and business knowledge
- Connectivity and content engineering and business knowledge
- Opportunity to work in cross-functional teams.
- Performance-based bonus
- Opportunity to work across teams and organizations.
Neuron is an Equal Opportunity Employer. Employment opportunities at Neuron are based upon one's qualifications and capabilities to perform the essential functions of a particular job. All employment opportunities are provided without regard to race, religion, sex (including sexual orientation and transgender status), pregnancy, childbirth or related medical conditions, national origin, age, veteran status, disability, genetic information, or any other characteristic protected by law.
-
Site Reliability Engineer
2 days ago
Chennai, Tamil Nadu, , India Insent Full time ₹ 6,00,000 - ₹ 18,00,000 per yearWe are looking to hire a site reliability engineer to our super fast -growing team. As a site reliability engineer, you will be responsible for deploying, supporting, monitoring and troubleshooting large scale micro -service based system; documenting the IT infrastructure, policies and procedures **About Insent** Insent is a super fast -growing, enterprise...
-
Site Reliability Engineer Trainer
4 days ago
Medavakkam, Chennai, Tamil Nadu, India Intellion Technologies Pvt Ltd Full time ₹ 2,40,000 - ₹ 18,00,000 per yearJob Title: Site Reliability Engineer Trainer (Part-Time / Freelance)Job Description:We are looking for an experienced Site Reliability Engineer (SRE) Trainer for a part-time freelance role. The trainer will be responsible for delivering practical and interactive sessions to learners, covering key concepts and hands-on aspects of Site Reliability...
-
civil site engineer
2 weeks ago
Kundrathur, Chennai, Tamil Nadu, India The Chennai Engineer Full time ₹ 1,20,000 - ₹ 3,00,000 per yearSupervise and oversee construction projects to ensure they meet specifications and timelines.Provide technical support and direction to construction teams.Ensure projects comply with health and safety regulations.Coordinate and manage site activities and resources.Handle day-to-day problems that arise on the construction site.Liaise with clients,...
-
Site Reliability Engineer
2 weeks ago
tamil nadu, India Tata Consultancy Services Full timeTCS has been a great pioneer in feeding the fire of young techies like you. We are a global leader in the technology arena and there’s nothing that can stop us from growing together.What we are looking forRole: Digital : Site Reliability Engineering (SRE)Experience Range: 4 – 7 YearsLocation: Chennai/Pune/KolkataSRE Team Skills: (Must have) In...
-
AWS Site Reliability Engineer
3 days ago
tamil nadu, India HTC Global Services Full timeHTC – A brief profileEstablished in 1990, HTC Inc., a company with headquarters in Troy, Michigan, is a leading global Information Technology solution and BPO provider. HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data warehousing, embedded systems, ECM, SCM, CRM, and ERP solutions. HTC Inc....
-
Site Reliability Engineer Ii
4 weeks ago
Chennai, Tamil Nadu, India Trimble Full timeYour Title Site Reliability Engineer -II Job Location Chennai India Our Department Trimble Platform Are you interested in cutting edge cloud technologies ready to dirt your hands in the cloud world Do you like to be part of a core team with industry leading site reliability engineering standards About the Role Are you passionate about cutting-edge cloud...
-
Site Reliability Engineer
3 days ago
Chennai, Tamil Nadu, India Elgebra Full time ₹ 6,00,000 - ₹ 18,00,000 per yearHiring: Site Reliability Engineer – 7+ YearsLocation: Bangalore / Chennai Payroll: Elgebra Client: Qincline Joining: Immediate to 15 DaysRole Overview:We are looking for an experienced Site Reliability Engineer (SRE) with over 6 years of expertise to join our team. The ideal candidate will have strong technical skills, a problem-solving mindset, and the...
-
Site Reliability Engineer
2 weeks ago
Chennai, India Ford Motor Company Full timeJob Description Job Description Job Description: Ford is seeking an experienced Site Reliability Engineer (SRE) to join our team and lead the development, enhancement, and extension of our global monitoring and observability platform. Enterprise Technology plays a critical part in shaping the future of mobility. If you're looking for the chance to leverage...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, India Relanto Full timeJob Description Job Title: Site Reliability Engineer Summary We are looking for a Site Reliability Engineer to join our Digital & Transformation department. The ideal candidate will have 2-3 years of experience in this field and will be responsible for ensuring the reliability, availability, and performance of our systems and applications. Roles And...
-
Site Reliability Engineer
5 days ago
Chennai, India Siemens Full timeJob Description Dear Aspirant! We empower our people to stay resilient and relevant in a constantly changing world. We're looking for people who are always searching for creative ways to grow and learn. People who want to make a real impact, now and in the future. Does that sound like you Then it seems like you'd make a great addition to our vibrant...