SME on DevOps

1 month ago


thiruvananthapuram, India Tata Elxsi Full time

SME on DevOps / SRE

Tata Elxsi’s strong domain expertise in Media and Communications, complemented with our experience in delivering vertically industry use cases, enable customers to differentiate and win. We offer integrated services from research and strategy to electronics and mechanical design, software development, validation, and deployment, supported by design studios, global development centers, and offices around the globe. These cover the entire spectrum of the 5G Services, Edge computing services, and subsystems to the connectivity, cloud platform and infrastructure elements.

We are seeking a Principal Architect – SRE with 10 years of experience who can contribute to the success & advancement of our projects through their leadership & expertise.

Role and Responsibilities

Owns the infra, network, and security architecture and implementation of highly scalable service delivery platforms Lead the evaluation, implementation, and streamlining of SRE practices and end-to-end orchestration integration, deployment, and InLife management SME in tools, frameworks, and their integration used in monitoring, maintaining, and automating production systems, applications, and networks Lead the architectural revamps of production systems for operational scaling, platform reliability, or other modernization initiatives Support the process alignment with respect to integration cadence, deployment processes, and service assurance between the engineering and operations team

Qualifications and Education Requirements

Bachelor/Master of Engineering, MCA, with overall experience of 10+ years 4+ years of experience in OnPrem/Cloud infra and has worked in production deployment and management 4+ years of experience working as an SRE in supporting Live Systems and has experience in technically leading orchestration or modernization efforts (beyond regular maintenance of the platform)

Performance KPIS:

Project success: The number of projects that the Chief Architect has been consulted on and provided solutions and/or architectures for is a measure of their ability to help teams achieve their goals. Customer satisfaction: The number of successful technical solutions proposed and accepted by customers is a measure of the Chief Architect's ability to understand and meet the needs of their customers. Customer engagement: The number of customer consultations completed is a measure of the Chief Architect's commitment to providing support and guidance to their customers. Architecture quality: The number of projects sponsored to review and ensure architecture quality is a measure of the Chief Architect's commitment to ensuring that the organization's cloud infrastructure is designed and implemented in a secure and scalable manner. Innovation: The number of CoEs created for new service areas is a measure of the Chief Architect's ability to identify and develop new opportunities for the organization.

Preferred Skills

Must Have:

Technical skills Technical skillsSystem Architecture: Understanding of system design and architecture principles, including microservices, distributed systems, and monolithsCloud Platforms: Proficiency in cloud computing platforms such as AWS, Azure, or GCP, including services like EC2, S3, VPC, and managed Kubernetes services.Containerization and Orchestration: Experience with containerization technologies like Docker and container orchestration platforms like Kubernetes. Proficiency in IaC tools like Terraform, CloudFormation, or Ansible for automating infrastructure provisioning and management Strong skills in scripting (e.g., Bash) and programming languages (e.g., Python, Go, Ruby) for automation and tooling development Expertise in setting up and configuring monitoring tools like Prometheus, Grafana, and observability platforms such as Elasticsearch, Logstash, Kibana (ELK), or similar solutions Deep understanding of incident management practices, including incident command, incident response playbooks, and post-incident analysis. Understanding of security best practices, encryption, access control, and compliance requirements (e.g., SOC 2, HIPAA, GDPR) Knowledge of networking concepts, including routing, firewalls, VPNs, and troubleshooting network issues Experience with log aggregation and analysis tools like Splunk, ELK Stack, or Sumo Logic Skills in capacity planning, understanding resource utilization patterns, and making infrastructure scaling decisions Expertise in creating and testing disaster recovery and backup solutions Experience with change management processes and tools to track and manage infrastructure and application changes.

Technical leadership:

Project estimation includes effort estimation and infra estimations Has familiarity with design notation tools like C4 Model, PlantUML etc. or similar Has experience leading multiple teams in deploying different service delivery platforms and managing them

Certification: Cloud infra, Virtualization tools, Monitoring and management tools, etc.

Job location:- Bangalore / Trivandrum / Chennai

Qualification - B.E, B.Tech, MCA, M.E, M.Tech/M.Sc( Elec)

Job Code - DU3003