Techops I
1 week ago
ID: 1406 | 1-2 yrs | Bengaluru | careers - ClearTax is on a mission of simplifying the financial lives of Indians. Clearly, Product & Engineering excellence is what defines us and we stand for, and the opportunity is to build products that will shape ClearTax vision. We are looking for engineering leaders who can join our mission to simplify the financial lives of Indians. Our engineers are involved in all parts of the product life cycle: idea generation, design, planning, execution, and shipping. We create reliable, scalable and highly performant systems. A commitment to teamwork, hustle and good communication skills are key requirements. RESPONSIBILITIES - Responsible for driving operational excellence for the connected services that a business offers to its customers to deliver an "always on" operation, year round, at the right cost - Uses knowledge of technology and operational best practices to drive the design, development and implementation of operational standards and capabilities for connected services that enable highly available, scalable & reliable customer experiences. - Analyzes and synthesizes a variety of inputs to drives the end-to-end incident management process for multiple offerings - Developing monitoring architecture and implementing monitoring agents, dashboards, escalations, and alerts - Developing and driving incident management processes, playbooks and stakeholder communication mechanisms - Overseeing change management & configuration management operating mechanisms - Driving root cause analysis (RCA) and risk management processesM - Driving ongoing improvements and efficiencies in operational practices, tools & processes. REQUIRED QUALIFICATIONS - Strong OS fundamentals including Process & Memory management - Strong Networking concepts - Good knowledge of Linux command-line tools and ability to write Bash scripts. Proficiency with at least one other scripting language (Perl/Python/Ruby). - Expertise in UNIX/Linux/Windows Operating system administration and management - DBMS - MySQL. Familiarity with Postgres, MongoDB will be a plus. - Strong understanding of system monitoring and troubleshooting - Experience working in a 24/7 team managing large scale distributed systems - Familiarity with CI/CD tools like Jenkins - Familiarity with Configuration Management tools like Ansible, Terraform, Chef, Puppet, etc. - Familiarity with AWS service administration is a bonus.
-
Site Reliability Engineer
2 weeks ago
Bengaluru, India Pro5.ai Full timeOur client is seeking a Site Reliability Engineer I to join their growing technology operations team. This role is ideal for someone passionate about system reliability, incident response, and cross-team collaboration in a large-scale cloud environment.What You’ll DoAct as the first point of contact for all customer-affecting issues.Drive and manage the...