Site Reliability Engineer

3 months ago


Bangalore Urban, India PhonePe Full time

Job Overview:


As a Site Reliability Engineer (SRE) specializing in Data Platform OnPremise, you will play a critical role in deployment, ensuring the reliability, scalability, and performance of our Cloudera Data Platform (CDP) infrastructure. You will collaborate closely with cross-functional teams to design, implement, and maintain robust systems that support our data-driven initiatives. The ideal candidate will have a deep understanding of Cloudera Data Platform, strong troubleshooting skills, and a proactive mindset towards automation and optimization. You will play a pivotal role in ensuring the smooth functioning, operation, performance and security of large high density Cloudera-based infrastructure.


Key Responsibilities:


  1. Implementation of Cloudera Data Platform: Lead the implementation process of Cloudera Data Platform on-premises, including planning, installation, configuration, and integration with existing systems.
  2. Infrastructure Management: Manage and maintain the Cloudera-based infrastructure, ensuring optimal performance, high availability, and scalability. This includes monitoring system health, troubleshooting issues, and performing routine maintenance tasks.
  3. Data Security and Compliance: Implement and enforce security best practices to safeguard data integrity and confidentiality within the Cloudera environment. Ensure compliance with relevant regulations and standards (e.g., GDPR, HIPAA, DPR).
  4. Performance Optimization: Continuously optimize the Cloudera infrastructure to enhance performance, efficiency, and cost-effectiveness. Identify and resolve bottlenecks, tune configurations, and implement best practices for resource utilization.
  5. Capacity Planning: Monitor resource utilization trends and plan for future capacity needs. Proactively identify potential capacity constraints and propose solutions to address them.
  6. Backup and Disaster Recovery: Implement robust backup and disaster recovery strategies to ensure data protection and business continuity. Test and maintain backup and recovery procedures regularly.
  7. Patches & Upgrades: Routinely apply recommended patches and perform rolling upgrades of the platform in accordance with the advisory from Cloudera, InfoSec and Compliance.
  8. Documentation and Knowledge Sharing: Create comprehensive documentation for configurations, processes, and procedures related to the Cloudera Data Platform. Share knowledge and best practices with team members to foster continuous learning and improvement.
  9. Collaboration and Communication: Collaborate effectively with cross-functional teams including data engineers, developers, and IT operations personnel. Communicate project status, issues, and resolutions clearly and promptly.


Qualifications:

  1. Bachelor's degree in Computer Science, Engineering, or related field.
  2. Proficiency in Linux system administration, shell scripting, and networking concepts.
  3. 5+ years of experience in managing Big Data infrastructure.
  4. Strong understanding of distributed computing principles and experience with Hadoop ecosystem technologies (HDFS, MapReduce, YARN, Hive, Spark, etc.).
  5. Hands-on experience with configuration management tools (e.g., Salt,Ansible, Puppet, Chef).
  6. Strong scripting skills (e.g., Python, Bash) for automation and troubleshooting.
  7. Experience with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack).
  8. Knowledge of networking principles and protocols (TCP/IP, UDP, DNS, DHCP, etc.).
  9. Experience with managing *nix based machines and strong working knowledge of quintessential Unix programs and tools (e.g. Ubuntu, Fedora, Redhat, etc.)
  10. Excellent communication skills and the ability to collaborate effectively with cross-functional teams.
  11. Excellent analytical, problem-solving, and troubleshooting skills..
  12. Proven ability to work well under pressure and manage multiple priorities simultaneously.


Good To Have:

  1. Cloudera Certified Administrator (CCA) or Cloudera Certified Professional (CCP) certification preferred.
  2. Minimum 5 years of experience in managing and administering medium/large hadoop based environments (>100 machines), including Cloudera Data Platform (CDP) experience is highly desirable.
  3. Familiarity with Open Data Lake components such as Ozone, Iceberg, Spark, Flink, etc.
  4. Familiarity with containerization and orchestration technologies (e.g. Docker, Kubernetes, OpenShift) is a plus



  • Bangalore Urban, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • Bangalore Urban, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • Bangalore Urban, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • Bangalore Urban, India Integra Connect Full time

    About IntegraConnect Integra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • bangalore, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff to...


  • Bangalore, Karnataka, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff to...


  • Bangalore, Karnataka, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff to...


  • Bangalore, India Cyitechsearch Full time

    We are hiring for Site Reliability Engineer Skills : - Develop and provide operational support for full-stack software applications.- Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation.- Five years' experience as a site reliability engineer or similar role.- Collaborate with development operations staff...


  • bangalore, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 yearsResponsibilities:● Design,...


  • Bangalore, India Cyitechsearch Full time

    About the job :We are hiring for Site Reliability EngineerExperience : 5+ Years Work Model : Remote / Contract 3 years Skills : Develop and provide operational support for fullstack software applications. Relevant industry certifications, such as through the Site Reliability Engineering (SRE) Foundation. Five years' experience as a site reliability engineer...


  • bangalore, India Kunato Full time

    Site Reliability Engineer (SRE) - Python/GolangJob Description:We are seeking a highly skilled and passionate Site Reliability Engineer (SRE) to join our technology team. The ideal candidate will possess strong programming skills with expertise in Python, Golang, or both. This role is pivotal in ensuring the high availability, performance, and security of...


  • bangalore, India Kunato Full time

    Site Reliability Engineer (SRE) - Python/GolangJob Description:We are seeking a highly skilled and passionate Site Reliability Engineer (SRE) to join our technology team. The ideal candidate will possess strong programming skills with expertise in Python, Golang, or both. This role is pivotal in ensuring the high availability, performance, and security of...


  • bangalore, India Qure.ai Full time

    About the jobJob Title: Site Reliability EngineerDepartment: EngineeringLocation: BangaloreYears of experience: 2-5 yearsType: Full Time EmploymentAbout Qure.ai:Qure.ai is one of the fastest-growing startups in India, which develops Artificial Intelligence enabled products and platforms for healthcare diagnostics. We create cutting-edge solutions that...


  • bangalore, India Kunato Full time

    Site Reliability Engineer (SRE) - Python/Golang Job Description: We are seeking a highly skilled and passionate Site Reliability Engineer (SRE) to join our technology team. The ideal candidate will possess strong programming skills with expertise in Python, Golang, or both. This role is pivotal in ensuring the high availability, performance, and security...


  • bangalore, India Vistex Full time

    Vistex is currently hiring a Site Reliability Engineer. The Vistex Site Reliability Engineer will be primarily responsible for service availability, performance, monitoring, incident response, and capacity planning. This is a highly technical, hands-on role with a strong focus on automation, accurate monitoring, actionable alerting, resilient design,...


  • bangalore, India Vistex Full time

    Vistex is currently hiring a Site Reliability Engineer. The Vistex Site Reliability Engineer will be primarily responsible for service availability, performance, monitoring, incident response, and capacity planning. This is a highly technical, hands-on role with a strong focus on automation, accurate monitoring, actionable alerting, resilient design,...


  • bangalore, India Ultrabot Innovations Full time

    Position Overview :As a Senior Site Reliability Engineer with 5-8 years of experience, you will play a key role in ensuring the reliability, scalability, and performance of our systems and infrastructure. You will leverage your expertise in Site Reliability Engineering (SRE) to implement best practices and methodologies, effectively troubleshoot complex...


  • bangalore, India Integra Connect Full time

    About IntegraConnectIntegra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...


  • bangalore, India Qure.ai Full time

    About the job Job Title: Site Reliability Engineer Department: Engineering Location: Bangalore Years of experience: 2-5 years Type: Full Time Employment About Qure.ai: Qure.ai is one of the fastest-growing startups in India, which develops Artificial Intelligence enabled products and platforms for healthcare diagnostics. We create cutting-edge...


  • bangalore, India Integra Connect Full time

    About IntegraConnectIntegra Connect delivers a comprehensive, integrated suite of cloud-based technologies and services that enable specialty groups to optimize clinical and financial performance as reimbursement shifts to value-based models. Connected by the IntegraCloud platform, the company’s core applications span population health including care...