Site Reliability Engineer-On prem

4 weeks ago


Delhi, India PhonePe Full time
ob Overview:

As a Site Reliability Engineer (SRE) specializing in Data Platform OnPremise, you will play a critical role in deployment, ensuring the reliability, scalability, and performance of our

Cloudera

Data Platform (CDP) infrastructure. You will collaborate closely with cross-functional teams to design, implement, and maintain robust systems that support our data-driven initiatives. The ideal candidate will have a deep understanding of Cloudera Data Platform, strong troubleshooting skills, and a proactive mindset towards automation and optimization. You will play a pivotal role in ensuring the smooth functioning, operation, performance and security of large high density Cloudera-based infrastructure.

Key Responsibilities:

Implementation of Cloudera Data Platform: Lead the implementation process of Cloudera Data Platform on-premises, including planning, installation, configuration, and integration with existing systems.Infrastructure Management: Manage and maintain the Cloudera-based infrastructure, ensuring optimal performance, high availability, and scalability. This includes monitoring system health, troubleshooting issues, and performing routine maintenance tasks.Data Security and Compliance: Implement and enforce security best practices to safeguard data integrity and confidentiality within the Cloudera environment. Ensure compliance with relevant regulations and standards (e.g., GDPR, HIPAA, DPR).Performance Optimization: Continuously optimize the Cloudera infrastructure to enhance performance, efficiency, and cost-effectiveness. Identify and resolve bottlenecks, tune configurations, and implement best practices for resource utilization.Capacity Planning: Monitor resource utilization trends and plan for future capacity needs. Proactively identify potential capacity constraints and propose solutions to address them.Backup and Disaster Recovery: Implement robust backup and disaster recovery strategies to ensure data protection and business continuity. Test and maintain backup and recovery procedures regularly.Patches & Upgrades: Routinely apply recommended patches and perform rolling upgrades of the platform in accordance with the advisory from Cloudera, InfoSec and Compliance.Documentation and Knowledge Sharing: Create comprehensive documentation for configurations, processes, and procedures related to the Cloudera Data Platform. Share knowledge and best practices with team members to foster continuous learning and improvement.Collaboration and Communication: Collaborate effectively with cross-functional teams including data engineers, developers, and IT operations personnel. Communicate project status, issues, and resolutions clearly and promptly.

Qualifications:Bachelor's degree in Computer Science, Engineering, or related field.Proficiency in Linux system administration, shell scripting, and networking concepts.5+ years of experience in managing Big Data infrastructure,terraform.Strong understanding of distributed computing principles and experience with Hadoop ecosystem technologies (HDFS, MapReduce, YARN, Hive, Spark, etc.).Hands-on experience with configuration management tools (e.g., Salt,Ansible, Puppet, Chef).Strong scripting skills (e.g., Python, Bash) for automation and troubleshooting.Experience with monitoring and logging solutions (e.g., Prometheus, Grafana, ELK stack).Knowledge of networking principles and protocols (TCP/IP, UDP, DNS, DHCP, etc.).Experience with managing *nix based machines and strong working knowledge of quintessential Unix programs and tools (e.g. Ubuntu, Fedora, Redhat, etc.)Excellent communication skills and the ability to collaborate effectively with cross-functional teams.Excellent analytical, problem-solving, and troubleshooting skills..Proven ability to work well under pressure and manage multiple priorities simultaneously.

Good To Have:Cloudera Certified Administrator (CCA) or Cloudera Certified Professional (CCP) certification preferred.Minimum 5 years of experience in managing and administering medium/large hadoop based environments (>100 machines), including Cloudera Data Platform (CDP) experience is highly desirable.Familiarity with Open Data Lake components such as Ozone, Iceberg, Spark, Flink, etc.Familiarity with containerization and orchestration technologies (e.g. Docker, Kubernetes, OpenShift) is a plus



  • Delhi, Delhi, India PhonePe Full time

    ob Overview:As a Site Reliability Engineer (SRE) specializing in Data Platform OnPremise, you will have a crucial role in the deployment process to ensure the reliability, scalability, and performance of our Cloudera Data Platform (CDP) infrastructure. Your collaboration with various teams will be key in designing, implementing, and maintaining sturdy...


  • delhi, India World Wide Technology Full time

    World Wide Technology (WWT), a global technology integrator and supply chain solutions provider. WWT employs more than 7000 people worldwide and operates in more than 2 million square feet of state-of-the-art warehousing, distribution, and integration space strategically located throughout the world. WWT is ranked on Glassdoor Best Places to Work for 12...


  • delhi, India Cricbuzz.com Full time

    Site Reliability EngineerWe are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services.Experience - 3 - 5 yearsResponsibilities:● Design,...


  • Delhi, India ViewSonic Full time

    Job Requirements:Bachelor’s degree in computer science, Engineering, or a related field.3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role.Proficient in AWS solutions including but not limited to EC2, S3, CloudWatch, Lambda, and RDS.Strong understanding of Platform Engineering concepts and principles.Experience with...


  • delhi, India Korn Ferry Full time

    Role - Site Reliability EngineerExp - 5+ years RequiredLocation - Hyderabad ( Work from Office-Hybrid)Shift Timings - 5AM -1 PM ISTWe are looking for a Site Reliability Engineer with strong development background to join our team. In this role, you will be responsible for ensuring the reliability and performance of our systems. You will work closely to our...


  • delhi, India SID Global Solutions Full time

    Dear Candidates,We are looking for immediate joiners 8 to 9 years for Hyderabad Location for a talented Site Reliability Engineer-Manager to join our dynamic team and contribute to the development of our cutting-edge web applications. If you're passionate about the role and have experience in SRE, GCP and Kubernetes , send me your updated cv : Please...


  • Delhi, India Daxko Full time

    Company DescriptionDaxko powers health & wellness throughout the world. Every day our team members focus their passion and expertise in helping health & wellness facilities operate efficiently and engage their members.Whether a neighborhood yoga studio, a national franchise with locations in every city, a YMCA or JCC--and every type of organization in...


  • Delhi, India SID Global Solutions Full time

    Dear Candidates,We are looking for immediate joiners8 to 9 years for Hyderabad Locationfor a talented Site Reliability Engineer-Manager to join our dynamic team and contribute to the development of our cutting-edge web applications. If you're passionate about the role and have experience inSRE, GCP and Kubernetes , send me your updated cv : find below the...


  • Delhi, Delhi, India Serendipity Recruiting Full time

    Job DescriptionAs a Site Reliability Engineer (SRE), you will play a vital role in continuously driving improvements in observability, performance, and reliability, aiming to make a substantial impact across the federal government.Our client firmly believes that exceptional technology services are built upon exceptional individuals. For over two decades, our...


  • Delhi, India Quiktrak, LLC Full time

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps EngineerJob Description:Summary:As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...


  • Delhi, India World Wide Technology Full time

    Responsibilities This role is part of a dedicated team of SREs that operate mission-critical IT Infrastructure and Cloud Management platforms. The role requires communications skills and patience to work with people as well as technology. We encourage our engineers to work with and understand our users, leveraging our experience to understand problems, and...


  • Delhi, India World Wide Technology Full time

    ResponsibilitiesThis role is part of a dedicated team of SREs that operate mission-critical IT Infrastructure and Cloud Management platforms. The role requires communications skills and patience to work with people as well as technology. We encourage our engineers to work with and understand our users, leveraging our experience to understand problems, and...


  • Delhi, Delhi, India Exoscale Full time

    Job DescriptionExoscale is the leading Swiss/European cloud service provider.With services covering the full cloud infrastructure spectrum - from fast deploying virtual machines to S3 compatible object storage - Exoscale provides a simple and scalable experience in order to let its clients focus on their core business.Join a dynamic working environment with...


  • Delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer 100% REMOTE The Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • delhi, India World Wide Technology Full time

    ResponsibilitiesThis role is part of a dedicated team of SREs that operate mission-critical IT Infrastructure and Cloud Management platforms. The role requires communications skills and patience to work with people as well as technology. We encourage our engineers to work with and understand our users, leveraging our experience to understand problems, and...


  • delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer100% REMOTEThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • Delhi, Delhi, India World Wide Technology Full time

    ResponsibilitiesThis role is part of a dedicated team of SREs that operate mission-critical IT Infrastructure and Cloud Management platforms. The role requires communications skills and patience to work with people as well as technology. We encourage our engineers to work with and understand our users, leveraging our experience to understand problems, and...


  • delhi, India Amicon Hub Services Full time

    Why WE :- Competitive salary and performance-based incentives Opportunities for career growth and advancement Collaborative and innovative work environment Cutting-edge technology solutions Strong commitment to employee development and well-beingJob details :-Position :- DevOps – EngineerNature of Job - PermanentLocation - NoidaWorking Day...


  • Delhi, Delhi, India System Soft Technologies Full time

    Title: Site Reliability Engineer100% REMOTEThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • delhi, India WaferWire Cloud Technologies Full time

    Role: SRE (Site Reliability Engineer)Experience: 4+ YearsAbout WaferWire Cloud Technologies:WaferWire Cloud Technologies is a leading provider of innovative cloud solutions aimed at transforming businesses and driving digital growth. With a focus on cutting-edge technology and customer-centric approaches, we empower organizations to thrive in the digital...