Senior Database Infrastructure Engineer- Cassandra, DataStax, Big Data Pipelines

2 weeks ago


Pune, India HEROIC.com Full time

HEROIC Cybersecurity ( HEROIC ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms.

You will be responsible for designing and managing fully automated, big data pipelines that ingest, process, and serve hundreds of billions of breached and leaked records sourced from the surface, deep, and dark web. You'll work with DSE Cassandra, Solr, and Spark, helping us move toward a 99% automated pipeline for data ingestion, enrichment, deduplication, and indexing — all built for scale, speed, and reliability.

This position is critical in ensuring our systems are fast, reliable, and resilient as we ingest thousands of unique datasets daily from global threat intelligence sources.

What you will do: 

  • Design, deploy, and maintain high-performance Cassandra clusters using DataStax Enterprise (DSE)
  • Architect and optimize automated data pipelines to ingest, clean, enrich, and store billions of records daily
  • Configure and manage DSE Solr and Spark to support search and distributed processing at scale
  • Automate dataset ingestion workflows from unstructured surface, deep, and dark web sources
  • Cluster management, replication strategy, capacity planning, and performance tuning
  • Ensure data integrity, availability, and security across all distributed systems
  • Write and manage ETL processes, scripts, and APIs to support data flow automation
  • Monitor systems for bottlenecks, optimize queries and indexes, and resolve production issues
  • Research and integrate third-party data tools or AI-based enhancements (e.g., smart data parsing, deduplication, ML-based classification)
  • Collaborate with engineering, data science, and product teams to support HEROIC’s AI-powered cybersecurity platform




Requirements
  • Minimum 5 years experience with Cassandra / DataStax Enterprise in production environments
  • Hands-on experience with DSE Cassandra, Solr, Apache Spark, CQL, and data modeling at scale
  • Strong understanding of NoSQL architecture, sharding, replication, and high availability
  • Advanced knowledge of Linux/Unix, shell scripting, and automation tools (e.g., Ansible, Terraform)
  • Proficient in at least one programming language: Python, Java, or Scala
  • Experience building large-scale automated data ingestion systems or ETL workflows
  • Solid grasp of AI-enhanced data processing, including smart cleaning, deduplication, and classification
  • Excellent written and spoken English communication skills
  • Prior experience with cybersecurity or dark web data (preferred but not required)



Benefits
  • Position Type: Full-time
  • Location: Pune, India  (Remote – Work from anywhere)
  • Compensation: Competitive salary depending on experience
  • Benefits: Paid Time Off + Public Holidays
  • Professional Growth: Amazing upward mobility in a rapidly expanding company.
  • Innovative Culture: Fast-paced, innovative, and mission-driven. Be part of a team that leverages AI and cutting-edge technologies. 

     

About Us: HEROIC Cybersecurity ( HEROIC ) is building the future of cybersecurity. Unlike traditional cybersecurity solutions, HEROIC takes a predictive and proactive approach to intelligently secure our users before an attack or threat occurs. Our work environment is fast-paced, challenging and exciting. At HEROIC, you’ll work with a team of passionate, engaged individuals dedicated to intelligently securing the technology of people all over the world.

Position Keywords: DataStax Enterprise (DSE), Apache Cassandra, Apache Spark, Apache Solr, AWS, Jira, NoSQL, CQL (Cassandra Query Language), Data Modeling, Data Replication, ETL Pipelines, Data Deduplication, Data Lake, Linux/Unix Administration, Bash, Docker, Kubernetes, CI/CD, Python, Java, Distributed Systems, Cluster Management, Performance Tuning, High Availability, Disaster Recovery, AI-based Automation, Artificial Intelligence, Big Data, Dark Web Data





  • Pune, India HEROIC.com Full time

    Job Description HEROIC Cybersecurity ( HEROIC.com ) is seeking a Senior Data Infrastructure Engineer with deep expertise in DataStax Enterprise (DSE) and Apache Cassandra to help architect, scale, and maintain the data infrastructure that powers our cybersecurity intelligence platforms. You will be responsible for designing and managing...


  • Pune, India POWER IT SERVICES Full time

    Casandra Database Administrator Experience: 7 years Technical Skills: • Strong experience with Apache Cassandra (Open Source, DataStax, Scylla DB). • Deep understanding of Cassandra architecture, partitions, replication, and consistency levels. • Proficiency in CQL (Cassandra Query Language) and data modelling. • Hands-on experience with Linux...


  • Pune, India Trigent Software Private Limited Full time

    Role Purpose Cassandra and Postgres Database Administrator (DBA) Job Summary: Cassandra Database Administration: Design, configure, and maintain Cassandra clusters deployed in HSBC. Install, configure, and manage multi-node Cassandra clusters both on-premise and cloud, both public and internal Perform database backups, recovery, and restoration...


  • Pune, India Devlats Pvt Ltd Full time

    Sr. Big Data EngineerLocation : PuneExperience : 8+ years overall; 6+ years relevantMode : HybridRole Overview :We are seeking a talented Sr. Big Data Engineer to design, develop, and support a highly scalable, distributed SaaS-based Security Risk Prioritization product. You will lead the design and evolution of our data platform and pipelines, providing...

  • Big Data Engineer

    2 weeks ago


    Pune, Maharashtra, India Talent Sketchers Full time ₹ 15,00,000 - ₹ 25,00,000 per year

    Designation: Big Data EngineerExperience: 4+ YearsWork Mode: Remote OpportunityNotice Period: Immediate Joiners/ Serving Notice PeriodJob Description:Role: Big Data EngineerThis Data Engineer will be engaged in data science-related research and software application development and engineering duties related to our enterprise-grade Wi-Fi technology to provide...

  • Big Data Engineer

    2 weeks ago


    Pune, India RiskInsight Consulting Pvt Ltd Full time

    Responsibilities Design, develop, and implement robust Big Data solutions using technologies such as Hadoop, Spark, and NoSQL databases. Build and maintain scalable data pipelines for effective data ingestion, transformation, and analysis. Collaborate with data scientists, analysts, and cross-functional teams to understand business requirements and...


  • Pune, Maharashtra, India Equifax Full time ₹ 80,00,000 - ₹ 2,00,00,000 per year

    TheDatabase Engineerwill be actively involved in the evaluation, review, and management of databases. You will be part of a team who supports a range of Applications and databases. You should be well versed in database administration which includes installation, performance tuning and troubleshooting. A strong candidate will be able to rapidly troubleshoot...

  • Sr Data Engineer

    4 weeks ago


    Pune, India Ankercloud Full time

    Job Description: As a Senior Data Engineer you will play a crucial role in the development, maintenance, and optimization of our data infrastructure. You will work closely with cross-functional teams, including data scientists, analysts, and software engineers, to ensure that our data pipelines and systems are reliable, scalable, and efficient. Key...


  • Pune, India LIFTU TECHNOLOGY PRIVATE LIMITED Full time

    Job Summary :We are seeking an experienced Big Data Architect to lead the design and implementation of scalable, secure, and high-performing big data solutions. The ideal candidate will possess a deep understanding of big data technologies, data modeling, and cloud-based data services. You will collaborate with cross-functional teams including data...


  • Pune, India Devlats Pvt Ltd Full time

    About the Role : We are seeking a highly skilled and experienced Big Data Engineer to join our team. The ideal candidate will have strong expertise in designing, building, and optimizing data pipelines and architectures, as well as a solid understanding of big data tools and Responsibilities : - Design, develop, and maintain scalable and high-performance...