Platform Resilience Architect

12 hours ago


bangalore, India Rakuten Symphony Full time

Job Title: Platform Resilience Architect Location: Bangalore, Hybrid Why should you choose us? Rakuten Symphony is a Rakuten Group company, that provides global B2B services for the mobile telco industry and enables next-generation, cloud-based, international mobile services. Building on the technology Rakuten used to launch Japan’s newest mobile network, we are taking our mobile offering global. To support our ambitions to provide an innovative cloud-native telco platform for our customers, Rakuten Symphony is looking to recruit and develop top talent from around the globe. We are looking for individuals to join our team across all functional areas of our business – from sales to engineering, support functions to product development. Let’s build the future of mobile telecommunications together What Do We Expect From You This role architects and owns the master test strategy for the entire OSS platform, ensuring its resilience, scalability, and operational readiness in a cloud-native environment. The architect will define the "how" and "what" of platform quality by designing comprehensive test situations, chaos engineering experiments, and security test plans. The strategic purpose is to guarantee the underlying platform for the entire OSS suite is robust, secure, and production-grade by engineering the comprehensive validation strategy required to certify high-value capabilities like automated deployment, High Availability (HA), and Disaster Recovery (DR). The company may expect you to undertake other tasks outside of this job description. This job description is not exhaustive and may be updated from time to time. Responsibilities Resilience & Chaos Engineering Architecture Architect the comprehensive chaos engineering strategy to proactively identify systemic weaknesses and validate the resilience of the OSS platform against infrastructure, network, and application failures. Design and govern the master test strategy for validating the platform's multi-DC High Availability and Disaster Recovery procedures, defining the methodologies to measure and certify RTO and RPO against business requirements. Performance & Scalability Validation Strategy Design the end-to-end performance testing strategy to certify the platform's scalability, latency, and throughput against defined Service Level Objectives (SLOs). Architect the validation approach for resource management and capacity planning, ensuring the platform's efficiency and ability to scale cost-effectively under various load conditions. Security & Compliance Validation Architecture Architect the platform's security validation strategy, defining the continuous test plans for penetration testing, vulnerability scanning, and software supply chain security (e.g., container image scanning). Design the validation framework to ensure the platform's configuration and deployment pipelines adhere to industry compliance standards and internal security best practices. Platform Lifecycle & Operability Validation Design the validation architecture for the OSS platform's complete lifecycle, including automated "zero-touch" installation, seamless in-service upgrades, and dynamic scaling. Architect the test strategy for platform observability, ensuring that logging, monitoring, and alerting mechanisms are sufficient to guarantee operational readiness and rapid fault isolation. Technical Governance & Enablement Define and evangelize the technical requirements for the next generation of infrastructure simulators and chaos engineering tools, providing clear specifications to the Tools Engineering team. Act as the primary technical authority on platform quality, providing strategic guidance and quality benchmarks to the Platform Development, SRE, and Product QE teams to influence design for testability. Qualifications Experience and Expertise 8+ years in platform-focused engineering roles such as SRE, DevOps, or Cloud Engineering, with at least 3 years in an architect role focused on cloud-native infrastructure Analytical and Problem-Solving Skills Ability to think like an adversary (for both security and failure testing) to uncover systemic weaknesses in a distributed platform before they manifest in production. Passion for solving problems and delivering optimal solutions. Technical Skills Expert-level knowledge of Kubernetes, container runtimes, and service mesh technologies (e.g., Istio). Deep experience with Infrastructure as Code (Terraform, Ansible) and CI/CD practices. Hands-on experience with chaos engineering principles and tools (e.g., LitmusChaos, Chaos Mesh). Additional Skills Certified Kubernetes Administrator (CKA) or similar certification. Experience with large-scale public cloud deployments (AWS, Azure, GCP). Experience with observability stacks (Prometheus, Grafana, Loki). Understanding of DevSecOps principles and their enforcement across lifecycle. Prior exposure to working in cross-continental distributed organizations. Working knowledge of Atlassian suite, Confluence, and modern OKR tracking tools. Good understanding of Agile principles and processes. Experience working in a fluid, start-up-like environment. Rakuten Shugi Principles: Our worldwide practices describe specific behaviours that make Rakuten unique and united across the world. We expect Rakuten employees to model these 5 Shugi Principles of Success. Always improve, always advance. Only be satisfied with complete success - Kaizen. Be passionately professional. Take an uncompromising approach to your work and be determined to be the best. Hypothesize - Practice - Validate - Shikumika. Use the Rakuten Cycle to success in unknown territory. Maximize Customer Satisfaction. The greatest satisfaction for workers in a service industry is to see their customers smile. Speed Speed Speed Always be conscious of time. Take charge, set clear goals, and engage your team.



  • Bangalore, India Rakuten Symphony Full time

    Job Title: Platform Resilience Architect Location: Bangalore, Hybrid Why should you choose us? Rakuten Symphony is a Rakuten Group company, that provides global B2B services for the mobile telco industry and enables next-generation, cloud-based, international mobile services. Building on the technology Rakuten used to launch Japan’s newest mobile network,...


  • bangalore, India Rakuten Symphony Full time

    Job Title: Platform Resilience ArchitectLocation: Bangalore, Hybrid Why should you choose us? Rakuten Symphony is a Rakuten Group company, that provides global B2B services for the mobile telco industry and enables next-generation, cloud-based, international mobile services. Building on the technology Rakuten used to launch Japan’s newest mobile network,...


  • bangalore, India Rakuten Symphony Full time

    Job Title: Platform Resilience ArchitectLocation: Bangalore, HybridWhy should you choose us?Rakuten Symphony is a Rakuten Group company, that provides global B2B services for the mobile telco industry and enables next-generation, cloud-based, international mobile services. Building on the technology Rakuten used to launch Japan's newest mobile network, we...


  • bangalore district, India Rakuten Symphony Full time

    Job Title: Platform Resilience Architect Location: Bangalore, Hybrid Why should you choose us? Rakuten Symphony is a Rakuten Group company, that provides global B2B services for the mobile telco industry and enables next-generation, cloud-based, international mobile services. Building on the technology Rakuten used to launch Japan’s newest mobile network,...


  • bangalore, India beBeeArchitect Full time

    Job Title: Platform Resilience ArchitectAt our company, we're building a cutting-edge cloud-based platform for the mobile telco industry. We're looking for a skilled professional to join our team and help shape the future of mobile telecommunications.Key ResponsibilitiesDesign and implement comprehensive chaos engineering strategies to identify systemic...

  • Associate Architect

    2 days ago


    bangalore, India Quantiphi Full time

    Role : Associate Architect - MLOps / LLMOps Experience : 6 to 8 Years Location : Bangalore / Mumbai (Hybrid) Job Summary: Join our dynamic team as a Platform Architect and leverage your expertise in production-scale platforms within the GenAI or ML domain . In this role, you'll be instrumental in designing, developing and maintaining cutting-edge build and...

  • Associate Architect

    5 minutes ago


    bangalore, India Quantiphi Full time

    Role : Associate Architect - MLOps / LLMOpsExperience : 6 to 8 YearsLocation : Bangalore / Mumbai (Hybrid)Job Summary: Join our dynamic team as a Platform Architect and leverage your expertise in production-scale platforms within the GenAI or ML domain. In this role, you'll be instrumental in designing, developing and maintaining cutting-edge build and test...

  • Associate Architect

    2 weeks ago


    bangalore district, India Quantiphi Full time

    Role : Associate Architect - MLOps / LLMOps Experience : 6 to 8 Years Location : Bangalore / Mumbai (Hybrid) Job Summary: Join our dynamic team as a Platform Architect and leverage your expertise in production-scale platforms within the GenAI or ML domain . In this role, you'll be instrumental in designing, developing and maintaining cutting-edge build and...

  • Platform Architect

    1 week ago


    bangalore, India Photon Full time

    Java Platform Architect / Enterprise ArchitectBangaloreRole FocusThis role functions as an Enterprise Architect responsible for the comprehensive architecture and design of the entire system landscape, covering front-end, back-end, database, middleware, and cloud infrastructure. The primary focus is architecting scalable, microservices-based, event-driven...


  • bangalore, India beBeeClimate Full time

    Job Opportunity: Climate Risk AnalystThe role of a Climate Risk Analyst involves conducting thorough climate change studies, vulnerability assessments, and climate risk analysis to support climate-resilient development initiatives.Design and evaluate agricultural investment projects with consideration for climate change impacts.Conduct comprehensive analysis...