Site Reliability Engineer
3 weeks ago
Job Description Our Company Changing the world through digital experiences is what Adobe's all about. We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital experiences We're passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. We're on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours Adobe Pass is a leading authentication and authorization platform that enables seamless access to premium TV and video content across devices. It powers TV Everywhere experiences by allowing users to sign in with their pay-TV credentials to watch subscribed content from broadcasters and streaming services. Trusted by major media companies, Adobe Pass ensures secure, scalable, and frictionless user authentication, while providing insights and analytics that help content providers deliver personalized and compliant viewing experiences. Key Responsibilities: - System Design & Architecture Design, build, and maintain scalable, highly available infrastructure and services for Adobe Pass platform. Collaborate with engineering teams to ensure new products and features are designed with reliability and scalability in mind. Create resilient architectures that prevent downtime and enhance service reliability through redundancy, failover strategies, and automated recovery mechanisms. - Automation & Tooling Develop automation frameworks for continuous integration/continuous deployment (CI/CD) pipelines, infrastructure provisioning, and operational tasks. Build tools to monitor system performance, reliability, and capacity, reducing manual interventions and operational overhead. Drive initiatives for end-to-end automation, optimizing for efficiency and reducing human error. - Monitoring & Incident Management Implement and maintain robust monitoring systems that detect anomalies and provide real-time alerting on key system metrics (latency, availability, etc.). Lead incident management processes, including troubleshooting, root cause analysis, and post-mortem reviews to prevent future occurrences. Collaborate with support and engineering teams to develop strategies for minimizing incidents and reducing mean time to recovery (MTTR). - Performance Optimization & Capacity Planning: Analyze system performance and make recommendations for improvement, focusing on latency reduction, increased throughput, and cost efficiency. Conduct capacity planning to ensure the infrastructure can scale efficiently to meet the growing demands of Adobe's advertising platform. Perform load testing and simulate peak traffic scenarios to identify potential bottlenecks. - Collaboration & Knowledge Sharing: Partner with software engineers, product managers, and other stakeholders to understand business requirements and ensure system reliability meets the platform's needs. Document best practices, system designs, and incident response procedures, promoting knowledge sharing within the team. Mentor and provide technical leadership to junior engineers, fostering a culture of continuous learning and improvement. Qualifications: - Bachelor's or Master's degree in Computer Science, Engineering, or a related field. 7+ years of experience in site reliability engineering, infrastructure engineering, or a similar role. - Proven experience in managing large-scale distributed systems, preferably in cloud environments such as AWS, Azure, or GCP. - Strong programming and scripting skills (e.g., Python, Go, Bash) with a focus on automation. - Deep understanding of containerization and orchestration technologies (Docker, Kubernetes, etc.). - Expertise in monitoring tools (Prometheus, Grafana, Datadog) and incident management practices. - Experience with CI/CD pipelines, infrastructure as code (Terraform, CloudFormation), and version control (Git). - Solid knowledge of networking, storage, and database systems, both relational and NoSQL. - Excellent problem-solving, troubleshooting, and analytical skills. Preferred Qualifications: - Experience working with advertising platforms or related digital marketing technologies. - Familiarity with big data systems and analytics platforms (Hadoop, Kafka, Spark). Cloud certifications (AWS, Azure, GCP). - Experience with security best practices and compliance standards (ISO, SOC 2, GDPR). Adobe is proud to be anemployer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. Adobe aims to make Adobe.com accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, emailor call (408) 536-3015.
-
Site Reliability Engineer
3 weeks ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Site Reliability Engineer
3 weeks ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Site Reliability Engineer
3 weeks ago
India Pagos Consultants Full timewe are looking for experienced site reliability engineers to join a founding team of startup-minded individuals that will lay the groundwork for our new fintech offering. This team will play a pivotal role in spearheading innovation. As such, you will have the opportunity to shape the early architecture and design of the system and set the trajectory for its...
-
Site Reliability Engineer
4 weeks ago
India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – AWS Experience: 8+ years Location: Chennai / Mumbai Work Mode: Hybrid Key Skills: AWS, Terraform, Kubernetes, Docker, Grafana, Prometheus, Datadog Job Summary: We are looking for a skilled Site Reliability Engineer (SRE) with strong AWS experience and a solid background in DevOps, automation, observability, and...
-
Site Reliability Engineer
2 weeks ago
India Insight Global Full timeCompany: Insight Global Duration: Approved for 1 year 📍 Location: Remote (India) 💼 Type: Contract with Insight Global Client 💰 Compensation: 14 LPA – 20 LPA 🕒 Working Hours: Normal IST hours 🚀 Start Date: Immediate (No notice period) About the Role Join our Site Reliability Engineering (SRE) team as a Fullstack Developer, focused on building...
-
Site Reliability Engineer
2 weeks ago
Bengaluru, Karnataka, , India Qure ai Technologies Full time ₹ 9,00,000 - ₹ 12,00,000 per yearAbout Qure.AI:Qure.AI is an equal opportunity employer. is a leading Healthcare Artificial Intelligence (AI) company disrupting the 'status quo' by enhancing diagnostic imaging and improving health outcomes with the assistance of machine -supported tools. taps deep learning technology to provide an automated interpretation of radiology exams like X -rays,...
-
Site Reliability Engineer
2 weeks ago
India Insight Global Full timeCompany: Insight GlobalDuration: Approved for 1 year📍 Location: Remote (India)💼 Type: Contract with Insight Global Client💰 Compensation: 14 LPA – 20 LPA🕒 Working Hours: Normal IST hours🚀 Start Date: Immediate (No notice period)About the RoleJoin our Site Reliability Engineering (SRE) team as a Fullstack Developer, focused on building and...
-
Site Reliability Engineer
6 days ago
India pythian Full timeRemote Site Reliability Engineering - Site Reliability Engineering Full Time Remote Site Reliability Engineer India Multiple Timezones Remote Work from Home Why Pythian At Pythian we are experts in strategic database and analytics services driving digital transformation and operational excellence Pythian a multinational company was founded in 1997 and...
-
Site Reliability Engineer
2 weeks ago
india Hydrolix Full timeAbout the jobAt Hydrolix, we are revolutionizing the world of data management and analytics with our innovative cloud data platform, purpose-built for petabyte-scale datasets. Our mission is to help organizations drastically reduce data costs while increasing their data retention.We are looking for a Site Reliability Engineer (SRE) with 8 to 10+ years...
-
Site Reliability Engineering
1 week ago
Bengaluru, Karnataka, India Thakral One Full time US$ 60,000 - US$ 1,20,000 per yearCompany DescriptionThakral One, headquartered in Singapore, is a technology consulting and services company with a strong presence across Asia. The company specializes in technology-driven consulting, custom solution development, data analytics, and leveraging cloud capabilities to deliver enhanced decision support and practical outcomes. Collaborating...