
Cloud Site Reliability, Staff
6 days ago
Synopsys IT cloud team is responsible for providing best in class EDA Infrastructure & Design environment in the public cloud, optimized to meet the scale and complexity of the EDA workload.
As we expand our cloud deployments, we are looking for a talented Site Reliability Engineer with experience of EDA/HPC environments to deliver insights from massive-scale data in real-time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.
**Responsibilities**
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
- Implement, maintain, and consult on the observability stack that supports the needs of multiple internal stakeholders.
- Utilize your deep experience and problem-solving skills to help prevent and investigate production issues.
- Participate in the design and implementation of new system layers of high complexity compute environments.
**Desired Skills**:
- A degree in Computer Science or a related field, with a minimum of 5 years of experience in SRE roles.
- Knowledge of Cloud engineering / architecture with Azure, AWS or GCP.
- Familiarity with containerization technologies such as Docker, Swarm and Kubernetes.
- Knowledge of IaaC / configuration mgmt. / systems automation tools at scale (e.g. Terraform, Ansible, etc.);
- Deep knowledge of Linux OS, Networking and NFS technologies.
- Experience with data stores and search engines such as Elasticsearch is a must. Other technologies like Prometheus, Grafana, and similar technologies is a plus.
- Experience with CI/CD: GitOps / GitHub Actions, ArgoCD, Flux.
- Solid Python programming skills and experience.
- SLURM, Linux, networking and NFS is required.
- Excellent problem-solving skills and attention to detail.
- Ability to work collaboratively with other teams and stakeholders.
- Ability to work in a fast-paced and dynamic environment.
- Experience implementing and delivering monitoring solutions in development, QA, and Production environments.
- Domain Knowledge of the underlying infrastructure requirements such as Networking, Storage, and Hardware Optimization.
- Proven experience in High-Performance Computing environments for HPC/EDA workload.
- Extremely strong problem-solving / troubleshooting skills. HA and Scalability knowledge and experience
- Proven experience in High-Performance Computing environments for HPC/EDA workload.
**Personal attributes**:
- A team player with strong collaboration skills. Proven communication skills, both verbal and written.
- Passion for continuous learning and knowledge sharing
- Ability to drive continuous improvement and propose innovative solutions
-
Azure Cloud Site Reliability Engineer
2 weeks ago
Hyderabad / Secunderabad, Telangana, India beBeeCloud Full time US$ 1,04,000 - US$ 1,30,878Job DescriptionWe are seeking a highly skilled Azure Cloud Site Reliability Engineer (SRE) to join our organization. The ideal candidate will have a strong background in cloud infrastructure, automation, and operational excellence, with a focus on ensuring the reliability, scalability, and performance of our Azure cloud environments.The successful candidate...
-
Site Reliability Engineer
4 days ago
Hyderabad, Telangana, India UBS Full timeBusiness Divisions Group Functions Your role Are you an analytic thinker Do you enjoy Site Reliability Engineering initiatives and proactive problem management across on-premise Cloud Database ensuring high availability stability of Database infrastructure services Do you want to play a key role in transforming our firm into an agile organization At...
-
Cloud Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Careernet Full time ₹ 1,04,000 - ₹ 1,30,878 per yearKey Skills: Cloud, Kubernetes, Python, Jenkins, OpenTelemetry, AppDynamics, Site Reliability Engineer.Roles & Responsibilities:Design, implement, and manage cloud infrastructure to ensure high availability and reliability.Utilize Kubernetes for container orchestration and management.Develop and maintain monitoring solutions using OpenTelemetry and...
-
Cloud Site Reliability Engineer
1 day ago
Hyderabad, India Careernet Full timeKey Skills: Cloud, Kubernetes, Python, Jenkins, OpenTelemetry, AppDynamics, Site Reliability Engineer. Roles & Responsibilities: Design, implement, and manage cloud infrastructure to ensure high availability and reliability. Utilize Kubernetes for container orchestration and management. Develop and maintain monitoring solutions using OpenTelemetry and...
-
Site Reliability Engineer
2 weeks ago
Hyderabad, Telangana, India Talent Worx Full time US$ 1,20,000 - US$ 2,00,000 per yearTalent Worx is seeking a talented SRE (Site Reliability Engineer) to enhance our technology team. In this role, you will be pivotal in ensuring the reliability, performance, and availability of our applications and services.Your work will involve both software engineering and systems operations as you strive to improve customer experiences and operational...
-
Senior Site Reliability Expert
7 days ago
Hyderabad, Telangana, India beBeeSite Full time ₹ 2,24,00,000 - ₹ 3,51,20,000About Our Senior Site Reliability ExpertThe role of a senior site reliability expert is pivotal in ensuring the stability, scalability, and operational excellence of accounting and finance systems.Key ResponsibilitiesOperational Oversight: As a senior site reliability expert, you will be responsible for overseeing day-to-day operations for accounting and...
-
Site Reliability Engineer
1 week ago
Hyderabad / Secunderabad, Telangana, India beBeeReliability Full time ₹ 1,04,000 - ₹ 1,30,878Cloud Reliability EngineerWe are seeking a skilled Cloud Reliability Engineer to join our team. In this role, you will be responsible for implementing and driving Site Reliability Engineering (SRE) discipline in the project.Key Responsibilities:Implement and drive SRE discipline in the project.Evaluate emerging SRE tools and stay updated on the...
-
Site Reliability Leader
6 days ago
Hyderabad / Secunderabad, Telangana, India beBeeReliability Full time US$ 1,25,000 - US$ 1,75,000Senior Site Reliability EngineerJob Description:We are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our engineering organization, you will be responsible for driving the reliability and scalability of cloud-based systems, identifying and implementing improvements for operational efficiency, and proactively...
-
Site Reliability Engineer
5 days ago
Hyderabad, India Apexsync Technologies Full timeHello Everyone,We're looking for an experienced Site Reliability Engineer who excels in automation, cloud infrastructure, and observability solutions. The right candidate will combine technical depth with a proactive mindset to drive system reliability and performance.Location: Hyderabad (Hybrid Role. 2-3 days in office )Experience level: Senior ( 7 years...
-
Site Reliability Engineer
1 week ago
Hyderabad, Telangana, India Jigya Software Services Full time ₹ 1,50,000 - ₹ 28,00,000 per yearJob Title:Senior Site Reliability Engineer (SRE) - AWS/KubernetesLocation:Hyderabad - OnsiteJob Type:Full-TimeAbout the Role:We are looking for a highly skilled and motivated Site Reliability Engineer to design, build, and maintain our high-performance, scalable cloud infrastructure. You will play a critical role in ensuring the reliability, performance, and...