Aws Sre

1 week ago


Mumbai Maharashtra, India Blazeclan Technologies Full time

Setting up world class observability platform for Multi Cloud Infrastructure services. Reviewing and contributing to setting up observability for infrastructure of new/existing cloud apps. Analyzing, troubleshooting, and designing vital services, platforms, and infrastructure while always thinking about reliability, scalability, resilience, automation, security, and performance

2. Continue improving cloud product reliability, availability, maintainability & cost/benefit-incl. developing fault-tolerant tools to ensure general robustness of the cloud infra.

3. Responsible for availability, performance, monitoring, and incident response, among other things, of the platforms and services of cloud Landing zone.

4. Manage capacity across public and private cloud resource pools-incl. automating scale down/up of environments.

5. Ensuring that everything that goes to production complies with a set of general requirements like diagrams, documents, security compliance, dependencies of other services, monitoring and logging plans, backups, and possible high availability setups.

6. Ensuring the efficient functioning of cloud resources and functions in accordance with company security policies and best practices in cloud security

7. Employ exceptional problem-solving skills, with the ability to see and solve issues before they affect business productivity.

8. Support developers in optimising and automating cloud engineering activities, -e.g. real-time migration, provisioning and deployment, etc.

9. Monitoring and action of hardware degradation, networking problems, high usage of resources, or slow responses on cloud Landing zone.

10. Preparing and managing runbook having procedures necessary for getting services up and running again quickly in case of any issues.

11. Enable automation for some of key functions like CI/CD across SDLC phases, monitoring, alert, incident response, infra provisioning, and patching.

12. As Site Reliability Engineers focus on system reliability, they reduce operational expenses, lessen and mitigate failure points, while automate monotonous time and resource-wasting tasks resulting in economic savings both in terms of effort and money.

13. Failure resolution is preemptive, as SRE Engineers, identify failure causes early while mitigating faults more holistically.

14. Developing and maintaining cloud solutions in accordance with best practices.

15. Perform Incident Analysis on a regular basis with the intention of preventing and finding a long term solve for Incidents

**Requirements**:

- Experience / working knowledge with configuring, deploying, and operating public cloud services (Azure, AWS, GCP)

2. Knowledge of best practices related to security, performance, high-availability, and disaster recovery.

3. Demonstrate a proven record of handling production issues, planning escalation procedures, conducting post-mortems, impact analysis, risk assessments and other related procedures.

4. In-depth experience with continuous integration and continuous deployment pipelines

5. Blend of both Development and SRE mindset (i.e. software and infrastructure)

6. Experience with APM solutions, multi cloud monitoring solutions (Zabbix, graphana, cloudwatch, cloud logging, Network performance monitoring etc)

7. Familiarity with network and security features e.g. cloud network topology, BGP, routing, TCP/IP, DNS, SMTP, HTTPS, Security, Guardrails etc.

8. Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and determining the root cause of incidents.

9. Expertise in infrastructure as code (e.g., cloud formation, ARM, Terraform, Ansible, Chef, Puppet)

10. High availability engineering experience (region, availability zone, data replication clustering)

11. Awareness in open source tools & scripting language (e.g. Python, powershell, Shell)

12. Deep understanding of software development lifecycles and cloud economics, incl. knowledge of consumption-driven TCO

13. Understanding of network architectures suitable for different cloud topologies with familiarity with user expectations / OLAs for cloud services

14. Good knowledge of security implications of public & private cloud infra design

15. Experience on firewall like paloalto, Fortinet, waf, cisco routers and proxy devices

16. Good understanding and knowledge on Container platforms eg - Docker, Kubernetes, EKS, GKE, Anthos, Openshift etc

17. Product development in a scaled agile environment with awareness of DevOps methodologies and ability/mindset to drive rapid prototyping and piloting of cloud solutions

18. Azure, AWS, and GCP certifications preferred.

19. Basic Database experience, including knowledge of SQL and NoSQL, and related data stores such as Postgres.

20. Procedural and troubleshooting documentation skills

21. Good communication and collaboration skills.

22. Client management skills.

23. Specific Kubernetes experience, such as experience with deploying and managing Kubernetes clus



  • Bengaluru, Chennai, Mumbai, India Hexaware Technologies Full time US$ 1,50,000 - US$ 2,00,000 per year

    Title DevOps & SRE Architect / DevOps & SRE Presales LeadKey Responsibilities: He / She will be responsible for working with the DevOps & SRE practice team on dierent practice related activities along with working on consulting opportunities.Job DescriptionA minimum of 15 years of experience in IT (preferably in software development, testing or...


  • Navi Mumbai, Maharashtra, India Spring HR Services Full time

    **Position - DevOps SRE Practitioner** **Experience - 5+ Years** **CTC - Upto 15LPA** **Location - Navi Mumbai** **Industry - Media, Entertainment, Sports, OTT** **Qualification - B.Sc / M.Sc / M.C.A / B.E: IT or Computer Science** **Skills required**: - Experience into initiating and implementing DevOps practices. - Strong in scripting and markup...


  • Mumbai, Maharashtra, India beBeeReliability Full time ₹ 12,00,000 - ₹ 30,00,000

    Lead SRE at a StartupThis is a challenging role that involves leading and mentoring a team of site reliability engineers to ensure high availability and uptime across distributed systems hosted on AWS.Key ResponsibilitiesDefine and implement SRE best practices, such as SLAs, SLOs, SLIs, error budgets, and chaos engineering.Build and maintain automated CI/CD...

  • Sre

    5 days ago


    Pune, Maharashtra, India Hitachi Solutions Full time

    **Company Description** About Hitachi Solutions India Pvt Ltd**: Hitachi Solutions, Ltd., headquartered in Tokyo, Japan, is a core member of Information & Telecommunication Systems Company of Hitachi Group and a recognized leader in delivering proven business and IT strategies and solutions to companies across many industries. The company provides...

  • Sre

    7 days ago


    Pune, Maharashtra, India Virtusa Full time

    JD:Site Reliability Engineer Minimum 5 years of work experience as an SRE not Traditional Production Support covering integration platforms on cloud based deployments. Skills like AWS and GCP including Kubernetes as a service EKS, Fargate, GKE, Docker, etc Extensive coding experience in any major programming language, particularly for integration tier and...

  • Sre DevOps Ii

    1 week ago


    Mumbai, Maharashtra, India Upstox Full time

    Software Development Engineer II (SRE-Devops Engineer)Mumbai/Bangalore Technology - Engineering /Full-Time /On-Site The Upstox Story: Upstox is one of India's leading Fin-Tech companies with a mission to simplify trading & investing to make it easily accessible to the masses. We aim to enable everyone, from new investors to seasoned traders, to invest...


  • Pune, Maharashtra, India HNM Solutions Full time

    **Role: AWS services - Mid Level** **Location: Pune, India** **Experience: 8 to 13 yrs** **Description**: **Key Responsibilities** - Ability to create an SRE backlog - broad things that need to be done right to implement SRE at scale - Work with senior stakeholders to agree and drive the backlog - Liaise and manage dependencies across teams which...

  • Aws DevOps Engieer

    1 day ago


    Pune, Maharashtra, India Neosoft Technologies Full time

    JD - DevOps Engineer Bachelor's degree in Computer Science, Information Technology, or related field. 5+ years of experience in DevOps or SRE. In-depth knowledge of cloud services and its best practices AWS preferable. Ability to provide support and maintain AWS/Azure any cloud accounts. Strong knowledge of Amazon services like ASG, Load Balancer, S3, EC2,...

  • Sre DevOps

    4 days ago


    Mumbai, India Zycus Full time

    **Zycus** **is looking for Site Reliability Engineering - Developers **JOB DESCRIPTION**: Understanding of tools like grafana, ansible, prometheus and how it is used. Should have hands on experience in coding with Python,NodeJS,ReactJS Should have architectural practice knowledge on AWS to build scalable solutions Understanding of AWS serverless...


  • Chennai, Mumbai, Pune, India Hexaware Technologies Full time ₹ 15,00,000 - ₹ 20,00,000 per year

    Manage and maintain AWS infrastructure including EC2, S3, RDS, VPC, and other services.Monitor system performance and optimize configuration to ensure efficient operation.Very strong experience in Unix OS administration. networking.Implement security best practices and manage security groups, IAM policies, and access control.Design, implement, and maintain...