Site Reliability Engineer

3 weeks ago


india Career Stone Consultant Full time

PRINCIPAL ACCOUNTABILITIES:


1.AWS Infrastructure Design:

o Lead the design and implementation of scalable, reliable, and secure AWS infrastructure.

o Provide expertise in architecting solutions that maximize the benefits of AWS services.

o Lead the upgrade of Apache web servers for improved performance and security.

o Oversee the database (DB) upgrade process, ensuring minimal downtime and data integrity.

o Manage the upgrade of application servers to enhance overall system efficiency.


2.Automation and AWS Tooling:

o Develop and maintain automation tools for deployment, monitoring, and operations on AWS.

o Implement and enhance infrastructure as code (IaC) using AWS CloudFormation or similar tools.


3.Service Availability Monitoring and Incident Response:

o Set up and maintain monitoring solutions on AWS to proactively identify and address system issues.

o Respond to and resolve incidents, ensuring minimal downtime and impact on users.

o Getting involved during Major incidents. Leverage available monitors at hand to debug, identify and get right team to resolve the issue

o Prepare proper RCA of incident. Get the right team to work on preventive steps

o Keep a tab on Minor incidents. Look for trends to ensure they do not lead to Major incident.


4.AWS Best Practices:

o Enforce AWS best practices for security, performance, and cost optimization.

o Stay current with AWS advancements and integrate relevant technologies into our infrastructure.


5.Collaboration and Communication:

o Work closely with development, operations, and QA teams to foster a DevOps culture.

o Effectively communicate AWS-related insights, recommendations, and project status.

o Facilitate the upgrade of Kafka and other essential tools within the solution engineering framework.

o Engage in change planning with the cloud team for seamless upgrades and troubleshoot any arising issues.


6.Cloud Security:

o Implement and maintain Akamai Edge Security, WAF, measures for optimal protection.

o Oversee monitoring activities to proactively identify and address security vulnerabilities.

o Collaborate with the solution team to conduct cloud security checks and upgrade planning.

o Work closely with the solution engineering team & Security team to resolve security issues promptly.

o Manage DDOS, WAF, Edge firewall, and network security tasks, including continuous monitoring.

o Coordinate corrective actions with the cloud team/AWS to ensure a secure cloud environment.


7.High Traffic Events:

o Evaluate infrastructure needs for high-traffic events, ensuring appropriate sizing and scaling.

o Monitor traffic patterns and collaborate with basic cloud architects to optimize performance.


8.FinOps Cost Management:

o Monitor storage utilization and implement strategies to optimize costs.

o Oversee infrastructure utilization, controlling costs through effective monitoring.

o Monitor CPU, memory, RAM, and other parameters, optimizing resource consumption.

o Conduct regular checks on data storage to ensure efficient utilization.


SKILLS AND KNOWLEDGE:

Qualifications:

•Bachelor’s degree in computer science, Engineering, or related field.


Work Experience:

•6-10 years of hands-on experience as a Site Reliability Engineer, with a focus on AWS.

•Hands-on experience with AWS, Cloud Infrastructure, AWS cloud security, high-traffic events, and FinOps cost management.


Required Skills:

•Proficiency in scripting languages (e.g., Python, Bash) and experience with AWS SDKs.

•In-depth knowledge of AWS services and a proven track record of implementing solutions on AWS.

•Experience with container orchestration tools (e.g., Kubernetes, Docker Swarm) on AWS.

•Has an understanding of Web, Middleware, DB technologies such as Apache , Wildfly, MySQL , Kafka etc...

•Familiarity with cloud security measures and high-traffic event management.

•Knowledge of FinOps principles and cost management in cloud environments

•Strong problem-solving and troubleshooting skills.

•Excellent communication and collaboration skills.



  • india Cricbuzz.com Full time

    Site Reliability Engineer We are looking for a highly skilled and motivated Web Server Site Reliability Engineer to join our team. As a Web Server Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our web server infrastructure and CDN services. Experience - 3 - 5 years Responsibilities: ●...


  • india ViewSonic Full time

    Job Requirements: Bachelor’s degree in computer science, Engineering, or a related field. 3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Proficient in AWS solutions including but not limited to EC2, S3, CloudWatch, Lambda, and RDS. Strong understanding of Platform Engineering concepts and principles. Experience...


  • india SID Global Solutions Full time

    Dear Candidates, We are looking for immediate joiners 8 to 9 years for Hyderabad Location for a talented Site Reliability Engineer-Manager to join our dynamic team and contribute to the development of our cutting-edge web applications. If you're passionate about the role and have experience in SRE, GCP and Kubernetes , send me your updated cv : Please...


  • india iScale Solutions Full time

    Job Description This is a remote position. Key Responsibilities: Design, implement, and maintain highly available and scalable infrastructure on AWS cloud platform. Develop and manage Infrastructure as Code (IaC) using Terraform for provisioning and managing cloud resources. Implement containerization strategies using Docker for packaging and deploying...


  • india Quiktrak, LLC Full time

    Job Title: Azure Site Reliability Engineer (SRE) / DevOps Engineer Job Description: Summary: As an Azure Site Reliability Engineer (SRE) / DevOps Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure on the Azure platform. This role involves managing deployments, implementing continuous...


  • India System Soft Technologies Full time

    Title: Site Reliability Engineer100% REMOTEThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • india System Soft Technologies Full time

    Title: Site Reliability Engineer 100% REMOTE The Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and...


  • india EZINFORMATICS SOLUTIONS PVT LTD Full time

    Company Description EZINFORMATICS SOLUTIONS PVT LTD is a team of professionals with vast industrial experience and accomplishments in various IT services. They focus on three different spheres: Cyber Security, Information Technology, and Consulting Services. Their goal is to provide safe and secure solutions, unify customer data, and deliver exceptional...


  • india JPMorgan Chase & Co. Full time

    Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability. As a Lead Site Reliability Engineer at JPMorgan Chase within the Consumer & Community Banking division, you hold a leadership role in your team, demonstrate strong knowledge across...


  • india Encora Inc. Full time

    Description Sr. Software Engineer (Site Reliability Engineer) Important Information Location: Ahmedabad Experience: 5+ years Job Mode: Full-time Work Mode: Remote Job Summary Working with DevOps SRE with good experience in Site Reliability Engineer. Responsibilities and Duties Design, implement, and maintain highly...


  • India System Soft Technologies Full time

    Job SummaryThe Site Reliability Engineer (SRE) is a technician who utilizes an array of skills to enhance reliability in critical customer facing digital assets. The SRE is responsible for maintaining the availability and performance of relevant systems through supporting, building, and enhancing applications, tools and engaging with infrastructure teams....


  • india Greenway Health Full time

    Job Description Job Summary The Manager is responsible for implementing the development process and site reliability engineering practices to resolve issues and identify opportunity areas. This role will lead development and site reliability engineering teams and establish and implement best practices and standards related to engineering...


  • india UBS Full time

    Your role We're looking for a Site Reliability Engineer to:• work as a part of an agile pod (team)• determine the reliability of our digital products, technology services, and the infrastructure that underpins them• minimize the risk and impact of failures by engineering operational improvements, such as predictive monitoring, auto scaling or...


  • india RapidBraiins Full time

    Job Description : We are seeking a highly skilled and experienced Senior DevOps Site Reliability Engineer to join our dynamic team. The ideal candidate will have a proven track record of success in DevOps, Site Reliability Engineering (SRE), or development roles within SaaS-based or enterprise applications. As a Senior DevOps SRE Engineer, you will play a...


  • india Hansen Technologies Full time

    About The Role If you are an experienced Site Reliability Engineer join our team in Pune location to become a driving force in ensuring the reliability, performance, and scalability of our systems. As an SRE, you'll be more than just a technical expert, you’ll be a creative problem solver with exceptional customer relationship skills. Your primary...


  • india Coforge Full time

    Qualifications : Experience in a DevOps / Site Reliability Engineer ( SRE ) position, dedicated to ensuring the high availability, reliability, and scalability of live systems. Proficient in observability tools like Prometheus, ELK stack, Grafana, and Azure Monitor, capable of fully managing the suite for optimal system oversight. Skilled in operating APM...


  • india LTIMindtree Full time

    We are Hiring DevOps Site Reliability Engineer !!! Exp - 8 to 12 years Location - Pune Banglore & Mumbai NP - Immediate to 60 days JD 5+ years of experience in DevOps, Site Reliability Engineer, or as a developer in SaaS based/enterprise applications • Previous experience within Agile Development or Systems Engineering / automation role • Development...


  • india CloudBees Full time

    Job Title - Manager, Site Reliability Engineer Location - Bangalore and Chennai Year of Experience - 10+ Years About CloudBees CloudBees is the leading software delivery platform that enables enterprises to deliver scalable, compliant, and secure software, empowering developers to do their best work. Seamlessly integrating into any hybrid and...


  • India World Wide Technology Full time

    World Wide Technology (WWT), a global technology integrator and supply chain solutions provider. WWT employs more than 7000 people worldwide and operates in more than 2 million square feet of state-of-the-art warehousing, distribution, and integration space strategically located throughout the world. WWT is ranked on Glassdoor Best Places to Work for 12...


  • India World Wide Technology Full time

    World Wide Technology (WWT), a global technology integrator and supply chain solutions provider. WWT employs more than 7000 people worldwide and operates in more than 2 million square feet of state-of-the-art warehousing, distribution, and integration space strategically located throughout the world. WWT is ranked on Glassdoor Best Places to Work for 12...