Staff Site Reliability Engineer
7 days ago
We’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and will also believe that automation is a key component to operating large-scale systems.6-Month Accomplishments- Familiarize with poshmark tech stack and functional requirements.- Get comfortable with automation tools/frameworks used within cloudops organization and deployment processes associated with.- Gain in depth knowledge related to related product functionality and infrastructure required for it.- Start Contributing by working on small to medium scale projects.- Understand and follow on call rotation as a secondary to get familiarized with the on call process.12+ Month Accomplishments- Execute projects independently with little guidance from lead.- Create meaningful alerts and dashboards for various sub-system involved in targeted infrastructure.- Identify gaps in infrastructure and suggest improvements or work on it.- Get involved in on-call rotation.Responsibilities● Serve as a primary point responsible for the overall health, performance, and capacity ofone or more of our Internet-facing services.● Gain deep knowledge of our complex applications.● Assist in the roll-out and deployment of new product features and installations tofacilitate our rapid iteration and constant growth.● Develop tools to improve our ability to rapidly deploy and effectively monitor customapplications in a large-scale UNIX environment.● Work closely with development teams to ensure that platforms are designed with"operability" in mind.● Function well in a fast-paced, rapidly-changing environment.● Participate in a 12x7 on-call rotation.Desired Skills● 4+ years of experience in Systems Engineering/Site Reliability Operations role isrequired, ideally in a startup or fast-growing company.● 4+ years in a UNIX-based large-scale web operations role.● 4+ years of experience in doing 12/7 support for large scale production environments.● Battle-proven, real-life experience in running a large scale production operation.● Experience working on cloud-based infrastructure e.g AWS, GCP, Azure.● Hands-on experience with continuous integration tools such as Jenkins, configurationmanagement with Ansible, systems monitoring and alerting with tools such as Nagios,New Relic, Graphite.● Experience scripting/coding● Ability to use a wide variety of open source technologies and tools.Technologies we use:● Ruby, JavaScript, NodeJs, Tomcat, Nginx, HaProxy● MongoDB, RabbitMQ, Redis, ElasticSearch.● Amazon Web Services (EC2, RDS, CloudFront, S3, etc.)● Terraform, Packer, Jenkins, Datadog, Kubernetes, Docker, Ansible and other DevOpstools.Please note that Poshmark will not be able to sponsor work-related visa for this position.
-
Staff Site Reliability Engineer
4 weeks ago
Chennai, India Poshmark Full timeWe’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through...
-
Staff site reliability engineer
4 weeks ago
Chennai, India Poshmark Full timeWe’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through...
-
Staff Site Reliability Engineer
4 weeks ago
Chennai, India Poshmark Full timeWe’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through...
-
Site Reliability Engineer
1 week ago
Chennai, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – Azure & AIExperience: 7+ yearsWork Mode: HybridWork Location: Chennai/Mumbai/GurgaonJob Summary:We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure, AI infrastructure, and automation. The ideal candidate will have a solid background in managing cloud...
-
Site Reliability Engineer
1 week ago
Chennai, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – Azure & AIExperience: 7+ yearsWork Mode: HybridWork Location: Chennai/Mumbai/GurgaonJob Summary:We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure, AI infrastructure, and automation. The ideal candidate will have a solid background in managing cloud...
-
Site Reliability Engineer
1 week ago
Chennai, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – Azure & AIExperience: 7+ yearsWork Mode: HybridWork Location: Chennai/Mumbai/GurgaonJob Summary:We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure, AI infrastructure, and automation. The ideal candidate will have a solid background in managing cloud...
-
Site Reliability Engineer
2 weeks ago
Chennai, India Zyoin Group Full timeDescription : MoneyForward is seeking a Site Reliability Engineer (SRE) to lead the reliability, scalability, and performance of our products. This role involves making critical technical decisions, collaborating with development and platform engineering teams, and ensuring that our systems remain resilient and scalable to support stable business...
-
Site Reliability Engineer
2 weeks ago
Chennai, India Tata Consultancy Services Full timeRole: Site Reliability Engineer Experience: 4 to 7 Years Locations: Chennai/Pune/Kolkata
-
Site Reliability Engineer
3 weeks ago
Chennai, India Tata Consultancy Services Full timeRole: Site Reliability Engineer Experience: 4 to 7 Years Locations: Chennai/Pune/Kolkata
-
Site Reliability Engineer
3 weeks ago
chennai, India Tata Consultancy Services Full timeRole: Site Reliability Engineer Experience: 4 to 7 Years Locations: Chennai/Pune/Kolkata