Staff Site Reliability Engineer
1 day ago
We’re looking for an experienced Site Reliability Engineer to fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and will also believe that automation is a key component to operating large-scale systems.6-Month AccomplishmentsFamiliarize with poshmark tech stack and functional requirements. Get comfortable with automation tools/frameworks used within cloudops organization and deployment processes associated with. Gain in depth knowledge related to related product functionality and infrastructure required for it.Start Contributing by working on small to medium scale projects.Understand and follow on call rotation as a secondary to get familiarized with the on call process.12+ Month AccomplishmentsExecute projects independently with little guidance from lead.Create meaningful alerts and dashboards for various sub-system involved in targeted infrastructure.Identify gaps in infrastructure and suggest improvements or work on it.Get involved in on-call rotation.Responsibilities● Serve as a primary point responsible for the overall health, performance, and capacity ofone or more of our Internet-facing services.● Gain deep knowledge of our complex applications.● Assist in the roll-out and deployment of new product features and installations tofacilitate our rapid iteration and constant growth.● Develop tools to improve our ability to rapidly deploy and effectively monitor customapplications in a large-scale UNIX environment.● Work closely with development teams to ensure that platforms are designed with"operability" in mind.● Function well in a fast-paced, rapidly-changing environment.● Participate in a 12x7 on-call rotation.Desired Skills● 4+ years of experience in Systems Engineering/Site Reliability Operations role isrequired, ideally in a startup or fast-growing company.● 4+ years in a UNIX-based large-scale web operations role.● 4+ years of experience in doing 12/7 support for large scale production environments.● Battle-proven, real-life experience in running a large scale production operation.● Experience working on cloud-based infrastructure e.g AWS, GCP, Azure.● Hands-on experience with continuous integration tools such as Jenkins, configurationmanagement with Ansible, systems monitoring and alerting with tools such as Nagios,New Relic, Graphite.● Experience scripting/coding● Ability to use a wide variety of open source technologies and tools.Technologies we use:● Ruby, JavaScript, NodeJs, Tomcat, Nginx, HaProxy● MongoDB, RabbitMQ, Redis, ElasticSearch.● Amazon Web Services (EC2, RDS, CloudFront, S3, etc.)● Terraform, Packer, Jenkins, Datadog, Kubernetes, Docker, Ansible and other DevOpstools.Please note that Poshmark will not be able to sponsor work-related visa for this position.
-
Site Reliability Engineer
1 day ago
tamil nadu, India Datum Technologies Group Full timeJob Title: Site Reliability Engineer (SRE) – Azure & AIExperience: 7+ yearsWork Mode: HybridWork Location: Chennai/Mumbai/GurgaonJob Summary:We are looking for an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure, AI infrastructure, and automation. The ideal candidate will have a solid background in managing cloud...
-
AWS Site Reliability Engineer
1 week ago
tamil nadu, India HTC Global Services Full timeHTC – A brief profileEstablished in 1990, HTC Inc., a company with headquarters in Troy, Michigan, is a leading global Information Technology solution and BPO provider. HTC assists clients across multiple industry verticals, offering turnkey project lifecycle in, e-business, data warehousing, embedded systems, ECM, SCM, CRM, and ERP solutions. HTC Inc....
-
Site Reliability Engineer Trainer
1 week ago
Medavakkam, Chennai, Tamil Nadu, India Intellion Technologies Pvt Ltd Full time ₹ 2,40,000 - ₹ 18,00,000 per yearJob Title: Site Reliability Engineer Trainer (Part-Time / Freelance)Job Description:We are looking for an experienced Site Reliability Engineer (SRE) Trainer for a part-time freelance role. The trainer will be responsible for delivering practical and interactive sessions to learners, covering key concepts and hands-on aspects of Site Reliability...
-
MLOps Site Reliability Engineer
2 days ago
tamil nadu, India KLA Full timeCompany OverviewKLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, , India Insent Full time ₹ 6,00,000 - ₹ 18,00,000 per yearWe are looking to hire a site reliability engineer to our super fast -growing team. As a site reliability engineer, you will be responsible for deploying, supporting, monitoring and troubleshooting large scale micro -service based system; documenting the IT infrastructure, policies and procedures **About Insent** Insent is a super fast -growing, enterprise...
-
Site Reliability Engineer
1 week ago
Chennai, Tamil Nadu, , India Quvia Full time ₹ 9,00,000 - ₹ 12,00,000 per yearAbout the role:We are seeking a highly skilled Site Reliability Engineer (SRE) to join our dynamic team. In this role, you will be responsible for ensuring the reliability, scalability, and performance of our satellite communication systems, which leverage AI and ML for automation and optimization. You will play a key role in maintaining the infrastructure,...
-
Senior Site Reliability Engineer
2 days ago
Chennai, Tamil Nadu, India Miratech Full timeCompany Description Miratech helps visionaries change the world We are a global IT services and consulting company that brings together enterprise and start-up innovation Today we support digital transformation for some of the world s largest enterprises By partnering with both large and small players we stay at the leading edge of technology remain nimble...
-
Mainframe Sre
3 weeks ago
Chennai, Tamil Nadu, India Kyndryl Full timeWho We Are At Kyndryl we design build manage and modernize the mission-critical technology systems that the world depends on every day So why work at Kyndryl We are always moving forward - always pushing ourselves to go further in our efforts to build a more equitable inclusive world for our employees our customers and our communities The Role Join us as a...
-
MLOps Site Reliability Engineer
3 days ago
IND-Tamil Nadu-Chennai-KLA, India KLA Full time ₹ 12,00,000 - ₹ 36,00,000 per yearCompany OverviewKLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents...
-
Senior Reliability Engineer
2 days ago
tamil nadu, India Wing Full timeAbout Wing: Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's fleet of highly automated delivery drones can transport small packages directly...