Site Reliability Engineer II
4 months ago
Are you ready to make your mark with a true industry disruptor? ZineOne, a subsidiary of Session AI, the pioneer of in-session marketing, is looking to add talented team members to help us grow into the premier revenue tool for e-commerce. We work with some of the leading brands nationwide and we innovate how brands connect with and convert customers.
Job Description
This position offers a hands-on, technical opportunity as a vital member of the Site Reliability Engineering Group. Our SRE team is dedicated to ensuring that our Cloud platform operates seamlessly, efficiently, and reliably at scale. The ideal candidate will bring over five years of experience managing cloud-based Big Data solutions, with a strong commitment to resolving operational challenges through automation and sophisticated software tools.
Candidates must uphold a high standard of excellence and possess robust communication skills, both written and verbal. A strong customer focus and deep technical expertise in areas such as Linux, automation, application performance, databases, load balancers, networks, and storage systems are essential.
Key Responsibilities:As a Session AI SRE, you will:
- Design and implement solutions that enhance the availability, performance, and stability of our systems, services, and products.
- Develop, automate, and maintain infrastructure as code for provisioning environments in AWS, Azure, and GCP.
- Deploy modern automated solutions that enable automatic scaling of the core platform and features in the cloud.
- Apply cybersecurity best practices to safeguard our production infrastructure.
- Collaborate on DevOps automation, continuous integration, test automation, and continuous delivery for the Session AI platform and its new features.
- Manage data engineering tasks to ensure accurate and efficient data integration into our platform and outbound systems.
- Utilize expertise in DevOps best practices, shell scripting, Python, Java, and other programming languages, while continually exploring new technologies for automation solutions.
- Design and implement monitoring tools for service health, including fault detection, alerting, and recovery systems.
- Oversee business continuity and disaster recovery operations.
- Create and maintain operational documentation, focusing on reducing operational costs and enhancing procedures.
- Demonstrate a continuous learning attitude with a commitment to exploring emerging technologies.
- Experience with cloud platforms like AWS, Azure, and GCP, including their management consoles and CLI.
- Proficiency in building and maintaining infrastructure on:
- AWS using services such as EC2, S3, ELB, VPC, CloudFront, Glue, Athena, etc.
- Azure using services such as Azure VMs, Blob Storage, Azure Functions, Virtual Networks, Azure Active Directory, Azure SQL Database, etc.
- GCP using services such as Compute Engine, Cloud Storage, Cloud Functions, VPC, Cloud IAM, BigQuery, etc.
- Expertise in Linux system administration and performance tuning.
- Strong programming skills in Python, Bash, and NodeJS.
- In-depth knowledge of container technologies like Docker and Kubernetes.
- Experience with real-time, big data platforms including architectures like HDFS/Hbase, Zookeeper, and Kafka.
- Familiarity with central logging systems such as ELK (Elasticsearch, LogStash, Kibana).
- Competence in implementing monitoring solutions using tools like Grafana, Telegraf, and Influx.
Benefits
- Comparable salary package and stock options
- Opportunity for continuous learning
- Fully sponsored EAP services
- Excellent work culture
- Opportunity to be an integral part of our growth story and grow with our company
- Health insurance for employees and dependents
- Flexible work hours
- Remote-friendly company
-
Site Reliability Engineer II
4 months ago
Mumbai, India Session AI Full timeAre you ready to make your mark with a true industry disruptor? ZineOne, a subsidiary of Session AI, the pioneer of in-session marketing, is looking to add talented team members to help us grow into the premier revenue tool for e-commerce. We work with some of the leading brands nationwide and we innovate how brands connect with and convert customers.Job...
-
Site Reliability Engineer
4 months ago
Mumbai, India dentsu Full timeThe purpose of this role is to ensure the availability and stability of production and test platforms. Job Title: Site Reliability Engineer Job Description: Key responsibilities:Troubleshoots and owns issues in our development, test and production environments. Including performance optimisation and continuous tuningWorks alongside the DevOps team in...
-
Construction Site Lead II
3 months ago
Mumbai, Maharashtra, India NES Fircroft Remote Work Freelance Full timeWe have an opportunity with one of our reputed client in India for the position of "Construction Site Lead II" Position: Construction Site Lead II Location: Mumbai Experience: 5+ years Duration: 1 year project Job Description: - Review of relevant Mechanical deliverables - Stewarding mechanical work as per plan with EPC contractor - On ground...
-
Site Care Partner Ii
3 months ago
Mumbai, Maharashtra, India Parexel Full time**Job **Purpose**: The Site Care Partner II (SCP II) is the “face of the client” and therefore accountable for ensuring that sites receive necessary support and engagement, issues are resolved, and client’s reputation is upheld throughout study lifecycle. The SCP II is the main client point of contact for investigative sites; accountable for site start...
-
Site Reliability Engineering Manager
2 months ago
Mumbai, India Talent Socio Full timeJob Description :- Lead and mentor a team of Site Reliability Engineers (SREs) responsible for ensuring the reliability, availability, and performance of critical systems.- Establish and enforce engineering practices focused on automation, monitoring, and process improvement to enhance system reliability and operational efficiency.- Conduct thorough and...
-
Site Reliability Engineer
4 months ago
Mumbai, India IMC Full timeAs a Site Reliability Engineer at IMC, you'll be an integral member of a highly experienced team, responsible for maintaining a robust, best in class, low latency trading environment. The skills necessary to excel could range from system administration, network troubleshooting, database optimization, software development, release management and...
-
Senior Site Reliability Engineer
2 months ago
Mumbai, India CimpressVista Full timeSenior Site Reliability Engineer You have successfully completed a degree in computer science or comparable training (e.g. as an ITspecialist) or have gained several years of relevant professional experience in the DevOpsenvironment.Experience working with:Agile methods and cloud technologies/architecture in AWS.Database administration to a small extent...
-
Senior Site Reliability Engineer I
3 days ago
mumbai, India RELX India (Pvt) Ltd Risk div Company Full timeAbout the role We are seeking a highly motivated and experienced Site Reliability Engineer (SRE) to manage and optimize our AWS cloud resources. The ideal candidate will have a strong background in AWS, Terraform, Kubernetes, and scripting, with proficiency in monitoring and CI/CD tools. Experience with Hashicorp Vault is a plus. Responsibilities: ...
-
Site Reliability Engineer
1 day ago
Mumbai, India Jio Full timeSite Reliability Engineer (SRE) with Automation Job OverviewAs a Site Reliability (SRE)/DevOps Automation Engineer, you will be responsible for the availability, automation, performance, efficiency, Scaling, monitoring and emergency response for any incidents/issues in Applications. You will use your deep understanding of platforms, architecture, people,...
-
Site Reliability Engineer
2 days ago
mumbai, India Jio Full timeSite Reliability Engineer (SRE) with Automation Job Overview As a Site Reliability (SRE)/DevOps Automation Engineer, you will be responsible for the availability, automation, performance, efficiency, Scaling, monitoring and emergency response for any incidents/issues in Applications. You will use your deep understanding of platforms, architecture,...
-
Senior Site Reliability Engineer SRE
2 months ago
Mumbai, India Ztek Consulting INC Full timeJob Title: Senior Site Reliability Engineer(SRE) Duration: 612 months Location: HybridFort Worth TX Work Type: Rate: Pay rangeoffered to a successful candidate will be based on several factorsincluding the candidates education work experience work locationspecific job duties certifications etc. JobSummary: A Site Reliability Engineer is responsible...
-
Site Reliability Engineer
3 days ago
mumbai, India Antal International Full timeJob Description A major player in the tech industry, which specializes in retail technology, AI, ML, and big data, is seeking new talent. Established by alumni from a top engineering institute, this organization manages a vast network of brands and stores. Headquartered in Mumbai, it is recognized for its innovation and expertise across multiple tech...
-
Site Reliability Engineer
1 week ago
Mumbai, India Cyber Sphere LLC Full timeSite Reliability Engineer (SRE) to join our team. Qualifications :- 4+ years of Software Engineering experience- BS Engineering/Computer Science or equivalent experience requiredResponsibilities :- Design, deploy, and maintain a highly available and scalable data infrastructure on Azure open ai , databases and event driven services- Monitor and optimize the...
-
Senior Site Reliability Engineering Manager
2 months ago
Mumbai, India IDFC FIRST Bank Full timeRole/ Job Title: Senior Site Reliability Engineering Manager Function/ Department: Information Technology Job Purpose: Site Reliability Engineering (SRE) department plays a pivotal role in providing seamless experience for our customers. With state-of-the-art technology and tools, we are transforming the overall application development and...
-
Senior Site Reliability Engineering Manager
3 days ago
mumbai, India IDFC FIRST Bank Full timeRole/ Job Title: Senior Site Reliability Engineering Manager Function/ Department: Information Technology Job Purpose: Site Reliability Engineering (SRE) department plays a pivotal role in providing seamless experience for our customers. With state-of-the-art technology and tools, we are transforming the overall application development and...
-
Site Reliability Engineer
1 month ago
Mumbai, India Cyber Sphere LLC Full timeSALARY : 40LPA - 60LPAWe are seeking a talented and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our Azure AI Services platform. You will work closely with cross-functional teams to design, implement, and maintain robust infrastructure and...
-
Condition Monitoring Engineer CAT II
3 weeks ago
Mumbai, India IRD Mechanalysis Limited Full timeWe need a B Tech plus Vibration CAT II urgentlyResponsibilitiesOffer Vibration analysis service to our client in Mumbai. You shall be posted at site and will report to the client every day in the morning. QualificationsB Tech plus Vibration analyst CAT II
-
Condition Monitoring Engineer CAT II
3 weeks ago
Mumbai, India IRD Mechanalysis Limited Full timeWe need a B Tech plus Vibration CAT II urgentlyResponsibilitiesOffer Vibration analysis service to our client in Mumbai. You shall be posted at site and will report to the client every day in the morning. QualificationsB Tech plus Vibration analyst CAT II
-
Senior Site Reliability Engineer I
3 days ago
mumbai, India RELX India (Pvt) Ltd Risk div Company Full timeJob Description for Senior Site Reliability Engineer (SRE) Position Overview: We are seeking a dynamic Site Reliability Engineer (SRE) with 7-9 years of experience in system administration who has a deep proficiency in automation. The ideal candidate will be instrumental in monitoring and incident response and will possess comprehensive knowledge...
-
Senior DevOps Engineer
3 days ago
Navi Mumbai, India Capabiliq IT Services (OPC) Private Limited Full timeResponsibilities :- Define processes for the DevOps program and align to best practice standards- Support of Product delivery teams integrating into existing pipelines and platforms.- Plan for and manage operational resilience for network and application while minimizing the effect on the business- Develop and extend DevOps tooling and automation efforts...