▷ Immediate Start Openstack Engineer

4 days ago


Yelahanka, India WhiteLotus Talent Partners Full time

We are looking for a L0 and L1 Site Reliability Engineer (SRE) Support to join our Krutrim Cloud Site Reliability operations team and ensure the smooth functioning of our cloud infrastructure powered by OpenStack and Kubernetes. In this role, you will focus on monitoring, basic troubleshooting, and incident response, helping to maintain high system availability, reliability, and performance. You will be responsible for identifying and addressing simple issues, as well as escalating more complex problems to senior SREs when needed. The ideal candidate should have a basic understanding of cloud infrastructure (especially OpenStack and Kubernetes), containerized environments, and system monitoring. This position offers an excellent opportunity for someone looking to grow into a more advanced SRE or DevOps role. Key Responsibilities: For L0 Support (Level 0): - Incident Monitoring & Triage: - Respond to system alerts, monitor infrastructure health using tools like Prometheus, Grafana, and Observability for both OpenStack and Kubernetes. - Identify low-level issues and follow runbooks or predefined scripts to perform first-level triage. - Document and escalate unresolved incidents to L1 or L2 based on established escalation protocols. - System Health Checks: - Perform daily health checks for Kubernetes pods, nodes, and OpenStack instances. - Verify basic functionality of VMs, containers, and network services within the environment. - Basic Troubleshooting: - Resolve simple issues such as VM reboots, pod failures, and network connectivity issues within OpenStack or Kubernetes environments. - Follow the predefined steps for basic troubleshooting tasks like restarting services or clearing logs. - Ticket Management: - Log incidents and issues into a ticketing system (e.G., JIRA, ServiceNow) for tracking and escalation. - Update incident tickets and provide relevant information for ongoing resolution efforts. ========================================================================================================= For L1 Support (Level 1): - Incident Resolution: - Investigate and resolve more complex issues compared to L0, such as Kubernetes pod crashes, network misconfigurations in OpenStack, and minor service disruptions. - Work with tools like kubectl to troubleshoot Kubernetes pods and nodes, and OpenStack CLI to diagnose problems with VMs, storage, and networks. - Automation & Scripting: - Automate routine tasks, such as VM provisioning, pod deployments, or status checks, using basic scripting languages (Python, Bash). - Improve automation workflows based on feedback and frequently encountered issues. - Log Aggregation & Monitoring: - Review logs and metrics collected from ELK Stack, Prometheus, Grafana, or other logging tools to detect trends and potential issues. - Analyze logs and metrics from OpenStack and Kubernetes clusters to pinpoint underlying problems (e.G., high CPU usage, memory leaks). - Basic Network & Storage Management: - Investigate networking issues related to Neutron (for OpenStack) and CNI configurations (for Kubernetes). - Manage storage resources within OpenStack and Kubernetes (e.G., creating persistent volumes, debugging storage access issues). - Collaboration & Escalation: - Work closely with L2 and L3 engineers for complex troubleshooting or advanced system issues that require in-depth knowledge. - Share knowledge with the team and assist in creating new documentation or updating existing troubleshooting guides. - User and Permissions Management: - Perform basic user management tasks within OpenStack (e.G., creating and managing tenants, security groups). - Review and modify Kubernetes RBAC (Role-Based Access Control) settings based on user access needs. Skills & Qualifications: Required Skills: - Basic Cloud & Kubernetes Knowledge: - Familiarity with OpenStack architecture (e.G., Nova, Neutron, Cinder). - Basic understanding of Kubernetes components, including pods, services, deployments, and namespaces. - Systems & Networking: - Knowledge of Linux/Unix-based operating systems (e.G., Ubuntu, CentOS, Red Hat). - Understanding of networking concepts like DNS, IP routing, and VLANs in cloud environments. - Monitoring & Alerting Tools: - Familiarity with monitoring tools like Prometheus, Grafana, Zabbix, or CloudWatch for alert management and system health monitoring. - Troubleshooting & Incident Response: - Experience in using log aggregation tools (ELK stack, Splunk) and interpreting logs for incident detection. - Ability to perform basic troubleshooting steps (e.G., restarting services, running basic shell commands) to resolve issues. - Communication Skills: - Strong communication skills to collaborate effectively with senior SREs, developers, and other teams. - Ability to document incidents, solutions, and troubleshooting steps clearly. Preferred Skills: - Basic Scripting & Automation: - Exposure to scripting languages such as Bash, Python, or Go to automate basic administrative tasks. - Cloud Platform Experience: - Familiarity with other cloud technologies such as AWS, Azure, or Google Cloud Platform. - Certifications: - Basic certifications such as CompTIA Linux+, AWS Certified Solutions Architect, Kubernetes Fundamentals (CKA), or OpenStack COA are a plus.


  • Founding Engineer

    19 hours ago


    Yelahanka, India Eazy AI Full time

    IMMEDIATE JOINERS WOULD BE PREFERRED. FULL-TIME HYBRID ROLE IN 3 DAYS IN OFFICE IN WHITEFIELD BENGALURU. About the Company Eazy AI is a conversational AI platform transforming online shopping across the Middle East and South Asia. Our AI-powered solution replaces traditional product search with intelligent conversations using text, voice, and image. Eazy AI...


  • Yelahanka, India ACL Digital Full time

    Hi All, ACL Digital is looking for "Senior Design Verification Engineers" Exp Level: 4+ years Notice period: Immediate to 30 days Location: Bangalore and Hyderabad Job Description: - Must have good knowledge on the verification flows - Excellent hands-on debug skills and problem solving attitude. - Experience of working in complex test-bench/model in...


  • Yelahanka, India People Prime Worldwide Full time

    About Client: Our Client is a AI first Innovation Engineering Services & Solutions company headquartered in Pittsburgh, our core purpose is to impact lives by transforming businesses through innovation. With a presence in 23 global locations, it boasts an engineering headcount of more than 5,500+ employees. The company engages with its clients through...


  • Yelahanka, India Palo Alto Networks Full time

    Our Mission At Palo Alto Networks® everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we’re looking...


  • Yelahanka, India TeqUniq Solutions LLP Full time

    Company Description TeqUniq Solutions LLP is a group of professionals with a strong background in services required for the entire construction life cycle. We have extensive experience in Architecture, Interior, Mechanical, Electrical, and Plumbing (MEP) design, Revit modeling, 3D Rendering, coordination, photo & video documentation, and digital solutions....


  • Yelahanka, India IntraEdge Full time

    Job Title: TST support engineer – Engineer II Location: Bangalore Job Type: Full-Time Experience:4+ Years As a TST Support Engineer, you’ll ensure high availability, stability, and reliability of production systems. You’ll own incident management, automation, and continuous improvement of infrastructure and deployment processes. Looking for SRE...


  • Yelahanka, India Objectways Full time

    Job Title: Private Cloud Security Engineer Location: Bangalore (Hybrid – 3 days in office) Experience Required: 5+ years Role Overview As a Private Cloud Security Engineer, you will play a vital role in safeguarding our on-premise or privately hosted cloud environments. You will be responsible for designing, implementing, and monitoring robust security...


  • Yelahanka, India Evolving Systems Full time

    Title: Apprenticeship Position -Junior Support Engineer Location: Remote, India Stipend - 15k per month Apprentice Period - 12 months Who We Are HeadSpin is a unique developer platform that combines data science insights and global device infrastructure to enable companies to perfect their digital experiences during the engineering cycle. HeadSpin platform...


  • Yelahanka, India Moder Full time

    About Us Moder, formerly known as Archwell Operations, is a part of Archwell Holdings founded in 2017. We are a tech forward outsourcing company specializing in supporting the US Mortgage, Insurance, and Banking industries. We specialize in end-to-end component-based outsourcing, managing one-off projects to become an extension of the customer service or...

  • Sr. Bi Engineer

    3 weeks ago


    Yelahanka, India Visa Full time

    Job Description The Global Data Team With data and AI being the fuel that drives our future - our strategies, policies, and business successes around data will define our future growth prospects. Unlocking the value available through the innovative use of data and AI/Machine Learning on behalf of consumers, businesses, and communities is key to our future....