
Site Reliability Engineering Manager
3 weeks ago
Ready to be pushed beyond what you think youre capable of
At Coinbase, our mission is to increase economic freedom in the world. Its a massive, ambitious opportunity that demands the best of us, every day, as we build the emerging onchain platform and with it, the future global financial system.
To achieve our mission, were seeking a very specific candidate. We want someone who is passionate about our mission and who believes in the power of crypto and blockchain technology to update the financial system. We want someone who is eager to leave their mark on the world, who relishes the pressure and privilege of working with high caliber colleagues, and who actively seeks feedback to keep leveling up. We want someone who will run towards, not away from, solving the companys hardest problems.
Our work culture is intense and isnt for everyone. But if you want to build the future alongside others who excel in their disciplines and expect the same from you, theres no better place to be.
We are seeking a highly experienced and talented Principal Engineer to join our team. This individual will be one of the most senior individual contributors at Coinbase and will play a crucial role in influencing across multiple areas. This person will mentor other ICs to promote technical excellence and professional growth.
The mission of the Platform Product Group engineers is to build a trusted, scalable and compliant platform to operate with speed, efficiency and quality. Our teams build and maintain the platforms critical to the existence of Coinbase. There are many teams that make up this group which include Product Foundations (i.e. Identity, Payment, Risk, Proofing & Regulatory, Finhub), Machine Learning, Customer Experience, and Infrastructure.
Reliability team is a vital part of Infrastructure(Platform) org responsible for paving the path for systems reliability and scalability. We manage multiple company wide projects like Scalability/Load testing, Configuration management system, Canary based safe release capability to ensure company wide systems reliability and less customer impact, Embedded SRE function to embed with product team to improve systems reliability features. The team is also responsible for the mission critical incident management function to help mitigate and resolve high severity incidents to minimize the customer impact.
As an Engineer Manager you will promote reliability culture across Coinbase. You will be helping company-wide goals to scale the system by 10-20x and help secure service configurations & secrets by building/enhancing world class service configuration manager systems. Your customer focus skill will help reduce customer incidents by building/enhancing Safe Release (canary based deployment systems) capability and onboarding thousands of services and which deploys hundreds of deployments on a daily basis. You will be responsible for hiring and retaining top talent. Build trust and relationships with cross functional teams to make embedded SRE programs successful.
What youll be doing (ie. job duties):
- We would like to add an Engineer Manager to help promote reliability culture across Coinbase. You would be helping company-wide goals to scale the system by 10-20x and help secure service configurations & secrets by building/enhancing world class service configuration manager systems.
- Your customer focus skill will help reduce customer incidents by building/enhancing Safe Release (canary based deployment systems) capability and onboarding thousands of services and which deploys hundreds of deployments on a daily basis.
- You will be responsible for hiring and retaining top talent.
- Build trust and relationships with cross functional teams to make embedded SRE programs successful.
- Collaborate with engineers, product managers, and leadership to understand testing pain points and develop strategy with detailed roadmap. Generate alignment with stakeholders across the company.
- Actively listen to customer feedback and iterate to improve solutions.
- Be a thoughtful technical voice within the team, aiding in diligent architectural decisions and fostering a culture of high-quality code and engineering processes.
- Take ownership of the team&aposs processes and services, ensuring SLA adherence.
- Work closely with our talent organization to identify and recruit exceptional engineers who align with Coinbase&aposs culture and contribute to our products.
- Coach your direct reports to have a positive impact on the organization and support their career growth.
What we look for in you (ie. job requirements):
- At least 10+ years of software engineering/SRE experience with 2+ years of management experience as a people manager.
- Knowledge in SRE, Devops , Incident management and reliability tooling like Canary, load testing etc.
- Public cloud and general infrastructure like Kubernetes, Load Balancer, Auto-Scaling, basic networking, observability tools like Datadog and troubleshooting knowledge.
- Strong communication skills and ability to explain technical concepts clearly and simply
- Strong interpersonal skills working with Engineers from junior to principal levels
- Demonstrated critical thinking under pressure
- A willingness to dive into understanding, debugging, and improving any layer of the stack
Nice to haves:
- Prior experience designing and building reliable systems capable of handling high throughput and low latency
- Prior experience with high severity incident management process and onCall support.
- Experience with observability and monitoring systems such as Kibana, Datadog, etc.
- Familiarity with working in rapid growth environments
- Experience with AWS, GCP, Azure, or other cloud environment
- Experience designing and building reliable systems
- Experience working in a highly regulated environment
- Experience writing company-facing blog posts and training materials
Commitment to Equal Opportunity
Coinbase is committed to diversity in its workforce and is proud to be an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, creed, gender, national origin, age, disability, veteran status, sex, gender expression or identity, sexual orientation or any other basis protected by applicable law. Coinbase will also consider for employment qualified applicants with criminal histories in a manner consistent with applicable federal, state and local law. For US applicants, you may view Pay Transparency, Employee Rights and Equal Employment Opportunity is the Law notices by clicking on their corresponding links. Additionally, Coinbase participates in the E-Verify program in certain locations, as required by law.
Coinbase is also committed to providing reasonable accommodations to individuals with disabilities. If you need a reasonable accommodation because of a disability for any part of the employment process, please send an e-mail to accommodations[at]coinbase.com and let us know the nature of your request and your contact information. For quick access to screen reading technology compatible with this site click here to download a free compatible screen reader (free step by step tutorial can be found here).
Global Data Privacy Notice for Job Candidates and Applicants
Depending on your location, the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) may regulate the way we manage the data of job applicants. Our full notice outlining how data will be processed as part of the application procedure for applicable locations is available here. By submitting your application, you are agreeing to our use and processing of your data as required.
-
Site Reliability Engineering Manager
4 weeks ago
India CloudHire Full timeJob SummaryThe Technical Manager for Site Reliability Engineering (SRE) will lead a remote team of Site Reliability Engineers, ensuring operational excellence and fostering a high-performing team culture. Reporting to the US-based Director of Systems and Security, this role is responsible for overseeing day-to-day operations, technical mentorship, and...
-
India AionNimbius Full timeWe are looking for a Site Reliability Engineering Manager – Cloud Engineering to join our team in Bengaluru.This role will lead operations for a 24x7 cloud environment, ensuring our systems stay reliable, resilient, and ready to scale.You'll be the one making sure incidents are handled quickly, systems are well-documented, and automation is in place to...
-
Senior Site Reliability Engineer
4 weeks ago
India BQE Software Full timeWe are seeking a Senior Site Reliability Engineer to lead reliability efforts across our application stack, focusing on high availability, performance, and scalability.This role will own the health and uptime of our mission-critical application , Cloud infrastructure , database system , and monitoring infrastructure . About Us At BQE, our mission...
-
Site Reliability Engineer
4 weeks ago
India CES Full timeWe're looking for a highly skilled Site Reliability Engineer to help us build, manage, and scale modern infrastructure systems for high-availability applications. If you're passionate about automation, cloud platforms, and solving tough operational challenges, we would love to hear from you.Key Skills and Competencies3+ years of extensive experience with...
-
Junior Site Reliability Engineer
3 weeks ago
India JoVE Full timeJo VE is the world-leading producer and provider of science video solutions with the mission to improve scientific research and education.Millions of scientists, educators and students use Jo VE for their research, teaching and learning.Our institutional clients comprise over 1,000 universities, colleges, and biopharma companies, including such leaders as...
-
Junior Site Reliability Engineer
4 weeks ago
India JoVE Full timeJoVE is the world- leading producer and provider of video solutions with the mission to improve scientific research and education. Millions of scientists, educators and students use JoVE for their research, teaching and learning. Our institutional clients comprise over 1,000 universities, colleges, and biopharma companies, including such leaders as Harvard,...
-
Site Reliability Engineer
2 days ago
Remote, India Rackspace Technology Full timeJob DescriptionSite Reliability Engineer / Observability EngineerPublic Cloud - Offerings and Delivery - Workforce Mgmt & Delivery Ops /Full - Time / RemoteRackspace is building up its Professional Services Center of Excellence on Application Performance Monitoring Suites.If you enjoy solving complex business problems and can contribute to building next...
-
Site Reliability Engineer
2 days ago
India Xebia Full timeWe are looking for a highly skilled AWS Engineer with strong Python development and Chaos Engineering expertise to design, build, and validate resilient, scalable, and automated cloud-native environments. The ideal candidate will combine cloud engineering, DevOps, and chaos experimentation to improve reliability, fault tolerance, and operational efficiency...
-
Urgent Search Site Reliability Engineer
3 weeks ago
India pythian Full timeRemote Site Reliability Engineering - Site Reliability Engineering Full Time Remote Site Reliability Engineer India Multiple Timezones Remote Work from Home Why Pythian At Pythian we are experts in strategic database and analytics services driving digital transformation and operational excellence Pythian a multinational company was...
-
Senior Site Reliability Engineer- ELK Expert
4 weeks ago
India iVedha Inc. Full timeSenior Site Reliability Engineer (SRE) – ELK Expert | Platform Engineering PracticeLocation: India (Remote) - Must be available to work in the EST (US/Canada) Time Zone.Role Summary:Are you a Senior Site Reliability Engineer (SRE) with deep ELK expertise, ready to take ownership of large-scale observability infrastructure?We're looking for an SRE with 7+...