Senior Manager, Infrastructure Site Reliability Engineering

Full Time
Remote
Posted
Job description

US, Remote

This role is responsible for the daily operations of the Infrastructure Site Reliability Engineering (ISRE) team, which combines project management, team management, and engineering duties. You will oversee the operational activities, delegate tasks, and plan and coordinate projects. You will also provide insight into team activities and status, identify project blockers and roadblocks and work to resolve or escalate issues.

The ideal candidate for this role will understand the rigors of working in a high-paced, deeply technical environment. Take ownership and responsibility for all team activities. Communicate and collaborate with various stakeholders and individuals throughout the organization. Work with a sense of urgency to drive and complete projects and team objectives. You must be passionate about individual contributions, career development, and progression, providing guidance and mentoring.

Job Responsibilities

  • Act as primary point-of-contact for all infrastructure projects and requests

  • Assume lead role in troubleshooting, service restoration, and root cause analysis of incidents and outages

  • Provide project management, planning, and road-mapping support

  • Be the driving force behind our automation, monitoring, and observability initiatives

  • Build and maintain operational tools for deployment, monitoring, and analysis of the infrastructure and systems

  • Work collaboratively with software engineering to define infrastructure and deployment requirements; be a sounding board and provide recommendations for engineering

  • Establish, document, publish, and communicate ISRE standards, processes, and procedures

  • Plan, strategize, and assign team goals and objectives

  • Provide professional mentorship and career development for team members

  • Seek opportunities for continuous improvements in our tools, technologies and processes

  • All other duties and responsibilities as assigned

  • Participate in a 24x7x365 on-call rotation

Skills & Competencies

  • Proven track record working in large-scale environments

  • Expert-level administration and operational support for various Linux operating systems

  • Deep knowledge of server and system hardware

  • Experience working with Linux systems from kernel to shell, including working with system libraries, file systems, and client-server protocols

  • Experience with networking (TCP/IP, UDP, ICMP, ARP, DNS, load balancing, etc.)

  • Experience with configuration management tools (Ansible)

  • In depth understanding of OS, systems security, encryption, and networking stack and interfaces.

  • Configuration Automation - automation cookbooks and test suites for Ansible, Chef.

  • Excellent coding or scripting skills including template building (Python, Groovy or Bash).

  • Cloud technologies: Kubernetes, Docker, and AWS Infrastructure Services.

  • Working knowledge of content management systems, source control systems, GIT, Jira, Confluence, and ServiceNow

  • Must have excellent interpersonal skills; solid communication skills, both written and verbal

  • Must be organized, detail-oriented, and able to manage multiple tasks simultaneously with the ability to prioritize appropriately

  • Familiarity with the Korean language is a bonus

Education & Experience

  • A Bachelor’s degree in Computer Science, a related SRE technical field, or relevant equivalent industry experience

  • Minimum of 8 years of industry experience in engineering with 4+ years of leadership experience

  • 5+ years of experience with major Incident Management, Program Management or related Incident Command processes

  • Experience in managing, collaborating, and influencing global teams

Joyent is committed to employing a diverse workforce and providing Equal Employment Opportunities for all individuals regardless of race, color, religion, gender, age, national origin, marital status, sexual orientation, gender identity, status as a protected veteran, genetic information, status as a qualified individual with a disability, or any other characteristic protected by law.

Compensation and Benefits

Compensation for this position will vary among specific regions due to geographical differentials in the labor market, and actual pay will be determined considering factors such as relevant skills, experience, and comparison to other employees in the role. Therefore, the annual base compensation range for this role (depending on the geographical location) is expected to be between $140000 and $240000.

Regular full-time employees (salaried or hourly) have access to benefits including Medical, Dental, Vision, Life Insurance, 401(k), Employee Purchase Program, Vacation and Sick leave, electronic reimbursement and many more. In addition, regular full-time employees (salaried or hourly) are eligible for bonus compensation based on individual, department, and company performance.

About Joyent

Joyent, a wholly-owned subsidiary of Samsung, is the open cloud company. Joyent builds technology, at the pinnacle of scale, performance, stability, and security to accelerate the transformation toward the mobile and cloud-centric world. Joyent designs, builds and manages market competitive cloud computing solutions and services for Samsung Electronics and its partners at global scale.

How To Apply

To apply, please submit a brief introduction, a copy of your resume, and a link to your Github or LinkedIn profile to jobs@joyent.com with Senior Manager, Infrastructure Site Reliability Engineering in the subject. We are an equal-opportunity employer, building a diverse and inclusive team. Qualified applicants with criminal histories will be considered for the position in a manner consistent with the Fair Chance Ordinance.

Joyent is committed to employing a diverse workforce and providing Equal Employment Opportunities for all individuals regardless of race, color, religion, gender, age, national origin, marital status, sexual orientation, gender identity, status as a protected veteran, genetic information, status as a qualified individual with a disability, or any other characteristic protected by law.

Disclaimer: This job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee. Duties, responsibilities and activities may change or new ones may be assigned at any time with or without notice.

gatheringourvoice.org is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, gatheringourvoice.org provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, gatheringourvoice.org is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs