Lead Cloud Engineer (Remote)

Full Time
Atlanta, GA 30301
Posted
Job description

POSITION PURPOSE

Working at the world's largest home improvement retailer is a career-defining experience. New associates decide to join us because: they love our people and culture, they get to work with cutting-edge technologies, and they spend every day solving large-scale problems that matter. As a Home Depot QuoteCenter associate, you will impact the daily lives and decisions of our customers who spend billions of dollars at our stores in North America.

At The Home Depot QuoteCenter our mission is to “enable a frictionless customer experience to sell Pros the complete job for the planned purchase.” Behind everything our users see and experience within The Home Depot QuoteCenter application eco-system is the Build, Integrate and Connect operating model built and managed by a diverse team of Merchants, Product Managers, Engineers, Marketeers, Operations, and Support professionals that enable Doers to Get More Done.

We're committed to maintaining a fun, engaging, and inclusive environment, ensuring the agility of a close-knit team and driving results that enable The Home Depot to continue to be a leader in our industry. At The Home Depot QuoteCenter, we value diversity in all its forms, and we work hard to support growing associate skills in a fast-paced collaborative environment.

The Home Depot QuoteCenter technology team is focused on radically reimagining the shopping experience at Home Depot utilizing the latest web technologies and data tools.

As a Lead Cloud Engineer, you will be responsible for the infrastructure and security of QuoteCenter's cloud-native platform in GCP as well as the tooling and integration with The Home Depot's wider Enterprise.

This will require you to maintain high site uptime/availability while embracing rapid change and growth using a strong DevSecOps mindset of continuous delivery and site automation. This role requires deep technical knowledge, adaptability, hands-on execution, and a drive towards reliability and disaster-resilience. In this role you will:

  • Drive the practice of Reliability Engineering in the Cloud Infrastructure domain.
  • Partner closely with Software Engineering and Architecture teams to develop DevSecOps. solutions in cloud infrastructure that are reliable, efficient, and maintainable.
  • Design and deliver improvements to existing cloud-native processes and technology.
  • Set the standard for infrastructural engineering excellence.
  • Mentor and upskill junior team members in Cloud Engineering.
  • Partner with teams across The Home Depot enterprise.
  • Serve as a Subject Matter Expert in your domain.

To learn more about QuoteCenter, watch this short video: https://bit.ly/HomeDepotQCPro


Major Tasks, Responsibilities & Key Accountabilities:


  • 25%- Architecture and Solution Design
  • 35%- Implementation and Support
  • 25%- Team Mentoring and Education
  • 10%- Professional Development
  • 5%- Administrative and Planning Activities

Nature and Scope:

This position reports to the Manager, Site Reliability Engineering.
This position has no direct reports.

Environmental Job Requirements:

Travel:
Typically requires overnight travel less than 10% of the time.

Additional Environmental Job Requirements:

Standard Minimum Qualifications:
Must be eighteen years of age or older.
Must be legally permitted to work in the United States.

Additional Minimum Qualifications:

Education Required: The knowledge, skills and abilities typically acquired through the completion of a high school diploma and/or GED.

Years of Relevant Work Experience: 8 years

Preferred Qualifications:

  • 5-8 years or relevant work experience
  • 3-5 years Cloud Native Engineering - Expertise in one of the major cloud providers (AWS, GCP, Azure)
    • Proficient in production systems design including High Availability, Disaster Recovery, Performance, Efficiency, and Security
    • Strong preference for Cloud Architect Certification in Azure, AWS, or GCP

Required - Technical Proficiencies

  • Deep experience with
    • Kubernetes
    • Terraform
    • Helm
    • Configuration Management (OS Config, Ansible)
    • Containerization (Docker, containerd, et al)
    • Secrets Management (Vault, ASM, GSM, Azure Key Vault, et al)
    • CI/CD solutions (Spinnaker, Harness.io, Jenkins, CircleCI, GitHub Actions, et al)
    • Observability Tools (Prometheus, Influx, ELK, New Relic, Datadog)
  • Proficient in fundamentals of cloud-native Networking and Security.
  • Proficient in Linux/Unix-based and Windows Operating Systems
  • Strong understanding of POSIX standards and commands
  • Deep understanding of modern microservice based architectures and operations
  • Proficient in production monitoring concepts and implementation including synthetic, real user, application performance, system, log, time-series, and dashboarding
  • Proficient with one or more Object Oriented Programming Language including but not limited to: C#, Typescript, Javascript, Java, Go, Python
  • Expert in shell scripting and with standard data serialization languages (JSON, Yaml, et al)
  • Expertise in Git and GitOps workflows
  • Thorough understanding of data structures and algorithms
  • Knowledge of software design patterns

Required - Competencies

  • Cultivates Innovation: Creating new and better ways for the organization to be successful
  • Action Oriented: Taking on new opportunities and tough challenges with a sense of urgency, high energy and enthusiasm
  • Collaborates: Building partnerships and working collaboratively with others to meet shared objectives
  • Communicates Effectively: Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences
  • Drives Results: Consistently achieving results, even under tough circumstances
  • Global Perspective: Taking a broad view when approaching issues; using a global lens
  • Interpersonal Savvy: Relating openly and comfortably with diverse groups of people
  • Manages Ambiguity: Operating effectively, even when things are not certain, or the way forward is not clear
  • Optimizes Work Processes: Knowing the most effective and efficient processes to get things done, with a focus on continuous improvement
  • Self-Development: Actively seeking new ways to grow and be challenged using both formal and informal development channels
  • Situational Adaptability: Adapting approach and demeanor in real time to match the shifting demands of different situations

Additional Desired Experience

  • Exposure to modern objected oriented programming languages (preferably Java or .NET C#)
  • Experience in destructive testing methodologies and tools such as Chaos Monkey
  • Experience in defensive coding practices and patterns for high availability
  • Hands on experience Service Discovery and Routing Mesh technologies, ex. Envoy, Istio, Anthos Service Mesh, et al
  • Modern Deployments Strategies (Blue/Green, Canary, et al)


Delivery & Execution

  • Defines team level and infrastructural best practices and engineering excellence
    • Develops automated mechanisms to drive them forward
    • E.g. Code Style Guides, Static Code Analysis, Sentinel and/or OPA policies, testing standards etc
  • Architects network (VPC, Subnet, CIDR, Firewall etc.), IAM, Infrastructure solutions based on business need and future planning for the infrastructure platform
    • Contributes to meaningful architecture diagrams and other documentation needed for security reviews or other interested parties
  • Automates infrastructure change management and pipelining (CI/CD)
  • Automates application-level change management and pipelining (CI/CD)
  • Constantly reflects, reviews, and proposes improvements to our infrastructure, security, tooling, processes, standards, capabilities with a continuous learning and improvement mindset
  • Collaborates and pairs with outside team members (e.g. Architects, Engineers, product management) to create secure, reliable, scalable software solutions
  • Drives and delivers on the major workstreams and business requirements
  • Documents, reviews and ensures that all quality and change control standards are met
  • Writes custom code or scripts to automate infrastructure, monitoring services, and test cases
  • Writes custom code or scripts to do “destructive testing” to ensure adequate resiliency in production
  • Creates meaningful dashboards, logging, alerting, and responses to ensure that issues are captured and addressed proactively
  • Contributes to enterprise-wide tools to drive destructive testing, automation, or engineering empowerment
  • Defines Service Level Objectives to constantly measure reliability in production and help prioritize backlog work

Support & Enablement

  • Fields questions from other product teams or support teams
  • Monitors tools and participates in conversations to encourage collaboration across product teams
  • Provides infrastructure support for services running in production
  • Proactively monitors production Service Level Objectives
  • Proactively reviews the performance and capacity of all aspects of production: code, infrastructure, data, and message processing
  • Triages high priority issues and outages as they arise
  • Collaborates with other Leads in the org to drive skills development and training in cloud tooling.

Learning

  • Participates in and leads learning activities around modern software design and development core practices (communities of practice)
  • Proactively views articles, tutorials, and videos to learn about new technologies and best practices being used within other technology organizations
  • Attends conferences and learns how to apply new technologies where appropriate

gatheringourvoice.org is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, gatheringourvoice.org provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, gatheringourvoice.org is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs