Staff Site Reliability Engineer

Staff Site Reliability Engineer

  • Location BENTONVILLE, AR
  • Career Area Information Technology
  • Job Function Information Technology
  • Employment Type Full Time
  • Position Type Salary
  • Requisition 959435BR

What you'll do at

As a member of the Global Technical Engineering and Operations (GTEO) SRE team you will work with other SRE and DevOps practitioners to produce mission-critical infrastructure, tools, and processes that will ensure highest levels of availability and reliability of all our websites. As a senior member of the team you will be expected to work with management, peers, and customers to define and implement the technical vision of the team.

You're right for the job if you're comfortable with deep technical Linux, networking topics, and distributed architectures. You will work cross-functionally amongst a variety of teams and be a core contributor in every significant engineering service or solution that we deliver to our stakeholders. You'll excel if you have enthusiasm for digging deep, and a flare for sharp technical communication, prioritization and organization. You will work directly with our Software Engineering teams to build our next generation “always up” cloud based e-commerce/Retail and Enterprise platform.

Site Reliability Engineers are hybrid systems and software engineers who are responsible and take ownership for reliability, scalability, automation, and other issues related to uptime and availability of Walmart’s e-commerce/Retail and Enterprise platform. Our goal is to build, scale and guard the systems that delights the customers. To do so, you will need to strong skills in following areas:
- Design, write and build tools to improve the reliability, latency, availability and scalability of Walmart e-commerce/Retail and Enterprise products.
o Engender reliability and availability starting with metrics and measurements
o Enable scaling by providing tools, developing training and/or augmenting processes
o Build tools/automate to prevent re-occurrence of problem to mission critical products/services.

  • Develops Innovation strategies, processes, and best practices
  • Drives the execution of multiple business plans and projects
  • Ensures business needs are being met
  • Leads and participates in medium- to large-scale, complex, cross-functional projects
  • Leads the discovery phase of medium to large projects to come up with high level design
  • Leads the work of other small groups of six to ten engineers, including offshore associates, for assigned Engineering projects
  • Promotes and supports company policies, procedures, mission, values, and standards of ethics and integrity
  • Provides supervision and development opportunities for associates
  • Supports business objectives
  • Troubleshoots business and production issues
  • Utilizes industry research to improve Wal-Mart's technology environment

Minimum Qualifications

Bachelor’s and 6 years OR MS and 3 years

Preferred Qualifications

- 10+ years in a software development, DevOps role, or SRE role.
- Experience in designing, investigating, analyzing and troubleshooting large-scale enterprise systems.
- Methodical and systematic problem solving approach, combined with a solid awareness of ownership, initiative and drive.
- Fluency with running services at scale; In depth understanding of Unix systems internals and networking.
- Networking knowledge and in depth understanding of network concepts, such as different protocols (TCP/IP, UDP, ICMP, etc.), MAC addresses, IP packets, DNS, OSI layers, and load balancing).
- Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way. Experience administering Linux systems in a production environment
- Programming experience in one or more of the following languages: Go, Java, Python, Ruby, Shell
- Bachelor's Degree in Computer Science or a related field, or relevant work experience
- Experience with distributed version control like Git or similar
- Experience with IaaS and PaaS providers such as AWS, AZURE OpenStack
- Experience with enterprise monitoring solutions like AppDynamics, New Relic, Prometheus, Graphite, Nagios, Sensu and Splunk
- Familiarity with continuous integration/deployment processes and tools such as Jenkins, Maven, Nexus, etc.,

About Walmart Labs

Imagine working in an environment where one experiment can catapult an entire industry toward a smarter future. That’s what we do at Walmart Labs. We’re a team of 4,000+ software engineers, data scientists, designers and product managers within Walmart, the world’s largest retailer, delivering innovations that improve how our customers shop and our enterprise operates.

Hello, NW Arkansas

With over 200 miles of trails, an emerging locally-sourced food scene, the world-renowned Crystal Bridges Museum—NWA has something for everyone.

Discover NW Arkansas
Northwest Arkansas
NWA Crystal Bridges Museum of American Art

All the benefits you need for you and your family

  • 100% coverage for in network preventative care
  • Retirement Plan
  • Vision Plans
  • Dental Plans
  • Exclusive Discounts

Recently viewed jobs