Principal Software Engineer

Principal Software Engineer

What you'll do at

As a Principal Software Engineer (Reliability Engineering) you are responsible for working with Walmart’s ecommerce state of the art fulfillment center warehouse management system as part of the Supply Chain Technology organization. The initiatives require ensuring smooth functioning of the WMS system and creating a great customer order fulfillment experience. We are looking to bring more intellectually curious engineers who are passionate about technology and creating engineering solutions to operations problems such as optimizing existing systems, building monitoring infrastructure and eliminating work through automation I and find innovative ways that reduce time spent on manual operations and proactively identify potential downtimes.

Responsibilities include:
• Work closely with development team on maintaining operational health and performance of core application functions
• Managing and triaging tickets. Driving prioritization and execution of work based on impact
• Scale systems sustainably through mechanisms such as easy to use tooling and automation. Work in concert with application developers, infrastructure engineers , business operations to evolve systems/products for better scalability, reliability and development velocity
• Drives new playbooks to help reduce mean time to discover, mean triage time of incidents and mean time to recovery. Prioritize and automate high volume playbooks
• Develop optimal incident response processes and drive root case analysis
• Demonstrate up-to-date expertise in Software Engineering and apply this to the development, execution, and improvement of action plans.
• Participate in multiple multi-scale projects.
• Technical understanding of core infrastructure, cloud services, platforms and micro-services
• Ability to understand and capture key data from log
• Ability to effectively triage - be able to detect and determine symptom vs cause.
• Analyze trends to pro-actively prevent incidents.
• Identify and drive continuous improvement efforts to reduce waste (eliminate, automate or streamline).
• Build tools to improve visibility, pro-actively detect issues and restore system availability.
• Strong focus on collecting and inferring metrics.
• Analyzes systems and makes recommendations to prevent possible problems.
• Takes lead on issue resolution activities using knowledge of complex and company-wide systems.
• Perform build, deployment and continuous integration processes to move the code and configurations from local development environments to QA & Production environments.
• Work as Level 2-production support engineer on a rotation-basis to help Level 1 production support team for any production issue where engineering help is required.
• Responsible for production environment health as first priority, enabling automated monitoring and alerting to meet SLAs.
• Clear communication skills.
• Perform build, deployment and continuous integration processes to move the code and configurations from local development environments to QA & Production environments.
• Responsible for production environment health as first priority, enabling automated monitoring and alerting and ensuring close to 100% uptime.
• Troubleshoot business and production issues

Minimum Qualifications

• Bachelor's Degree or Master's Degree in Computer Science + 15+ years of experience
• Proven industry experience with large scale distributed systems
• Solid experience with object-oriented and/or event driven systems
• Strong java programming experience
• Extensive experience building services using back end technologies (Java, Spring, Hibernate)
• In depth knowledge of SQL/No-SQL and database technologies ( Oracle, Cassandra, Hive)
• Experience automating tasks with scripting languages such as Shell, Perl, Python, Bash, and JavaScript
• Systematic problem-solving approach, strong communication skills, a sense of ownership and drive
• Deep understand of service metrics and alarms through the development of dashboards, service KPIs, alarming systems
• Aptitude to
• Experience working in an operational environment with mission critical tier one services with associated pager duty

Preferred Qualifications

Strong aptitude to debug and optimizes code
• Attitude to thrive in a fun, fast-paced start-up like environment
• Experience in production system operations (logging, telemetry, alerting etc.)
• Excellent communication and problem-solving skills
• Has ambition and vigor to add value to a rapidly growing development team

About Walmart Labs

Imagine working in an environment where one experiment can catapult an entire industry toward a smarter future. That’s what we do at Walmart Labs. We’re a team of 4,000+ software engineers, data scientists, designers and product managers within Walmart, the world’s largest retailer, delivering innovations that improve how our customers shop and our enterprise operates.

Hello, Silicon Valley

You don’t have to choose between your career and your lifestyle in Silicon Valley. Here, you can have both.

Discover Silicon Valley
Silicon Valley
View of Silicon Valley from the hills after a passing storm

All the benefits you need for you and your family

  • 100% coverage for in network preventative care
  • Retirement Plan
  • Vision Plans
  • Dental Plans
  • Exclusive Discounts

Recently viewed jobs