Job Description

Site Reliability Engineer
Client is based in Austin, TX (Remote is approved)
6 Month+ Contract
Job Description:
Site Reliability Engineering (SRE) is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to Equifax engineering principles.
You will have the opportunity to work with our client who is a highly innovative digital healthcare company endeavoring to make high-quality healthcare accessible and affordable for every person across the globe.   You will be building and supporting highly scalable, digital-first platforms that deliver virtual clinical operations and integrated, personalized healthcare, utilizing millions of mobile devices, a state-of-the-art software stack, and use of advance AI engines.
Seeking a Site Reliability Engineer with experience deploying and managing globally distributed systems, with strong development experience in the commercial application software space.  This Role will work closely with product teams and platform engineers to improve API and service performance, identify opportunities for improvement and metrics to monitor success.  While both defining and evangelizing SRE best practices to improve reliability and performance, this role will also participate in on-call rotations, anticipate problems, dig deep for root causes, and implement solutions to prevent further occurrences.
Responsibilities & Expectations:
  • Develops and demonstrates an advanced knowledge on deploying and managing globally distributed systems
  • Comfortable building and maintaining Python or Java codebases
  • Enjoys troubleshooting a variety of programming languages, such as:  Python, Java, JavaScript, Ruby, etc
  • Knowledge of containers, Kubernetes, and AWS
  • Participate in On-Call rotation
  • Passionate about building monitoring tools and automation to improve the quality of applications and infrastructure, and potentially contributing to Open-Source community.
  • Identify problems (quickly or in advance), through thorough investigation, and apply solutions to prevent further occurrences.
  • Actively working to improve systems and metrics for success
  • Participate in Design and Security Reviews
  • Define and Evangelize SRE best practices improve reliability and performance
  • Challenges the status quo and understands full scope of application architecture and design
  • Strong verbal and written communication skills are necessary due to the dynamic nature of collaborations with customers, vendors, and other engineering team, solving complex business problems together
  • A solution minded person who can work independently, with and across teams.
Required Qualifications:
  • Bachelor's Degree
  • Experience in operating, troubleshooting, and scaling production systems
  • Proficient coding in Python is highly desirable
  • Experience with Python, Java, JavaScript, Ruby, or similar
  • Advanced Knowledge of AWS (certification preferred)
  • Comfort level working in teams practicing Agile and/or Scrum Methodologies
  • At least 3+ years of experience
  • Development or Support Experience across a reasonable subset of the following:
Kubernetes, Docker, CircleCIReact, React Native
New Relic, AppDynamics, Datadog, or similarPostgres
Python 3Honeycomb
Java 11Jenkins
Scala 2AWS, GCP, or Azure
Azure PipelinesRest/GraphQL
  • Cloud native principles, design, architecture with strong working knowledge of cloud DevOps processes and implementation.
  • Solid implementation experience in building, logging & exception handling frameworks.
Preferred Qualifications:
  • Master’s degree
  • HealthCare Industry experience
  • AI / Machine Learning solutions domain exposure

Application Instructions

Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!

Apply Online