Job description
V7 is an AI data platform to automate any visual task, voted by Forbes as one of the top 25 machine learning startups of 2021. We have raised over $40 million in venture funding and are backed by some of the most competent individuals in AI. We manage the training data and models of hundreds of AI companies and enterprises. What sets us apart is our team's obsession with pushing our product to where AI will be three years from today.
About the team:
The mission of the Platform team is to provide secure, reliable infrastructure to enable high quality, performant services. You will help us maximise uninterrupted service for the exponentially growing number of customers.
We look after petabytes of data in S3 that we use to train computer vision models. Our industry is new and so in the Platform team we are continually looking for ways that SRE and DevOps practices can help the company and our customers with the challenges of building businesses around machine learning and computer vision.
Technologies we use:
AWS, (S3, EC2, ECS, Cloudfront, RDS)
Terraform, Github
Linux (various)
Python, Bash, JS
You’ll be working on:
Building a robust, secure self hosted observability stack
Kubernetes (AWS EKS)
Prometheus
Grafana
Open Telemetry
Jaeger
Improving developer experience and accelerating time to ship
GitHub
Deploying and curating fleet of GitHub action runners
Integrating end to end tests (Cypress) in CI and CD pipelines
Implementing LocalStack for pre merge testing
Implementing and maintaining strong cyber security patterns
Improving security through best practices and educating colleagues
Supporting our highly regulated customers in the healthcare industries
Building cross cloud solutions
Work with customer facing teams to help validate use cases on AWS, Azure, GCP
Requirements:
4+ years of overall experience and 2+ years in a DevOps/SRE role
Knowledge of Linux operating systems and computer networking
Experience writing code in a programming language such as C++, Java, Elixir, Python, Go, etc.
Experience administering cloud-based infrastructure (e.g. AWS) using Terraform
Ability to troubleshoot production issues related to computer infrastructure, configuration, monitoring, deployments, continuous integration and delivery
Nice to have:
Interest in machine learning and computer vision
Hands on experience building Kubernetes clusters
Implementing federated authentication system, SSO, OIDC, SAML
Building streamlined CI/CD pipelines
Relational Database management or administration
Benefits:
Unlimited vacation, just tell us when you need time off
Stock options
Work from anywhere
Company retreats in stunning locations
Paid tickets, accommodation, and travel to relevant conferences, nationally or internationally (NeurIPS, ICCV, CVPR, ...) to expand your network & knowledge during normal times
Compensation Range: £100K - £120K