Scalability & Reliability Engineer, London

  • Software Development
  • London

Work with the Product Development teams to raise the level of operations knowledge and skill through regular pair programming and expert advice, and to lead the development of a cross-team vision for future infrastructure improvements.

WANTED: An infrastructure expert to join our extreme programming team, who will design and advocate incremental improvements to our production infrastructure. You will help us handle our ever increasing capacity and latency demands, while moving fast and delivering reliably.


You will pair-program with developers to implement your ideas, as well as providing an expert Voice in team discussions, and researching new technologies. You must be equally happy discussing ideas and pair programming, as you are keen to develop your own skills in all areas of product development.


In this role you will work with a wide range of technologies from infrastructure management code, to application code through to datastores and networks. You’ll help the whole team embed monitoring and scaling capabilities into everything we build.


About Unruly Tech Operations

At Unruly we have an innovative approach to tech operations. We have come to value:


Cross functional teams over teams cooperating

Rapid feedback over careful planning

Automation over manual work

Continuous deployment over planned releases

Monitoring value over monitoring servers

We believe developers who are building new product features should also be responsible for looking after them in production. This means developers on-call, no infrastructure operations team, and puts operations at the heart of our development process.


Reports to: Team Lead, Product Development

Location: Shoreditch (London, UK)

Employment Type: Permanent

Working Hours: Monday to Friday, 09:30 – 18:00

Salary: Highly Competitive



Private health cover, iPhone or Android phone, Cycle to Work scheme, childcare vouchers, season ticket loan, laptop allowance, conference attendance allowance.  There are monthly company social events, weekly deliveries of fruit, Xbox contests, occasional poker nights, film nights and an annual Unruly Festival. Plus we spend way too much time watching and sharing the most awesome videos on the Web


About the Role: Mission


Your mission will be to:

  • Work with the Product Development teams to raise the level of operations knowledge and skill through regular pair programming and expert advice, and to lead the development of a cross-team vision for future infrastructure improvements.


About the Role: Key Relationships

  • Product Development teams
  • CTO
  • SVP
  • Team Leads


About You: Experience

You must have:

  • Experience in gathering requirements as well as designing and building infrastructure to meet them (Capacity & Reliability SLAS, Monitoring)
  • Run or lead training sessions to share your knowledge and expertise
  • Experience evaluating and benchmarking solutions for performance and suitability
  • Worked on a project in close collaboration with application developers
  • Worked with cloud solutions (We use AWS)
  • Experience with multiple approaches to monitoring and alerting (we use Nagios, Graphite, Collectd, Grafana)
  • Experience with infrastructure automation & configuration management tools (We use Puppet, Fabric)


About You: Skills


You must be:

  • An expert on web technologies – HTTP, Webservers, Configuring CDNs, DNS
  • Able to explain and advise on distributed Systems design, fault tolerance and consensus
  • Confident working with SQL Databases including replication (we use Postgres)
  • Confident working with NoSQL datastores (We use Redis, Cassandra and others)
  • Competent at shell scripting and at least one other programming language
  • Competent with open Source monitoring tools such as Nagios, Monit, Graphite
  • Competent in automation tools such as Terraform, Puppet, Fabric
  • Competent in at least two of
    • Bulk data processing pipelines with tools such as Hadoop (EMR), Kafka
    • Networking – TCP/IP, BGP
    • JVM internals – Garbage collection algorithms, JIT debugging
    • Data Warehousing


About You: Behaviour

You must be:

  • Sociable – happy to pair-program daily
  • Flexible – able to adapt and align work to changing requirements and priorities
  • Patient – able to collaborate with people with different levels of experience
  • A team player – able to take shared ownership of code
  • Passionate – interested in the latest technologies and trends
  • Confident – able to clearly express your ideas in discussions


About You: Education 8 Oualifications

  • Honours-level degree or equivalent preferred (University of Life also counts)


The WoW Factor

We’d love it if you have:

  • Understanding of ad-serving and Programmatic ad technologies
  • Worked in an extreme programming team, or similar environment
  • Worked in a continuous-delivery environment
  • Experience in coaching and mentoring developers
  • Are as happy Working on application features as operations
  • The ability to see how everything fits into the bigger picture
  • Experience with Splunk
  • Experience with various build tools (Maven, Grunt, Sbt, Gradle)


To Apply:

Send an email with CV attached and your name and “Scalability & Reliability Engineer” in the subject line to Please specify your availability to commence the role and don’t forget to tell us where you heard about the role! All applicants must be authorised to work in the UK.

We love reviewing all the applications we receive, but unfortunately we’re not able to get back to everyone individually. If we’d like to move forward with your application we’ll definitely be in touch!

Please upload your CV/resume.
Please upload your cover letter.