Joe’s Mots on Software

Trying to be cogent.

I’m a software engineer with a focus on designing for simplicity, reliability and stability in distributed systems.

At SoundCloud I’m building tools to the musicians and artists who host content on the platform, and before that I was the Tech Lead of the Data Platform, working on the architecture of our data infrastructure, and guiding how we work with data across the company.

Posts

February 12, 2019 Off–Platform Validation – automatic checks with supervision

SoundCloud Premier Distribution allows creators to distribute their music from SoundCloud to other streaming platforms and stores. For many of our users, this will be their first experience with the strict requirements of the music industry supply chain on metadata and media. Here we’ll look at how a system of automatic and manual validations allows users to get fast feedback as they prepare a release.

December 1, 2017 Concepts in DataFlow – basic definitions

Google's 2015 paper on the Dataflow model describes general solutions to general data pipeline processing problems. The terms they use have been helpful to me in understanding patterns in these problems.

June 20, 2017 Who Owns the Data? – a better model for data ownership

We have a good solution for ownership of services in a microservices architecture. We can learn from this to define ownership of datasets in a way that reduces the total cost of maintenance and integration across teams.

February 17, 2017 Gödel Blockchain Theorem – limits of verifiability

A blockchain allows independent parties to make verifiable statements. This works with bitcoin, whose value comes from the system itself, but fails in applications where the value is external.

January 6, 2017 Two Phase Commit – an old friend

Two-phase commit is a long-established means of keeping two resources strongly synchronised. These days it's not so sexy, but it's an important piece of heritage of distributed computing.

December 30, 2016 Understanding Record Shredding – storing nested data in columns

Record shredding allows nested data structures to be considered in a sort-of-tabular way, and stored in a columnar data store. This post describes the intuition behind how this can be done preserving message structure, from Dremel and Parquet.

December 7, 2016 Levels of Robustness – so many things to go wrong

We like our code to be "robust". This post looks at different failure modes against which a system needs to be protected

October 15, 2016 Everything Is a Tradeoff – in praise of writing down design choices

Being explicit about costs and implications when making choices makes future decisions easier when things change. A collaborative document can be a great implementation of this.

[Series] February 24, 2016 Just In Time – introduction to JVM compilation
An introduction to compilation for the JVM, bytecode and JIT compilation, and benchmarking with JMH. It accompanies a talk I gave to the Berlin-Brandenburg Scala User Group.

[How To] January 26, 2016 One Source, Two Jars – multiple builds for one project

How to build two artifacts from one source folder in SBT

December 14, 2015 What is the State monad? – functional programming with state

Learning about what the State monad represents and how to use and understand it

[How To] December 3, 2015 Sequence All The Things – implement sequence on your own types

How to add Applicative and Traverse instances for your own types, use sequence, sequenceU and Unapply

April 12, 2015 Learning about Non–blocking I/O – no-code intro

Deriving how non-blocking I/O must work, from first principles

April 20, 2014 Guava Testlib Example – a brief introduction to generated tests

Step-by-step guide to using the Guava Testlib library for test case generation

[How To] June 17, 2013 JMX in a DMZ – understanding RMI settings

An SSH tunnel can allow access to a JMX endpoint that is only exposed to the local machine.

[How To] June 13, 2013 SSH tunnel into a DMZ – creating a path back home

SSH tunnelling allows opening a hole back through a firewall or NAT, and it's really easy to set up.

May 13, 2013 Interviewing for Programmers – my approach to interviewing

A description of my interview approach while at GSA – what I was looking for what I expected from candidates