Absurdist Arcana Ben Weintraub's blog

What's a URI?

There are some things that almost everyone is familiar with, but which relatively few people understand deeply. You might go your whole life never awakening to your own lack of knowledge on a topic that you deal with in a passing manner daily.

For me, URIs were like this (and still are, to some extent). During a recent work discussion about how to structure internally-used URIs, a co-worker helped me realize this.

RFC 3986 formally defines the structure of a URI, along with related concepts like relative references. If you’ve never tried, it’s worth reading (or at least skimming the table of contents)! I’ll admit to only having tried for the first time a few weeks ago, despite working as a software engineer for the last 15 years.

One of the most surprising things I learned while reading it was that URIs do not need to contain two slashes (//). In order to understand the alternative constructions of URIs that looked different from what was in my head, I spent some time making a railroad diagram from (as subset of) the ABNF definitions in Appendix A of RFC3986. They’re pretty, and you might enjoy them:

segment segment segment-nz segment segment-nz scheme : authority segment ? query # fragment scheme : authority ? query # fragment WHAT I THOUGHT A URI WAS: WHAT I LEARNED A URI WAS: path-abempty (begins with '/' or is empty) path-absolute (begins with '/' but not '//') path-rootless (begins with a segment) path-empty (zero-length path)