A footgun in Swift’s Codable protocol

Swift’s Codable protocol helps you create structs that can be serialized to and from an external representation such as JSON. The compiler generates code that does the heavy lifting, mapping Swift’s types to and from the appropriate JSON types.

A simple use of Codable might look like this. First, we define a struct (here representing a user) and mark it as conforming to the Codable protocol.

struct User: Codable {
  var name: String
  var age: Int
}

We can then encode a User struct as JSON:

let user = User(name: "Alice", age: 42)

let encoder = JSONEncoder()
let data = try! encoder.encode(user)

print(String(data: data, encoding: .utf8)!)
// -> {"name":"Alice","age":42}

And we can decode a User from JSON:

let data = Data("{ \"name\": \"Bob\", \"age\": 36 }".utf8)

let decoder = JSONDecoder()
let user = try! decoder.decode(User.self, from: data)

print(user)
// -> User(name: "Bob", age: 36)

This is pretty handy. Just by declaring that our struct User conforms to Codable, we can convert it to and from a JSON representation without needing to write any serialization or deserialization code ourselves.

Mapping names

Let’s say that we’re sending these JSON-encoded User objects to and from an HTTP service. And let’s say that the developers of that service add a new field to store the user’s (optional) favorite ice cream flavor.

{
	"name": "Alice",
	"age": 42,
	"favorite_ice_cream_flavor": "pistachio"
}

To support this additional field, we could add a property to our User struct declared in Swift. But Swift code typically uses camelCase for names, not snake_case like the developers of the HTTP service have chosen to use in their JSON. Can we name the Swift property favoriteIceCreamFlavor and somehow map it to the snake case field name used in the JSON?

Yes, we can. There are two ways. The first is to create an explicit Coding Key. This is a lookup table (in the form of an enum) that maps the property names on our Swift struct to the corresponding names in its encoded representation.

struct User: Codable {
  var name: String
  var age: Int
  var favoriteIceCreamFlavor: String

  enum CodingKeys: String, CodingKey {
    case name = "name"
    case age = "age"
    case favoriteIceCreamFlavor = "favorite_ice_cream_flavor"
  }
}

Encoders and decoders (like the JSONEncoder and JSONDecoder in the example above) will automatically use a CodingKey if one is present.

The second way is to keep the struct simpler, but then to modify how we encode and decode it.

struct User: Codable {
  var name: String
  var age: Int
  var favoriteIceCreamFlavor: String
}

// to encode, configure the encoder like this:
let encoder = JSONEncoder()
encoder.keyEncodingStrategy = .convertToSnakeCase

// and to decode, configure the decoder like this:
let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase

The second option looks tempting. It’s more concise, especially for structs with many fields. However, it also has a hidden footgun.

The footgun

Suppose that the HTTP service adds another field, avatar_url which is a URL string to an image file for the user’s avatar (profile picture).

We might add that to our User struct like this:

struct User: Codable {
  var name: String
  var age: Int
  var favoriteIceCreamFlavor: String
  var avatarURL: String
}

Let’s try this out to make sure that encoding is working.

let user = User(
	name: "Alice",
	age: 42,
	favoriteIceCreamFlavor: "pistachio",
	avatarURL: "http://example.com/alice.png"
)

let encoder = JSONEncoder()
encoder.keyEncodingStrategy = .convertToSnakeCase
let data = try! encoder.encode(user)

print(String(data: data, encoding: .utf8)!)
// -> {"age":42,"avatar_url":"http:\/\/example.com\/alice.png","name":"Alice","favorite_ice_cream_flavor":"pistachio"}

Great. Now let’s try to decode that same JSON.

let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase

// 'data' is the same data from above
let user = try! decoder.decode(User.self, from: data)
// -> PANIC!

The last line above fails. This is the error that is raised:

Swift.DecodingError.keyNotFound(
  CodingKeys(stringValue: "avatarURL", intValue: nil),
  Swift.DecodingError.Context(
    codingPath: [],
    debugDescription: "No value associated with key CodingKeys(stringValue: \"avatarURL\", intValue: nil) (\"avatarURL\"), with divergent representation avatarUrl, converted to avatar_url.",
    underlyingError: nil
  )
)

The problem is that while convertToSnakeCase mapped "avatarURL" to "avatar_url", the inverse is not true: convertFromSnakeCase has mapped "avatar_url" to "avatarUrl" (note the different capitalization).

The root of the problem

Fundamentally, the problem is that there are more camelCase names than snake_case names. The camelcase strings avatarURL and avatarUrl both map to the same snake case string avatar_url. Mapping that snake case string back to a camelcase string requires deciding how to interpret it.

I had wrongly assumed that when no explicit coding key was provided, Swift’s encoder and decoder would each generate an implicit coding key based on the property names in the struct, and then use that key to map struct properties to JSON keys (for encoding) or JSON keys to struct properties (for decoding). That way, decoding with the convertFromSnakeCase strategy would map avatar_url in the JSON to either avatarURL or avatarUrl in the Swift struct, depending on which field was present in the struct’s definition (if both were present in the same struct, a runtime error would probably be appropriate). This would guarantee that decode(encode(foo)) == foo (assuming that foo’s fields were individually encodable and decodable without loss of information).

But that’s not how it works. What really happens is that when decoding a JSON document, Swift applies convertFromSnakeCase function to each key, which simply deletes each underscore and capitalizes the letter that follows it. If the resulting name isn’t found in the struct, then it skips that field in the JSON. At the end, if some of the required properties on the struct haven’t been found, then the decoder raises an error.

How to avoid this

Some software projects follow the convention that only the first letter of an acronym in camelCase is capitalized, as if it were an ordinary word. This means using HttpRequest and messageId instead of HTTPRequest and messageID. If we stuck to this rule in our Codable structs, then Swift’s convertFromSnakeCase would reliably do what we wanted. But this feels odd to me, because Apple’s Foundation library seems to exclusively do the opposite: it’s full of names like URLSession and objectID and ISO8601DateFormatter. The Swift language API Design Guidelines also suggest uppercasing acronyms and initialisms in camelcase names.

Ultimately, what I settled on was to always use explicit Coding Keys. The key provides a bidirectional mapping between struct properties and encoded JSON fields, and it’s checked at compile time so omitting a struct property from the map will produce an error. Using Coding Keys ensures that decode(encode(foo)) always succeeds.

Maybe in the future Apple can fix this rough edge in the Codable protocol. It would in theory be possible for JSONEncoder and JSONDecoder to work in the way I described above, where each would use reflection on the struct to generate name mapping tables that are invertible, so that encoding and decoding always perform exactly opposite transformations. But for now I’ll be manually defining name mappings for all of my Codable structs.