I lean on this. Or instead of nil being possible, something like: Must be present and not nil
but can be :not-provided
.
My reason is because I think about the form presented to the user, did we ask for a middle name on it or not? If the form didn’t have a middle name input box, than I’d make the key missing, since the form didn’t ask for middle name, there is no middle name key. If the form does have a middle name input box, then I want to know that, and then I want to know that the user didn’t enter anything in it, which is why if they don’t provide a middle name on it I would make it either nil
which in my system is interpreted as :not-provided
or I’d use a keyword or a special type to indicate when it is not provided (which would not be a string as not to conflict with a user providing as a middle name the name not-provided
)
I find that knowing the difference between the data not having been found but looked for, and not having been provided but asked for to come in handy sometimes, so I like to be explicit about modeling it.
Another case is for example if you query something, if you tried to get a middle name from your DB or other, and it was not found, than I’d rather get {:middle-name :not-found}
or even {:middle-name nil}
instead of {}
. I’ll assume if there is no middle-name key, no attempt was made to acquire it, where as if the key is there but nil
or a keyword explaining why it’s not there, that an attempt to acquire it was made, but yielded nothing. The keyword is even better because it can tell you why it yielded nothing, like :not-found
or :not-provided
and is also less likely to be nil
due to a bug.
This also has the advantage that with Spec for example, conform can tell the difference between your various schemas.
Like imagine the form a year ago didn’t have a :middle-name
key. So the data from a year ago would look like:
{:first "Martin"
:last "King"}
And eventually a middle name was added and now the data would look like:
{:first "Martin"
:middle nil
:last "King"}
You could then spec it like:
(s/def first-last-form
(s/keys :req-un [::first ::last]))
(s/def first-last-middle-form
(s/keys :req-un [::first ::last ::middle]))
And now given some form data s/conform
can tell you which of these two spec it belongs too. Without you needing to use nominal types.
If you didn’t do it this way, you could not distinguish which type of form this data was? Unless you had a nominal typing scheme like:
{:type :first-last-middle
:first "Martin"
:last "King"}
Which now tells you that even though there is no :middle
key this was a :first-last-middle
form. And so you similarly know that the middle name was asked to be provided by the user but nothing was entered.
So in summary, I like my keys to tell me about what keys are on my data. Keys map to expectations of what data can be there, so if the map represents user input, it means all the keys were something the user was asked to provide. If the keys represent a DB result, all the keys represent the columns in my table. If the user was not asked for it, or the table doesn’t have the column, then the key wouldn’t be on the map for it.
And similarly I like my values to tell me what values are on my data, where I consider lack of a value a value, which I normally am okay using nil
for. It means that for some reason it’s not there for this entry.
Ok one last thing, this is not like OO. I just want to point this out. In OO the opposite normally happens. You end up with a class of the union of all possible keys over time. And then things are null, but you don’t know why they are null. Since in OO the object cannot represent “key not applicable”, you’re forced to have the key and put null in it. You can be better in OO as well, and use subtyping to represent all schema variants, but it’s a lot more work and people get lazy and in practice this doesn’t happen.
This is same for anything which cannot mix different schemas. So like SQL DBs suffer from this, since you can’t have old rows without a column. So again, you’re forced to make up some value for all these old rows, and pretend like the user didn’t enter anything or entered some weird default (which is really weird if the column type is a Number)