It is a bit of a subtle difference, but a type isn’t exactly the same as a set of values. A type is a symbolic tag associated with some data, with implicit rules related to it as defined by the given type system that manages it.
For example, these are types in Clojure:
- java.lang.Long
- java.lang.String
- clojure.lang.PersistentVector
You can think of a type as a label, and typed data as a piece of data with an accompanying type label.
What happens next is that, the type of some data (aka data type) will be tracked by the type system. And its specific type will define implicit rules the type system can rely on, for example, it can define the operations that are permitted to manipulate the data, the semantic meaning of the data, the memory encoding for the data, relations to other types in the type system, etc.
Now here is where things get a bit tricky. A type can be used to restrict the set of values from which a variable, function, form, etc. may take its value, but a type is not a set of values and it doesn’t need to be.
Let me try to simplify this. Here’s a piece of data using Clojure’s EDN notation:
{:name "John"
:age 12}
You can consider this to be an untyped piece of data. Now, say we wanted to add types to it, here’s one way to do it:
{:name "John"
:age 12
:type :person}
This is now typed data, and :person
is now a type. That’s all a type is. Now, the type system can choose to use this type in any way it wants. So you can imagine for example, it could validate that only data of a certain type is provided to a given function, for example:
(defn kill-person [person]
{:pre [(= :person (:type person))]
:post [(= :person (:type %))]}
(assoc person :dead? true))
Now in a real type system, the type is at a much more low level. Untyped data are bits in memory. And the type system for any known section of bits will track the type of those bits. Thus that will specify how the bits are to even be encoded/decoded and stored. And then things grow out of that.
The relation to a set is thus dependent on the type system in use, and it is implicit. The type doesn’t actually list out the possible values, and it doesn’t even necessarily have a predicate that it can use to validate if the data actually fits the implicit assumptions of a particular type, though most type system will, at least for primitive types.
So, if you followed along, you can see that a type doesn’t necessarily define a set. So checking if a type is a subtype of another doesn’t need to perform a subset operation. It can just check the type information provided by the type system. For example:
{:value "c"
:type :char}
{:value "cool"
:type :string}
Now the type system might define the following subtyping relation:
(def subtype
{:char :string})
And now if you asked (isa? :char :string)
the type system can just look in the subtype map to see if it is true or false. No set operations are involved.
So now in Clojure (JVM), the first type system involved is the Java type system. This type system is used to define encodings for the memory bits in use by the program, so you have all the primitive types like:
- int
- long
- char
- double
- array
- etc.
Those types define the memory layout and the decoding scheme (like big endian). They are also used to validate and specify the possible values they represent. That’s where you could say that there exist a set of values that maps to the specific type, but that set existing, does not mean that the type itself is that set. It is more like a symbolic representation of that set. And it doesn’t even mean there are subtyping relations if their respective sets were to be subset of each other. For example, a long is not necessarily an int. You can’t use an int in place to something that needs a long without converting one to the other. That’s because their encodings aren’t compatible, an int is not just the subset of longs from -2,147,483,648 to 2,147,483,647. So even though the set of value they represent have a subset relation, the types are not compatible.
Types in Clojure are thus closely related to types in Java. Each type being either a primitive type, or a class/interface type. Like in Java, classes/interfaces can extend/implement others, thus defining subtyping relations. You can create new JVM types using deftype and defprotocol. The rules of the Java type system will apply to them. Protocols extend that beyond what the Java type system allows, by making type relations definable dynamically at runtime as well, and so that they can be declared outside the type definition itself. Java determines if a given type isa subtype of another purely through these relationship definition. In real Java, you’d also have generic types, which have more complex rules around determining subtyping relations, those don’t exist in Clojure.
Clojure also has another type system which is the derive
based one. This type system is a lot like the one I just demonstrated in my examples above. You can kind of use it to build your own custom type systems on top of it. It integrates with the Java one, because Java types can be defined as subtypes, so you can for example:
(derive java.lang.LinkedList ::collection)
Nothing uses this by default. Though isa?
, parents
, ancestors
, descendants
, etc., all integrate with it. The most common use case is probably to use it in combination with multi-methods for polymorphic dispatch.
No it can’t. And like I said, types won’t necessarily map back and forth. That said, it isn’t impossible, there can exist a relation from predicates to types and back. This is what spectrum tries to do: GitHub - arohner/spectrum
Predicates, sets and types?
Alright, so now, all of these concepts can overlap in some ways and have a lot of intersect. Yet, none of them have universal formal definitions as well. So each one must be taken within the context of the theory or concrete system they are being used in.
So if:
- A set SX is a distinct collection of values.
- A predicate PX can be written to return true if X is an element of the set SX.
- A type TX can be created whose definition is that it restricts data to only those which are in the set SX or for which PX returns true.
You can see how there is some possibility of a correspondence here. That said, depending which of those three things you actually have, you can’t do the same things, and you can’t necessarily easily derive the others.
What about core.spec?
Ok, I’ll finish by talking about core.spec. This system exposes predicates, and lets the user define predicates. I hope I managed to explain how predicates are not types, and neither are they sets. Given a predicate, it isn’t trivial (or maybe even possible not sure), to figure out if any of the Java types would correspond to it or be a subtype of it if it were to have a corresponding type.
You could define corresponding types yourself, and corresponding relations if you wanted too. But I don’t know if one can derive those programmatically or even mathematically. Like:
(s/def ::foo (s/int-in 20 50))
;; Now let us also assume the type ::foo, which is the type of all values
;; that validate when tested against the spec ::foo
(s/def ::bar (s/int-in 30 40))
;; Now let us also assume the type ::bar, which is the type of all values
;; that validate when tested against the spec ::bar
;; Now, how would we know what relations exist between ::foo and ::bar?
;; Could we infer them? Maybe, but more easily you could just specify them:
(derive ::bar ::foo)
(isa? ::bar ::foo)
;;=> true
And if you wanted to grow a type system over that, for every function whose spec say that the return or any of the arguments should be a valid ::foo or ::bar spec, you can also try to type check their corresponding types.
If you wanted to infer or derive complex relationship from the specs to their types and between them, you most likely would face the halting problem, as I’m pretty sure it isn’t feasible (unless you restrict yourself to only certain predicate or use some non-turing complete predicate system), but with some level of provided annotation it might be doable.