Showing posts with label ocaml. Show all posts
Showing posts with label ocaml. Show all posts

2008-02-28

Fun with the Value Restriction

This is not a valid OCaml program:

type 'a endo = Endo of int * ('a -> 'a)
let poly = Endo ((succ 23),(fun x -> x))
type thing = { u : unit }
let mono = Endo (0,(fun { u = u } -> { u = u }))
let _ = [mono;poly]


The typechecker rejects it, with this error message:

File "vrmh.ml", line 5, characters 14-18:
This expression has type 'a endo but is here used with type thing endo
The type constructor thing would escape its scope


Removing the last line makes it typecheck. As does exchanging the second and third lines, thusly:

type 'a endo = Endo of int * ('a -> 'a)
type thing = { u : unit }
let poly = Endo ((succ 23),(fun x -> x))
let mono = Endo (0,(fun { u = u } -> { u = u }))
let _ = [mono;poly]


Yes. Notice the mysterious action-at-a-distance. Notice also how the type inferred for poly (see ocamlc -i) is not polymorphic at all, but rather thing endo, the same as mono. Notice further that it would not be possible to define it with an explicit ascription of that type before the definition (i.e., outside the scope) of thing itself. Apparently this is bad.

And then there's the reason why the type of poly didn't have its type variable generalized, thus leaving it with the type '_a endo and subject to unification with the type of mono. This is, of course, the value restriction of ML fame, which I will not explain here; but it's a little more confusing (read: I spent far too much time trying to come up with the mostly minimal test case seen here) (so, yes, this actually is something I ran into in a halfway-real program) in OCaml, where it's been relaxed in certain ways. But not others. For example, this:

type 'a endo = Endo of int * ('a -> 'a)
let poly = Endo (24,(fun x -> x))
type thing = { u : unit }
let mono = Endo (0,(fun { u = u } -> { u = u }))
let _ = [mono;poly]


Is fine.

2007-10-12

Fun with Polymorphic Variants

ML variants. Wonderful stuff:

# type color = Red | Yellow | Green | Cyan | Blue | Magenta | Other of int * int * int;;

OCaml has polymorphic variants, where you don't need to bother with declaring types like that:

# `Foobar;;
- : [> `Foobar ] = `Foobar


It's almost like it's dynamically typed, except it can't go wrong. Of course, when you do go to define types, you have to deal with the type variables hidden in abbreviations like the above, but that's not the point. Nor is this unfortunate limitation of the type sytem:

# function `Foo i -> `Foo (string_of_int i) | x -> x;;
This expression has type [> `Foo of int ] but is here used with type
  [> `Foo of string ]
Types for tag `Foo are incompatible


No, this post is actually about representations. Since OCaml is statically typed to the point of having no runtime type information, it can represent a variant case with no data using the same bits it would use for an int. And it does this thing — meaning that 0, false, the empty list and so on are all equal; but, unlike in Lisp, there's no way to know that without breaking the type system, as the comparison operators work only on two things of like type.

So, break the type system; you could use Obj.repr and Obj.is_int to verify what I just said, or:

# external unsafe_to_int : 'a -> int = "%identity";;
external unsafe_to_int : 'a -> int = "%identity"
# List.map unsafe_to_int [Red;Yellow;Green;Cyan;Blue;Magenta];;
- : int list = [0; 1; 2; 3; 4; 5]


Well, what about polymorphic variants? You can use different, non-disjoint sets of tags in different places, and it somehow has to work. This is somehow:

# unsafe_to_int `Foobar;;
- : int = 807339693


Gee. I wonder how it does that...

# unsafe_to_int `A;;
- : int = 65
# unsafe_to_int `B;;
- : int = 66
# unsafe_to_int `AA;;
- : int = 14560
# unsafe_to_int `AB;;
- : int = 14561
# unsafe_to_int `BA;;
- : int = 14783


Oh look. It's a simple polynomial hash function. At this point, your first thought should be approximately this:

# `Uctakw == `Uctakw;;
- : bool = true
# `Uctakw == `Foobar;;
- : bool = false
# `Uctakw == `Saazaa;;
Variant tags `Uctakw and `Saazaa have same hash value. Change one of them.