in the past week or so, as mentioned (in brief) previously, i've been learning a lot about the development of how we came to understand the second-order lambda calculus, data abstraction, and eventually modern-day module systems. as a frequent SML user, i think i have (and had before this week) a good intuitive grasp on how the module system works, but only recently have i come to have a deeper understanding of what's under the hood, logically speaking, and how it got that way.
i want to try to explain just the surface-scratching essentials of this logical and historical gem, as an expository exercise. i'm primarily interested in telling you about the dialogue between mitchell-plotkin and macqueen in the two papers linked, but i may also involve (and contrast) the modern take in f-ing modules*, which we discussed in RG yesterday. all of these papers are quite good, so read them if you want, but with any luck you won't need to refer to them for my explanation to make sense.
first let's just talk about first-order logic, particularly existential quantifiers. so you have propositions that look like ∃x:τ.P(x) (where τ might be some arbitrary object domain, like natural numbers), and the rule you might have learned for coming up with such a thing might be
and for using such a thing might be
for any proposition C that's closed with respect to x. this is the formulation under which you can think of an existential as an infinitary "or"; specifically, it's closed scope elim is similar:
but there's another way you might like to view these things: after all, the way you came up with it looks an awful lot like you supplied two things, a term and a predicate over it. but when you *use* it, you don't get to know what it is, you just get that it has type τ and that P holds of it and you and work parametrically over that. you don't have direct access to the witness of the existential. what if instead we viewed a proof of this thing as a pair, and gave it elimination rules to project out the components? we can do this with proof terms:
these two different formulations of ∃ are called "weak sums" and "strong sums", respectively. strong sums cough up more from the existential, but they're messier to deal with because now you have to worry about substituting proof terms into types.
(tangentially, if you replace τ with a metavariable ranging over propositions, you've got dependent types in the usual sense (props depending on proofs of props), and you could tell a similar story there, but i wanted to start with something you might have seen in a vanilla intro logic course, and afaict that's usually how they do it.)
now fast-forward the the early 70s and 80s: people are motivated by programming languages much more than before. this gives rise to Girard's and Reynolds' independent discovery of essentially the same calculus: second-order logic, or the polymorphic lambda calculus. you get this if you take first-order logic and now let the quantifier-bound variables range over types (AKA propositions) instead of terms. with ∀, this results in polymorphism, e.g. "∀a.a → a" is the type of the polymorphic identity function.
with ∃, it turns out you can describe data abstraction. for example "∃a.a×a → bool" might be the type "comparable" inhabited by anything that has a comparison function or "∃a.(real×real → a) × (a → real) × (a → real)" might be the type describing a geometric point, which has a function for making such a thing out of two real coordinates, and ways of getting the xcoord and ycoord out of it, but the implementation is left abstract.
the question of import is: are these things strong sums or weak sums?
mitchell and plotkin say they are weak sums. this means the proof terms and their typing rules look like
where, again, C is not allowed to mention the existentially bound thing (a).
they explicitly argue against the idea of "projecting" out the first component -- that is, the type component -- quite correctly, because that would defeat the very purpose, i.e. abstraction. as soon as you can project out the type component used to implement the interface expressed by the existential, then you're no longer abstract in it. what's more, the naive change to the system would actually render it inconsistent; you can get type:type with impredicative strong sums.
but if you think about it a little harder, the weak existentials seem awfully restrictive: the type is forgotten, and you can no longer refer to it at all except inside the scope of a completely hypothetical thing. in short, this isn't modular. if you define a module somewhere, shrinkwrap it, and send it to your coworker, then he has to open it at the very top-level of his code to get at the existence of its type component. and if he ever opens it a second time, e.g.
does not yield that a=b or x=y. moreover, if you ever have *several* modules you want to use, you have to open them *all* at the top level, so now you've got namespace collisions to worry about; you've forgotten that you had a bunch of cleanly separated program components and just thrown all their guts into the same bag o' code.
furthermore, sometimes you *do* want to leak your implementation (i.e. expose the witness type to the client), but you still want to give it an alias to use consistently in your module (like
so along comes macqueen and says, "we really would like these to look more like strong sums." so he invents this stratified dependent type theory mumbo jumbo that solves the type:type problem and describes a calculus with projections
but the original objection still holds: this may achieve modularity, but it sacrifices abstraction. neither author seems to fully come clean with (or maybe even fully understand) the fact that there are sacrifices on both sides, and neither offers an alternative to the strengths of the others' work.
it was only much later that we hit upon the notion of sealing. if you essentially treat a module as a strong sums, but you seal it at a particular signature (existential type), then you hide the implementation... with the caveat that sealing is a static effect, so you don't get the equality described above (e.g. if you write down a structure and seal it, then do the same identical thing again, the two sealed structures are not equal, nor are their components)... unless you declare it somewhere further out with a name you can refer to. what's more is you can have transparent ascription that allows you to seal a module, but blab the secret identity of some of its type components in the signature you seal it at, thereby making it accessible to the rest of the world. the formalism all this fits in is, uh, beyond the scope of this post? maybe next time. but it's all in the ML module system that we (or i, at least) tend to take completely for granted.
so, not to get all "respect your elders" about it, but i hope this lends to you, as it did to me, a greater appreciation for the way these simple little logical tools of the past grew up into big powerful module systems -- with equally sexy but much more modern type theory -- via these first glimpses at the tension between modularity and abstraction.
*looks like i didn't get to this. if i do a sequel i will probably talk about it as well as phase splitting. should i do that or was this boring/badly explained? moar y/n?
i want to try to explain just the surface-scratching essentials of this logical and historical gem, as an expository exercise. i'm primarily interested in telling you about the dialogue between mitchell-plotkin and macqueen in the two papers linked, but i may also involve (and contrast) the modern take in f-ing modules*, which we discussed in RG yesterday. all of these papers are quite good, so read them if you want, but with any luck you won't need to refer to them for my explanation to make sense.
first let's just talk about first-order logic, particularly existential quantifiers. so you have propositions that look like ∃x:τ.P(x) (where τ might be some arbitrary object domain, like natural numbers), and the rule you might have learned for coming up with such a thing might be
t:τ P(t)
-----------
∃x:τ.P(x)
and for using such a thing might be
∃x:τ.P(x) x:τ, P(x) |- C
-----------------------------
C
for any proposition C that's closed with respect to x. this is the formulation under which you can think of an existential as an infinitary "or"; specifically, it's closed scope elim is similar:
A v B A |- C B |- C
---------------------------
C
but there's another way you might like to view these things: after all, the way you came up with it looks an awful lot like you supplied two things, a term and a predicate over it. but when you *use* it, you don't get to know what it is, you just get that it has type τ and that P holds of it and you and work parametrically over that. you don't have direct access to the witness of the existential. what if instead we viewed a proof of this thing as a pair, and gave it elimination rules to project out the components? we can do this with proof terms:
M : ∃x:τ.P(x)
-------------
&pi1M : τ
M : ∃x:τ.P(x)
-----------------
&pi2M : [π1M/x] P(x)
these two different formulations of ∃ are called "weak sums" and "strong sums", respectively. strong sums cough up more from the existential, but they're messier to deal with because now you have to worry about substituting proof terms into types.
(tangentially, if you replace τ with a metavariable ranging over propositions, you've got dependent types in the usual sense (props depending on proofs of props), and you could tell a similar story there, but i wanted to start with something you might have seen in a vanilla intro logic course, and afaict that's usually how they do it.)
now fast-forward the the early 70s and 80s: people are motivated by programming languages much more than before. this gives rise to Girard's and Reynolds' independent discovery of essentially the same calculus: second-order logic, or the polymorphic lambda calculus. you get this if you take first-order logic and now let the quantifier-bound variables range over types (AKA propositions) instead of terms. with ∀, this results in polymorphism, e.g. "∀a.a → a" is the type of the polymorphic identity function.
with ∃, it turns out you can describe data abstraction. for example "∃a.a×a → bool" might be the type "comparable" inhabited by anything that has a comparison function or "∃a.(real×real → a) × (a → real) × (a → real)" might be the type describing a geometric point, which has a function for making such a thing out of two real coordinates, and ways of getting the xcoord and ycoord out of it, but the implementation is left abstract.
the question of import is: are these things strong sums or weak sums?
mitchell and plotkin say they are weak sums. this means the proof terms and their typing rules look like
T type M : [T/a]B
------------------------
pack∃a.B (T, M) : ∃a.B
M : ∃a.B a type, x:B |- N : C
------------------------------------
unpack (a, x) as M in N : C
where, again, C is not allowed to mention the existentially bound thing (a).
they explicitly argue against the idea of "projecting" out the first component -- that is, the type component -- quite correctly, because that would defeat the very purpose, i.e. abstraction. as soon as you can project out the type component used to implement the interface expressed by the existential, then you're no longer abstract in it. what's more, the naive change to the system would actually render it inconsistent; you can get type:type with impredicative strong sums.
but if you think about it a little harder, the weak existentials seem awfully restrictive: the type is forgotten, and you can no longer refer to it at all except inside the scope of a completely hypothetical thing. in short, this isn't modular. if you define a module somewhere, shrinkwrap it, and send it to your coworker, then he has to open it at the very top-level of his code to get at the existence of its type component. and if he ever opens it a second time, e.g.
unpack (a, x) as M in (unpack (b, y) as M in [...])
does not yield that a=b or x=y. moreover, if you ever have *several* modules you want to use, you have to open them *all* at the top level, so now you've got namespace collisions to worry about; you've forgotten that you had a bunch of cleanly separated program components and just thrown all their guts into the same bag o' code.
furthermore, sometimes you *do* want to leak your implementation (i.e. expose the witness type to the client), but you still want to give it an alias to use consistently in your module (like
type point = real * real
).so along comes macqueen and says, "we really would like these to look more like strong sums." so he invents this stratified dependent type theory mumbo jumbo that solves the type:type problem and describes a calculus with projections
witness
and out
to get a hold of the type and term components of the module.but the original objection still holds: this may achieve modularity, but it sacrifices abstraction. neither author seems to fully come clean with (or maybe even fully understand) the fact that there are sacrifices on both sides, and neither offers an alternative to the strengths of the others' work.
it was only much later that we hit upon the notion of sealing. if you essentially treat a module as a strong sums, but you seal it at a particular signature (existential type), then you hide the implementation... with the caveat that sealing is a static effect, so you don't get the equality described above (e.g. if you write down a structure and seal it, then do the same identical thing again, the two sealed structures are not equal, nor are their components)... unless you declare it somewhere further out with a name you can refer to. what's more is you can have transparent ascription that allows you to seal a module, but blab the secret identity of some of its type components in the signature you seal it at, thereby making it accessible to the rest of the world. the formalism all this fits in is, uh, beyond the scope of this post? maybe next time. but it's all in the ML module system that we (or i, at least) tend to take completely for granted.
so, not to get all "respect your elders" about it, but i hope this lends to you, as it did to me, a greater appreciation for the way these simple little logical tools of the past grew up into big powerful module systems -- with equally sexy but much more modern type theory -- via these first glimpses at the tension between modularity and abstraction.
*looks like i didn't get to this. if i do a sequel i will probably talk about it as well as phase splitting. should i do that or was this boring/badly explained? moar y/n?
no subject
Date: 2009-11-19 12:34 am (UTC)From:Also, all the module calculi I've ever encountered strike me as tragic in the very specific sense that the ML module system seems very intuitive* (I mean, I learned almost everything I know about it from my 212 exam in 20 minutes...:-P) but the calculi to describe it are very complicated. But whenever I sit down to think about module formalizations, it becomes apparent that most of the complication is necessary. (I should read the F-ing modules paper again.)
*modulo a few things about whether functor application is generative (I think SML and OCaml may do this differently?) and the whole diamond-inclusion, sharing fiasco**,***.
** I really like "with type" in OCaml, which I /think/ is eqivalent to "where type" in SML in that it lets you punch a hole in a signature by saying "S with type t = int" to produce a new signature which is weaker than S (where S is a sig or a signature variable). I guess this is an intuitive solution to that problem.
*** and also datatype copying!!!
no subject
Date: 2009-11-19 12:37 am (UTC)From:I have exactly the same feeling, and this is why I'm so excited by the F-ing modules paper --- it seems to explain the whole mess reasonably simply, in contemporary language, and is only about as complicated as you'd think it should be given that you've programmed in the language.
no subject
Date: 2009-11-19 12:41 am (UTC)From:no subject
Date: 2009-11-19 11:51 pm (UTC)From:no subject
Date: 2009-11-19 06:00 am (UTC)From:william confirms that with type is exactly the same as wheretype. and yeah, it solves the diamond problem and makes the weird "sharing" construct redundant.
it becomes apparent that most of the complication is necessary.
i think it really depends what you mean by necessary.
argh so sleepy articulation power.. fading...
no subject
Date: 2009-11-19 06:09 am (UTC)From:Is there a technical good reason functor application should be generative? I've been using OCaml for a while, and applicative functors feel very natural. (ETA: there are some weird corner cases where proving that the types that result from two different applications of the same functor are the same fails for mysterious reasons, though. I *think* I once found a mailing list post by Xavier mentioning that there was a better way to do this than they did, but I do not remember the details and I think it was unclear to me at the time.)
no subject
Date: 2009-11-19 06:40 am (UTC)From:no subject
Date: 2009-11-19 06:49 am (UTC)From:The real question is, given some functor F and some module M, and the following code,
module A = F(M)
module B = F(M)
does A.t = B.t?
So, even if application has effects, I think functor application can still be applicative under this definition.
no subject
Date: 2009-11-19 06:53 am (UTC)From:functor F(M : S) =
struct
datatype t = Foo
val v = ref ()
end
is effectful, but we can still ask if the functor application is applicative in the same way as I said above: i.e., is type t in any module created by applying this a functor equal to type t in any other.
no subject
Date: 2009-11-19 07:21 am (UTC)From:Yes: sometimes it's what you want. Suppose the result signature of a functor has some abstract type t which, under the hood, is represented by some stateful thing like a reference, or keys into a lookup table, or something like that. The implementor of such a functor might reasonably hope that clients won't be able to get t's from some other instantiation of the functor which -- although its type component is equal -- has a different reference or a different lookup table on the inside.
Here's a simple example illustrating the problem:
Now suppose we say
Now certainly we both agree that M.t and N.t are equal types, namely both int ref, but they have different invariants, since they are inhabited by different int ref values! But i can do all sorts of nefarious things like
N.get M.x
andM.inc N.x
, and in a more involved setting like the lookup table keys i alluded to, in some cases this violation of abstraction might result in a run-time error.As far as i can tell, there is no way to obtain implementor-side guarantees like this from O'Caml's applicative functors. Interestingly, you can always make the two types actually different by sealing the two modules, e.g.:
but this pushes the resolution to the client-side -- i.e. the client of the module has to realize the extra sealing is necessary to ensure the proper invariants are maintained. If you can find a way to enforce this guarantee on the implementor's side in O'Caml, i'd love to see it!
no subject
Date: 2009-11-19 07:55 am (UTC)From:Alternatively, if one could be encoded in terms of the other but not vice versa, this would be a compelling argument for one over another. But I think this isn't (easily) true (as you kind of point out).
I'm sure there are cases where one would want applicative functors, too. I know I did last summer at Janest, I just can't remember why. I think it had to do with wanting to use both A.t and B.t (say) as keys to map.
no subject
Date: 2009-11-19 08:47 am (UTC)From:I have a vague recollection in the back of my head that in some module calculus or another you could encode generative functors as applicative functors followed by sealing. I just couldn't figure out a way to get this to work in O'Caml -- it seems to do everything in its power to keep you from sealing the result of a functor :)
no subject
Date: 2009-11-19 12:02 pm (UTC)From:no subject
Date: 2009-11-19 12:05 pm (UTC)From:no subject
Date: 2009-11-19 04:08 pm (UTC)From:no subject
Date: 2009-11-19 04:32 pm (UTC)From:no subject
Date: 2009-11-19 05:09 pm (UTC)From:no subject
Date: 2009-11-19 12:34 am (UTC)From:For instance, I didn't know impredicative strong sums were inconsistent. Do you remember how the proof goes or where it comes from?
During "fast-forward through the 70s and 80s" I have a mental image of John Reynolds standing still and rapidly changing hairstyles and clothes, with a pipe sometimes appearing and disappearing. It is basically the best mental image.
Also:
Historically, Reynolds and Girard didn't discover second-order logic itself, did they? But rather its Curry-Howard analogue? I am actually not sure about this but my gut says higher order logic itself was an early 20th century thing.
Also also:
Obviously with focusing-colored glasses on, strong sums look like the "negative version" of existentials. What is the deal with this? Most recently I tend to believe that Σ just is positive, and blame its unfriendliness of negative-Σ-with-projections with spine form on a focusing error, and the difficulties of strong sums seem to corroborate that. And yet "F-ing modules" works! That is, at least there's an elaboration that makes positive sums look negative. Is this telling us anything interesting about what we can do with other positive types?
no subject
Date: 2009-11-19 12:45 am (UTC)From:Matthew took the Which logical system are you? quiz and the result is Full Second-Order Logic (Standard Semantics)
Expressive power, it is said, is the greatest aphrodisiac—and you have it in spades. People tend to find you intensely attractive, but often also a bit scary. Some of them think you’re definitely too much t...o handle. Characterizing the intended model of arithmetic up to isomorphism is just an average night out for you; you’ve been known to formulate sentences equivalent to the continuum hypothesis on the second date. Those who don’t know you sometimes confuse you with your rather shy sibling—the one with the Henkin semantics and the stammer—but they quickly realize their error upon meeting you. Lately, however, you’ve been worrying that you’re becoming respectable: at times you secretly long for the old days when scandalized Quineans, convinced that your relationship with set theory was unnatural and corrupting, felt obliged to warn their children against you.
On a more serious note, yeah, it was definitely earlier on. I could have sworn it shows up in Goedel's work, but I'm not seeing it.
no subject
Date: 2009-11-19 12:49 am (UTC)From:no subject
Date: 2009-11-19 12:47 am (UTC)From:no subject
Date: 2009-11-19 06:09 am (UTC)From:no subject
Date: 2009-11-19 07:24 am (UTC)From:no subject
Date: 2009-11-19 01:07 am (UTC)From:well, the reference is "impredicative strong existential equivalent to type:type" by hook and howe. i've not looked at it, but william and i tried (and i think nearly succeeded) to reconstruct it based on karl's hint in class that it has something to do with being able to code up "type" as an existential... so something like ∃a.top, because you can take any type, pack it up with (), and now it inhabits that type.
no subject
Date: 2009-11-19 01:45 am (UTC)From:no subject
Date: 2009-11-19 05:53 am (UTC)From:no subject
Date: 2009-11-19 06:06 am (UTC)From:i would speculate based on possible (mis)understandings of things people were saying at RG, but i think i should sleep first.
no subject
Date: 2009-11-19 01:54 am (UTC)From:no subject
Date: 2009-11-19 05:51 am (UTC)From:Most grad students' experiences writing such things on personal blogs seem to be "agh why are you talking about that???"—you have a very cooperative readership! (This is of course generally true, but.)
It's really awesome that you have this "classic papers" course going on. It's very annoying that most graduate work you're sorta expected to know the foundational papers but expected to be using your time more wisely than actually reading them. Somehow I need to figure out Important Pre-OT Phonology that I need to read...
no subject
Date: 2009-11-19 06:08 am (UTC)From:Most grad students' experiences writing such things on personal blogs seem to be "agh why are you talking about that???"
really? what blogs? i don't think i've ever seen that. usually if people are bored by a topic they just won't comment.
yeah, i'm really happy classic papers came into being, for exactly the reason you say.
no subject
Date: 2009-11-19 12:04 pm (UTC)From:no subject
Date: 2009-11-19 04:34 pm (UTC)From:no subject
Date: 2009-11-19 03:05 pm (UTC)From:Mostly like LJs as opposed to full blogs for this purpose, of course. It's pretty surprising that people don't just not comment, but I guess it's some manifestation of LJ's "forcing you to read" in the form of Friends pages?
Totally going to try asking my advisor today if there's some canonical list of things I minimally should read. I am Inspired.
no subject
Date: 2009-11-19 03:53 pm (UTC)From:So... um, I guess a proposition returns a boolean, right? So you're dividing lots-of-spaces-operator(t-of-type-tau, undefined-function) by a boolean, and I feel like that's probably a type error.
This isn't going to stop me from commenting, mind. ^_^
no subject
Date: 2009-11-19 08:17 am (UTC)From:no subject
Date: 2009-11-19 08:42 am (UTC)From:∃x:τ.P(x) just means there is some P in which x is used for which there exists some x such that P is true. τ is the type of x, which is really important in the type theory, but you can think of it as sorta-kinda just specifying a set of elements from which we know the x is chosen.
no subject
Date: 2009-11-19 12:07 pm (UTC)From:You may have seen "there exists an x such that P(x)" written
(∃x)P(x)
and "for all x such that P(x)"
(∀x)P(x)
or sometimes even in very old-fashioned math as
(x)P(x)
but we tend to write
∃x.P(x)
∀x.P(x)
as part of a general notational habit of writing a dot between any bound variable being introduced on the left of the dot, and the scope in which it is allowed to be mentioned on the right. Like, we also write
λx.E(x)
for the function that takes an argument x, and returns E(x).
no subject
Date: 2009-11-19 04:35 pm (UTC)From:no subject
Date: 2009-11-19 05:07 pm (UTC)From:no subject
Date: 2009-11-20 02:20 am (UTC)From:no subject
Date: 2009-11-23 06:17 pm (UTC)From: