chrisamaphone | (no subject)

in the past week or so, as mentioned (in brief) previously, i've been learning a lot about the development of how we came to understand the second-order lambda calculus, data abstraction, and eventually modern-day module systems. as a frequent SML user, i think i have (and had before this week) a good intuitive grasp on how the module system works, but only recently have i come to have a deeper understanding of what's under the hood, logically speaking, and how it got that way.

i want to try to explain just the surface-scratching essentials of this logical and historical gem, as an expository exercise. i'm primarily interested in telling you about the dialogue between mitchell-plotkin and macqueen in the two papers linked, but i may also involve (and contrast) the modern take in f-ing modules*, which we discussed in RG yesterday. all of these papers are quite good, so read them if you want, but with any luck you won't need to refer to them for my explanation to make sense.

first let's just talk about first-order logic, particularly existential quantifiers. so you have propositions that look like ∃x:τ.P(x) (where τ might be some arbitrary object domain, like natural numbers), and the rule you might have learned for coming up with such a thing might be


t:τ    P(t)
-----------
∃x:τ.P(x)

and for using such a thing might be


∃x:τ.P(x)     x:τ, P(x) |- C
-----------------------------
            C

for any proposition C that's closed with respect to x. this is the formulation under which you can think of an existential as an infinitary "or"; specifically, it's closed scope elim is similar:


A v B     A |- C     B |- C
---------------------------
            C

but there's another way you might like to view these things: after all, the way you came up with it looks an awful lot like you supplied two things, a term and a predicate over it. but when you *use* it, you don't get to know what it is, you just get that it has type τ and that P holds of it and you and work parametrically over that. you don't have direct access to the witness of the existential. what if instead we viewed a proof of this thing as a pair, and gave it elimination rules to project out the components? we can do this with proof terms:


M : ∃x:τ.P(x)
-------------
&pi₁M : τ


M : ∃x:τ.P(x)
-----------------
&pi₂M : [π₁M/x] P(x)

these two different formulations of ∃ are called "weak sums" and "strong sums", respectively. strong sums cough up more from the existential, but they're messier to deal with because now you have to worry about substituting proof terms into types.

(tangentially, if you replace τ with a metavariable ranging over propositions, you've got dependent types in the usual sense (props depending on proofs of props), and you could tell a similar story there, but i wanted to start with something you might have seen in a vanilla intro logic course, and afaict that's usually how they do it.)

now fast-forward the the early 70s and 80s: people are motivated by programming languages much more than before. this gives rise to Girard's and Reynolds' independent discovery of essentially the same calculus: second-order logic, or the polymorphic lambda calculus. you get this if you take first-order logic and now let the quantifier-bound variables range over types (AKA propositions) instead of terms. with ∀, this results in polymorphism, e.g. "∀a.a → a" is the type of the polymorphic identity function.

with ∃, it turns out you can describe data abstraction. for example "∃a.a×a → bool" might be the type "comparable" inhabited by anything that has a comparison function or "∃a.(real×real → a) × (a → real) × (a → real)" might be the type describing a geometric point, which has a function for making such a thing out of two real coordinates, and ways of getting the xcoord and ycoord out of it, but the implementation is left abstract.

the question of import is: are these things strong sums or weak sums?

mitchell and plotkin say they are weak sums. this means the proof terms and their typing rules look like


T type      M : [T/a]B
------------------------
pack_∃a.B (T, M) : ∃a.B

M : ∃a.B      a type, x:B |- N : C
------------------------------------
unpack (a, x) as M in N : C

where, again, C is not allowed to mention the existentially bound thing (a).

they explicitly argue against the idea of "projecting" out the first component -- that is, the type component -- quite correctly, because that would defeat the very purpose, i.e. abstraction. as soon as you can project out the type component used to implement the interface expressed by the existential, then you're no longer abstract in it. what's more, the naive change to the system would actually render it inconsistent; you can get type:type with impredicative strong sums.

but if you think about it a little harder, the weak existentials seem awfully restrictive: the type is forgotten, and you can no longer refer to it at all except inside the scope of a completely hypothetical thing. in short, this isn't modular. if you define a module somewhere, shrinkwrap it, and send it to your coworker, then he has to open it at the very top-level of his code to get at the existence of its type component. and if he ever opens it a second time, e.g.

unpack (a, x) as M in (unpack (b, y) as M in [...])

does not yield that a=b or x=y. moreover, if you ever have *several* modules you want to use, you have to open them *all* at the top level, so now you've got namespace collisions to worry about; you've forgotten that you had a bunch of cleanly separated program components and just thrown all their guts into the same bag o' code.

furthermore, sometimes you *do* want to leak your implementation (i.e. expose the witness type to the client), but you still want to give it an alias to use consistently in your module (like type point = real * real).

so along comes macqueen and says, "we really would like these to look more like strong sums." so he invents this stratified dependent type theory mumbo jumbo that solves the type:type problem and describes a calculus with projections witness and out to get a hold of the type and term components of the module.

but the original objection still holds: this may achieve modularity, but it sacrifices abstraction. neither author seems to fully come clean with (or maybe even fully understand) the fact that there are sacrifices on both sides, and neither offers an alternative to the strengths of the others' work.

it was only much later that we hit upon the notion of sealing. if you essentially treat a module as a strong sums, but you seal it at a particular signature (existential type), then you hide the implementation... with the caveat that sealing is a static effect, so you don't get the equality described above (e.g. if you write down a structure and seal it, then do the same identical thing again, the two sealed structures are not equal, nor are their components)... unless you declare it somewhere further out with a name you can refer to. what's more is you can have transparent ascription that allows you to seal a module, but blab the secret identity of some of its type components in the signature you seal it at, thereby making it accessible to the rest of the world. the formalism all this fits in is, uh, beyond the scope of this post? maybe next time. but it's all in the ML module system that we (or i, at least) tend to take completely for granted.

so, not to get all "respect your elders" about it, but i hope this lends to you, as it did to me, a greater appreciation for the way these simple little logical tools of the past grew up into big powerful module systems -- with equally sexy but much more modern type theory -- via these first glimpses at the tension between modularity and abstraction.

*looks like i didn't get to this. if i do a sequel i will probably talk about it as well as phase splitting. should i do that or was this boring/badly explained? moar y/n?

Flat | Top-Level Comments Only

moar

Also, all the module calculi I've ever encountered strike me as tragic in the very specific sense that the ML module system seems very intuitive* (I mean, I learned almost everything I know about it from my 212 exam in 20 minutes...:-P) but the calculi to describe it are very complicated. But whenever I sit down to think about module formalizations, it becomes apparent that most of the complication is necessary. (I should read the F-ing modules paper again.)

*modulo a few things about whether functor application is generative (I think SML and OCaml may do this differently?) and the whole diamond-inclusion, sharing fiasco**,***.

** I really like "with type" in OCaml, which I /think/ is eqivalent to "where type" in SML in that it lets you punch a hole in a signature by saying "S with type t = int" to produce a new signature which is weaker than S (where S is a sig or a signature variable). I guess this is an intuitive solution to that problem.

*** and also datatype copying!!!

Also, all the module calculi I've ever encountered strike me as tragic in the very specific sense that the ML module system seems very intuitive* (I mean, I learned almost everything I know about it from my 212 exam in 20 minutes...:-P) but the calculi to describe it are very complicated. But whenever I sit down to think about module formalizations, it becomes apparent that most of the complication is necessary.

I have exactly the same feeling, and this is why I'm so excited by the F-ing modules paper --- it seems to explain the whole mess reasonably simply, in contemporary language, and is only about as complicated as you'd think it should be given that you've programmed in the language.

OK. I will read it again (I am pretty sure I read it about a year ago, but I read a bunch of other module papers and they all blur together).

It just got accepted to TLDI, so it's at the very least substantially revised from whatever thing you read.

yes, ocaml has applicative functors, as karl repeatedly iterated in HOT with reference to "those crazy french."

william confirms that with type is exactly the same as wheretype. and yeah, it solves the diamond problem and makes the weird "sharing" construct redundant.

it becomes apparent that most of the complication is necessary.

i think it really depends what you mean by necessary.

argh so sleepy articulation power.. fading...

Well, what I really meant by "it becomes apparent that most of the complication is necessary" is that signature ascription packages up a bunch of things (hiding labels, making types abstract or not, um, some others?) which programmers feel are related into one construct, but they are actually different and so things get complicated.

Is there a technical good reason functor application should be generative? I've been using OCaml for a while, and applicative functors feel very natural. (ETA: there are some weird corner cases where proving that the types that result from two different applications of the same functor are the same fails for mysterious reasons, though. I *think* I once found a mailing list post by Xavier mentioning that there was a better way to do this than they did, but I do not remember the details and I think it was unclear to me at the time.)

Edited Date: 2009-11-19 06:11 am (UTC)

Thinking briefly about it, I don't understand how functors can be applicative. Doesn't functor application have effects?

Er, does it? What are they?

The real question is, given some functor F and some module M, and the following code,

module A = F(M)
module B = F(M)

does A.t = B.t?

So, even if application has effects, I think functor application can still be applicative under this definition.

Oh, I see. The application of

functor F(M : S) =
struct
datatype t = Foo
val v = ref ()
end

is effectful, but we can still ask if the functor application is applicative in the same way as I said above: i.e., is type t in any module created by applying this a functor equal to type t in any other.

Is there a technical good reason functor application should be generative?

Yes: sometimes it's what you want. Suppose the result signature of a functor has some abstract type t which, under the hood, is represented by some stateful thing like a reference, or keys into a lookup table, or something like that. The implementor of such a functor might reasonably hope that clients won't be able to get t's from some other instantiation of the functor which -- although its type component is equal -- has a different reference or a different lookup table on the inside.

Here's a simple example illustrating the problem:

module type TABLE =
  sig
    type t
    val x : t
    val get : t -> int
    val inc : t -> unit
  end

module I = struct let x = 5 end

module AppTable(X : sig val x : int end) : TABLE =
  struct
    type t = int ref
    let x = ref X.x
    let get y = !y
    let inc y = incr y
  end

Now suppose we say

module M = Table(I);;
module N = Table(I);;

Now certainly we both agree that M.t and N.t are equal types, namely both int ref, but they have different invariants, since they are inhabited by different int ref values! But i can do all sorts of nefarious things like N.get M.x and M.inc N.x, and in a more involved setting like the lookup table keys i alluded to, in some cases this violation of abstraction might result in a run-time error.

As far as i can tell, there is no way to obtain implementor-side guarantees like this from O'Caml's applicative functors. Interestingly, you can always make the two types actually different by sealing the two modules, e.g.:

module M : TABLE = AppTable(I);;
module N : TABLE = AppTable(I);;

but this pushes the resolution to the client-side -- i.e. the client of the module has to realize the extra sealing is necessary to ensure the proper invariants are maintained. If you can find a way to enforce this guarantee on the implementor's side in O'Caml, i'd love to see it!

Oh, yeah, I realize that these different functor-application behaviors can be used to get different guarantees about your programs. I'm asking if there is some sense in which one or the other is "wrong", either technically (say, it breaks soundness, but also more subtle things) or aesthetically.

Alternatively, if one could be encoded in terms of the other but not vice versa, this would be a compelling argument for one over another. But I think this isn't (easily) true (as you kind of point out).

I'm sure there are cases where one would want applicative functors, too. I know I did last summer at Janest, I just can't remember why. I think it had to do with wanting to use both A.t and B.t (say) as keys to map.

I am prretty sure applicative functors are sound, but they do allow you to violate abstraction boundaries as above, and this could lead to broken code. Personally, i call that an aesthetic loss.

I have a vague recollection in the back of my head that in some module calculus or another you could encode generative functors as applicative functors followed by sealing. I just couldn't figure out a way to get this to work in O'Caml -- it seems to do everything in its power to keep you from sealing the result of a functor :)

Hm so you're not allowed to make a functor that just returns little shim wrapper module around a sealed module to force it to be generative?

Nope! Tried that. Didn't work. I'm not really sure why.

On the other hand, it should be perfectly valid to make your functor application compository. You just take your proposition ∃x:τ.P(x), and your more different propositions B |- C and M : ∃a.B, and you cross-project their modularity functions in the opposite order. So then you get:

module AppTable(B |- C const M : ∃a.B) : TABLE =
  struct
    type t:τ
    type x:τ.τ, y:2τ
    val π1M:τ = unpack(B   C)
    τ.τ.τ ? ∃a.B : throw new ModularityException();;
    let M unpack x in y with implicit(z)
    return M:τ
  end

Dan, go back to your graph theory :P

No, no, no, dan, you forgot to reverse-focus the polarizamajigginator. Honestly, it's almost as if you're just making fun of us.

I love posts like this! Make Moar for sure. I'm finding somehow that even reading detailed stories about math I allegedly know about is enjoyable because (a) the pedagogical/story-telling aspects of it are fascinating in their own right and (b) I usually learn a (or am reminded of a forgotten) thing or two anyway.

For instance, I didn't know impredicative strong sums were inconsistent. Do you remember how the proof goes or where it comes from?

During "fast-forward through the 70s and 80s" I have a mental image of John Reynolds standing still and rapidly changing hairstyles and clothes, with a pipe sometimes appearing and disappearing. It is basically the best mental image.

Also:

Historically, Reynolds and Girard didn't discover second-order logic itself, did they? But rather its Curry-Howard analogue? I am actually not sure about this but my gut says higher order logic itself was an early 20th century thing.

Also also:

Obviously with focusing-colored glasses on, strong sums look like the "negative version" of existentials. What is the deal with this? Most recently I tend to believe that Σ just is positive, and blame its unfriendliness of negative-Σ-with-projections with spine form on a focusing error, and the difficulties of strong sums seem to corroborate that. And yet "F-ing modules" works! That is, at least there's an elaboration that makes positive sums look negative. Is this telling us anything interesting about what we can do with other positive types?

Here is the reason that I am pretty sure that second order logic is early twentieth century (via facebook):

Matthew took the Which logical system are you? quiz and the result is Full Second-Order Logic (Standard Semantics)

Expressive power, it is said, is the greatest aphrodisiac—and you have it in spades. People tend to find you intensely attractive, but often also a bit scary. Some of them think you’re definitely too much t...o handle. Characterizing the intended model of arithmetic up to isomorphism is just an average night out for you; you’ve been known to formulate sentences equivalent to the continuum hypothesis on the second date. Those who don’t know you sometimes confuse you with your rather shy sibling—the one with the Henkin semantics and the stammer—but they quickly realize their error upon meeting you. Lately, however, you’ve been worrying that you’re becoming respectable: at times you secretly long for the old days when scandalized Quineans, convinced that your relationship with set theory was unnatural and corrupting, felt obliged to warn their children against you.

On a more serious note, yeah, it was definitely earlier on. I could have sworn it shows up in Goedel's work, but I'm not seeing it.

looks like you're right -- i should have realized this, having recently finished logicomix -- but then that's all sets and not types. which, of course, turns out to be important.

I can never remember how powerful something needs to be before it falls prey to incompleteness, but I think it is first order...?

logicomix says second order. (i can never remember either...)

i think anything strong enough to talk about arithmetic is incomplete. Peano arithmetic is still a first-order theory, albeit one with infinitely many axioms, so you don't need second-order (which is *preposterously* powerful! omg system F is so Girard-crazy..)

Do you remember how the proof goes or where it comes from?

well, the reference is "impredicative strong existential equivalent to type:type" by hook and howe. i've not looked at it, but william and i tried (and i think nearly succeeded) to reconstruct it based on karl's hint in class that it has something to do with being able to code up "type" as an existential... so something like ∃a.top, because you can take any type, pack it up with (), and now it inhabits that type.

ooh this makes perfect sense and sounds familiar somehow. Whereas the weak ∃a.top is just isomorphic to top, right?

yes.

as for your last paragraph, i'm not really sure. it would have been nice if this came up in RG, because it was on my mind but i forgot to ask it.

i would speculate based on possible (mis)understandings of things people were saying at RG, but i think i should sleep first.

moar: y.

I too appreciate this writeup though of course it mostly baffles me.

Most grad students' experiences writing such things on personal blogs seem to be "agh why are you talking about that???"—you have a very cooperative readership! (This is of course generally true, but.)

It's really awesome that you have this "classic papers" course going on. It's very annoying that most graduate work you're sorta expected to know the foundational papers but expected to be using your time more wisely than actually reading them. Somehow I need to figure out Important Pre-OT Phonology that I need to read...

out of curiosity, how far did you get before becoming baffled?

Most grad students' experiences writing such things on personal blogs seem to be "agh why are you talking about that???"

really? what blogs? i don't think i've ever seen that. usually if people are bored by a topic they just won't comment.

yeah, i'm really happy classic papers came into being, for exactly the reason you say.

Hehe well I think I get a lot of nonpositive remarks about my wall-of-math posts, but it doesn't stop me. I think the thing is the people who enjoy them don't make specific positive remarks, they just post comments.

and i think the fact that they get comments is evidence enough that people enjoy them. i'm pretty sure the only "uninterested" comments i see are just poking fun and have hints of "i wish i had the background to understand this because you're obviously excited about it."

Um probably where pack/unpack start coming in is where-ish. Though of course I've never formally the notation but.

Mostly like LJs as opposed to full blogs for this purpose, of course. It's pretty surprising that people don't just not comment, but I guess it's some manifestation of LJ's "forcing you to read" in the form of Friends pages?

Totally going to try asking my advisor today if there's some canonical list of things I minimally should read. I am Inspired.

I got as far as, well, you've got t of type tau, and then you've got your lots-of-spaces operator, and you're applying it to a function on t which you don't seem to have declared (although I see you did declare a more complicated function to be a "proposition", and sure enough there it is in the denominator).

So... um, I guess a proposition returns a boolean, right? So you're dividing lots-of-spaces-operator(t-of-type-tau, undefined-function) by a boolean, and I feel like that's probably a type error.

This isn't going to stop me from commenting, mind. ^_^

Wow. It appears I have not the background for this. (This next part will probably sound as laughably basic to you as a copy editor forgetting whether the comma goes inside or outside the quotes in American English sounds to me, but:) I got as far as "∃x:τ.P(x)" before realizing I don't know what the dot indicates, and also that I can't remember whether the antecedents or the consequents go below the line in that way you write implications (although I'm guessing it's the consequents). Perhaps with the answers to these two things I could make a fresh attack on this post :-)

Consequents go below the line.

∃x:τ.P(x) just means there is some P in which x is used for which there exists some x such that P is true. τ is the type of x, which is really important in the type theory, but you can think of it as sorta-kinda just specifying a set of elements from which we know the x is chosen.

Yeah yeah what matt said about the dot but for some historical perspective:

You may have seen "there exists an x such that P(x)" written

(∃x)P(x)

and "for all x such that P(x)"

(∀x)P(x)

or sometimes even in very old-fashioned math as

(x)P(x)

but we tend to write

∃x.P(x)
∀x.P(x)

as part of a general notational habit of writing a dot between any bound variable being introduced on the left of the dot, and the scope in which it is allowed to be mentioned on the right. Like, we also write

λx.E(x)

for the function that takes an argument x, and returns E(x).

i hear that Church invented the dot notation as a kind of parenthesis you never have to close. some very old mathematicians (i.e. Peter Andrews) continue to use it this way, and you can find hints of it in very old papers (like the Curry-Howard formulae-as-types paper).

Oh is that dot that same dot as that other dot? I have this sinking feeling that I've learned this fact before and forgot.

Not to be confused with the dot notation of Leroy! Which has nothing to do with dots, except that it does.

This looks like it would be fun to read, alas I'm completely lost, starting at the beginning. I base my supposition only on reading your summary at the end.

Flat | Top-Level Comments Only

Profile

chrisamaphone

August 2014

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Page Summary

Style Credit

Base style: Abstractia by chiming
Theme: Abyss by chiming

Expand Cut Tags

No cut tags

Page generated Jun. 23rd, 2025 11:08 pm