Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And... I still understand nothing about category theory.

Are there any more comprehensible primers out there or is this just a field of mathematics that I shouldn't expect to understand without years of prior study in other requisite fields? Or am I better off building something in Haskell and learning by doing? I know some discrete math, but all attempts at broaching category theory have been so far totally unsuccessful for me.



My understanding was made better when I started thinking about utilizing category theory in programming, more in a way of "here are some design patterns stolen from abstract algebra", rather than "man! now I should learn this entire branch of mathematics".

The branch of math itself would be mostly useless anyway, because it is mostly concerned about relations between categories, and as programmers, you usually work just within a single category, category of functions and types in your programming languages :-)

But from the practical, design-pattern viewpoint, I would suggest trying purescript :-)

My journey was working through the purescript book [1] up to the Applicative Validation chapter. And then I played around with the flare backend in try-purescript [2]. And then it clicked for me and I was suddenly able to read and comprehend Typeclassopedia [3] :-)

[1] https://leanpub.com/purescript/read [2] http://try.purescript.org/?backend=flare&session=a06d83dd-d0... [3] https://wiki.haskell.org/Typeclassopedia


I'd say there is not much point to Category Theory if we where to restrict ourselves to a single category. For instance Monoids are categories (with a single object) that is not the category of functions. Another set are the Kleisli Categories[1] that are equivalent to monads.

You mention Applicative Functors, which form another (different) category where arrows from a to b are on the form `f (a -> b)` and composition and identity laws of these arrows are equivalent to the applicative laws.

[1] https://en.wikipedia.org/wiki/Kleisli_category


I got my point about single category from a friend of mine that studied category theory and applied it to problems topology.

Way he explained it, they often were searching for relations between categories that didn't at all intersect.

And way I was learning the applications of category theory in i.e. haskel, I viewed it more as learning about algebraic structures in the category of haskel types and functions.

So right now I have reasonable idea about i.e. applicative functors in haskel. But thinking about applicative functors as their own category is one abstraction step above what I am usually used to :-)


A category abstracts the concept of function composition. You need two things: an identity element, and a composition operation. For functions the identity element is the identity function id(x)=x and the composition operation is function composition compose(f,g) = (x -> f(g(x))). The twist is that you can't compose any two functions; their input and output types have to match up. If g is A -> B and f is B -> C you can compose them to a function A -> C, but the output type of g has to match up with the input type of f. Categories generalise this concept from composing functions to composing any thing with a notion of input and output type. In practice that often ends up being function composition anyway, but with restrictions on the function. For instance if you compose two linear functions you get a linear function, so this forms a category. Or continuous functions, or functions that can be differentiated, etcetera. However, quadratic functions do not form a category, because if you compose a quadratic function with another quadratic function you may get a function with higher degree. For instance if f(x) = x^2 then compose(f,f) is x^4. Polynomials on the other hand do form a category. In general this need not be composition of functions, as long as there's an identity element and a compose operation with a notion of input and output type. Binary relations is the easiest example that I can think of.


So it's more a generalization of Groups right ?


A monoid (weaker than group: doesn't need to have an inverse) is an example of a category, with a single object G (namely the underlying set) and a lot of arrows from G -> G.

If you take the natural numbers N = {0, 1, ...} as a set and + as a binary operator forming a monoid (N, +, 0), then the arrows would correspond to (. + 0), (. + 1), (. + 2) etc. Here (. + 0) can be seen as the identity arrow, and composition of arrows (. + a) and (. + b) yields a new arrow (. + (a+b)).


Yes, roughly. You can think about a category like a group which carries around "type parameters" to say which "elements" you can and can't multiply. Also, "elements" may not be invertible.

For example, for each N, there is a group of invertible N by N matrices. If we want to allow non-square matrices too, we can bundle them up as a category:

- The objects are the natural numbers

- A map from N to M is an M row by N column matrix

- We compose A : N -> M and B : M -> K by multiplying them as matrices to get BA : N -> K. The objects (i.e., the dimensions) are type checking which things we can and can't multiply


There are two differences:

1) Groups have an inverse operation, categories don't.

2) You can compose any two elements in a group, whereas in a category the input and output types have to match up.


The category-like generalization of the group is the groupoid: all arrows are isomorphisms.


I can recomend the video series of Bartosz Milewski (https://www.youtube.com/watch?v=I8LbkfSSR58&list=PLbgaMIhjbm...) I'm at the 4th video and it's the first time that i start to become an understanding of the howl thing.

But expect it to take some time. (10 Videos, most are around 1h, i needed to pause several times and think or try something out)


Those videos are good but still I struggle!


What also helped me understand CT a little bit better was the book "Haskell Programming from first Principles" (http://haskellbook.com/)

This also needs a lot of time and effort to comprehend!


Hm, as someone who's finished that book, I don't know that I'd say Haskell Programming from First Principles really teaches much category theory -- it teaches specific applications of a few things as it exists in Haskell.

I'm currently working through Conceptual Mathematics (an intro category theory book), and while the Haskell book has some overlap, it's been extremely minor. The main thing I can think of at the moment is determining the number of arbitrary maps that exist between sets. There is some overlap of things like monoids and functors, but even those have a different, more nuanced treatment in category theory (for example, the relatively simple fact that endomaps are functorial interpretations of the sum monoid was new to me, coming from that haskell book).


I'd recommend learning a couple of the fields that category theory abstracts over first. It's cool as a generalisation over topology and set theory, for example (and shows you how some theorems from both are "really" the same theorem), but that only works if you know topology and set theory.


I think this is the fundamental roadblock to teaching category theory to people who don't have a relatively diverse math background. If you have one, the early exposition of category theory is a constant stream of formalizing connections that already had a nebulous version in your head (e.g. products, coproducts, kernels, etc.). If not, you're robbed of any kind of a-ha! moment, and the motivation is unclear at best.


The single paper I have understood about this subject is http://www.sciencedirect.com/science/article/pii/01676423879... . It is the foundation of the Caml Light language that latter gave birth to the famous ocaml. This paper is very dense, but everything is explained.


For anyone wondering how this is useful outside of programming and/or looking for an easier introduction, check out Category Theory for the Sciences [1]. There are free digital versions on Github, too [2]. Here's the reasoning given by the author:

"Information is inherently dynamic; the same ideas can be organized and reorganized in countless ways, and the ability to translate between such organizational structures is becoming increasingly important in the sciences. Category theory offers a unifying framework for information modeling that can facilitate the translation of knowledge between disciplines."

[1] https://mitpress.mit.edu/books/category-theory-sciences

[2] https://github.com/mmai/Category-Theory-for-the-Sciences


If you have a math-background, category theory is easy to learn, as it's merely a trivial (in retrospect, but not in prospect) generalisation of existing mathematics. It's basically the abstract theory of binary associative 'thingies' and 'functions' that preserve these 'thingies'. Most mathematicians pick it up by osmosis without really trying. That's because they understand so well what CT abstracts from.

Experience has shown that without a maths background learning CT is extremely difficult. Almost everybody basically fails.

As somebody who taught himself CT without a math background, I suggest to expect between 100 and 300 hours of serious, intensive study before CT begins to make sense.

My recommendation is to bite the bullet and study one of the many good textbooks, and solve every exercise, and rote-learn the definitions until they become second nature. You will know you've understood CT when see why Peter Freyd's quip "[t]he purpose of [category theory] is to show that which is trivial is trivially trivial" is apt. The key concepts to learn are: category, functor, natural transformation, adjunction, universal property, limit and dualisation. Once you understand those, an extremely large number of prima facie separate mathematical concepts can be seen as instances of the same few trivial ideas. But they must be rephrased in this seemingly alien and unnatural language called CT. My favourite example is Lawvere's unification of most known paradoxa (e.g. Cantor, Russell, Goedel, Tarski) as a basically a trivial fixpoint [1]. Note that after Lawvere had realised what most paradoxa have in common using a categorical re-formulation, it then became trivial to get the same insight without CT [2].

Let me illustrate the power of looking at a phenomenon using a new language with the example of computation itself: we have two dominant abstractions of computation:

- Turing machines

- Lambda calculus

Simplifying only a tiny bit: Turing machines gave us the theory of computational complexity but has not been helpful at all in the development of programming languages in general, and typing systems in particular; in contrast, lambda-calculus has been the foundation of most work on programming languages in general and practically all work on types. Yet lambda-calculus has been mostly silent on complexity theory (although [3] is the beginning of a rapprochement).

Finally, let me state my belief that the average programmer has currently no need to understand CT. The main categorical concept that has filtered down to conventional programming are monads, and monads can be explained and understood completely without reference to CT.

[1] F. W. Lawvere, Diagonal arguments and cartesian closed categories.

[2] N. S. Yanofsky, A Universal Approach to Self-Referential Paradoxes, Incompleteness and Fixed Points.

[3] https://arxiv.org/abs/1601.01233


Can you explain the fixed point idea more in a beginner way? I'm curious to understand the result.


A function has a fixed point when for some value of x, x = f(x). This is interesting because it's some change, that doesn't change the result.

in the lingua franca of javascript.

  var timesThree = (x => x * 3)
  //for this function 0 is a fixed point...
  var a = 0
  console.log(a == timesThree(a))
  //7 is not a fixed point
  var b = 7
  console.log(b == timesThree(b))


Thanks for the response. My question is how does fixed point relate to the incomplete theorems and decidability?


The key to e.g. Goedel's results is the Fixed Point Lemma: if ψ is a formula with v free then there is a sentence φ such that, provably, φ <=> ψ(<φ>) [where <..> is the numerical code of .. ].

The proof, if you are interested, is not difficult. Let sub be the function that describes, via codes, substituting the (numeral n_ of the number) n for a free variable v of a formula χ : sub(<χv>, n_) = <χn_>. Then the PROOF is: consider ψ(sub(v,v)), call it θv, let m be <θv> and let φ be θm_. Then, provably, φ <=> ψ(sub(m_,m_)) <=> ψ(sub(<θv>,m_)) <=> ψ(<θm_>) <=> ψ(<φ>). Ta-da!

Goedel used this to get the "formula that says I am not provable", φ <=> not Prov(<φ>).


    explain the fixed point idea
The Yanofsky paper explains is in quite gentle a manner.

The key insight behind Lawvere's abstract framework to paradoxa is the the general existence of certain fixpoints.

Definition. We say that a set B has the fixpoint property if any function f:B→B has a fixpoint (i.e. f b = b for some b in B).

With this convenient definition, we are now ready to state and prove Lawvere's ridiculously simple and yet great theorem.

Theorem (Lawvere). If e:A→(A→B) is a surjective function, then B has the fixed point property.

The proof is quite easy. Let e:A→(A→B) be surjective. We have to show that B has the fixpoint property. That means for every f:B→B there is b∈B such that f(b)=b. Choose a function f:B→B and define the function g:A→B by setting

   a ↦ f (e a a)
As e is surjective, there must be a0∈A such that

   e a0 = g.
But then immediately

   f (g a0) = f (e a0 a0) = g a0
Hence g a0 is a fixpoint of f's.

Now many/most paradoxa are special cases of Lawvere's theorem. But for each paradox, the specific functions involved are a bit different. Let's look at an example.

Theorem (Cantor). There is no surjection e:A→Pow(A).

To see why this is true, note that Pow(A) is isomorphic to A→Bool. But there is a function on Bool that has no fixpoints, for example negation ¬:Bool→Bool, contradicting Lawvere's theorem.

   ---------------
What is the intuition behind Lawvere's theorem? At first worrying about A→(A→Bool) is a bit surprising. What does that have to do with paradoxa and self-reference?

Well, what does it mean that A can speak about itself? To approach an answer, we could maybe first ask a simpler question: what does it mean to speak about A? How about this for an answer: to speak about A means to say something about A's elements. What does it mean to say something about A's elements? Maybe stating whether any given element a∈A has a property of interest? But what's a property? Easy: a property of A's elements is a function

   p:A→Bool
But we don't want just a fixed property, we want arbitrary properties. To do so, we have to consider the function space

   A→Bool
And how can we turn this into self-reference? What if each element a in A corresponded to a property over A? In other words, (with a lot of handwaving) self-reference means the existence of a surjective function

   A→(A→Bool)
The next step is to wonder: why Bool? Why not any old set? Note that Cantor's theorem continues to hold if we replace Bool with a larger set, but does not hold, if B in A→(A→B) has cardinality 1. What Bool and larger sets have in common is that we can rearrange them, i.e. there is a permuation that doesn't have a fixpoint.


Fascinating! A couple of questions:

Are there any non-singleton sets with the fixpoint property?

> a ↦ f (e a a)

Don't you mean

   a ↦ e a a
? Because later you expand (g a0) to (e a0 a0).


No, g is given by a ↦ f (e a a). The expansion you refer to works because e a0 = g.


Honestly, Saunders MacLane's book, "Categories for the Working Mathematician", was very useful for this non-mathematician. It was surprisingly accessible if I started at the introduction and did not move forward until I grokked the content.

I won't pretend I absorbed all (or even most) of it, but I at least understand the "Standard Haskell Joke" now (which is a funny misquote from his book)[1][2]. I also understand much more clearly why Haskell is so associated with Category Theory.

[1]: "A monad is just a monoid in the category of endofunctors, what's the problem?"

[2]: What MacLane actually wrote is:

All told, a monad in X is just a monoid in the category of endofunctors of X, with product × replaced by composition of endofunctors and unit set by the identity endofunctor.


nLab has been one of the best sources I've come across. My aim is to express and work with philosophical logic in a more formal way, and nLab is a great source for that as well. Milewski's YouTube lectures gets mentioned a lot, but I couldn't make much sense of them before watching Tom LaGatta's more informal and conversational presentation https://youtu.be/o6L6XeNdd_k Good luck and keep at it.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: