I'm wondering, given a random truth-table with N binary variables and 1 binary output, what is (worst case) the smallest network that can learn it? (In terms of number of parameters).
This is technically true, but I wonder how close you could get to 100% with minimal size. I would expect that you could get somewhere around >98% with a network that's a few megabytes in size.