This makes me think of over-fitting in machine learning algorithms. The epicycles fit the data very well, but added a lot of parameters to the model. It makes me wonder if there is a bias/variance trade off for scientific models, but I don't know how to express the connection formally.
In machine learning algorithms, we hold out a dataset to test for over-fitting. We don't exactly have a spare universe to test our scientific models for over-fitting, but maybe if there were a second planetary system at the time to test against it would have been clear sooner that the "epicycles" model fit only this solar system from the vantage point of earth? Maybe you could "train" the model on some heavenly bodies, then test on others?
I'm pretty sure I'm making a fool of myself at this point and missing something obvious, and I'm hoping one or more of you will point that obvious thing out to me.
Kepler's ellipses fit the data with an even simpler model. Newton added an underlying model which could be generalized to hypothetical bodies. In other words, you could plan something like an Apollo mission with it. I doubt you could do something like that with the Copernican, Ptolemaic, or even the Keplerian model. Newton's model gave you enough insight to hack. And not just surface hacks, but deep hacks. Everything before was merely descriptive.
Newton's model also showed convergence. The way the planets moved became connected with the way cannonballs behaved. Mechanics could also subsume models of buildings and machines. Engineering and architecture were unified by Newtonian Mechanics.
It's not just a matter fitting. It's a matter of transcending current models. (Another reason to study different programming languages/paradigms.)
I think you've hit on a weakness of the whole Machine Learning paradigm. I might get this wrong, but I believe every machine learning algorithm necessarily introduces some bias into selecting which possibilities to consider, and without some kind of bias, learning is impossible.
But once you've chosen how you will bias your model, you are only going to search for solutions in the space defined by that bias. So, figuring out the parameters of the ellipses describing the movement of heavenly bodies, but not questioning whether ellipses are a good choice to begin with. There is also feature selection, how you decide which aspects of reality (or measurements of reality, actually) are relevant to the learning problem. (There are feature selection techniques, but that presumes you already have a finite set of candidate features and then determine which ones have the most value.)
It seems that, perhaps, this kind of paradigm busting discovery is out of reach of current machine learning methods, and that the kinds of decisions about what to model and how to bias your model is where humans add value to the process.
This is all philosophical bullshit at this point, but I remain curious about the relationship between learning algorithms and scientific discovery. If anyone is still reading, are there any good books on this topic?
In machine learning algorithms, we hold out a dataset to test for over-fitting. We don't exactly have a spare universe to test our scientific models for over-fitting, but maybe if there were a second planetary system at the time to test against it would have been clear sooner that the "epicycles" model fit only this solar system from the vantage point of earth? Maybe you could "train" the model on some heavenly bodies, then test on others?
I'm pretty sure I'm making a fool of myself at this point and missing something obvious, and I'm hoping one or more of you will point that obvious thing out to me.