> So why is it so hard to get all of that orchestrated? Why do we have to boil the oceans to get some sane defaults for all of this. Why do we need to micromanage absolutely everything here?
This post truly resonates with me, however i don't think that we appreciate just how many things are necessary to run a web application and do it well. There is an incredible amount of complexity that we attempt to abstract away.
Sometimes i wish that there'd be a tool that could tell me just how many active code lines are responsible for the processes that are currently running on any of the servers and in which languages. Off the top of my head, what's necessary to ship an enterprise web app in 2021.
RUNTIMES - No one* writes web applications in assembler code or a low level language like C with no dependencies - there is usually a complex runtime like JVM (for Java), CLR (for .NET), or whatever Python or Ruby implementations are used, which are already absolutely huge.
LIBRARIES - Then there are libraries for doing common tasks in each language, be it serving web requests, serving files, processing JSON data, doing server side rendering, doing RPC or some sort of message queueing etc, in part due to there not being just one web development language, but many. Whether this is a good thing or a bad thing, i'm not sure. Oh, and front end can also be really complex, since there are numerous libraries/frameworks out there for getting stuff rendering in a browser in an interactive way (Angular, Vue, React, jQuery), each with their own toolchains.
PACKAGING - But then there are also all the ways to package software, be it Docker containers, other OCI compatible containers (ones that have nothing to do with the Docker toolchain, like buildah + podman), approaches like using Vagrant, or shipping full size VMs, or just copying over some files on a server and either using Ansible, Chef, Puppet, Salt or manually configuring the environment. Automating this can also be done in any number of ways, be it GitLab CI, GitHub Actions, Jenkins, Drone or something else.
RUNNING - When you get to actually running your apps, what you have to manage is an entire operating system, from the network stack, to resource management, to everything else. And, of course, there are multiple OS distributions that have different tools and approaches to a variety of tasks (for example, OpenRC in Alpine vs systemd in Debian/Ubuntu).
INGRESS - But these OSes also don't live in a vacuum so you end up needing a point of ingress, possible load balancing or rate limiting, so eventually you introduce something like Apache, Nginx, Caddy, Traefik and optionally something like certbot for the former two. Those are absolutely huge dependencies as well, just have a look at how many modules the typical Apache installation has, all to make sure that your site can be viewed securely, do any rate limiting, path rewriting etc.!
DATA - And of course you'll also need to store your data somewhere. You might manage your databases with the aforementioned approaches to automate configuration and even running them, but at the end of the day you are still running something that has decades of research and updates behind them, regardless of whether it's SQLite, MariaDB, MySQL, PostgreSQL, SQL Server, S3, MongoDB, Redis or anything else. All of which have their own ways of interacting with them and different use cases, for example, you might use MariaDB for data storage, S3 for files and Redis for cache.
SUPPORT - And that's still not it! You also probably want some analytics, be it Google Analytics, Matomo, or something else. And monitoring, something like Nagios, Zabbix, or a setup with Prometheus and Grafana. Oh and you better run something for log aggregation, like ELK or Graylog. And don't forget about APM as well, to see what's going on in your app in depth, like Apache Skywalking or anything else.
OTHERS - There can be additional solutions in there as well, such as a service mesh to aid with discoverability of services, circuit breakers to route traffic appropriately, security solutions like Vault to make sure that your credentials aren't leaked, sometimes an auto scaling solution as well etc.
In summary, it's not just because of there being a lot of tools for doing any single thing, but rather that there are far too many concerns to be addressed in the first place. To that end, it's really amazing that you can even run things on a Raspberry Pi in the first place, and that many of the tools can scale from a small VPS to huge servers that would handle millions of requests.
That said, it doesn't have to always be this complex. If you want to have a maximally simple setup, just use something like PHP with a RDBMS like MariaDB/MySQL and server side rendering. Serve it out of a cheap VPS (i have been using Time4VPS, affiliate link in case you want to check them out: https://www.time4vps.com/?affid=5294, though DigitalOcean, Vultr, Hetzner, Linode and others are perfectly fine too), maybe use some super minimal CI like GitLab CI, Drone, or whatever your platform of choice supports.
That should be enough for most side projects and personal pages. I also opted for a Docker container with Docker Swarm + Portainer, since that's the simplest setup that i can use for a large variety of software and my own projects in different technologies, though that's a personal preference. Of course, not every project needs to scale to serving millions of users, so it's not like i need something advanced like Kubernetes (well, Rancher + K3s can also be good, though many people also enjoy Nomad).
Edit: there are PaaS out there that make things noticeably easier for you by focusing on doing some of the things above for you, but that can lead to a vendor lock, so be careful with those. Regardless, maybe solutions like Heroku or Fly.io are worth checking out as well, though i'd suggest you read this article: https://www.gnu.org/philosophy/who-does-that-server-really-s...
This post truly resonates with me, however i don't think that we appreciate just how many things are necessary to run a web application and do it well. There is an incredible amount of complexity that we attempt to abstract away.
Sometimes i wish that there'd be a tool that could tell me just how many active code lines are responsible for the processes that are currently running on any of the servers and in which languages. Off the top of my head, what's necessary to ship an enterprise web app in 2021.
RUNTIMES - No one* writes web applications in assembler code or a low level language like C with no dependencies - there is usually a complex runtime like JVM (for Java), CLR (for .NET), or whatever Python or Ruby implementations are used, which are already absolutely huge.
LIBRARIES - Then there are libraries for doing common tasks in each language, be it serving web requests, serving files, processing JSON data, doing server side rendering, doing RPC or some sort of message queueing etc, in part due to there not being just one web development language, but many. Whether this is a good thing or a bad thing, i'm not sure. Oh, and front end can also be really complex, since there are numerous libraries/frameworks out there for getting stuff rendering in a browser in an interactive way (Angular, Vue, React, jQuery), each with their own toolchains.
PACKAGING - But then there are also all the ways to package software, be it Docker containers, other OCI compatible containers (ones that have nothing to do with the Docker toolchain, like buildah + podman), approaches like using Vagrant, or shipping full size VMs, or just copying over some files on a server and either using Ansible, Chef, Puppet, Salt or manually configuring the environment. Automating this can also be done in any number of ways, be it GitLab CI, GitHub Actions, Jenkins, Drone or something else.
RUNNING - When you get to actually running your apps, what you have to manage is an entire operating system, from the network stack, to resource management, to everything else. And, of course, there are multiple OS distributions that have different tools and approaches to a variety of tasks (for example, OpenRC in Alpine vs systemd in Debian/Ubuntu).
INGRESS - But these OSes also don't live in a vacuum so you end up needing a point of ingress, possible load balancing or rate limiting, so eventually you introduce something like Apache, Nginx, Caddy, Traefik and optionally something like certbot for the former two. Those are absolutely huge dependencies as well, just have a look at how many modules the typical Apache installation has, all to make sure that your site can be viewed securely, do any rate limiting, path rewriting etc.!
DATA - And of course you'll also need to store your data somewhere. You might manage your databases with the aforementioned approaches to automate configuration and even running them, but at the end of the day you are still running something that has decades of research and updates behind them, regardless of whether it's SQLite, MariaDB, MySQL, PostgreSQL, SQL Server, S3, MongoDB, Redis or anything else. All of which have their own ways of interacting with them and different use cases, for example, you might use MariaDB for data storage, S3 for files and Redis for cache.
SUPPORT - And that's still not it! You also probably want some analytics, be it Google Analytics, Matomo, or something else. And monitoring, something like Nagios, Zabbix, or a setup with Prometheus and Grafana. Oh and you better run something for log aggregation, like ELK or Graylog. And don't forget about APM as well, to see what's going on in your app in depth, like Apache Skywalking or anything else.
OTHERS - There can be additional solutions in there as well, such as a service mesh to aid with discoverability of services, circuit breakers to route traffic appropriately, security solutions like Vault to make sure that your credentials aren't leaked, sometimes an auto scaling solution as well etc.
In summary, it's not just because of there being a lot of tools for doing any single thing, but rather that there are far too many concerns to be addressed in the first place. To that end, it's really amazing that you can even run things on a Raspberry Pi in the first place, and that many of the tools can scale from a small VPS to huge servers that would handle millions of requests.
That said, it doesn't have to always be this complex. If you want to have a maximally simple setup, just use something like PHP with a RDBMS like MariaDB/MySQL and server side rendering. Serve it out of a cheap VPS (i have been using Time4VPS, affiliate link in case you want to check them out: https://www.time4vps.com/?affid=5294, though DigitalOcean, Vultr, Hetzner, Linode and others are perfectly fine too), maybe use some super minimal CI like GitLab CI, Drone, or whatever your platform of choice supports.
That should be enough for most side projects and personal pages. I also opted for a Docker container with Docker Swarm + Portainer, since that's the simplest setup that i can use for a large variety of software and my own projects in different technologies, though that's a personal preference. Of course, not every project needs to scale to serving millions of users, so it's not like i need something advanced like Kubernetes (well, Rancher + K3s can also be good, though many people also enjoy Nomad).
Edit: there are PaaS out there that make things noticeably easier for you by focusing on doing some of the things above for you, but that can lead to a vendor lock, so be careful with those. Regardless, maybe solutions like Heroku or Fly.io are worth checking out as well, though i'd suggest you read this article: https://www.gnu.org/philosophy/who-does-that-server-really-s...
* with very few exceptions