Discourse releases new versions quite regularly, and we tend to keep up to get the latest features and security patches. The process currently goes like this
I get an email that there’s a new version
ssh root@clojureverse.org
tmux at
cd /var/discourse/ && ./launcher rebuild app && ./launcher cleanup
the site is down for a few minutes
we’re up to date
This isn’t the worst process and I’m fine with continuing this way, but if someone would like to look at improving it that would be really great.
Apparently the downtime can be reduced by “splitting the discourse docker container into two containers”. I don’t know the fine details of this, but google with those terms and you should find some links.
Discourse also supports updating through the admin UI, but I stopped using that because the box we’re running on would run out of memory halfway, and I had to restart the process from a terminal anyway. But that’s a while ago. Maybe the box we’re currently on can handle it, or maybe it’s time to move to something bigger (we’re on Digital Ocean, there’s a post somewhere in the staff-only category with more details about the setup).
Currently while upgrading the site is just down, it would be great if it could show a friendly message.
Finally perhaps there’s a way to automate this so the site upgrades automatically. I’d be all for it, less manual work and less chance of getting pwnd.
If you want to support your friendly Clojure community then this is great way to do it, and you might pick up some devops-foo along the way. If that sounds like fun then please get in touch!
I don’t know much about Discourse, but if you run Docker one trivial thing to do could be to spin up a second container to bind to port 80 with a webserver saying “we’ll be back soon”.
This said, I think we can live with 5-minute downtime every once in a while
I won’t have time to do this myself but here’s a rough recipe I’d use:
put nginx or similar in front of it
set a flag so that nginx shows a Maintenance page
do the maintenance
remove the flag
I’d put it all in a script; I wouldn’t use tmux for this but instead background the command after sshing in.
Potential problems:
how to deal with errors during the process?
logging
Even better would be to make a copy of the current directory, work on that, and only copy over if everything worked. However, not sure if discourse uses a database, which makes this more complicated obviously.
To report back: I did the most recent upgrade through the web UI. I had to do it in several steps, upgrading component per component, but in the end it all worked, so it seem to out-of-memory issue has been fixed.