Hi. I’m kinda of a noob in the world of self-hosting and matrix, for that matter. But I was wondering how heavy is it to host a matrix server?
My understanding how matrix works is each participating server in the room stores the full history and then later some sort of merging happens or something like that.
How is that sustainable? Say in 5 years matrix becomes mainstream and 5 people join my server and each also join 3 different 10k+ people rooms with long histories. So now what I have to account for that or people have to be careful of joining larger rooms when they sign up in a smaller-ish server?
Or do I not understand how Matrix works? Thanks.


It can baloon as it scales up. Matrix.org (homeserver) has had at least one DB corruption and that’s with their proprietary Rust bindings for Synapse. Small communities, especially ones that share rooms between them, should be fine on most systems. Make regular backups of the DB.
And, importantly, run the db on postgre, not sqlite, and implement the regular db maintenance steps explained in the wiki. I’ve been running mine like that in a small VM for about 6 months, i join large communities, run whatsapp, gmessages and discord bridges, and my DB is 400MB.
Before when I was still testing and didn’t implement the regular db maintenance it balloned up to 10GB in 4 months.
It is my understanding that all of the ballooning DB is room states, something that you can’t really prune. What exactly are you pruning from the DB?
I purge 2 weeks old media using these. Then I purge the largest rooms’ history events using these. Then I compress the DB using this.
It looks like this:
export PGPASSWORD=$DB_PASS export MYTOKEN="mytokengoeshere" export TIMESTAMP=$(date --date='2 weeks ago' '+%s%N' | cut -b1-13) echo "DB size:" psql --host core -U synapse_user -d synapse -c "SELECT pg_size_pretty(pg_database_size('synapse'));" echo "Purging remote media" curl \ -X POST \ --header "Authorization: Bearer $MYTOKEN" \ "http://localhost:8008/_synapse/admin/v1/purge_media_cache?before_ts=%24%7BTIMESTAMP%7D" echo '' echo 'Purging local media' curl \ -X POST \ --header "Authorization: Bearer $MYTOKEN" \ "http://localhost:8008/_synapse/admin/v1/media/delete?before_ts=%24%7BTIMESTAMP%7D" echo '' echo 'Purging room Arch Linux' export ROOM='!usBJpHiVDuopesfvJo:archlinux.org' curl \ -X POST \ --header "Authorization: Bearer $MYTOKEN" \ --data-raw '{"purge_up_to_ts":'${TIMESTAMP}'}' \ "http://localhost:8008/_synapse/admin/v1/purge_history/$%7BROOM%7D" echo '' echo 'Purging room Arch Offtopic' export ROOM='!zGNeatjQRNTWLiTpMb:archlinux.org' curl \ -X POST \ --header "Authorization: Bearer $MYTOKEN" \ --data-raw '{"purge_up_to_ts":'${TIMESTAMP}'}' \ "http://localhost:8008/_synapse/admin/v1/purge_history/$%7BROOM%7D" echo '' echo 'Compressing db' /home/northernlights/scripts/synapse_auto_compressor -p postgresql://$DB_USER:$DB_PASS@$DB_HOST/$DB_NAME -c 500 -n 100 echo "DB size:" psql --host core -U synapse_user -d synapse -c "SELECT pg_size_pretty(pg_database_size('synapse'));" unset PGPASSWORDAnd periodically I run vacuum;
Thank you for the queries. The rhetorical question is why isn’t the server handling this.
I don’t know, can’t speak for the devs. It is weird that if you don’t implement these API calls buried a bit deep in the wiki, you end up storing every meme and screenshot anybody posted on any instance for the rest of time. But I found these through issue reports with many people asking for these to be implemented by default with for instance a simple setting “purge after X days” and a list of rooms to include or exclude from the history clean-up.
There’s also issues with the state disagreement / resolution algorithms across federation.
Has this been solved? Maybe it’s also due to database corruption, where some state is forgotten across the federation, and thus the algorithm breaks down?