🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Combining multiple files into modifiable archive?

Started by
16 comments, last by Shaarigan 4 years, 2 months ago

The use of many small files for a saved game doesn't seem necessary. If you “update” a saved game frequently, there are more natural options:

  • With enough memory for an extra copy of the game state, set aside a read-only snapshot of all data structures when you want to save the game, take as long as it takes to serialize this large blob to a compressed file while the game continues, and then discard the snapshot from memory. Loading the game requires deserializing the game state while the player waits.
  • Reliable game state snapshots (possibly a fixed initial state, e.g. the name of a map and a RNG seed, rather than all dynamic details of a game in progress) might be combined with recording and replaying an event log (small, append-only write operations), possibly just player commands.
  • A well-organized SQLite database can store events, game state or both, with unlimited and branching history and separate files only for independent games. Aggressive normalization can replace compression, write operations are very small, and you can also archive a database “offline” by stripping indexes and slack space before applying single-file compression.

Omae Wa Mou Shindeiru

Advertisement

The problem is that I have an infinite universe, so I simply store each chunk in a separate file. It was the easiest solution before I ran into problems. At least it is a problem when copying the savegame, it takes forever to copy 100000 files. Maybe having separate files is actually beneficial when syncing them between multiple servers. Compressing them separately would definitely make sense for this purpose.

What do you mean by “chunks”? Are there parts of the game world that can be somehow “deactivated” when they have no player-visible activity?

Hopefully you can split the potentially large game state in read-only data (e.g. maps), procedurally generated data that can be regenerated without saving it, events that can be forgotten (e.g. monsters in a RPG are reset to their initial positions) and events that need not be fragmented by chunk (e.g. what monsters in a RPG have been killed, anywhere).

Omae Wa Mou Shindeiru

Even if your universe is infinite, your data is still finite (or you would need infinite memory and infinite disk space). So how big is your data set? Are we talking gigabytes (which may still fit into RAM) or terabytes (which probably won't even fit on disk)?

If you do need to swap game state data to disk because it won't fit into RAM, I would recommend using a database (probably sqlite), with the chunks stored as binary blobs. But the last time I wasn't able to fit the game state in RAM was in the age of 16 bit DOS, so you probably don't even need to do that.

There is read-only data, like the generated structures, and data that can change when a player is nearby, including the blocks they placed/removed and plants that grow etc. Chunks far away from the player are not loaded and therefore not modified.

I think the best thing to do is to generate the archive once an entire planet is unloaded and then delete all the files and when it is loaded again, all writeable data is unpacked when accessed and the read-only stuff can stay in the archive and is only read into memory.

Beosar said:

There is read-only data, like the generated structures, and data that can change when a player is nearby, including the blocks they placed/removed and plants that grow etc. Chunks far away from the player are not loaded and therefore not modified.

I think the best thing to do is to generate the archive once an entire planet is unloaded and then delete all the files and when it is loaded again, all writeable data is unpacked when accessed and the read-only stuff can stay in the archive and is only read into memory.

So something like Minecraft? Where I think unloaded chunks are really just on the disk in a directory of files (taking advantage of the file system to deal with this, which generally means “wasting” a lot of space in case files grow, get deleted/replaced, etc., but in a way it can reclaim. Also why many filesystems really hate being like 90%+ full. I believe for example FAT32 doesn't do this, and fragmentation was a huge issue, ), not in RAM at all (and generally any backups of a live game needs special consideration, don't just copy/zip/etc. the directory on a live server/game)?

I think zip or other similar common formats are potentially still a good solution for backups. Potentially to keep the regular “in-use” game size down you might want to compress the stored chunks anyway, use something like deflate or some other commonly support algorithm, and then they are even in the right format, creating the zip backup/"save" is mostly concatenating files with some headers inbetween, or potentially something like tar (while normally a tar contains mostly uncompressed files, then say gzip is used on the whole thing, nothing technically says it must be. In “normal usage”, doing the gzip/etc. on top an archive of say JPEG, PNG, MP3, etc. files is largely a waste of time).

As for general performance, small files are bad. Putting lots of small files into some sort of archive then randomly accessing them is still bad, and at best intelligent caching (by the OS and/or say user mode database engines) in memory saves you (oh you just wrote that? well I kept a hot copy here for you. You want this tiny file/row/record that comes after the last one? Well lucky you I read a X KB page last time and kept it, and it happens to include the thing you want now as well).

One consideration is to make sure that the size of the “chunks” you store to disk is fairly large, computers have lots of memory today, what is a 20MB block instead of 40 separate 500KB items? These chunks might be different to what is actively updated in game, for example you might combine 3x3 chunks into a block for IO purposes and save/load them together.

EDIT: I just searched, this is in fact basically how Minecraft does it. While chunks are 16x16x256 blocks, 32x32 chunks (so 512x512x256 blocks) are a “region”. The game saves/loads that entire region as one individual file.

Btw. a good database implementation can work in parallel, even for writing data. This is how our game design software's database worked. As long as you are not writing to the same chunks of data at the same time, you can for sure acquire write access to as much different data chunks as you want. Especially if you make use of a technique like memory mapped files which are kept in OS Virtual Memory (some kind of double buffered read/ write) and synced to disk if needed.

This however requires a clever lock management to not corrupt any data but this is not so much work

This topic is closed to new replies.

Advertisement