Model synthesis (also wave function collapse or 'wfc') is a family of constraint-solving algorithms commonly used in procedural generation, especially in the video game industry.
Some video games known to have utilized variants of the algorithm include Bad North, Townscaper, and Caves of Qud.
The first example of this type of algorithm was described by Paul Merrell, who termed it 'model synthesis' first in his 2007 i3D paper [1] and also presented at the 2008 SIGGRAPH conference and his 2009 PhD thesis.[2] The name 'wave function collapse' later became the popular name for a variant of that algorithm, after an implementation by Maxim Gumin was published in 2016 on a GitHub repository with that name.[3] Gumin's implementation significantly popularised this style of algorithm, with it becoming widely adopted and adapted by technical artists and game developers over the following years.[3]
There were a number of inspirations to Gumin's implementation, including Merrell's PhD dissertation, and convolutional neural network style transfer.[4][5] The popular name for the algorithm, 'wave function collapse', is from an analogy drawn between the algorithm's method and the concept of superposition and observation in quantum mechanics.[6][7] Some innovations present in Gumin's implementation included the usage of overlapping patterns, allowing a single image to be used as an input to the algorithm.[8]
Some have speculated that the reason Gumin's implementation proved more popular than Merrell's, may have been due to the 'model synthesis' implementation's lower accessibility, its 3D focus, or perhaps the general public's computing constraints at the time.[9]
One of the differences between Merrell & Gumin's implementation and 'wave function collapse' lies in the decision of which cell to 'collapse' next. Merrell's implementation uses a scanline approach, whereas Gumin's always selects as next cell the one with the lowest number of possible outcomes.[10]
The WFC or 'model synthesis' algorithm has some variants.[6] Gumin and Merrell's implementations are described below, and other variants are noted:
Merrell's earlier implementation is substantially the same as Gumin's with some minor differences.
(1) In Merrell's version, there is no requirement to select the cell with the lowest number of possible output states for collapse. Instead, a scanline approach is adopted. According to Merrell, this results in a lower failure rate of the model without any negative effect on quality.[10] Some commentators have noted however that the scanline approach to 'collapse' tends to result in directional artifacts.[11]
(2) Merrell's approach performs the algorithm in chunks, rather than all-at-once. This approach greatly reduces the failure rate for many large complex models; especially in a 3D space.[10]
In April 2023 Shaad Alaka and Rafael Bidarra of Delft University proposed 'Hierarchical Semantic wave function collapse'. Essentially, the algorithm is modified to work beyond simple, unstructured sets of tiles. Prior to their work, all WFC algorithm variants operated on a flat set of tile choices per cell.[12]
Their generalised approach organizes tile-sets into a hierarchy, consisting of abstract nodes called 'meta-tiles', and terminating nodes called 'leaf tiles'.[13] For example, on the first pass, WFC might make a certain tile a meta-tile of 'castle' type; which on a second pass will be collapsed into other tiles based on a rule, e.g. a 'wall' or 'grass' tile.
{{cite book}}
: CS1 maint: location missing publisher (link)
In 2016, Maxim Gumin unleashed the WFC algorithm, publishing a repository containing his initial implementation. Since then, WFC has had a profound impact on technical artists and game developers, getting adopted, adapted and used in commercially published and upcoming projects (Caves of Qud, Townscaper, Matrix Awakens).
In the case of WFC it is inspired off three distinctive but functionally similar algorithms& concepts; Texture Synthesis (Specifically Discrete Synthesis), Markov Chains & Quantum Mechanics. WFC was also additionally inspired by convolution neural network style transfer (CNN Style Transfer).
Years before, Merrel had published the conceptually identical Model Synthesis algorithm, though it did not catch on as much as WFC did, possibly due to its lower accessibility, main 3D focus and computing requirements at the time.
The first difference is in the step where we choose a cell and pick a label. The cells are chosen in a different order. Model synthesis sweeps through the grid in scanline order. WFC chooses the lowest entropy cell.
Unfortunately, this method can introduce directional artifacts.
To the best of our knowledge, however, all such WFC algorithm variants operate on a flat set of tile choices per cell.
We propose a solution to these shortcomings by introducing (i) the notion of meta-tile, an abstract tile that represents a semantic group of tiles, along with (ii) a graph-like structure that is able to represent the hierarchy among them and the constraints between them.