Sparse Set
SparseSet
is Shipyard's default storage. This chapter explains the basics
of how it works, the actual implementation is more optimized both in term of speed and memory.
Overview
To understand how Shipyard uses sparse sets, we must first understand how sparse sets work.
A basic sparse set is a data structure for storing integers. It is comprised of two
arrays: sparse
and dense
.
To insert an integer i
, we first set the next available slot in the dense
array to i
,
and then set sparse[i]
to the position of i
in the dense array. Let's walk through
an example.
We start off with an empty sparse set:
- Sparse Array:
[]
- Dense Array:
[]
To add 3
to our sparse set, we first append it to dense
and then set sparse[3]
to 0
(the position of 3
in dense
):
- Sparse Array:
[U, U, 0]
- Dense Array:
[3]
U
is short for uninitialised.
If we then add 0
, the sparse set will look like so:
- Sparse Array:
[1, U, 0]
- Dense Array:
[3, 0]
Searching a sparse set is O(1)
. To check if the integer i
exists we check whether
dense[sparse[i]] == i
. For example, to look up 3
in our example sparse set, we should
first check sparse[check]
. sparse[check]
is equal to 0
and so next we check
dense[0]
. Since dense[0] == 3
we can say that 3
is in our example sparse set.
Shipyard
So far, we've only seen how sparse sets can store integers. However, Shipyard has to store both entity IDs (basically just integers) and components, requiring us to use a slightly more complicated data structure. Shipyard makes two major changes to the traditional sparse set described above.
Firstly, Shipyard sparse sets are actually composed of three arrays: sparse
, dense
, and
data
. dense
stores the entity IDs, whereas data
contains the actual components of the
entities. dense
and data
are linked: their lengths are always the same. data[i]
is
the component for the entity with the ID located at dense[i]
. Whenever dense
changes,
so does data
.
Secondly, Shipyard uses multiple sparse sets, one for each type of component. The dense
array
in each sparse set contains the EntityIds
of the entities that have that
component.
Let's walk through an example:
#[derive(Component)]
struct FirstComponent(pub u32);
#[derive(Component)]
struct SecondComponent(pub u32);
let mut world = World::new();
let entity_id_0 = world.add_entity((FirstComponent(322),));
let entity_id_1 = world.add_entity((SecondComponent(17),));
let entity_id_2 = world.add_entity((FirstComponent(5050), SecondComponent(3154)));
let entity_id_3 = world.add_entity((FirstComponent(958),));
For this example we will assume that the entity IDs are in order i.e. entity_id_0 == 0
, entity_id_1 == 1
, etc.
The world data will now be stored in two sparse sets, one for each component:
SparseSet<FirstComponent>:
sparse: [0, U, 1, 2]
dense: [0, 2, 3]
data: [FirstComponent(322), FirstComponent(5050), FirstComponent(958)]
SparseSet<SecondComponent>:
sparse: [U, 0, 1]
dense: [1, 2]
data: [SecondComponent(17), SecondComponent(3154)]
U
is short for uninitialised.
Iteration
To iterate over a single sparse set, we can simply iterate over the data
array.
However, Shipyard also lets us iterate over multiple sparse sets.
To iterate over multiple sparse sets, we first pick the shortest set (comparing the lengths
of the dense
arrays) and then iterate over the dense
array of the shortest set. For each
entity ID, we check whether all the other sparse sets contain it, and if they do, we yield
the entity ID in the iterator.
Let's walk through an example with the sparse set we defined above:
let (firsts, seconds) = world
.borrow::<(View<FirstComponent>, View<SecondComponent>)>()
.unwrap();
for (first, second) in (&firsts, &seconds).iter() {
// Do some stuff
}
We first check which has the shortest dense set. The SecondComponent
sparse set does, so
we begin iterating over its dense
array.
The first entity ID is 1
. Since we are iterating over SecondComponent
, we already know
that entity 1
has a SecondComponent
; we just need to check if the entity has a
FirstComponent
. As described above, to check whether an entity has a component, we have
to check if dense[sparse[id]] == id
in the sparse set of the component. sparse[1]
in
SparseSet<FirstComponent>
is uninitialised and so we know that entity 1
does not have
a FirstComponent
.
The next entity that contains a SecondComponent
is 2
. However, this time, sparse[2]
in SparseSet<FirstComponent>
is equal to 1
and dense[1]
is equal to 2
, which means
that entity 2
has a FirstComponent
meaning we can yield it in the iterator.
After iterating over all the items in the SecondComponent
sparse set, we are done.
Removal
Removing is done by swap removing from both dense
and data
and updating sparse
in
consequence.
Continuing the previous example if we call:
world.remove::<(FirstComponent,)>(entity_id_0);
The internal representation now looks like this:
sparse: [U, U, 0, 1]
dense: [2, 3]
data: [FirstComponent(5050), FirstComponent(958)]
dense
and data
shifted to the left, the first element in sparse is now uninitialised,
and the indexes at sparse[2]
and sparse[3]
were updated.
Additional Resources
This blog post goes into more detail on sparse sets and compares them with archetypes, another common way of representing data in ECS libraries. The blog post is part of a larger series about the design and internals of ECS systems.