Parallelism

By late 90s - early 2000s, CPUs started to get too close to the physical limitation of transistors and manufacturers couldn't "just" make their product faster. The solution: more cores.

Nowadays almost all devices come with multiple cores, it would be a shame to use just one.

In ECS there's two big ways to split work across cores: running systems on separate threads or using a parallel iterator, we call these two methods "outer-parallelism" and "inner-parallelism," respectively.

Outer-parallelism

We'll start by the simplest one to use. So simple that there's nothing to do, workloads handle all the work for you. We even almost used multiple threads in the Systems chapter.

As long as the "parallel" feature is set (enabled by default) workloads will try to execute systems as much in parallel as possible. There is a set of rules that defines the "possible":

  • Systems accessing AllStorages stop all threading.
  • There can't be any other access during an exclusive access, so ViewMut<T> will block T threading.

When you make a workload, all systems in it will be checked and batches (groups of systems that don't conflict) will be created.
add_to_world returns information about these batches and why each system didn't get into the previous batch.

Inner-parallelism

While parallel iterators does require us to modify our code, it's just a matter of using par_iter instead of iter.
Don't forget to import rayon. par_iter returns a ParallelIterator.

Example:

use rayon::prelude::*;

fn many_u32s(mut u32s: ViewMut<u32>) {
    u32s.par_iter().for_each(|i| {
        // -- snip --
    });
}

Don't replace all your iter method calls just yet, however! Using a parallel iterator comes with an upfront overhead cost. It will only exceed the speed of its sequential counterpart on storages large enough to make up for the overhead cost in improved processing efficiency.