For Drools Planner 6.0, I am working on **highly scalable construction heuristics**. Furthermore, they ‘ll be far more flexible and optionally configurable. This post focuses on how multiple planning variables affect scalability.

In the curriculum course use case, the planning entity *Lecture* has a planning variable *Period* and a planning variable *Room*. That makes 2 planning variables per entity. For example, dataset comp07 has 434 lectures, 25 periods and 20 rooms.

Suppose we use the construction heuristic *First Fit Decreasing* (FFD) on that. Simply explained: FFD sorts the lectures in a queue, takes the first lecture from that queue and assigns it the best remaining Period and Room. It repeats this until all lectures are assigned.

### What’s wrong with the old FFD implementation?

If there are multiple variables, the old implementation will, for every lecture, try a combination of every period and every room. So, for every lecture, it will try a **cartesian product** of the period value set and the room value set. In comp07 it tries 25 periods * 20 rooms = 500 moves for each of the 434 lectures. Suppose I have 5 variables of 1 000 values each: that would make 1 000 000 000 000 000 moves per entity!

The new implementation now also supports an alternative: do a **union** instead: for every lecture, try every period and assign the period, then try every room and assign the room. In comp07 it tries 25 period

So how does this affect score results and performance?

### Benchmark cartesianProduct vs union of multiple variables

Unsurprisingly, *cartesianProduct* has consistently better results than *union* (because it investigates much more combinations):

*Union* breaks more hard and soft constraints. But this is an unfair comparison: *union* took less time. So that leaves more time for local search to optimize those constraints.

If we look at the number of milliseconds to run both algorithms, we can clearly see that *cartesianProduct* takes much longer than *union*.

Especially if we look at the scalability graph, we can see a clear **scalability problem with cartesianProduct**, unlike with

*union*:

Here, there are only with 2 planning variables (period and room). More planning variables has even worse scalability.

So what can Local Search do with that extra time? Let’s give both configurations 2 minutes, so local search gets whatever time is left after the construction heuristic has finished:

For this dataset scale, there’s no clear winner in this case: they are competitive. If we scale out, I imagine we ‘ll see *union* as the clear winner. So if my dataset is small/medium and I have an execution time of at least 1 minute, I might go for *cartesianProduct*. Otherwise, I ‘d prefer *union*. In any case, I ‘d use the Drools Planner’s benchmarker toolkit to help me decide that.

Interested in the benchmark’s details? Take a look at the full benchmark report here.