### Refactoring the Drools Compiler

In the past few weeks we have been working hard on redesigning the architecture of Drools the rules engine and the rest of our ecosystem of runtime engines.

In this blog post I want to focus a bit on the refactoring of the KnowledgeBuilder, a core component of the build infrastructure of the v6-v7 APIs. For the ongoing work on Drools 8, we are rethinking the entire design of this and other components.

On the latest stable version of the 7 series, the KnowledgeBuilderImpl is a 2500-lines class that contains logic for processing resources of different types (such as DRL, DMN, BPMN, PMML, XLS etc…)

On the main branch, the same class is now less than half the size, where most of the fat is really public methods that we kept for backwards compatibility, that are now delegating to new self-contained classes.

The main culprit with the KnowledgeBuilderImpl was that it was both the class holding the logic for building assets, and both a sort of "context" object that was passed around to collect pieces of information.

The main goals of the refactoring were

1. Refactoring most of the state inside the KnowledgeBuilderImpl into smaller objects with well-defined boundaries
2. Moving the building logic related to the DRL family (plain DRL, XLS, DSLs etc.) to a series smaller, composable CompilationPhases
3. Ensuring that each CompilationPhase never referred directly the KnowledgeBuilderImpl

The same work involved the CompositeKnowledgeBuilderImpl (which decorates KnowledgeBuilderImpl) and for the ModelBuilderImpl (which subclasses the KnowledgeBuilderImpl).

As you can imagine the work was a bit long and iterative, but the good news is that it is now possible to put the CompositePhases in sequence, instantiating them without requiring the entire KnowledgeBuilder, but just its constituent.

The KnowledgeBuilderImpl itself now implements a few interfaces by delegating to self-contained objects (e.g. BuildResultCollector, GlobalVariableContext, PackageRegistryManager). The phases always refer to such interfaces, e.g., a RuleCompilationPhase only refers to a TypeDeclarationContext.

The result is that now it is possible to put in sequence such phases to produce a self-contained rule compiler:

List<CompilationPhase> phases = asList(
new ImportCompilationPhase(packageRegistry, packageDescr),
new TypeDeclarationAnnotationNormalizer(annotationNormalizer, packageDescr),
new EntryPointDeclarationCompilationPhase(packageRegistry, packageDescr),
new AccumulateFunctionCompilationPhase(packageRegistry, packageDescr),
new TypeDeclarationCompilationPhase(packageDescr, typeBuilder, packageRegistry, null),
new WindowDeclarationCompilationPhase(packageRegistry, packageDescr, typeDeclarationContext),
new FunctionCompilationPhase(packageRegistry, packageDescr, configuration),
new ImmutableGlobalCompilationPhase(packageRegistry, packageDescr, globalVariableContext),
new RuleAnnotationNormalizer(annotationNormalizer, packageDescr),
new RuleValidator(packageRegistry, packageDescr, configuration),
new ImmutableRuleCompilationPhase(packageRegistry, packageDescr, parallelRulesBuildThreshold,
attributesForPackage, resource, typeDeclarationContext),
new ConsequenceCompilationPhase(packageRegistryManager)
);


The same is true both for the traditional in-memory compiler, and the new canonical model compiler.

This huge refactoring makes it possible to reuse most of the logic in the traditional compilation flow in a new compiler architecture that is currently being worked on. Stay tuned for more details!