Esteban's Blog

/dev/rnd

Knowledge Agent: Incremental Change Set Build Implementation

This post is the continuation of Knowledge Agent incremental change-set processing and binary diff. Here I will try to explain the internals of how change sets are processed by kagent in an incremental way. The content of this post is very technical and it is aimed to developers trying to get the best of incremental change set processing. Of course I’m also posting this looking for some feedback and comments about the current implementation.

I will try to show the implementation using some test scenario where monitored resources are added, modified and deleted. I will explain how these operations impact in the kagent and kagent’s kbase.

KnowledgeAgentImpl resource and definition mappings

The current implementation of the KnowledgeAgent interface maintains a map of resources/kdefinitions. The resource -> kdefinition mapping is implemented using a Map>. The Agent maintains this map updated all the time: when a resource is modified, its changes are applied to this map.

Adding Resources

When the Agent is notified about new resources, it receives a ChangeSet with its “resourcesAdded” collection filled. KnowledgeAgentImpl first compiles the resource to get its kdefinitions, then they are added to the kbase and finally a new entry on the map is created.

The procedure of adding a new resource to the agent can be done on demand using a change-set xml like the following:

<change-set>
   <add>
      <resource source='http://localhost:9000/rules1.drl' type='DRL' />
      <resource source='http://localhost:9000/rules2.drl' type='DRL' />
   </add>
</change-set>

The agent will setup a new listener on every added resource to detect further modifications.

If in our example, rules1.drl defines 2 rules: rule1 and rule2; and rules2.drl defines just 1: rule3; the resource/kdefinitions mapping of KnowledgeAgentImpl will have the following structure:

{
http://localhost:9000/rules1.drl => [rule1, rule2],
http://localhost:9000/rules2.drl => [rule3]
}

Because the resources were new, no diff is performed

If a new resource redefines an existing kdefinition of a previous resource, the current kdefinition inside the kbase is replaced with the new version. Let’ suppose that we add a new resource (rules3.drl) that redefines rule1. The previous mapping will now look like this (after the akgent is notified):

{
http://localhost:9000/rules1.drl => [rule1, rule2],
http://localhost:9000/rules2.drl => [rule3],
http://localhost:9000/rules3.drl => [rule1]
}

You may notice that there is still a mapping between rules1.drl and rule1 (the one in red). While this is a problem we are looking at, in this instance it will be fine because rules1.drl actually has a definition for rule1.

Even when rule1 was modified in kbase, none diff was performed. This is because the Agent only perform a diff when a resource is modified, and not when a resource is added. In this case, rule1 was updated in kbase just because it was added as a new rule. We are trying to determinate if it would be useful to remove rules1.drl=>rule1 mapping or not. The next time rules1.drl gets modified, this mapping will be fixed.

Modifying Resources

As previously mentioned KnowledgeAgentImpl adds a listener to every added resource. When a monitored resource is modified, a new change-set is created with its “resourcesModified” attribute filled and passed to the agent for processing.

A modified resource can update the current kbase in three different ways:

  • Add new definitions.
  • Modify existing definitions.
  • Remove existing definitions.

Any combination of these operations is supported. In order to support these operations, when KnowledgeAgentImpl processes a modified resource, it first compiles the new resource and compare each definition with the mapped ones.

Continuing with the example, if we add a new rule definition (rule4) to rules2.drl, the agent will be notified about this change and after processing it the mapping will be:

{
}


The Agent compares the version of rule3 in the modified resource (rules2.drl) with the one he has in the map. Because rule3 didn’t change, it is not replaced in kbase. The new version of rules2.drl contains a new rule (rule4); because this rule is not present in the map, the Agent assumes that it is new and adds it in the kbase.

You can also modify a resource to overwrite an existing definition. Suppose we add a new definition of rule3 in rules3.drl and remove the previous definition of rule1 from that file too. After the agent process the generated change-set, the maps will look like this:

{

}

In this case, rules3.drl is the modified resource. The map contains an entry for rule1, but this rule is not present on the new version of the resource, hence it is removed from kbase. Rule3 appears as a new resource and then is added to kbase overriding the definition present in rules2.drl (you can see it in red). Again note that this overwriting is not detected by the Agent.

When we remove all the definitions from a resource, it continues mapped to the agent. This is because the agent still monitoring the resource for changes. We can see an example of this situation if we modify rules3.drl to contains just the package name. This will create a new change-set that will leave the agent’s maps in the following state:

{
}

Please notice that removing the last version of a rule (in this case rule3 from rules3.drl) doesn’t put any previous version of it in the kbase. This is true even if we modify the previous resource but not the definition itself. To make it clear, suppose that we modify the definition of rule4 in rules2.drl. The agent will process the resource and compares the mapped definitions against the new ones. Before compare the original definition of rule3 with the definition present on the modified resource, the Agent checks if the current definition is still present in the kbase. In this case, rule3 doesn’t exist anymore (it was removed when we modified rules3.drl) so rule3 is not added to kbase. But the mapping will be updated:

This behavior could bring some problems: The next time rules2.drl changes (suppose that we add a new rule), rule3 will appear again. Because it is not mapped, rule3 will be new for the Agent. This could causes involuntary modifications of the kbase.

The change-set needed to mark a resource as modified (if you want to make it manually) should look like:

<change-set>
   <modify>
      <resource source='http://localhost:9000/rules1.drl' type='DRL' />
   </modify>
</change-set>

Removed Resources

The last operation that can be performed on a resource is to remove it from the kbase. This means to remove all the definitions present in the resource. When a resource is removed, the agent also frees any listener it had on that resource. In the example we have, we can remove rules3.drl resource. If we do that, the agent’s maps will have the following structure:

{
}

Because rules3.drl didn’t contain any mapped definition, the kbase wasn’t modified. The listener for rules3.drl was also removed. Strange things happen when removing resources containing overwritten definitions. If we add a new definition for rule1 in rules2.drl and then we remove rules1.drl, rule1’s definition will be removed from the kbase. And because rules2.drl also contains a definition of rule3, this rule will be present on the kbase again (rule3 reappears because it wasn’t mapped anymore in rules2.drl) .

The resulting mapping of this operation will be:

{
http://localhost:9000/rules2.drl => [rule1, rule3,rule4]
}

rule1 is marked with red because it is no longer in the kbase.

The current implementation only supports the deletion of rules and functions definitions. Type declarations deletion are not yet supported by Drools, and there is an open bug (https://jira.jboss.org/jira/browse/JBRULES-2374) that prevents to remove queries from the kbase. Please note that this is also applied to definition’s modifications: remember that when a definition is modified it is first removed from kbase and added again.

To remove a resource manually, you can create an xml change-set similar to this:

<change-set>
   <remove>
      <resource source='http://localhost:9000/rules1.drl' type='DRL' />
   </remove>
</change-set>

Known Issues

There are two majors issues in this implementation: dirty mapping and crash recovery. Dirty mapping occurs when a definition is overwritten by another resource. The agent’s mapping keeps the original resource/definition entry and this could bring problems during resource’s modification and remotion. I have already explained them during this post. One solution for this could be to update all the resource mappings when a resource is modified. This way we can avoid all the deprecated mappings (the ones marked in red).  As a good practice and to avoid dirty mapping, knowledge definitions should be only modified in their original resources.

Another major issue is the impossibility to restore the last state of the agent’s kbase after a crash. The agent doesn’t keep the order in where the resources should be applied. Even worst, the resources could be modified from the original version. The agent should maintain an updated version of the kbase’s definitions after a change-set is processed.

Advertisements

8 comments on “Knowledge Agent: Incremental Change Set Build Implementation

  1. Pingback: Tweets that mention Knowledge Agent: Incremental Change Set Build Implementation « Esteban's Blog -- Topsy.com

  2. Giovanni Motta
    May 27, 2010

    Good post, clear explanation. Regards.

  3. Jamie Shaw
    May 13, 2011

    So what happens if I have a session executing at the time that kagent detects a change in a .drl file? Does that session complete with the old rules or are the rules modified in the middle of the execution?

    • esteban
      May 31, 2011

      Strange things could happen in that situation 😛
      The best way to handle this is to turn off ResourceScanner and to call ResourceScanner.scan() manually when you know you are in a “safe-point”.

  4. sumatheja
    May 1, 2012

    Hi,
    I have a problem refreshing the sateful knowledge session whenever there is a change in the rules assets int he guvnor. Below is my code used for configuration

    StatefulKnowledgeSession ksession = null;
    KnowledgeAgentConfiguration kconf = KnowledgeAgentFactory.newKnowledgeAgentConfiguration();
    KnowledgeBaseConfiguration config = KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
    config.setOption( EventProcessingOption.STREAM );
    KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase( config );
    kconf.setProperty(“drools.agent.newInstance” , “false”);
    KnowledgeAgent kagent = KnowledgeAgentFactory.newKnowledgeAgent(“MyAgent1”,kbase, kconf);
    kagent.applyChangeSet(ResourceFactory.newUrlResource(“http://127.0.0.1:8380/drools-guvnor/LDIChangeSet.xml”));

    kbase = kagent.getKnowledgeBase();
    ksession = kbase.newStatefulKnowledgeSession();

    I’ve a session where I’ve inserted thousands of facts. Whenever there is change in the rule base Do I need to reinsert all the facts using the new knowledgeBase? Any help would be appreciated. Thanks in advance.

    • esteban
      May 1, 2012

      The changes you make in your rules should affect all the stateful sessions you have. What is the problem you are experiencing?

      • sumatheja
        May 1, 2012

        The esisting session was not updated with the latest knowledgebase… However just a while ago I added

        ResourceFactory.getResourceChangeNotifierService().start();
        ResourceFactory.getResourceChangeScannerService().start();

        Everything works fine now 🙂

        Can you tell me the use of

        config.setOption( EventProcessingOption.STREAM );

        Thanks for the response

      • esteban
        May 1, 2012

        STREAM mode is related to drools-fusion. You can read more about it in drools-fusion documentation: http://docs.jboss.org/drools/release/5.4.0.CR1/drools-fusion-docs/html_single/index.html

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Information

This entry was posted on May 26, 2010 by in drools, java and tagged , , , .
%d bloggers like this: