Newsletter #12: Apr/May/Jun Edition

Writing expert systems is both a science and an art. Writing efficient expert systems is a science, art, and voodoo :-) Seriously, it can be difficult to write expert systems. Even when they work, it is amazing how much faster they can be made with a little thought and experimentation. In the Tips page of the IE Help File, we discussed some methods for making your expert systems more efficient. In this article, we are going to walk you through decisions and considerations we make to increase the speed of your expert system. Hopefully, this article will show you tips for optimizing your own expert systems.

Note that this entire article deals with optimizing your expert systems from within the expert system language, none of it tries to optimize the engine underneath. As such, even if you don't own the IECS, you can follow along in this article by downloading the console demo at http://www.riversoftavg.com/Files/IEConsolev20.zip The expert system we developed and refined can be found at www.RiverSoftAVG.com/Files/OptimizationTutorial.zip

For this article, we created a small expert system. The entire purpose of this expert system is to take a "number" fact and keep incrementing it by 1. The expert system contains 10 rules, which share incrementing the fact based on the current number of the fact. In other words, rule 1 is activated when the "number" fact value is between 0 and 99 and its actions increment the number by 1, rule 2 is activated when the "number" fact value is between 100 and 199, etc. This is obviously a contrived example, but it gives us a couple of important properties for our purposes:

	Many Rules which must be evaluated every time the "number" fact changes
	Each rule has constraints which we can work to optimize
	There is only one execution path to optimize so it is simpler to work with for an example. Your expert systems will have multiple paths of execution. So you need to be careful to not optimize one path which occurs rarely for the sake of another path which occurs often.

The basic expert system is in "Simple ES, Original.IE" All the rules are the same, they match on the number fact and have constraints to check that the fact is actually a number and in the range they want. Here is the first rule:

(defrule match-and-add-01

"match numbers and add 1"

?f1 <- (num ?n&:(numberp ?n)&:(>= ?n 0)&:(< ?n 100))

(modify ?f1 (__Data (+ ?n 1) ))

)

Starting the Advanced Console Demo, we can load this file and run it. Be careful, this expert system executes indefinitely. To bound our execution, we will type into the console to execute 2000 times: (run 2000). After a few moments, the inference engine finishes and we get the stats. The number fact is now at 2000. In our case, we executed 2000 rules in 1.996 seconds. We reloaded the expert system and executed it again 3 times to get the average number above. Ok, this is the baseline we have got to beat.

Do big optimizations first

Optimizing an expert system is a lot like optimizing code. Don't waste your time optimizing small parts of the expert system for small gains when you should be looking for the big gains. In practice, this means looking for ways to reduce the number of rules, the number of patterns, the number of fact templates, the number of facts, etc. It is the expert system equivalent for trying to optimize by getting rid of routines entirely than optimizing code within the routines themselves or looking for alternative structures such as hash tables instead of lists to optimize retrieving values.

For our example, we could obviously discard all the rules and constraints and just write one rule that matches a number fact and increments it by 1:

(defrule match-and-add-01

"match numbers and add 1"

?f1 <- (num ?n)

(modify ?f1 (__Data (+ ?n 1) ))

)

If we ran this, the expert system is over 2.5 times as fast! Even if we modify the actions part of the rule and put in a giant IF statement (one for each of the previous rules), the expert system is still 2.4 times as fast. However, that would ruin all the fun (and make a short article). And in practice, there are many occasions where it is not possible to optimize the "big picture". So we are going to cheerfully throw the above optimization away and try to optimize the "little things." :-)

Investigate Constraints, Patterns, and Sharing

The first thing we need to look at is constraints versus function calls. Depending on the expert system engine and real world circumstances, it can be faster to use constraints (the & (and) symbol) stuff or use a function. If this expert system was match discrete values, e.g., (password ?p), in some cases it would be faster to use constraints ( (password ?p&password1|password2) ) and in others functions ( (password ?p&:(or (eq password1 ?p) (eq password2 ?p))) ). For OR constraints, the IECS actually converts the first pattern above into the second pattern so we will see no difference for that case. However, in our example, we actually specify AND constraints, e.g., the number must be a number AND greater than or equal to 0 AND less than 100. The IECS leaves the separate AND constraints as is. Let's try converting the AND constraints to one AND function call which calls the other functions:

(defrule match-and-add-01

"match numbers and add 1"

?f1 <- (num ?n&:(and (numberp ?n) (>= ?n 0) (< ?n 100)))

(modify ?f1 (__Data (+ ?n 1) ))

)

If we change all the rules to the sample above ("Simple ES, one AND Constraint.ie"), load it, and run 2000, amazingly enough we get a performance boost of 15%! Not bad.

Well, let's keep going. So far, we have 10 rules all with unique rule patterns. The IECS uses a RETE network for optimizing rules by sharing common patterns so they only have to be evaluated once for each rule that uses the pattern. Let's try taking the constraints out of the pattern entirely and putting them in a TEST pattern. This way, every rule can share the (num ?n) pattern. Our rules would now look similar to the one below:

(defrule match-and-add-01

"match numbers and add 1"

?f1 <- (num ?n)

(test (and (numberp ?n) (>= ?n 0) (< ?n 100)))

(modify ?f1 (__Data (+ ?n 1) ))

)

Let's change rules, lock and load :-) This time, we got a speed boost of 30% over the original expert system. Now we are rolling!

Ok, sharing patterns worked in that case. Maybe we can share more tests, all the rules have the test (numberp ?n). Let's break that test out into its own TEST pattern and try it. Here is what a rule would look like ("Simple ES, Share Patterns (2).ie"):

(defrule match-and-add-01

"match numbers and add 1"

?f1 <- (num ?n)

(test (numberp ?n))

(test (and (>= ?n 0) (< ?n 100)))

(modify ?f1 (__Data (+ ?n 1) ))

)

When we try this expert system, we only get 30% boost again. Darn! More than likely that is a product of the simplicity of the shared TEST pattern. It is hard to get a faster function call than just checking if something's a number!

One more optimization to be aware of, the order of patterns can make a big difference. Just reordering the TEST patterns to do the numberp TEST pattern last can make a big difference. Of course, such an expert system would not be correct because there would be exceptions if the value in the number fact was not a number. But in your own expert systems, think about the ordering of your rule patterns. Reducing the number of partial matches will really help with efficiency. For our example, we don't see any ordering changes we can make safely so let's go on.

Reduce Number of Function Calls

Hmmm, now what? Let's start trying to reduce the number of function calls we make. It seems logical to think that if we can reduce the number of function calls, we should see some improvements in execution time. First, patterns and function calls do not work the same in the IECS. Function calls require overhead in setting up and executing. Patterns are more "native" if you will to the engine. All of our rules above have a TEST pattern which ANDs two other tests together. By their nature, rule patterns are implicit ANDs. Perhaps, we can break up the TEST pattern once more into 2 TEST patterns ("Simple ES, Share Patterns (3).ie"). This will get rid of one AND function call. Let's try it... Hey! This actually worked! Changing all the rules and executing, we see a speed boost of 37% over the original expert system!

Use the structure of your expert system to reduce function calls and other overhead

Now, let's go back to the drawing board a little bit and think "big picture" a little. Every rule in our expert system checks to ensure that the value in the "number" fact is in actuality a number. We are writing the expert system so we could just assume we know that this will always be a number and just get rid of the (numberp ?n) TEST pattern. However, this could be dangerous in the future if we modify our expert system and start overloading the number fact. Or, we may add in something to query the user for the original number. What if the user doesn't put in a number? The number fact won't detect it. So, let's be safe. Let's rethink the structure of our expert system to maker it safer and faster.

Instead of using an ordered fact with one multislot, let's define an unordered fact with one INTEGER slot. This slot will automatically enforce type safety, allow us to get rid of our numberp function calls in every rule, and avoid the overhead of a multislot compared to a slot. Our new fact template for num will look like:

(deftemplate num (slot n0 (type INTEGER)))

The rules will now reference the n0 slot and we get rid of the numberp TEST pattern:

(defrule match-and-add-01

"match numbers and add 1"

?f1 <- (num (n0 ?n))

(test (and (>= ?n 0) (< ?n 100)))

(modify ?f1 (n0 (+ ?n 1) ))

)

Let's change the rest of the rules ("Simple ES, Slots and TEST Patterns.ie"), load the expert system, and test it. Wow! 69% faster than the original! We have come far from our original expert system.

Don't carry around unused information

Right now, we need to pause to discuss something that does not apply to our current expert system but can easily occur in yours. Many expert systems will have facts with a massive number of facts. For example, a personnel expert system may have slots for ID, First Name, Middle Name, Last Name, Address, Phone Number, Fax Number, Cell Number, salary, tips, Mother's Maiden Name, et cetera, et cetera, et cetera. On the surface it makes sense. Keep all the data together in one place.

However, these same expert systems often have rules that deal with only a few of the slots of these massive facts. Even though they are not used, the inference engine needs to carry around and duplicate these large facts. Not only do they consume a lot of memory, they also consume a lot of processor time. If you are concerned about memory or speed, don't carry around unused slots! Only keep the slots that you use and that help make the fact unique (such as ID). If you absolutely must have access to that information (perhaps for one rule invocation at the end which summarizes the expert system results), write a user function that accepts the ID and returns this extra information.

For example, we took the above expert system, "Simple ES, Slots and TEST Patterns.ie" and just added 30 more slots to the fact template. If you run this expert system, it is almost TWICE as slow, just from adding those slots!

Reduce Number of Function Calls, Redux

Now, back to our originally scheduled program. If you look closely at our expert system so far, you may be able to spot one more optimization we can do to reduce the number of function calls. Basically, every rule (except the last one) in our expert system always checks to see if the number fact is between 2 other numbers. We could reduce the number of function calls for most rules from 2 to 1 if we could just find or write a "between" function. Writing such a function would be trivial, see newsletter 8 for details on writing a function (http://riversoftavg.com/NewsLetters/newsletter2002AprMayJun.htm).

However, in our case, we don't even need to write a function. It turns out the "<" function accepts more than 2 arguments. It can be used to check the ordering of an indefinite amount of numbers. It returns TRUE if "for all arguments, argument n is less than argument n + 1." Ah ha! Let's rewrite rules 2 through 9 to take advantage of this:

(defrule match-and-add-02

"match numbers and add 1"

?f1 <- (num (n0 ?n))

(test (< 99 ?n 200))

(modify ?f1 (n0 (+ ?n 1) ))

)

The file, "Simple ES, Reduced Funcalls (Slot).ie, " contains these changes. Let's load it and run it. Cool! Almost TWICE as fast, 90%.

Try Control Facts

We are going to try one more optimization technique in this article: control facts (see newsletter 4 for details, http://riversoftavg.com/NewsLetters/newsletter2001AprMay.htm#Tip:%20How%20to%20partition%20your%20rules). The idea behind control facts is that these facts control which rules get activated based on the value in the control fact. The traditional idea is to have a "phase" control fact, which could have values of initialization, decision, action, etc. Rules are written for a particular phase by putting the control fact in as a rule pattern:

(defrule control-fact-rule

(phase initialization)

(other-patterns)

)

Not only are control facts good for partitioning your expert system but they also can help the inference engine work less too. If a specific control fact is not present, rules that need that control fact will not evaluate any more patterns. The rule above would not evaluate (other-patterns) until the (phase initialization) fact is present.

Our example expert system is not a good candidate for control facts, but let's try it and see. We will define a is-high-number fact template. This fact will only be present when the number fact is above 500. Then, we can rewrite the rules based on whether they are looking for a high number:

(defrule match-and-add-01

"match numbers and add 1"

(not (is-high-number))

?f1 <- (num (n0 ?n))

(test (and (>= ?n 0) (< ?n 100)))

(modify ?f1 (n0 (+ ?n 1) ))

)

...

(defrule match-and-add-06

"match numbers and add 1"

(is-high-number)

?f1 <- (num (n0 ?n))

(test (< 499 ?n 600))

(modify ?f1 (n0 (+ ?n 1) ))

)

Now, half of the rules won't even be evaluated every time the number fact changes because they have been "disabled" by the control fact. In theory, this will reduce the number of pattern match checks and function calls. In practice, we need to see if the overhead for the control fact outweighs the benefits.

We need to write one more rule which will assert the control fact once the number gets high enough:

(defrule check-if-high-number

(not (is-high-number))

(num (n0 ?n&:(>= ?n 500)))

(assert (is-high-number))

)

Ok, let's finish rewriting the rules ("Simple ES, Reduced Funcalls (slot) Control Facts.ie"), load the expert system, and run. Darn, 83% improvement, less than the previous expert system. In this case, control facts don't help us. Interestingly though, even though we have one extra rule and one extra fact, we didn't lose much speed. It is easy to see that in other cases where there are perhaps more rules, more patterns, or more expensive functions, the control facts optimization would help us.

One more point about control facts, the control fact should usually be the first pattern in a rule. Otherwise, the other patterns would be evaluated first before the control fact could ever help us. Just rewriting the rules and putting the control fact last ("Simple ES, Reduced Funcalls (slot) Mistake Control Facts.ie") makes the expert system 10% slower than when the control facts are at the beginning.

Conclusion

Ok, that's it. By a few judicious optimizations, we have gotten almost twice as fact as the original expert system. If we collapse all the rules into one rule, in this case, it would work for us and give us almost 2.5 times the speed! The chart below shows our results.

Well, I hope this short article has been educational and sparked some ideas for how you may be able to increase the speed of your own expert systems. Next time, we will look at how to optimize fuzzy expert systems. Happy Writing!

Send mail to webmasterNO@SPAMRiverSoftAVG.com with questions or comments about this web site. Copyright © 2002-2010 RiverSoftAVG Last modified: September 20, 2010

Send mail to webmasterNO@SPAMRiverSoftAVG.com with questions or comments about this web site.
Copyright © 2002-2010 RiverSoftAVG
Last modified: September 20, 2010