|
RiverSoftAVG Newsletter #12
Apr/May/Jun 2003
Hello and welcome to Newsletter #12! We don't know about
the third time, but the twelfth time is definitely not the charm because we
are late, late, late! We apologize for our tardiness. We have been working
hard on getting the Object Inspector beta started and it has been a lot more
work than expected (isn't it always?). We do have a good newsletter for you
with lots of news and a great article on optimizing your expert systems.
A reminder, this newsletter and all previous ones are available for download from the web site at http://www.riversoftavg.com/articles_&_tips.htm Contents Article: Optimizing your expert systems
News: RiverSoftAVG Mailing List Location Has
Changed
News: Beta Test for Object Inspector Component Suite has started Tip: How to have history data for your facts in the IECS
Download: IECS Updated to v2.02
Download: HelpScribble Apprentice Updated
Note: You received this newsletter because you are an
owner of a RiverSoftAVG product. If you received this newsletter by mistake
or if for any reason you wish to not receive any future mailings from
RiverSoftAVG, just reply to this message informing us so. We apologize for
any intrusion.
Article: Optimizing Your Expert Systems
Writing expert systems is both a science and an art. Writing efficient
expert systems is a science, art, and voodoo :-) Seriously, it can be
difficult to write expert systems. Even when they work, it is amazing how
much faster they can be made with a little thought and experimentation. In
the Tips page of the IE Help File, we discussed some methods for making your
expert systems more efficient. In this article, we are going to walk you
through decisions and considerations we make to increase the speed of your
expert system. Hopefully, this article will show you tips for optimizing
your own expert systems.
Note that this entire article deals with optimizing your expert systems from
within the expert system language, none of it tries to optimize the engine
underneath. As such, even if you don't own the IECS, you can follow along
in this article by downloading the console demo at http://www.riversoftavg.com/Files/IEConsolev20.zip
The expert system we developed and refined can be found at
www.RiverSoftAVG.com/Files/OptimizationTutorial.zip
For this article, we created a small expert system. The entire purpose of
this expert system is to take a "number" fact and keep incrementing it by
1. The expert system contains 10 rules, which share incrementing the fact
based on the current number of the fact. In other words, rule 1 is
activated when the "number" fact value is between 0 and 99 and its actions
increment the number by 1, rule 2 is activated when the "number" fact value
is between 100 and 199, etc. This is obviously a contrived example, but it
gives us a couple of important properties for our purposes:
The basic expert system is in "Simple ES, Original.IE" All the rules are
the same, they match on the number fact and have constraints to check that
the fact is actually a number and in the range they want. Here is the first
rule:
(defrule match-and-add-01 "match numbers and add 1" ?f1 <- (num ?n&:(numberp ?n)&:(>= ?n 0)&:(< ?n 100)) => (modify ?f1 (__Data (+ ?n 1) )) )
Starting the Advanced Console Demo, we can load this file and run it. Be
careful, this expert system executes indefinitely. To bound our
execution, we will type into the console to execute 2000 times: (run
2000). After a few moments, the inference engine finishes and we get the
stats. The number fact is now at 2000. In our case, we executed 2000
rules in 1.996 seconds. We reloaded the expert system and executed it
again 3 times to get the average number above. Ok, this is the baseline
we have got to beat.
Do big optimizations first
Optimizing an expert system is a lot like optimizing code. Don't waste
your time optimizing small parts of the expert system for small gains when
you should be looking for the big gains. In practice, this means looking
for ways to reduce the number of rules, the number of patterns, the number
of fact templates, the number of facts, etc. It is the expert system
equivalent for trying to optimize by getting rid of routines entirely than
optimizing code within the routines themselves or looking for alternative
structures such as hash tables instead of lists to optimize retrieving
values.
For our example, we could obviously discard all the rules and constraints
and just write one rule that matches a number fact and increments it by 1:
(defrule match-and-add-01 "match numbers and add 1" ?f1 <- (num ?n) => (modify ?f1 (__Data (+ ?n 1) )) )
If we ran this, the expert system is over 2.5 times as fast! Even if we
modify the actions part of the rule and put in a giant IF statement (one
for each of the previous rules), the expert system is still 2.4 times as
fast. However, that would ruin all the fun (and make a short article).
And in practice, there are many occasions where it is not possible to
optimize the "big picture". So we are going to cheerfully throw the above
optimization away and try to optimize the "little things." :-)
Investigate Constraints, Patterns, and Sharing
The first thing we need to look at is constraints versus function calls.
Depending on the expert system engine and real world circumstances, it can
be faster to use constraints (the & (and) symbol) stuff or use a
function. If this expert system was match discrete values, e.g.,
(password ?p), in some cases it would be faster to use constraints (
(password ?p&password1|password2) ) and in others functions ( (password
?p&:(or (eq password1 ?p) (eq password2 ?p))) ). For OR constraints, the
IECS actually converts the first pattern above into the second pattern so
we will see no difference for that case. However, in our example, we
actually specify AND constraints, e.g., the number must be a number AND
greater than or equal to 0 AND less than 100. The IECS leaves the
separate AND constraints as is. Let's try converting the AND constraints
to one AND function call which calls the other functions:
(defrule match-and-add-01 "match numbers and add 1" ?f1 <- (num ?n&:(and (numberp ?n) (>= ?n 0) (< ?n 100))) => (modify ?f1 (__Data (+ ?n 1) )) )
If we change all the rules to the sample above ("Simple ES, one AND
Constraint.ie"), load it, and run 2000, amazingly enough we get a
performance boost of 15%! Not bad.
Well, let's keep going. So far, we have 10 rules all with unique rule
patterns. The IECS uses a RETE network for optimizing rules by sharing
common patterns so they only have to be evaluated once for each rule that
uses the pattern. Let's try taking the constraints out of the pattern
entirely and putting them in a TEST pattern. This way, every rule can
share the (num ?n) pattern. Our rules would now look similar to the one
below:
(defrule match-and-add-01 "match numbers and add 1" ?f1 <- (num ?n) (test (and (numberp ?n) (>= ?n 0) (< ?n 100))) => (modify ?f1 (__Data (+ ?n 1) )) )
Let's change rules, lock and load :-) This time, we got a speed boost of
30% over the original expert system. Now we are rolling!
Ok, sharing patterns worked in that case. Maybe we can share more tests,
all the rules have the test (numberp ?n). Let's break that test out into
its own TEST pattern and try it. Here is what a rule would look like
("Simple ES, Share Patterns (2).ie"):
(defrule match-and-add-01 "match numbers and add 1" ?f1 <- (num ?n) (test (numberp ?n)) (test (and (>= ?n 0) (< ?n 100))) => (modify ?f1 (__Data (+ ?n 1) )) )
When we try this expert system, we only get 30% boost again. Darn! More
than likely that is a product of the simplicity of the shared TEST
pattern. It is hard to get a faster function call than just checking if
something's a number!
One more optimization to be aware of, the order of patterns can make a
big difference. Just reordering the TEST patterns to do the numberp
TEST pattern last can make a big difference. Of course, such an expert
system would not be correct because there would be exceptions if the value
in the number fact was not a number. But in your own expert systems,
think about the ordering of your rule patterns. Reducing the number of
partial matches will really help with efficiency. For our example, we
don't see any ordering changes we can make safely so let's go on.
Reduce Number of Function Calls
Hmmm, now what? Let's start trying to reduce the number of function calls
we make. It seems logical to think that if we can reduce the number of
function calls, we should see some improvements in execution time. First,
patterns and function calls do not work the same in the IECS. Function
calls require overhead in setting up and executing. Patterns are more
"native" if you will to the engine. All of our rules above have a TEST
pattern which ANDs two other tests together. By their nature, rule
patterns are implicit ANDs. Perhaps, we can break up the TEST pattern
once more into 2 TEST patterns ("Simple ES, Share Patterns (3).ie"). This
will get rid of one AND function call. Let's try it... Hey! This
actually worked! Changing all the rules and executing, we see a speed
boost of 37% over the original expert system!
Use the structure of your expert system to reduce function calls
and other overhead
Now, let's go back to the drawing board a little bit and think "big
picture" a little. Every rule in our expert system checks to ensure that
the value in the "number" fact is in actuality a number. We are writing
the expert system so we could just assume we know that this will always be
a number and just get rid of the (numberp ?n) TEST pattern. However, this
could be dangerous in the future if we modify our expert system and start
overloading the number fact. Or, we may add in something to query the
user for the original number. What if the user doesn't put in a number?
The number fact won't detect it. So, let's be safe. Let's rethink the
structure of our expert system to maker it safer and faster.
Instead of using an ordered fact with one multislot, let's define an
unordered fact with one INTEGER slot. This slot will automatically
enforce type safety, allow us to get rid of our numberp function calls in
every rule, and avoid the overhead of a multislot compared to a slot. Our
new fact template for num will look like:
(deftemplate num (slot n0 (type INTEGER)))
The rules will now reference the n0 slot and we get rid of the numberp TEST pattern: (defrule match-and-add-01 "match numbers and add 1" ?f1 <- (num (n0 ?n)) (test (and (>= ?n 0) (< ?n 100))) => (modify ?f1 (n0 (+ ?n 1) )) )
Let's change the rest of the rules ("Simple ES, Slots and TEST Patterns.ie"),
load the expert system, and test it. Wow! 69% faster than the original!
We have come far from our original expert system.
Don't carry around unused information
Right now, we need to pause to discuss something that does not apply to
our current expert system but can easily occur in yours. Many expert
systems will have facts with a massive number of facts. For example, a
personnel expert system may have slots for ID, First Name, Middle Name,
Last Name, Address, Phone Number, Fax Number, Cell Number, salary, tips,
Mother's Maiden Name, et cetera, et cetera, et cetera. On the surface it
makes sense. Keep all the data together in one place.
However, these same expert systems often have rules that deal with only a
few of the slots of these massive facts. Even though they are not used,
the inference engine needs to carry around and duplicate these large
facts. Not only do they consume a lot of memory, they also consume a lot
of processor time. If you are concerned about memory or speed, don't
carry around unused slots! Only keep the slots that you use and that help
make the fact unique (such as ID). If you absolutely must have access to
that information (perhaps for one rule invocation at the end which
summarizes the expert system results), write a user function that accepts
the ID and returns this extra information.
For example, we took the above expert system, "Simple ES, Slots and TEST
Patterns.ie" and just added 30 more slots to the fact template. If you
run this expert system, it is almost TWICE as slow, just from adding those
slots!
Reduce Number of Function Calls, Redux
Now, back to our originally scheduled program. If you look closely at our
expert system so far, you may be able to spot one more optimization we can
do to reduce the number of function calls. Basically, every rule (except
the last one) in our expert system always checks to see if the number fact
is between 2 other numbers. We could reduce the number of function calls
for most rules from 2 to 1 if we could just find or write a "between"
function. Writing such a function would be trivial, see newsletter 8 for
details on writing a function (http://riversoftavg.com/NewsLetters/newsletter2002AprMayJun.htm).
However, in our case, we don't even need to write a function. It turns
out the "<" function accepts more than 2 arguments. It can be used to
check the ordering of an indefinite amount of numbers. It returns TRUE
if "for all arguments, argument n is less than argument n + 1." Ah ha!
Let's rewrite rules 2 through 9 to take advantage of this:
(defrule match-and-add-02 "match numbers and add 1" ?f1 <- (num (n0 ?n)) (test (< 99 ?n 200)) => (modify ?f1 (n0 (+ ?n 1) )) )
The file, "Simple ES, Reduced Funcalls (Slot).ie,
" contains these changes. Let's load it and run it. Cool! Almost TWICE
as fast, 90%.
Try Control Facts We are going to try one more optimization technique in this article: control facts (see newsletter 4 for details, http://riversoftavg.com/NewsLetters/newsletter2001AprMay.htm#Tip:%20How%20to%20partition%20your%20rules). The idea behind control facts is that these facts control which rules get activated based on the value in the control fact. The traditional idea is to have a "phase" control fact, which could have values of initialization, decision, action, etc. Rules are written for a particular phase by putting the control fact in as a rule pattern: (defrule control-fact-rule (phase initialization) (other-patterns) => ) Not only are control facts good for partitioning your expert system but they also can help the inference engine work less too. If a specific control fact is not present, rules that need that control fact will not evaluate any more patterns. The rule above would not evaluate (other-patterns) until the (phase initialization) fact is present. Our example expert system is not a good candidate for control facts, but let's try it and see. We will define a is-high-number fact template. This fact will only be present when the number fact is above 500. Then, we can rewrite the rules based on whether they are looking for a high number: (defrule match-and-add-01 "match numbers and add 1" (not (is-high-number)) ?f1 <- (num (n0 ?n)) (test (and (>= ?n 0) (< ?n 100))) => (modify ?f1 (n0 (+ ?n 1) )) ) ... (defrule match-and-add-06 "match numbers and add 1" (is-high-number) ?f1 <- (num (n0 ?n)) (test (< 499 ?n 600)) => (modify ?f1 (n0 (+ ?n 1) )) ) Now, half of the rules won't even be evaluated every time the number fact changes because they have been "disabled" by the control fact. In theory, this will reduce the number of pattern match checks and function calls. In practice, we need to see if the overhead for the control fact outweighs the benefits. We need to write one more rule which will assert the control fact once the number gets high enough: (defrule check-if-high-number (not (is-high-number)) (num (n0 ?n&:(>= ?n 500))) => (assert (is-high-number)) ) Ok, let's finish rewriting the rules ("Simple ES, Reduced Funcalls (slot) Control Facts.ie"), load the expert system, and run. Darn, 83% improvement, less than the previous expert system. In this case, control facts don't help us. Interestingly though, even though we have one extra rule and one extra fact, we didn't lose much speed. It is easy to see that in other cases where there are perhaps more rules, more patterns, or more expensive functions, the control facts optimization would help us. One more point about control facts, the control fact should usually be the first pattern in a rule. Otherwise, the other patterns would be evaluated first before the control fact could ever help us. Just rewriting the rules and putting the control fact last ("Simple ES, Reduced Funcalls (slot) Mistake Control Facts.ie") makes the expert system 10% slower than when the control facts are at the beginning. Conclusion Ok, that's it. By a few judicious optimizations, we have gotten almost twice as fact as the original expert system. If we collapse all the rules into one rule, in this case, it would work for us and give us almost 2.5 times the speed! The chart below shows our results.
Well, I hope this short article has been educational and sparked some ideas
for how you may be able to increase the speed of your own expert systems.
Next time, we will look at how to optimize fuzzy expert systems. Happy
Writing!
News: RiverSoftAVG Mailing List Location Has Changed
Our hosting service has added mailing list support so we have moved our
mailing list. The mailing list is to help users of RiverSoftAVG Products ask
questions of each other and us about its products, such as: How to use the
product, How to install, etc. Interested users can go to
http://riversoftavg.com/mailman/listinfo/support-list_riversoftavg.com to
sign up.
News: Object Inspector Component Suite Beta Has Started
It took awhile but the Object Inspector Component Suite beta testing has
started. The beta period officially started July 9th. We are very proud of
the Object Inspector and think it has some amazing features that make it stand
out from the crowd. If you want to help, it is not too late. Go to
http://www.riversoftavg.com/beta_signup.htm to sign up.
Our Object Inspector has the following features:
You can get more information at
http://www.riversoftavg.com/object_inspector.htm
Tip: How to have history data for your facts in the IECS
The Inference Engine Component Suite does not
retain previous versions, or history data, for the facts in its fact list.
When a fact is retracted or modified, the old fact is gone. However,
occasionally, you may need to keep a history of certain facts in your expert
system. For example, you may want to track the last 10 transactions for a
customer. This tip tells you how to add support for retaining facts and
values in the inference engine.
Note that this tip only works for one
execution of your program. To enable persistence, you would need to stream
out the history facts.
For our example, we are going to save facts of a
certain type, all facts from the "save-facts" fact template. You obviously
could expand this concept to save any number or type of facts. First, we need
to store the old facts someplace. We are going to create a HistoryFacts
variable which stores IFact interfaces (the interface used by the inference
engine for manipulating facts). So let's declare our variable and allocate it
(note, that we don't have to worry about freeing the facts within the variable
since they are reference counted interfaces):
var HistoryFacts: TFacts; // HistoryList
procedure TForm1.FormCreate(Sender: TObject); begin HistoryFacts:= TFacts.Create(nil); end;
Now, we need to actually save facts as they are created. We will hook into the OnAssertion event of the TInferenceEngine. The OnAssertion event is called every time a fact is asserted into the fact list. So, in this event, we just need to watch for "save-facts" and store them in our variable:
procedure TForm1.InferenceEngine1Assertion(Sender: TObject; const Fact: IFact); begin if Fact.Template.TemplateName = 'save-facts' then begin // Save a history of our facts HistoryFacts.Add(Fact.Clone() as IFact); // keep our queue size to 100 or less if HistoryFacts.Count > 100 then HistoryFacts.Delete(0); end; end; Voila, we are saving a history of our save-facts. However, this is not very useful unless we can access them somehow. We need to write a TUserFunction which will access our history list. For our example, we are going to write a user function that returns TRUE if the history list contains 10 or more facts. To create a user function, drop a TUserFunction on your form and set its properties: object UserFunction1: TUserFunction ArgumentString = '<factid-expression>' Comment = 'Give us history for a fact' FunctionName = 'get-history' MaxArgumentCount = 1 MinArgumentCount = 1 Engine = InferenceEngine1 OnCall = UserFunction1Call Left = 128 Top = 8 end In its OnCall event, use the history variable: procedure TForm1.UserFunction1Call(Sender: TObject; FunCall: IFunCall; Context: TIEContext; var Result: IIEValue); var Fact: IFact; begin // our function will return TRUE if there are 10 or more facts in the history // list. // get the first parameter which will be the fact id we want history of // we always resolve the argument in case it is a variable name Fact := InferenceEngine1.Facts.FactById[ FunCall.Argument[0].Resolve(Context).AsFactId ]; if (Fact.Template.TemplateName <> 'save-facts') or (HistoryFacts.Count < 10) then result := FCFalse else result := FCTrue; end; To test our history variable and its function, we will write a rule that uses the function. This simple rule will only activate when get-history returns that there are 10 or greater facts in the history list: (defrule check-history "check if we have saved 10 facts or more" ?id <- (save-facts ?$) (test (get-history ?id)) => (printout t "Our history is finally 10 or more") ) Now, in our expert system, the preceding rule will activate when 10 or more save-facts are asserted: (assert (save-facts Tom Grubb)) <Fact-2> (assert (save-facts John Smith)) <Fact-3> (assert (save-facts 3)) <Fact-4> (assert (save-facts 4)) <Fact-5> (assert (save-facts 5)) <Fact-6> (assert (save-facts 6)) <Fact-7> (assert (save-facts 7)) <Fact-8> (assert (save-facts 8)) <Fact-9> (assert (save-facts 9)) <Fact-10> (agenda) for a total of 0 activations.
At this point,
we have asserted 9 facts into the history list, so the check-history rule has
still not activated. Let's assert one more and we see that check-history has
now activated.
(assert (save-facts 10)) <Fact-11> (agenda) 0 check-history : f-11 (1.0000) for a total of 1 activations.
That's it! We are done. This was a simple
example, but it should illustrate how to create history lists in your own
expert systems. Good luck!
Download: IECS Updated to v2.02
The Inference Engine Component Suite has gotten a small update. It is a
small update which consolidates a couple of bug fixes. Registered users can
go to our new support page and download it. The update links are password
protected so registered users will need to request a password from
support@RiverSoftAVG.com
Please use the email address you used to register the IECS and include your
order number.
Here is a list of the fixes/changes:
of them (Thanks E.P.)
Download: HelpScribble Apprentice Updated
HelpScribble Apprentice, our freeware utility application for HelpScribble
users, has been updated since our last newsletter. HelpScribble
Apprentice has been updated to read HelpScribble v6.x and v7.x files.
Here is the list of changes for this release:
HelpScribble Apprentice is provided as a free utility. No warranties or
guarantees are implied. Use at your own risk. REQUIRES HelpScribble for
generating help files. HelpScribble Apprentice may be downloaded at
http://www.riversoftavg.com/downloads.htm#Free%20Programs
|
Send mail to
webmasterNO@SPAMRiverSoftAVG.com with questions or comments about this web
site.
|