Extract from "A comparison of Similarity Measures for Case-Based Reasonuing." by R.H. Tenback

Introduction

Since the introduction of the first knowledge-based system in the 1970s, interest in these systems has rapidly increased. For a long time, research focused on rule-based expert systems which use production rules to capture domain knowledge and applied an inference algorithm to reason. In recent years, the focus of research shifted from this rule-based approach to new approaches to knowledge-based reasoning. One of these new approaches is Case-based reasoning.

Case-based reasoning differs from traditional rule-based systems in the sense that knowledge is not represented in rules but in examples. Case-based reasoning builds on the idea that human expertise is not composed of formal structures like rules, but of experience: a human expert reasons by relating a new problem to previous ones.Case-based reasoning now amounts to reasoning by comparing a new problem with a set of stored previous problems with their solution. The solution to the new problem is constructed by retrieving similar problems from memory and adapting their associated solutions to apply to the new problem.

Case-based reasoning has several advantages over reasoning with rules. The main advantage is that it is relatively easy to set up a knowledge base. While experience has shown that it generally is very difficult to capture knowledge on a problem domain in a set of rules, examples of problems in this domain with their associated solution are often readily available or can easily be acquired. Another advantage is that case-based reasoning can be used in problem domains that are not well understood. To conclude, a case-based reasoning system can easily be expanded. Expanding a case-based reasoning system amounts to adding new appropriate examples to the set of cases. Expanding a rule-based system on the other hand is much more difficult: adding one rule often means rewriting a large part of the rules.

A major problem in case-based reasoning however, resides in the retrieval of cases that are sufficiently similar to a new problem at hand. For the purpose of retrieval, a case-based reasoning system uses a similarity measure. Based on the specific measure employed, the system associates a numerical value with each case indicating the similarity between this case and the problem under consideration. The basic idea is that cases with highest similarity are retrieved from memory. The solutions of the retrieved cases are then combined to create a solution for the new problem. The difficulty with this approach is that it is hard to find a similarity measure that actually gives high values to cases that are similar to the new problem. Several different similarity measures have been designed, mostly with a specific domain of application in mind. Because these measures are fine-tuned to different problem domains, their performances are not easily compered. In this thesis, we will investigate various similarity measures, and describe experiments with these measures performed on the same problem domain to give insight in their performances.
...