|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.ObjectFreeCBR.CBR
public class CBR
This is a CBR (Case Base Reasoning) API implementation. It finds the
closest match among cases in a case set. Each case consists of
a predefined set of features. The features are defined by a
name and a datatype where the datatype may be any of String,
MultiString, Float, Int and Bool.
The closest match is calculated using weighted euclid distance (??) - like Pythagoras theorem in n dimensions.
The returned "hit percentage" is calculated as
100 * (1 - sqrt(case distance/sum(weights))) and receives a value
between 0 and 100.
The distance between the search and a case is a floating point number between 0
and 1 and is calculated as:
case distance = weight1 * dist12 +
weight2 * dist22 + .. +
weightn * distn2
where
disti is the distance between the searched feature and the actual case feature. This value is a float between 0 and 1 where 0 means exact hit and 1 means maximum distance.
weighti is the weight for feature number "i". It is an integer >= 0, default = 5.
This means that the total case distance is >= 0 (0 means exact match) and <= sqrt(sumi=1 to n(weight[i])) wheren is the
number of features searched for.
The distance between the searched feature and the actual case feature
is calculated as:
If case value or searched value is "?" the case feature is disqualified
and not included in the result.
If the search is for only "?":s there are no hits
The "normal" algorithm (let's call it the NormalDistance algorithm) is
distance = min(1, diff(searchedvalue, casevalue)/((maxvalue - minvalue) * infinity_constant))
or in other words
if diff(searchedvalue, casevalue) > (maxvalue - minvalue) * infinity_constant then distance = 1, else
distance = diff(searchedvalue, casevalue)/((maxvalue - minvalue) * infinity_constant)
where "infinity_constant" is a constant that defines what distance is regarded as infinity
The "logarithmic" algorithm (let's call it the LogarithmicDistance algorithm) is
ln(NormalDistance * (e-1) + 1)
If the search is done for:
"=" and fuzzy linear: if exact match then distance = 0 else use the NormalDistance algorithm
"=" and fuzzy logarithmic: if exact match then distance = 0 use the LogarithmicDistance algorithm
"=" and strict: if exact match then distance = 0 otherwise the entire case is disqualified and not included in the result
"=" and flat: if exact match then distance = 0 otherwise distance = 1
"!=" and fuzzy linear: if exact match then distance = 1 else use the NormalDistance algorithm inverted
"!=" and fuzzy logarithmic: if exact match then distance = 1 use the LogarithmicDistance algorithm inverted
"!=" and strict: if exact match then the entire case is disqualified and not included in the result, otherwise distance = 0
"!=" and flat: if exact match then distance = 1 otherwise distance = 0
">=" and fuzzy linear: if searched value >= case value then 0 otherwise use the NormalDistance algorithm
">=" and fuzzy logarithmic: if searched value >= case value then 0 otherwise use the LogarithmicDistance algorithm
">=" and strict: if searched value >= case value then 0 otherwise the entire case is disqualified and not included in the result
">=" and flat: if searched value >= case value then 0 otherwise 1
">" and fuzzy linear: if searched value > case value then 0 otherwise use the NormalDistance algorithm
">" and fuzzy logarithmic: if searched value > case value then 0 otherwise use the LogarithmicDistance algorithm
">" and strict: if searched value > case value then 0 otherwise the entire case is disqualified and not included in the result
">" and flat: if searched value > case value then 0 otherwise 1
"<=" and fuzzy linear: if searched value <= case value then 0 otherwise use the NormalDistance algorithm
"<=" and fuzzy logarithmic: if searched value <= case value then 0 otherwise use the LogarithmicDistance algorithm
"<=" and strict: if searched value <= case value then 0 otherwise the entire case is disqualified and not included in the result
"<=" and flat: if searched value <= case value then 0 otherwise 1
"<" and fuzzy linear: if searched value < case value then 0 otherwise use the NormalDistance algorithm
"<" and fuzzy logarithmic: if searched value < case value then 0 otherwise use the LogarithmicDistance algorithm
"<" and strict: if searched value < case value then 0 otherwise the entire case is disqualified and not included in the result
"<" and flat: if searched value < case value then 0 otherwise 1
"max" and fuzzy linear: the NormalDistance algorithm between current case value and the max case vale
"max" and fuzzy logarithmic: the LogarithmicDistance algorithm between current case value and the max case vale
"max" and strict: if searched value is the max case value then 0, otherwise the entire case is disqualified and not included in the result
"max" and flat: if searched value is the max case value then 0, otherwise 1
"min" and fuzzy linear: the NormalDistance algorithm between current case value and the min case vale
"min" and fuzzy logarithmic: the LogarithmicDistance algorithm between current case value and the min case vale
"min" and strict: if searched value is the min case value then 0, otherwise the entire case is disqualified and not included in the result
"min" and flat: if searched value is the min case value then 0, otherwise 1
| Field Summary | |
|---|---|
static int |
DEFAULT_WEIGHT
Default weight |
protected int |
INFINITY_CONSTANT
Values further away than this are considered infinity |
static int |
SEARCH_OPTION_INVERTED
Should the search result be inverted? |
static short |
SEARCH_SCALE_FLAT
Search with a "flat" scale - if the hit is not exact it is treated as maximum distance |
static short |
SEARCH_SCALE_FUZZY_LINEAR
Search with a linear scale. |
static short |
SEARCH_SCALE_FUZZY_LOGARITHMIC
Search with a logarithmic scale |
static short |
SEARCH_SCALE_STRICT
Search "strict" - if the hit is not exact the case is not included in the result at all |
static short |
SEARCH_TERM_EQUAL
Search for closest value. |
static short |
SEARCH_TERM_GREATER
Search for greater values. |
static short |
SEARCH_TERM_GREATER_OR_EQUAL
Search for greater or equal values. |
static short |
SEARCH_TERM_LESS
Search for smaller values. |
static short |
SEARCH_TERM_LESS_OR_EQUAL
Search for smaller or equal values. |
static short |
SEARCH_TERM_MAX
Search for maximum values, the higher the better. |
static short |
SEARCH_TERM_MIN
Search for minimum, the lower the better. |
static short |
SEARCH_TERM_NOT_EQUAL
Search for non-equal values. |
| Constructor Summary | |
|---|---|
CBR()
Constructor that initiates the CBR with no data. |
|
CBR(java.lang.String logfile,
boolean verbose,
boolean silent)
Constructor that initiates the CBR with no data |
|
CBR(java.lang.String datafile,
java.lang.String logfile,
boolean verbose,
boolean silent)
Constructor that initiates the CBR with data |
|
| Method Summary | |
|---|---|
void |
addCase(Feature[] features)
Adds a case to the set |
void |
addCase(java.lang.String caseString)
Adds a case to the set |
void |
addFeature(java.lang.String name,
short type)
Adds a feature (column) to the set. |
Feature[] |
editCase(int caseNum,
Feature[] features)
Replaces specified case with another |
Feature[] |
getCase(int caseNum)
Returns the case at the specified position |
java.lang.String |
getDatafile()
Returns the name of the data file currently in use |
java.lang.String |
getFeatureName(int featureNum)
Returns the name of the specified feature |
int |
getFeatureNum(java.lang.String featureName)
Returns the number of the feature that carries the specified name |
short |
getFeatureType(int featureNum)
Returns the datatype of the specified feature |
Feature |
getFeatureValue(int caseNum,
int featureNum)
Returns the specified feature of the specified case |
java.lang.String |
getFeatureValueAX(int caseNum,
int featureNum)
Returns the specified feature of the specified case. |
int |
getINFINITY_CONSTANT()
Returns the current infinity constant |
java.lang.String |
getLogfile()
Returns the name of the current log file |
double |
getMaxFloatValue(int featureNum)
Returns the maximum floating point value of all cases for the specified feature |
long |
getMaxIntValue(int featureNum)
Returns the maximum integer value of all cases for the specified feature |
double |
getMinFloatValue(int featureNum)
Returns the minimum floating point value of all cases for the specified feature |
long |
getMinIntValue(int featureNum)
Returns the minimum integer value of all cases for the specified feature |
int |
getNumCases()
Returns the number of cases in current set |
int |
getNumFeatures()
Returns the number of features that each case has. |
boolean |
getSilent()
Returns the silence state |
java.lang.String[] |
getUsedStringValues(int featureNum)
Returns all of the string values used at specified feature, works for String and MultiString features |
java.lang.String |
getUsedStringValuesAX(int featureNum,
java.lang.String separator)
Returns all of the string values used at specified feature, works for String and MultiString features. |
boolean |
getVerbose()
Returns the verbose state |
void |
initialize(java.lang.String datafile,
java.lang.String logfile)
Initializes the CBR if not already done. |
void |
loadSet(java.lang.String filename)
Loads a case set to memory |
void |
newSet(java.lang.String[] featureNames,
java.lang.String[] featureTypeNames)
Empties the memory - deletes the current set from memory and creates a new empty set with the specified feature names and feature data types |
void |
readData()
Reads the data from the datafile. |
Feature[] |
removeCase(int caseNum)
Removes the specified case from the set |
void |
removeFeature(int featureNumber)
Deletes a feature (column) from the set. |
void |
saveSet(java.lang.String filename,
boolean setDefault)
Saves the entire case set |
CBRResult[] |
search(int[] searchFeatureNumbers,
Feature[] searchValues,
int[] searchWeights,
int[] searchTerms,
int[] searchScales,
int[] searchOptions)
Performs a search for the best match. |
WebResult |
search(java.lang.Object req)
Performs a search for the best match in a "web" way. |
CBRResult[] |
search(java.lang.String[] searchFeatureNames,
java.lang.String[] searchValues,
int[] searchWeights,
int[] searchTerms,
int[] searchScales,
int[] searchOptions)
Performs a search for the best match. |
java.lang.String |
searchAX(java.lang.Object[] searchFeatureNames,
java.lang.Object[] searchValues,
java.lang.Object[] searchWeights,
java.lang.Object[] searchTerms,
java.lang.Object[] searchScales,
java.lang.Object[] searchOptions,
java.lang.String resultSeparator,
java.lang.String caseSeparator)
Performs a search for the best match, used primarily by ActiveX components. |
void |
setDatafile(java.lang.String datafile)
Sets the data file. |
void |
setFeatureName(int featureNum,
java.lang.String newName)
Sets the name of the specified feature to the specified value |
void |
setFeatureType(int featureNum,
short newType)
Sets the datatype of the specified feature |
void |
setFeatureValue(int caseNum,
int featureNum,
java.lang.String value)
Sets the specified feature of the specified case to the specified value |
void |
setINFINITY_CONSTANT(int infinity)
Sets the infinity constant |
void |
setLogfile(java.lang.String logfile)
Sets the log file to the specified path |
void |
setSilent(boolean silent)
Sets the silence state |
void |
setVerbose(boolean verbose)
Sets the verbose state |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected int INFINITY_CONSTANT
public static final int DEFAULT_WEIGHT
public static final short SEARCH_TERM_EQUAL
public static final short SEARCH_TERM_NOT_EQUAL
public static final short SEARCH_TERM_GREATER_OR_EQUAL
public static final short SEARCH_TERM_GREATER
public static final short SEARCH_TERM_LESS_OR_EQUAL
public static final short SEARCH_TERM_LESS
public static final short SEARCH_TERM_MAX
public static final short SEARCH_TERM_MIN
public static final short SEARCH_SCALE_FUZZY_LINEAR
public static final short SEARCH_SCALE_FUZZY_LOGARITHMIC
public static final short SEARCH_SCALE_FLAT
public static final short SEARCH_SCALE_STRICT
public static final int SEARCH_OPTION_INVERTED
| Constructor Detail |
|---|
public CBR()
public CBR(java.lang.String logfile,
boolean verbose,
boolean silent)
logfile - path to the file to write log information to. May be set to "null"
which means no logging.verbose - if true then extra verbose information is added to the logfilesilent - if true then no information is output to standard error
public CBR(java.lang.String datafile,
java.lang.String logfile,
boolean verbose,
boolean silent)
throws java.io.IOException
datafile - path to the datafilelogfile - path to the file to write log information to. May be set to "null"
which means no logging.verbose - if true then extra verbose information is added to the logfilesilent - if true then no information is output to standard error
java.io.IOException - if unable to read file| Method Detail |
|---|
public boolean getVerbose()
public void setVerbose(boolean verbose)
verbose - the verbose state to assumepublic boolean getSilent()
public void setSilent(boolean silent)
silent - the silence state to assumepublic java.lang.String getLogfile()
public void setLogfile(java.lang.String logfile)
logfile - file to use as log filepublic java.lang.String getDatafile()
public void setDatafile(java.lang.String datafile)
initialize()
datafile - file to use as input data filereadData(),
initialize(String, String)
public void readData()
throws java.io.IOException,
FreeCBR.NoDataException
initialize() instead.
java.io.IOException - if unable to read from the current data file
NoDataException - if no fileHandler previously specifiedsetDatafile(String),
initialize(String, String)
public void initialize(java.lang.String datafile,
java.lang.String logfile)
throws java.io.IOException
datafile - file to use as input data file. If null then
a new empty case set is created. If the datafile already was
specified with the same value nothing happenslogfile - file to use for logging. If null or the log file
already was specified with the same value nothing happens.
java.io.IOException - if an error occurs when reading the data filesetLogfile(String),
setDatafile(String),
readData()public int getINFINITY_CONSTANT()
INFINITY_CONSTANTpublic void setINFINITY_CONSTANT(int infinity)
infinity - integer to use as infinityINFINITY_CONSTANTpublic int getNumCases()
public int getNumFeatures()
public void addCase(java.lang.String caseString)
caseString - a string describing the case to add. Tab separated
string with the feature values in correct order. MultiString
values are separated by semicolons. An example might be
"HP[tab]1000.5[tab]CD-RW;DVD;Scanner"public void addCase(Feature[] features)
features - an array of features for the case to addFeature
public Feature[] getCase(int caseNum)
throws FreeCBR.NoDataException
caseNum - the number of the case to retrieve (0-based)
NoDataException - when no data is in case base
public Feature[] editCase(int caseNum,
Feature[] features)
caseNum - the number of the case to replacefeatures - the features of the new case
public Feature[] removeCase(int caseNum)
caseNum - the number of the case to delete
public void addFeature(java.lang.String name,
short type)
name - the name of the new featuretype - the type of the new feature
public Feature getFeatureValue(int caseNum,
int featureNum)
throws FreeCBR.NoDataException
caseNum - the number of the case to retrievefeatureNum - the number of the feature to retrieve
NoDataException - when no data is read
public java.lang.String getFeatureValueAX(int caseNum,
int featureNum)
throws FreeCBR.NoDataException
caseNum - the number of the case to retrievefeatureNum - the number of the feature to retrieve
NoDataException - when no data is read
public void setFeatureValue(int caseNum,
int featureNum,
java.lang.String value)
throws FreeCBR.NoDataException
caseNum - the number of the case to changefeatureNum - the number of the feature to changevalue - new value to use
NoDataException - when no data is read
public java.lang.String getFeatureName(int featureNum)
throws FreeCBR.NoDataException
featureNum - the number of the feature which name to retrieve
NoDataException - when no data is read
public void setFeatureName(int featureNum,
java.lang.String newName)
throws FreeCBR.NoDataException
featureNum - the number of the feature which name to changenewName - the new feature name to use
NoDataException - when no data is read
public int getFeatureNum(java.lang.String featureName)
throws FreeCBR.NoDataException
featureName - the name of the feature
NoDataException - when no data is read
public short getFeatureType(int featureNum)
throws FreeCBR.NoDataException
featureNum - the number of the feature which type to retrieve
NoDataException - when no data is readFeature
public void setFeatureType(int featureNum,
short newType)
throws FreeCBR.NoDataException
featureNum - the number of the feature which type to changenewType - the feature type
NoDataException - when no data is readFeaturepublic void removeFeature(int featureNumber)
featureNumber - the number of the feature to delete
public java.lang.String[] getUsedStringValues(int featureNum)
throws FreeCBR.IllegalTypeException,
FreeCBR.NoDataException
featureNum - the number of the feature which string values to retrieve
NoDataException - when no data is read
IllegalTypeException - if the feature type is not String or MultiString
public java.lang.String getUsedStringValuesAX(int featureNum,
java.lang.String separator)
throws FreeCBR.IllegalTypeException,
FreeCBR.NoDataException
featureNum - the number of the feature which string values to retrieveseparator - the separator to use
NoDataException - when no data is read
IllegalTypeException - if the feature type is not String or MultiString
public long getMinIntValue(int featureNum)
throws FreeCBR.IllegalTypeException,
FreeCBR.NoDataException
featureNum - the number of the feature which minimum value is to retrieve
IllegalTypeException - if feature not of type Int
NoDataException - when no data is read
public long getMaxIntValue(int featureNum)
throws FreeCBR.IllegalTypeException,
FreeCBR.NoDataException
featureNum - the number of the feature which maximum value is to retrieve
IllegalTypeException - if feature not of type Int
NoDataException - when no data is read
public double getMinFloatValue(int featureNum)
throws FreeCBR.IllegalTypeException,
FreeCBR.NoDataException
featureNum - the number of the feature which minimum value is to retrieve
IllegalTypeException - if feature not of type Float
NoDataException - when no data is read
public double getMaxFloatValue(int featureNum)
throws FreeCBR.IllegalTypeException,
FreeCBR.NoDataException
featureNum - the number of the feature which maximum value is to retrieve
IllegalTypeException - if feature not of type Float
NoDataException - when no data is read
public void saveSet(java.lang.String filename,
boolean setDefault)
throws java.io.IOException
filename - name of the file to save as. If null then save to current file.setDefault - sets the specified filename to default if true. Otherwise saves as the
specified file name this time only.
java.io.IOException - if an error occurs when saving the set
public void loadSet(java.lang.String filename)
throws java.lang.Exception
filename - name of the file to use.
java.lang.Exception - if an error occurs when loading the set
public void newSet(java.lang.String[] featureNames,
java.lang.String[] featureTypeNames)
featureNames - an array of feature names to usefeatureTypeNames - an array of data type names to use (such as "String", "Float" and so on)Feature
public WebResult search(java.lang.Object req)
throws FreeCBR.NoDataException,
java.lang.Exception
featX, weightX, termX,
scaleX and optionX where X is
the number of the feature corresponding to this value
An example could beCBRBean.search(req)
where req.getQueryString() might look like
feat0=Compaq&scale0=0&feat3=1000&weight3=10
req - the servlet (or jsp) request. Must be javax.servlet.http.HttpServletRequest
NoDataException - when not enough data is present
java.lang.Exception
public java.lang.String searchAX(java.lang.Object[] searchFeatureNames,
java.lang.Object[] searchValues,
java.lang.Object[] searchWeights,
java.lang.Object[] searchTerms,
java.lang.Object[] searchScales,
java.lang.Object[] searchOptions,
java.lang.String resultSeparator,
java.lang.String caseSeparator)
throws FreeCBR.NoDataException
searchFeatureNames - array of names of the features. Must be an array of Strings.searchValues - array of strings describing the features to search for. Must be an array of Strings.searchWeights - array of weights for the search, valid values are >0
where 0 means don't care. May be set to
null which means alla features are equally important. Must be Null or an array of Integers.searchTerms - array of terms of the search. May be any ofNull or an array of Integers.searchScales - array of the scale to use. May be any of
CBR.SEARCH_SCALE_FUZZY_LINEAR, CBR.SEARCH_SCALE_FUZZY_LOGARITHMIC,
CBR.SEARCH_SCALE_FLAT and CBR.SEARCH_SCALE_STRICT. Default (when set to 0 or null) is
CBR.SEARCH_SCALE_FUZZY_LINEAR. Must be Null or an array of Integers.searchOptions - array of options on how to perform the search. Default is no options. Must be Null or an array of Integers.resultSeparator - string to use to separate the case number and the match percentage in the resultcaseSeparator - string to use to separate the cases in the result
resultSeparator
is ":" and the caseSeparator is ";" the result might look like:
"3:33.3;0:25;2:12.5;1:12.5;4:0" which would mean that the best
match is case number 3 with a search hit of 33.3%, case number 0 has a hit
rate of 25% and case number 1 and 2 have a hit rate of 12.5% each. Case
number 4 has the lowest hit rate, 0%. The cases are always returned in
decreasing hit order.
NoDataException - when not enough data is present
public CBRResult[] search(java.lang.String[] searchFeatureNames,
java.lang.String[] searchValues,
int[] searchWeights,
int[] searchTerms,
int[] searchScales,
int[] searchOptions)
throws FreeCBR.NoDataException
searchFeatureNames - array of names of the featuressearchValues - array of strings describing the features to search forsearchWeights - array of weights for the search, valid values are >0
where 0 means don't care. May be set to
null which means alla features are equally important.searchTerms - array of terms of the search. May be any ofsearchScales - array of the scale to use. May be any of
CBR.SEARCH_SCALE_FUZZY_LINEAR, CBR.SEARCH_SCALE_FUZZY_LOGARITHMIC,
CBR.SEARCH_SCALE_FLAT and CBR.SEARCH_SCALE_STRICT. Default (when set to 0 or null) is
CBR.SEARCH_SCALE_FUZZY_LINEAR.searchOptions - array of options on how to perform the search. Default is no options.
NoDataException - when not enough data is present
public CBRResult[] search(int[] searchFeatureNumbers,
Feature[] searchValues,
int[] searchWeights,
int[] searchTerms,
int[] searchScales,
int[] searchOptions)
searchFeatureNumbers - array of types of the featuressearchValues - array of features to search forsearchWeights - array of weights for the search, valid values are 0 to 10
where 0 means don't care and 10 means "must match". May be set to
null which means alla features are equally important.searchTerms - array of terms of the search. May be any ofsearchScales - array of the scale to use. May be any of
CBR.SEARCH_SCALE_FUZZY_LINEAR, CBR.SEARCH_SCALE_FUZZY_LOGARITHMIC,
CBR.SEARCH_SCALE_FLAT and CBR.SEARCH_SCALE_STRICT. Default (when set to 0 or null) is
CBR.SEARCH_SCALE_FUZZY_LINEAR.searchOptions - array of options on how to perform the search. Default is no options.
search(String[], String[], int[], int[], int[], int[])
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||