Lexical Similarity Measure

My last post described the Levenshtein's algorithm for finding the differences between two sequences which does so by means of "edit distance" metric.  Recently, I read an algorithm that utilizes the same metric to calculate the lexical similarity measure between two strings Li and Lj:

SM(Li, Lj ) := max(0 , (min( |Li|, |Lj| ) - ed(Li,Lj) ) / min(|Li|, |Lj|))

where

|Li| and |Lj| are the lengths of  Li and Lj ,
ed(Li, Lj) is the edit distance between Li and Lj

This method can be applied for comparing two set of strings as well. Details can be found in the paper: A. Maedche and S. Staab, "Measuring Similarity between Ontologies", Karlsruhe, Germany.

0 comments:

Post a Comment