Pages

Monday, January 16, 2012

Appendix F. Additional Supplied Modules

F.15. fuzzystrmatch

F.15.1. Soundex

The Soundex system is a method of matching similar-sounding names by converting them to the same code. (Soundex is not very useful for non-English names). The soundex function converts a string to its Soundex code. The difference function converts two strings to their Soundex codes and then reports the number of matching code positions. Since Soundex codes have four characters, the result ranges from zero to four, with zero being no match and four being an exact match. (Thus, the function is misnamed — similarity would have been a better name.)

SELECT soundex('Anne'), soundex('Ann'), difference('Anne', 'Ann');
     
SELECT soundex('Anne'), soundex('Andrew'), difference('Anne', 'Andrew');
SELECT soundex('Anne'), soundex('Margaret'), difference('Anne', 'Margaret');
    
F.15.3. Metaphone

Metaphone & Double Metaphone are algorithms that produce variable length keys for indexing words by their sound. definitely better compared to SOUNDEX because it is more precise compared to the fixed 4 character code of SOUNDEX. 

Metaphone, like Soundex, is based on the idea of constructing a representative code for an input string. Two strings are then deemed similar if they have the same codes.(source has to be a non-null string with a maximum of 255 characters)

SELECT metaphone('GUMBO', 4);
 metaphone
-----------
 KM
F.15.4. Double Metaphone
The Double Metaphone system computes two "sounds like" strings for a given input string — a "primary" and an "alternate". In most cases they are the same, but for non-English names especially they can be a bit different, depending on pronunciation. These functions compute the primary and alternate codes:

select dmetaphone('gumbo');
 dmetaphone
------------
 KMP

No comments:

Post a Comment