Thursday, April 21, 2011

Why should you continue reading...

Another post about the Levenshtein distance in ms sql... Why should you bother reading it?

There's ample examples available out there to create a function in sql, but I can give you one very good reason to keep reading... SPEED!

I will start by showing you the benefits so you know whether or not this is interesting for you. It could save you some valuable time reading stuff that makes no sense.

First a screenshot of the results you get when using a levenshtein function written in native SQL:















Here I compare two strings of 2400 characters each and as you see hilighted in green it took about 20 seconds to do the job, yielding a result of 88.89% match. Ok, my server might not be running optimal, but it is about a comparison and the next screenshot I show will be from the same server.

Next a screenshot of the results you get when using a levenshtein function using the methods described further down: (Keeping the secret until the last moment)















Comparing the same string now takes (indeed, there is no trickery here)...
0, naught, nil, zero seconds, yielding the same result of 88.89% match. My functions return percentage as result since I find that makes more sense than the number of edits a levenshtein function would normally return.
And in fact if you use the Sql Profiler you will find that it takes less than a millisecond, it is so fast that the profiler doesn't even notice a time difference between the start and the end of the function!

Does that at least tickle your fancy?

If it does then check out my next post where I will show you how it is done.
I will show you both ways, so you can check if maybe my native Sql function could be optimised to yield the same results... I think not, but I like to be surprised.

How it is done...

No comments:

Post a Comment