Thanks for both answers guys. I'll take a look into yacc.
> sound a like and fuzzy logic aren't going to be very reliable, normally you'd
> just reduce it to a list of unique values and then hand check the results to
> remove the mispellings ( technically you've only got a problem if there are
> multiple spellings ).
Precisely, there can be mispellings too. I've seen "BK" for book, in which cases I would have to parse stuff depending on other values.
The code works like this, the first two digits is a general category (let's say "12" is for printed material), and then each other number are sub-categories ("34" for hardcover books, "5" for red).
Don't blame me, I didn't make the old system :)
[download a life]