Author Topic: Anagram solver!  (Read 7086 times)

0 Members and 1 Guest are viewing this topic.

Offline mynameistmp

  • Full Member
  • ***
  • Posts: 111
  • Hi! I'm new here!
    • View Profile
Anagram solver!
« on: December 05, 2005, 06:58:11 am »
My Mom always does these anagram puzzles in her morning papers, so I took it upon myself to create a solution finder for her! Now I know that most of you probably aren't interested in what tmps Mother does or solving anagram problems, but you might find the code I made somewhat interesting.

What I am doing is taking a scrambled input string and checking to see if any combination of the string exists in the dictionary file (is it a valid english word or not). Here is what the program looks like in action:

Quote
bash-3.00$ time puzzle.sh adegln
Using anagram: adegln
Sorting adegln alphabetically... adegln
Scanning database...
Found 1 result(s):
angled
Found 2 result(s):
dangle

real    0m0.343s

As you can see, it works pretty quickly ;P The time program tells us less than a second, anyways.

How it works:

I started off with just a dictionary of all english words (thanks iago). I created script.sh which takes every line in the wordlist and rearranges each character alphabetically, keeping all of the lines in the original order. Now you have two wordlists, both parallel on a line to line basis. Then I created afilter.c. afilter (once compiled) takes data from standard input and rearranges the data alphabetically, then spits out the arrangement, in classical unix fashion ;P. Then I wrote puzzle.sh, a bash script that takes a command line argument (the scrambled word), pipes it to afilter wich outputs an alphabetically arranged version of the anagram. That output is grep'd from the second (alphabetized) wordlist. When a match is found, the line number is extracted via awk and awk is used again to scan the first (original) wordlist for that very line number, which is parallel to the second word list, thus giving you the matching (english) word.

Here is the link to the source (script.sh, afilter.c, puzzle.sh, both wordlists) if anybody is interested:
www.javaop.com/~tmp/puzzle.tar.gz

It is 600k, 95% of which is the dictionary file included.
« Last Edit: December 05, 2005, 06:59:59 am by mynameistmp »

Offline Joe

  • B&
  • Moderator
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: Anagram solver!
« Reply #1 on: December 05, 2005, 07:59:48 am »
That's pretty sweet.
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline MyndFyre

  • Boticulator Extraordinaire
  • x86
  • Hero Member
  • *****
  • Posts: 4540
  • The wait is over.
    • View Profile
    • JinxBot :: the evolution in boticulation
Re: Anagram solver!
« Reply #2 on: December 05, 2005, 11:45:03 am »
I wonder: would it be possible to make this faster by creating a database, hashing each word by-letter, and then doing a word lookup by the smaller result subset?

Example:
H(x) is defined:
Code: [Select]
proc H(x):
  hashcode := 0
  for each letter in x
  {
     hashcode ^= letter
  }

  return hashcode
end proc
(Joe, that's real pseudocode :P)

You'll have many more collisions, but you also should speed up lookups.  Other hash functions would be acceptable too, as long as they had the commutative property.  For example, addition and multiplication could work because you get the same result regardless of the operations order.  This is important for an anagram finder, and in fact addition might be better than XOR.

Anyway, once you have the rows with the matching hash, you can do the same row comparison that you do now with the words.  :)
« Last Edit: December 06, 2005, 07:36:33 pm by MyndFyre[x86] »
I have a programming folder, and I have nothing of value there

Running with Code has a new home!

Our species really annoys me.

Offline mynameistmp

  • Full Member
  • ***
  • Posts: 111
  • Hi! I'm new here!
    • View Profile
Re: Anagram solver!
« Reply #3 on: December 05, 2005, 03:45:43 pm »
Quote
I wonder: would it be possible to make this faster by creating a database, hashing each word by-letter, and then doing a word lookup by the smaller result subset?

I could think of a few ways to optimize it (afilter is rather slow itself), however the difference can be no more than a fraction of a second, as it is running at a fraction of a second as it is. Note: all of the search/compare algorithm is implemented by grep itself.

Offline MyndFyre

  • Boticulator Extraordinaire
  • x86
  • Hero Member
  • *****
  • Posts: 4540
  • The wait is over.
    • View Profile
    • JinxBot :: the evolution in boticulation
Re: Anagram solver!
« Reply #4 on: December 05, 2005, 05:48:15 pm »
Of course it's no more than a fraction of a second.  :P  Still, though, a few jillion runs and those fractional seconds add up.  :P
I have a programming folder, and I have nothing of value there

Running with Code has a new home!

Our species really annoys me.

Offline iago

  • Leader
  • Administrator
  • Hero Member
  • *****
  • Posts: 17914
  • Fnord.
    • View Profile
    • SkullSecurity
Re: Anagram solver!
« Reply #5 on: December 05, 2005, 06:05:50 pm »
Tmp's solution is pretty similar to another problem I outlined, from the book Programming Pearls, by Bentley.  The code to sort the letters in each word:
http://www.cs.bell-labs.com/cm/cs/pearls/sign.c

The wordlist is piped into that, then the output from that is piped into the sort program, and the output from that is piped into this:
http://www.cs.bell-labs.com/cm/cs/pearls/squash.c

Which combines similar words onto the same line.  The output from that could be grepped or you could use an optimized search on it. 

I highly recommend Programming Pearls, it's an amazing book. 

Offline Quik

  • Webmaster Guy
  • x86
  • Hero Member
  • *****
  • Posts: 3262
  • \x51 \x75 \x69 \x6B \x5B \x78 \x38 \x36 \x5D
    • View Profile
Re: Anagram solver!
« Reply #6 on: December 05, 2005, 07:22:27 pm »
This is a very interesting idea, and looks quite successful. Although, I think you may be ruining it for your mother.
Quote
[20:21:13] xar: i was just thinking about the time iago came over here and we made this huge bomb and light up the sky for 6 min
[20:21:15] xar: that was funny

Offline Joe

  • B&
  • Moderator
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: Anagram solver!
« Reply #7 on: December 05, 2005, 08:21:53 pm »
Thanks! I'll have to use this for my science homework from now on. I *hate* unscrambling stuff.

If I get a chance, I suppose I'll port this to entirely C++, so that it can be compiled with VC++ for Windows.
I'd personally do as Joe suggests

You might be right about that, Joe.


Offline iago

  • Leader
  • Administrator
  • Hero Member
  • *****
  • Posts: 17914
  • Fnord.
    • View Profile
    • SkullSecurity
Re: Anagram solver!
« Reply #8 on: December 05, 2005, 08:34:01 pm »
I have to complement Tmp's choice in program.  He chose my #1 favorite problem to solve :)


Thanks! I'll have to use this for my science homework from now on. I *hate* unscrambling stuff.

If I get a chance, I suppose I'll port this to entirely C++, so that it can be compiled with VC++ for Windows.

It might be a pain to get that working on Windows, Windows sucks at piping and stuff.  But who knows?

Offline Joe

  • B&
  • Moderator
  • Hero Member
  • *****
  • Posts: 10319
  • In Soviet Russia, text read you!
    • View Profile
    • Github
Re: Anagram solver!
« Reply #9 on: December 06, 2005, 07:56:50 am »
Well, I could probably do it all internally. Grep would be the hardest part, where I'd probably fail.
I'd personally do as Joe suggests

You might be right about that, Joe.