Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › Programming Challenge #8 (challenges for both beginners and experts)
New Posts  All Forums:Forum Nav:

Programming Challenge #8 (challenges for both beginners and experts)

post #1 of 42
Thread Starter 
Create a crossword solver.

(the next challenge is up)

Problem:
Most of you should be familiar with what crossword solvers are, but in case you're not, here's their brief overview: you have a crossword and you know some of the letters for a missing word but you also have a number of blanks. Crossword solvers provide every solution that matches against a pattern (see below) to allow the user to find the answer for the missing word.

Specification:
  • Your application must work against a list of words (see resources),
  • Your results should exclude words with numeric characters and punctuation (eg no hyphens nor apostrophes)
  • and provide a list of solutions that match the a user enterable pattern like such: c??e (which would return code, core, cope and many other for character words that start with c and end with e).
  • As said in the previous point, the search pattern must be user enterable (ie you cannot hardcode in c??e as the string to search for)
  • And finally, other specifications will be defined depending on the challenge and difficulty you wish to attempt (see solutions)

Ideally, I suggest we all use the same word list (see resources) and load that from disk at run time - however I'll allow for some flexibility here to enable greater variability in the difficulty levels.

I've also added a second challenge under hardest in case anyone fancies taking this exercise further (though I don't expect anyone to complete it)

Resources:
Word list:
I suggest we use the english.0 file from ispells dictionary:
http://downloads.sourceforge.net/wordlist/ispell-enwl-3.1.20.zip

Definitions / thesaurus (only required for hardest challenge): lengthy section (Click to show)
Most of the resources I've found here are just frontends to web APIs. And if you wish to use an online service for the hardest challenge then you're welcome to do so (I used colinsdictionary.com for my answer). However I have also found some offline files you can also use should you prefer:
http://wordnetcode.princeton.edu/wn3.1.dict.tar.gz
(That archive was taken from WordNet: http://wordnet.princeton.edu/wordnet/download/current-version/ )

The aforementioned archive contains a bunch of text files that link with each other as a bespoke database, so you'll have to do a little a bit of parsing to make any use of them. But as the data is stored in ASCII, it shouldn't be add any complexity to the program (it's more an annoying additional step).

From a quick look through the archive, I'd suggest concentrating your focus on the files prefixed with "data" (eg data.noun). However you may want to look through the entire archive just in case.

Solutions:
Easiest (challenge 1):
Any language using any includes / imports / modules available. Word list can be a subset already hardcoded into an array

Easy (challenge 1):
Any language using any includes / imports / modules available. Word list has to be loaded from disk (see resources)


Medium (challenge 1):
You're not allow to use your languages pattern matching (eg regex) functions / includes to perform the comparison (I appreciate this will create sub-optimal performance - but this is an academic exercise)


Hardest (challenge 2):
Any language using any includes / imports / modules available, however your results also have to correctly answer the clue. eg if the question is "a string of characters to convey a secret meaning" and the pattern is c??e then the answers core and cope would have a lower rank and/or be filtered out but code would match.

You don't need a 100% success rate 100% of the time and you can still return a multiple set of words - just so long as the suggestions are ranked to show their likelihood (ie your program must perform some logic to attempt to "solve" the clue as well as or instead of just bringing back a list or words based on a pattern match)

To complete this challenge, you'll need additional user input (the crossword question) and you'll need additional data sources to rank your results (it doesn't matter if you use these data sources in addition to the normal word list or in place of. And it doesn't matter what form these sources come in (eg dictionary definitions, thesaurus, etc) just as long as your program is making the comparisons and ranking the results.

This challenge is about proof of concept - about solving a real world problem using programming languages rather than a traditional punching code out tests that other programming challenges are. So while your code still needs to compile, I don't expect your results to be 100% accurate 100% of the time (though there will be bonus kudos points awarded to whoever's routine returns the most accurate results).


Happy coding smile.gif
Edited by Plan9 - 6/12/13 at 6:54am
post #2 of 42
A quick hack on the easy one.
Warning: Spoiler! (Click to show)
Code:
static void Main(string[] args)
{
    string[] words = File.ReadAllLines("english.0");

    if (!args.Any())
    {
        Console.Write("Enter a pattern or stop wasting my time.");
        return;
    }

    Regex regex = new Regex("^" + args[0].Replace("?", "[\\w]") + "$", RegexOptions.IgnoreCase);
    IEnumerable<string> matches = words.Where(w => regex.Match(w).Success);
    Console.Write(matches.Any() ? matches.Aggregate((in1, in2) => string.Format("{0}{1}{2}", in1, Environment.NewLine, in2)) : "No matches");
}

Ok, time for sleep biggrin.gif
Edited by randomizer - 5/29/13 at 6:10am
    
CPUMotherboardGraphicsRAM
i7 920 D0 MSI X58 Pro-E GTX 560 Ti 448 3x2GB G.Skill DDR3-1333 9-9-9-24 
Hard DriveHard DriveOptical DriveOS
840 Pro Caviar Black LG BD-ROM Windows 8.1 Pro x64 
MonitorMonitorKeyboardPower
Dell U2713HM Dell U2311H Turbo-Trak (Google it :D) Corsair HX-520 
CaseMouseMouse PadAudio
CM690 Mionix Avior 7000 Everglide Titan AKG K 242 HD 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
i7 920 D0 MSI X58 Pro-E GTX 560 Ti 448 3x2GB G.Skill DDR3-1333 9-9-9-24 
Hard DriveHard DriveOptical DriveOS
840 Pro Caviar Black LG BD-ROM Windows 8.1 Pro x64 
MonitorMonitorKeyboardPower
Dell U2713HM Dell U2311H Turbo-Trak (Google it :D) Corsair HX-520 
CaseMouseMouse PadAudio
CM690 Mionix Avior 7000 Everglide Titan AKG K 242 HD 
  hide details  
Reply
post #3 of 42
edit1: Got rid of Regex extension. Wasn't needed by loading from file on disc.
edit2: Got rid of non-alpha chars by simply checking if it was or not. Missed the specification in OP.


Quote:
C#

Crossword Solver (Medium).zip 128k .zip file

Code is included in ZIP (open with any text editor or IDE).
*Requires .NET 4.0 (Windows XP SP3 or higher).

Long Code (Click to show)
Code:
using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.IO;

namespace Crossword_Solver
{
    public partial class frmMain : Form
    {
        string fileName = "dic.txt";
        string[] words;

        public frmMain()
        {
            InitializeComponent();
        }
        
        private void frmMain_Load(object sender, EventArgs e)
        {
            tbInput.Text = Properties.Settings.Default.LastSearch;

            try
            {
                words = File.ReadAllLines(fileName);
            }
            catch(Exception ex)
            {
                MessageBox.Show(ex.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
                Environment.Exit(-1);
            }
        }

        private void frmMain_FormClosing(object sender, FormClosingEventArgs e)
        {
            Properties.Settings.Default.LastSearch = tbInput.Text;
            Properties.Settings.Default.Save();
        }

        private void btnHelp_Click(object sender, EventArgs e)
        {
            MessageBox.Show("Crossword Solver" + Environment.NewLine +
                            "© Kenny Parker, 2013. All rights reserved." + Environment.NewLine + Environment.NewLine +
                            "Enter in any word with missing letters to show all posibilities." + Environment.NewLine +
                            "Replace empty letters with \"?\"." + Environment.NewLine + 
                            "Example: pr g am = pr?g?am ",
                            "Help", MessageBoxButtons.OK, MessageBoxIcon.Information);
        }

        private void btnSearch_Click(object sender, EventArgs e)
        {
            btnSearch.Enabled = false;
            lbOutput.Clear();

            SearchWord(tbInput.Text)
                .ForEach(word => lbOutput.Items.Add(word));

            btnSearch.Enabled = true;
        }

        private List<string> SearchWord(string input)
        {
            var wordMatches = new List<string>();

            input = tbInput.Text.ToLower();

            for (int w = 0; w < words.Length; w++)
            {
                if (words[w].Length == input.Length)
                {
                    words[w] = words[w].ToLower();

                    int letterMatches = 0;

                    for (int l = 0; l < words[w].Length; l++)
                    {
                        if ((input[l] == '?' || words[w][l] == input[l]) && char.IsLetter(words[w][l])) letterMatches++;
                    }

                    if (letterMatches == input.Length) wordMatches.Add(words[w]);
                }
            }

            return wordMatches;
        }
    }
}


I think Hardest is too difficult to do alone.

The crossword question isn't always the definition.
Even when the definition is used, it does not always match up to other source definitions.

It would be nice if you have us a link to a definition file because I can't find one.
Edited by kennyparker1337 - 5/29/13 at 9:57am
post #4 of 42
Thread Starter 
Quote:
Originally Posted by kennyparker1337 View Post

C#

Crossword Solver (Medium).zip 128k .zip file

Code is included in ZIP (open with any text editor).

*Requires Windows XP SP3 or higher.

Nice work, but you're including non-alpha characters:

(also, do you mind posting your source code but inside the [SPOILER=Warning: Spoiler!][/SPOILER] tags please - so people can browse the source without having to download the zip file)

@randomizer: I've not tested your code, but it looks like your program doesn't filter out words with non-alpha characters too.

[edit]
I've tidied up the formatting of the specifications to in my opening post to make it easier to read.
Edited by Plan9 - 5/29/13 at 9:32am
post #5 of 42
^

Already fixed! I'm quick. ph34r2.gif
post #6 of 42
Thread Starter 
Quote:
Originally Posted by kennyparker1337 View Post

The crossword question isn't always the definition.
Even when the definition is used, it does not always match up to other source definitions.
Indeed. That's was the clever bit of this challenge. It's aimed at those who are skilled enough programmers that the other challenges don't offer any problems they haven't already solved in other projects - ie the hardest challenge isn't as much about new developers learning new programming skills, but instead about seasoned developers thinking outside the box to solve a counter-intuitive puzzle (from a programmers perspective).
Quote:
Originally Posted by kennyparker1337 View Post

It would be nice if you have us a link to a definition file because I can't find one.
Added, See OP. smile.gif
post #7 of 42
There are a lot of problems with Hardest.

Take this example:
???
Lay down the lawn.

"sod" will come up on the search but if you look at it's definition it is "surface layer of ground containing a mat of grass and grass roots".
Not a single word match between definition and hint.

I could hook into http://crosswordtracker.com/ but then there is no use because it would be easier to just provide the hint to the website instead of my program in the first place.

The challenge should be named Watson not Hardest. jealoussmiley.gif
Edited by kennyparker1337 - 5/29/13 at 11:40am
post #8 of 42
Thread Starter 
Quote:
Originally Posted by kennyparker1337 View Post

There are a lot of problems with Hardest.
There's supposed to be - otherwise it wouldn't be hard. rolleyes.gif
Quote:
Originally Posted by kennyparker1337 View Post

Take this example:
???
Lay down the lawn.

"sod" will come up on the search but if you look at it's definition it is "surface layer of ground containing a mat of grass and grass roots".
Not a single word match between definition and hint.
I think you'd have to be pretty dumb to expect 100% accuracy, but it's definitely possible. And if you're not happy with the word list I provided then bloody well find your own instead of moaning that my spoonfeeding isn't good enough rolleyes.gif
Quote:
Originally Posted by kennyparker1337 View Post

I could hook into http://crosswordtracker.com/ but then there is no use because it would be easier to just provide the hint to the website instead of my program in the first place.

The challenge should be named Watson not Hardest. jealoussmiley.gif
Clearly it's not "Watson" hard if you've managed to find a random website that solves this challenge after just a short time looking. rolleyes.gif

Honestly mate, if you're not smart enough to solve this challenge then that's fine. I did say it's a tough challenge and I did say that I didn't expect anyone to complete it. But please don't belittle the point of this thread nor the effort I've put in to create a range of challenges to meet the range of skills on these forums. After all, the whole point of the 2nd challenge was to get seasoned developers thinking rather than just churning out the same code to solve the same run of the mill exercises. Just like the hardest challenge I gave in my last programming challenge (and on that one, someone did solve it by writing a C++ application to wrote assembly directly to RAM). So instead of moaning that a problem is impossible because you can't manage it, how about you join in the spirit of this thread and maybe someone else might surprise you by finding a solution. smile.gif
post #9 of 42
No need for such backlash, jesus.
I have plenty spirit. I was the first person to submit a Medium finish.
I was just trying to gauge you on how Hardest should be done.

I have found other definition libraries but they posed the same problems.
I wanted to see if you had a better one, that is all.

I wasn't complaining that your library sucked because it didn't.
It just posed a problem for me.

I now see that you don't expect this feature to work all the time.
Thanks for calling me dumb in the process.

The Watson thing was just a joke because it did similar things like using keywords to look for pattern matches.
Quote:
I think you'd have to be pretty dumb to expect 100% accuracy, but it's definitely possible.
And if you're not happy with the word list I provided then bloody well find your own instead of moaning that my spoonfeeding isn't good enough
Honestly mate, if you're not smart enough to solve this challenge then that's fine.

Honestly I get a feeling this challenge isn't worth the amount of time it would require, if that is the feedback I get.
Edited by kennyparker1337 - 5/29/13 at 3:28pm
post #10 of 42
Thread Starter 
Quote:
Originally Posted by kennyparker1337 View Post

No need for such backlash, jesus.
I have plenty spirit. I was the first person to submit a Medium finish.
I was just trying to gauge you on how Hardest should be done.
The point of the exercise was to work that out for yourself smile.gif
If I told you how it should be done then anyone can code it. If I don't tell you how it should be done then people have to work out how to code their own routine for finding the most likely answers
Quote:
Originally Posted by kennyparker1337 View Post

I have found other definition libraries but they posed the same problems.
I wanted to see if you had a better one, that is all.

I wasn't complaining that your library sucked because it didn't.
It just posed a problem for me.
To be fair, working out how to compare the data was meant to be part of the challenge. I say that because most of the sources you'll find are going to have the same problems, but there is an way around that (and I have a working solution I knocked up a couple of hours ago just to prove my theory). But if I told everyone how to find the most likely answers then the solution wouldn't be hard any longer - it would just be a straight coding exercise smile.gif
Quote:
Originally Posted by kennyparker1337 View Post

I now see that you don't expect this feature to work all the time.
Fair enough. I made the assumption that people would take that as a given because most AI (and even Watson - as you referenced before) is can be pretty unreliable at times. I'll update the OP to make that point clear.

edit: just gone back to the OP and I had already stated in there that I wasn't expecting a 100% success rate: "however your answers have to match the definition (or as closely as you can) of the word."
Quote:
Originally Posted by kennyparker1337 View Post

Honestly I get a feeling this challenge isn't worth the amount of time it would require, if that is the feedback I get.
I gave you nice feedback for your first challenge. smile.gif

The reason I've my last post was so blunt was because (rightly or wrongly) your comments were starting to sound quite ungrateful. I appreciate now that this was just crossed wires, but try to look at it from my perspective: I've spent a couple of hours setting up a new challenge that can test a wide range of different skill levels (which in itself, is harder than you'd think). Dug out all the references that you guys would need to get started and tried to make the whole challenge as appealing as I can. I then add a tough 2nd challenge which is intentionally left open ended because it's designed to test problem solving as much as programming aptitude. So I mistook your list of problems to be negative comments rather than questions asking for clarification.

Going back on topic though, I'll happily give more hints at how to solve the 2nd challenge, but I want to see what other people come up with first. But it's definitely possible as I have a working solution. In fact it's easier than you think smile.gif
Edited by Plan9 - 5/29/13 at 6:20pm
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Coding and Programming
Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › Programming Challenge #8 (challenges for both beginners and experts)