Programming Challenge #8 (challenges for both beginners and experts) - Overclock.net - An Overclocking Community

Forum Jump: 

Programming Challenge #8 (challenges for both beginners and experts)

Reply
 
Thread Tools
post #1 of 42 (permalink) Old 05-29-2013, 02:52 AM - Thread Starter
New to Overclock.net
 
Plan9's Avatar
 
Join Date: Nov 2011
Location: Planet Vegeta
Posts: 8,040
Rep: 574 (Unique: 277)
Create a crossword solver.

(the next challenge is up)

Problem:
Most of you should be familiar with what crossword solvers are, but in case you're not, here's their brief overview: you have a crossword and you know some of the letters for a missing word but you also have a number of blanks. Crossword solvers provide every solution that matches against a pattern (see below) to allow the user to find the answer for the missing word.

Specification:
  • Your application must work against a list of words (see resources),
  • Your results should exclude words with numeric characters and punctuation (eg no hyphens nor apostrophes)
  • and provide a list of solutions that match the a user enterable pattern like such: c??e (which would return code, core, cope and many other for character words that start with c and end with e).
  • As said in the previous point, the search pattern must be user enterable (ie you cannot hardcode in c??e as the string to search for)
  • And finally, other specifications will be defined depending on the challenge and difficulty you wish to attempt (see solutions)

Ideally, I suggest we all use the same word list (see resources) and load that from disk at run time - however I'll allow for some flexibility here to enable greater variability in the difficulty levels.

I've also added a second challenge under hardest in case anyone fancies taking this exercise further (though I don't expect anyone to complete it)

Resources:
Word list:
I suggest we use the english.0 file from ispells dictionary:
http://downloads.sourceforge.net/wordlist/ispell-enwl-3.1.20.zip

Definitions / thesaurus (only required for hardest challenge): lengthy section (Click to show)
Most of the resources I've found here are just frontends to web APIs. And if you wish to use an online service for the hardest challenge then you're welcome to do so (I used colinsdictionary.com for my answer). However I have also found some offline files you can also use should you prefer:
http://wordnetcode.princeton.edu/wn3.1.dict.tar.gz
(That archive was taken from WordNet: http://wordnet.princeton.edu/wordnet/download/current-version/ )

The aforementioned archive contains a bunch of text files that link with each other as a bespoke database, so you'll have to do a little a bit of parsing to make any use of them. But as the data is stored in ASCII, it shouldn't be add any complexity to the program (it's more an annoying additional step).

From a quick look through the archive, I'd suggest concentrating your focus on the files prefixed with "data" (eg data.noun). However you may want to look through the entire archive just in case.

Solutions:
Easiest (challenge 1):
Any language using any includes / imports / modules available. Word list can be a subset already hardcoded into an array

Easy (challenge 1):
Any language using any includes / imports / modules available. Word list has to be loaded from disk (see resources)


Medium (challenge 1):
You're not allow to use your languages pattern matching (eg regex) functions / includes to perform the comparison (I appreciate this will create sub-optimal performance - but this is an academic exercise)


Hardest (challenge 2):
Any language using any includes / imports / modules available, however your results also have to correctly answer the clue. eg if the question is "a string of characters to convey a secret meaning" and the pattern is c??e then the answers core and cope would have a lower rank and/or be filtered out but code would match.

You don't need a 100% success rate 100% of the time and you can still return a multiple set of words - just so long as the suggestions are ranked to show their likelihood (ie your program must perform some logic to attempt to "solve" the clue as well as or instead of just bringing back a list or words based on a pattern match)

To complete this challenge, you'll need additional user input (the crossword question) and you'll need additional data sources to rank your results (it doesn't matter if you use these data sources in addition to the normal word list or in place of. And it doesn't matter what form these sources come in (eg dictionary definitions, thesaurus, etc) just as long as your program is making the comparisons and ranking the results.

This challenge is about proof of concept - about solving a real world problem using programming languages rather than a traditional punching code out tests that other programming challenges are. So while your code still needs to compile, I don't expect your results to be 100% accurate 100% of the time (though there will be bonus kudos points awarded to whoever's routine returns the most accurate results).


Happy coding smile.gif
Plan9 is offline  
Sponsored Links
Advertisement
 
post #2 of 42 (permalink) Old 05-29-2013, 06:07 AM
Fantastic Mr Fox
 
randomizer's Avatar
 
Join Date: Apr 2009
Posts: 5,817
Rep: 175 (Unique: 120)
A quick hack on the easy one.
Warning: Spoiler! (Click to show)
Code:
static void Main(string[] args)
{
    string[] words = File.ReadAllLines("english.0");

    if (!args.Any())
    {
        Console.Write("Enter a pattern or stop wasting my time.");
        return;
    }

    Regex regex = new Regex("^" + args[0].Replace("?", "[\\w]") + "$", RegexOptions.IgnoreCase);
    IEnumerable<string> matches = words.Where(w => regex.Match(w).Success);
    Console.Write(matches.Any() ? matches.Aggregate((in1, in2) => string.Format("{0}{1}{2}", in1, Environment.NewLine, in2)) : "No matches");
}

Ok, time for sleep biggrin.gif

CPU
i7 920 D0
Motherboard
MSI X58 Pro-E
GPU
Gigabyte GTX 970 (GV-N970IX-4GD)
RAM
3x2GB G.Skill DDR3-1333 9-9-9-24
Hard Drive
840 Pro
Hard Drive
Caviar Black
Optical Drive
LG BD-ROM
Power Supply
Corsair HX-520
Case
CM690
Operating System
Windows 8.1 Pro x64
Monitor
Dell U2713HM
Monitor
Dell U2311H
Keyboard
Turbo-Trak (Google it :D)
Mouse
Mionix Avior 7000
Mouse
Everglide Titan
Audio
AKG K 242 HD
▲ hide details ▲
randomizer is offline  
post #3 of 42 (permalink) Old 05-29-2013, 08:59 AM
sɪɴɢɪɴɢ ɢᴏᴅᴅᴇss
 
kennyparker1337's Avatar
 
Join Date: Jul 2010
Location: 35.190.36.24
Posts: 3,356
Rep: 352 (Unique: 304)
edit1: Got rid of Regex extension. Wasn't needed by loading from file on disc.
edit2: Got rid of non-alpha chars by simply checking if it was or not. Missed the specification in OP.


Quote:
C#

Crossword Solver (Medium).zip 128k .zip file

Code is included in ZIP (open with any text editor or IDE).
*Requires .NET 4.0 (Windows XP SP3 or higher).

Long Code (Click to show)
Code:
using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.IO;

namespace Crossword_Solver
{
    public partial class frmMain : Form
    {
        string fileName = "dic.txt";
        string[] words;

        public frmMain()
        {
            InitializeComponent();
        }
        
        private void frmMain_Load(object sender, EventArgs e)
        {
            tbInput.Text = Properties.Settings.Default.LastSearch;

            try
            {
                words = File.ReadAllLines(fileName);
            }
            catch(Exception ex)
            {
                MessageBox.Show(ex.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
                Environment.Exit(-1);
            }
        }

        private void frmMain_FormClosing(object sender, FormClosingEventArgs e)
        {
            Properties.Settings.Default.LastSearch = tbInput.Text;
            Properties.Settings.Default.Save();
        }

        private void btnHelp_Click(object sender, EventArgs e)
        {
            MessageBox.Show("Crossword Solver" + Environment.NewLine +
                            "© Kenny Parker, 2013. All rights reserved." + Environment.NewLine + Environment.NewLine +
                            "Enter in any word with missing letters to show all posibilities." + Environment.NewLine +
                            "Replace empty letters with \"?\"." + Environment.NewLine + 
                            "Example: pr g am = pr?g?am ",
                            "Help", MessageBoxButtons.OK, MessageBoxIcon.Information);
        }

        private void btnSearch_Click(object sender, EventArgs e)
        {
            btnSearch.Enabled = false;
            lbOutput.Clear();

            SearchWord(tbInput.Text)
                .ForEach(word => lbOutput.Items.Add(word));

            btnSearch.Enabled = true;
        }

        private List<string> SearchWord(string input)
        {
            var wordMatches = new List<string>();

            input = tbInput.Text.ToLower();

            for (int w = 0; w < words.Length; w++)
            {
                if (words[w].Length == input.Length)
                {
                    words[w] = words[w].ToLower();

                    int letterMatches = 0;

                    for (int l = 0; l < words[w].Length; l++)
                    {
                        if ((input[l] == '?' || words[w][l] == input[l]) && char.IsLetter(words[w][l])) letterMatches++;
                    }

                    if (letterMatches == input.Length) wordMatches.Add(words[w]);
                }
            }

            return wordMatches;
        }
    }
}


I think Hardest is too difficult to do alone.

The crossword question isn't always the definition.
Even when the definition is used, it does not always match up to other source definitions.

It would be nice if you have us a link to a definition file because I can't find one.

Complete Overclocking Guide: Sandy Bridge & Ivy Bridge | *ASRock Edition*
"If you wish to make an apple pie from scratch, you must first invent the universe." ― Carl Sagan
"That which can be asserted without evidence, can be dismissed without evidence." ― Christopher Hitchens
"The good thing about science is that it's true whether or not you believe in it." ― Neil deGrasse Tyson


kennyparker1337 is offline  
Sponsored Links
Advertisement
 
post #4 of 42 (permalink) Old 05-29-2013, 09:25 AM - Thread Starter
New to Overclock.net
 
Plan9's Avatar
 
Join Date: Nov 2011
Location: Planet Vegeta
Posts: 8,040
Rep: 574 (Unique: 277)
Quote:
Originally Posted by kennyparker1337 View Post

C#

Crossword Solver (Medium).zip 128k .zip file

Code is included in ZIP (open with any text editor).

*Requires Windows XP SP3 or higher.

Nice work, but you're including non-alpha characters:

(also, do you mind posting your source code but inside the [SPOILER=Warning: Spoiler!][/SPOILER] tags please - so people can browse the source without having to download the zip file)

@randomizer: I've not tested your code, but it looks like your program doesn't filter out words with non-alpha characters too.

[edit]
I've tidied up the formatting of the specifications to in my opening post to make it easier to read.
Plan9 is offline  
post #5 of 42 (permalink) Old 05-29-2013, 09:54 AM
sɪɴɢɪɴɢ ɢᴏᴅᴅᴇss
 
kennyparker1337's Avatar
 
Join Date: Jul 2010
Location: 35.190.36.24
Posts: 3,356
Rep: 352 (Unique: 304)
^

Already fixed! I'm quick. ph34r2.gif

Complete Overclocking Guide: Sandy Bridge & Ivy Bridge | *ASRock Edition*
"If you wish to make an apple pie from scratch, you must first invent the universe." ― Carl Sagan
"That which can be asserted without evidence, can be dismissed without evidence." ― Christopher Hitchens
"The good thing about science is that it's true whether or not you believe in it." ― Neil deGrasse Tyson


kennyparker1337 is offline  
post #6 of 42 (permalink) Old 05-29-2013, 10:46 AM - Thread Starter
New to Overclock.net
 
Plan9's Avatar
 
Join Date: Nov 2011
Location: Planet Vegeta
Posts: 8,040
Rep: 574 (Unique: 277)
Quote:
Originally Posted by kennyparker1337 View Post

The crossword question isn't always the definition.
Even when the definition is used, it does not always match up to other source definitions.
Indeed. That's was the clever bit of this challenge. It's aimed at those who are skilled enough programmers that the other challenges don't offer any problems they haven't already solved in other projects - ie the hardest challenge isn't as much about new developers learning new programming skills, but instead about seasoned developers thinking outside the box to solve a counter-intuitive puzzle (from a programmers perspective).
Quote:
Originally Posted by kennyparker1337 View Post

It would be nice if you have us a link to a definition file because I can't find one.
Added, See OP. smile.gif
Plan9 is offline  
post #7 of 42 (permalink) Old 05-29-2013, 11:35 AM
sɪɴɢɪɴɢ ɢᴏᴅᴅᴇss
 
kennyparker1337's Avatar
 
Join Date: Jul 2010
Location: 35.190.36.24
Posts: 3,356
Rep: 352 (Unique: 304)
There are a lot of problems with Hardest.

Take this example:
???
Lay down the lawn.

"sod" will come up on the search but if you look at it's definition it is "surface layer of ground containing a mat of grass and grass roots".
Not a single word match between definition and hint.

I could hook into http://crosswordtracker.com/ but then there is no use because it would be easier to just provide the hint to the website instead of my program in the first place.

The challenge should be named Watson not Hardest. jealoussmiley.gif

Complete Overclocking Guide: Sandy Bridge & Ivy Bridge | *ASRock Edition*
"If you wish to make an apple pie from scratch, you must first invent the universe." ― Carl Sagan
"That which can be asserted without evidence, can be dismissed without evidence." ― Christopher Hitchens
"The good thing about science is that it's true whether or not you believe in it." ― Neil deGrasse Tyson


kennyparker1337 is offline  
post #8 of 42 (permalink) Old 05-29-2013, 02:01 PM - Thread Starter
New to Overclock.net
 
Plan9's Avatar
 
Join Date: Nov 2011
Location: Planet Vegeta
Posts: 8,040
Rep: 574 (Unique: 277)
Quote:
Originally Posted by kennyparker1337 View Post

There are a lot of problems with Hardest.
There's supposed to be - otherwise it wouldn't be hard. rolleyes.gif
Quote:
Originally Posted by kennyparker1337 View Post

Take this example:
???
Lay down the lawn.

"sod" will come up on the search but if you look at it's definition it is "surface layer of ground containing a mat of grass and grass roots".
Not a single word match between definition and hint.
I think you'd have to be pretty dumb to expect 100% accuracy, but it's definitely possible. And if you're not happy with the word list I provided then bloody well find your own instead of moaning that my spoonfeeding isn't good enough rolleyes.gif
Quote:
Originally Posted by kennyparker1337 View Post

I could hook into http://crosswordtracker.com/ but then there is no use because it would be easier to just provide the hint to the website instead of my program in the first place.

The challenge should be named Watson not Hardest. jealoussmiley.gif
Clearly it's not "Watson" hard if you've managed to find a random website that solves this challenge after just a short time looking. rolleyes.gif

Honestly mate, if you're not smart enough to solve this challenge then that's fine. I did say it's a tough challenge and I did say that I didn't expect anyone to complete it. But please don't belittle the point of this thread nor the effort I've put in to create a range of challenges to meet the range of skills on these forums. After all, the whole point of the 2nd challenge was to get seasoned developers thinking rather than just churning out the same code to solve the same run of the mill exercises. Just like the hardest challenge I gave in my last programming challenge (and on that one, someone did solve it by writing a C++ application to wrote assembly directly to RAM). So instead of moaning that a problem is impossible because you can't manage it, how about you join in the spirit of this thread and maybe someone else might surprise you by finding a solution. smile.gif
Plan9 is offline  
post #9 of 42 (permalink) Old 05-29-2013, 03:25 PM
sɪɴɢɪɴɢ ɢᴏᴅᴅᴇss
 
kennyparker1337's Avatar
 
Join Date: Jul 2010
Location: 35.190.36.24
Posts: 3,356
Rep: 352 (Unique: 304)
No need for such backlash, jesus.
I have plenty spirit. I was the first person to submit a Medium finish.
I was just trying to gauge you on how Hardest should be done.

I have found other definition libraries but they posed the same problems.
I wanted to see if you had a better one, that is all.

I wasn't complaining that your library sucked because it didn't.
It just posed a problem for me.

I now see that you don't expect this feature to work all the time.
Thanks for calling me dumb in the process.

The Watson thing was just a joke because it did similar things like using keywords to look for pattern matches.
Quote:
I think you'd have to be pretty dumb to expect 100% accuracy, but it's definitely possible.
And if you're not happy with the word list I provided then bloody well find your own instead of moaning that my spoonfeeding isn't good enough
Honestly mate, if you're not smart enough to solve this challenge then that's fine.

Honestly I get a feeling this challenge isn't worth the amount of time it would require, if that is the feedback I get.

Complete Overclocking Guide: Sandy Bridge & Ivy Bridge | *ASRock Edition*
"If you wish to make an apple pie from scratch, you must first invent the universe." ― Carl Sagan
"That which can be asserted without evidence, can be dismissed without evidence." ― Christopher Hitchens
"The good thing about science is that it's true whether or not you believe in it." ― Neil deGrasse Tyson


kennyparker1337 is offline  
post #10 of 42 (permalink) Old 05-29-2013, 06:08 PM - Thread Starter
New to Overclock.net
 
Plan9's Avatar
 
Join Date: Nov 2011
Location: Planet Vegeta
Posts: 8,040
Rep: 574 (Unique: 277)
Quote:
Originally Posted by kennyparker1337 View Post

No need for such backlash, jesus.
I have plenty spirit. I was the first person to submit a Medium finish.
I was just trying to gauge you on how Hardest should be done.
The point of the exercise was to work that out for yourself smile.gif
If I told you how it should be done then anyone can code it. If I don't tell you how it should be done then people have to work out how to code their own routine for finding the most likely answers
Quote:
Originally Posted by kennyparker1337 View Post

I have found other definition libraries but they posed the same problems.
I wanted to see if you had a better one, that is all.

I wasn't complaining that your library sucked because it didn't.
It just posed a problem for me.
To be fair, working out how to compare the data was meant to be part of the challenge. I say that because most of the sources you'll find are going to have the same problems, but there is an way around that (and I have a working solution I knocked up a couple of hours ago just to prove my theory). But if I told everyone how to find the most likely answers then the solution wouldn't be hard any longer - it would just be a straight coding exercise smile.gif
Quote:
Originally Posted by kennyparker1337 View Post

I now see that you don't expect this feature to work all the time.
Fair enough. I made the assumption that people would take that as a given because most AI (and even Watson - as you referenced before) is can be pretty unreliable at times. I'll update the OP to make that point clear.

edit: just gone back to the OP and I had already stated in there that I wasn't expecting a 100% success rate: "however your answers have to match the definition (or as closely as you can) of the word."
Quote:
Originally Posted by kennyparker1337 View Post

Honestly I get a feeling this challenge isn't worth the amount of time it would require, if that is the feedback I get.
I gave you nice feedback for your first challenge. smile.gif

The reason I've my last post was so blunt was because (rightly or wrongly) your comments were starting to sound quite ungrateful. I appreciate now that this was just crossed wires, but try to look at it from my perspective: I've spent a couple of hours setting up a new challenge that can test a wide range of different skill levels (which in itself, is harder than you'd think). Dug out all the references that you guys would need to get started and tried to make the whole challenge as appealing as I can. I then add a tough 2nd challenge which is intentionally left open ended because it's designed to test problem solving as much as programming aptitude. So I mistook your list of problems to be negative comments rather than questions asking for clarification.

Going back on topic though, I'll happily give more hints at how to solve the 2nd challenge, but I want to see what other people come up with first. But it's definitely possible as I have a working solution. In fact it's easier than you think smile.gif
Plan9 is offline  
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off