Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › Programming Challenge #8 (challenges for both beginners and experts)
New Posts  All Forums:Forum Nav:

Programming Challenge #8 (challenges for both beginners and experts) - Page 2

post #11 of 42
I'm sorry if my posts came off as negative.
I was only meaning for clarification.

My mind was clouded after the 2nd post of yours.
You did give me nice feedback the first time. thumb.gif

I'll see if I can get a partially working finish going.

Oh btw, the wording "or as closely as you can" makes it seem like the definition might not match but the functionality would still have to work (ie. get around it).

Like you said, I could implement it but I couldn't guarantee 100% results.
post #12 of 42
Thread Starter 
Quote:
Originally Posted by kennyparker1337 View Post

I'm sorry if my posts came off as negative.
I was only meaning for clarification.

My mind was clouded after the 2nd post of yours.
You did give me nice feedback the first time. thumb.gif

I'll see if I can get a partially working finish going.

Oh btw, the wording "or as closely as you can" makes it seem like the definition might not match but the functionality would still have to work (ie. get around it).

Like you said, I could implement it but I couldn't guarantee 100% results.

I've taken your comments onboard and completely rewritten the question in the hope that it clears things up smile.gif

And sorry about being rude before redface.gif
post #13 of 42
Medium, C#: Some Code, quite long! (Click to show)
Code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace CrossSolve
{
    class Program
    {
        static void Main(string[] args)
        {
            string dicFile = @"english.0";
            if (args.Any() && File.Exists(args[0]))
            {
                dicFile = args[0];
            }
            else
            {
                if (!File.Exists(@"english.0"))
                {
                    return;
                }
            }
            List<string> words = File.ReadLines(dicFile).Where(n => n.All(char.IsLetter)).Select(n => n.ToLowerInvariant()).ToList();
            InitShow();
            for (;DoTakeInputAndSolve(words); )
            {
            }
        }
        private static List<string> TryAndSolve(IEnumerable<string> words, string input)
        {
            IEnumerable<string> wordsInternal = words.Select(n => n).Where(n => n.Length == input.Length);
            for (int i = 0; i < input.Length; i++)
            {
                if (input[i] == '?') continue;
                int whyDoesItWorkLikeThis = i;
                wordsInternal = wordsInternal.Where(n => n[whyDoesItWorkLikeThis] == input[whyDoesItWorkLikeThis]);
            }
            return wordsInternal.OrderBy(n => n).ToList();
        }
        private static bool DoTakeInputAndSolve(IEnumerable<string> words)
        {
            Console.WriteLine("Enter your word:");
            string input = Console.ReadLine().ToLowerInvariant();
            if (string.IsNullOrWhiteSpace(input)) return false;
            List<string> solution = TryAndSolve(words, input.Replace('*', '?'));
            if (!solution.Any())
            {
                Console.WriteLine("No matches found.");
            }
            else
            {
                solution.ForEach(n => Console.WriteLine(n));
            }
            return true;
        }
        private static void InitShow()
        {
            Console.WriteLine("Welcome to CrosswordSolver.");
            Console.WriteLine("Type your words in with a question mark or an asterisk");
            Console.WriteLine("where the missing letters are. e.g.: p?ogr?m??r.");
            Console.WriteLine("CrosswordSolver will try to find the best matches for it.");
            Console.WriteLine("Exit by entering a blank input.");
        }
    }
}

Might take a crack at Hardest when I've got more time.
Great challenge though. smile.gif

Also, it requires english.0 to be in the same dir as the exe, unless you pass a path to it (or any other dictionary, tbh) as the first command line arg.
Edited by 3930K - 5/30/13 at 4:22am
post #14 of 42
Quote:
Originally Posted by Plan9 View Post

@randomizer: I've not tested your code, but it looks like your program doesn't filter out words with non-alpha characters too.

Sorry, I misread the requirements to mean that the wildcard character shouldn't match a non alpha-numeric character (which mine would have anyway; I believe underscores are considered a "word" character). I knew I should have just waited until the next day before rushing out some code. I lose brain function after about 10pm tongue.gif

Here's a modified version, which should only match words with purely alpha characters:
Warning: Spoiler! (Click to show)
Code:
static void Main(string[] args)
{
    string[] words = File.ReadAllLines("english.0");

    if (!args.Any())
    {
        Console.Write("Enter a pattern or stop wasting my time.");

        return;
    }

    Regex regex = new Regex("^" + args[0].Replace("?", "[a-z]") + "$", RegexOptions.IgnoreCase);
    Regex thoseNastySymbols = new Regex("[^a-z]", RegexOptions.IgnoreCase);
    IEnumerable<string> matches = words.Where(word => regex.Match(word).Success && !thoseNastySymbols.Match(word).Success);
    Console.Write(matches.Any() ? matches.Aggregate((in1, in2) => string.Format("{0}{1}{2}", in1, Environment.NewLine, in2)) : "No matches");
}

I should try one of the harder ones. But hey, someone has to solve the easy one too!
Edited by randomizer - 5/30/13 at 2:31am
    
CPUMotherboardGraphicsRAM
i7 920 D0 MSI X58 Pro-E GTX 560 Ti 448 3x2GB G.Skill DDR3-1333 9-9-9-24 
Hard DriveHard DriveOptical DriveOS
840 Pro Caviar Black LG BD-ROM Windows 8.1 Pro x64 
MonitorMonitorKeyboardPower
Dell U2713HM Dell U2311H Turbo-Trak (Google it :D) Corsair HX-520 
CaseMouseMouse PadAudio
CM690 Mionix Avior 7000 Everglide Titan AKG K 242 HD 
  hide details  
Reply
    
CPUMotherboardGraphicsRAM
i7 920 D0 MSI X58 Pro-E GTX 560 Ti 448 3x2GB G.Skill DDR3-1333 9-9-9-24 
Hard DriveHard DriveOptical DriveOS
840 Pro Caviar Black LG BD-ROM Windows 8.1 Pro x64 
MonitorMonitorKeyboardPower
Dell U2713HM Dell U2311H Turbo-Trak (Google it :D) Corsair HX-520 
CaseMouseMouse PadAudio
CM690 Mionix Avior 7000 Everglide Titan AKG K 242 HD 
  hide details  
Reply
post #15 of 42
Thread Starter 
Nice work guys.

I wasn't going to post my contribution to the hardest until a few others have had a go, but I'm thinking I might rewrite it to work offline (plus the source code is pretty much line noise anyway - it's certainly not my most readable of projects redface.gif), so here's the code: source code (Click to show)
Code:
#!/usr/bin/perl

use constant {REGEX_HTML_PARSING => 1};
# set to 0 if you have HTML::Strip CPAN module installed
# set to 1 if you don't (regex isn't great for HTML parsing)

my $excludes = 'noun|verb|and|or|it|is|to|for|etc';

my (@wordlist, %suggestion, $rx, $clue, $html_strip);

use if !REGEX_HTML_PARSING, HTML::Strip;
$html_strip = HTML::Strip->new() if !REGEX_HTML_PARSING;

&main();

sub main() {
    &importWordList();
    &ui();
    &getSuggestions();
    printf "Found %d matches. Narrowing down....\n", scalar(keys %suggestion);

    while ((my $word, my $score) = each(%suggestion)){
        printf "Looking up the definition for %s....\n", $word;
        my @url = (
                    'http://www.collinsdictionary.com/dictionary/english/',
                    'http://www.collinsdictionary.com/dictionary/english-thesaurus/'
                  );
        foreach (@url) {
            $suggestion{$word} = &compareClue($word, $suggestion{$word}, &getDefinition($word, $_));
        }
    }
    
    &displayResults();
}

sub displayResults() {
    print "\nSuggestions:\n------------\nWord:\tScore:\n";
    while ((my $word, my $score) = each(%suggestion)){
        printf "%s\t%d\n", $word, $suggestion{$word};
    }
}

sub ui() {
    print "Please input the search pattern: ";
    ($rx = lc <>)   =~ s/\?/[a-z]/g;
    print "Please input the crossword clue: ";
    ($clue = lc <>) =~ s/[.,]/ /gm;
}

sub compareClue() {
    my ($word, $score, @def) = @_;
    return 0 if !@def;
    my @lookups = &dedup(split /\s/, $clue);
    foreach my $term(@lookups) {
        foreach (@def) { $score++ if $_ eq $term }
    }
    return $score;
}

sub getDefinition() {
    my ($word, $url) = @_;
    (my $html = `curl --silent $url$word`) =~ s/(\n|\r)/ /mg;
    if ($html =~ m/<div class="definition_main">(.*?)<\/div><\/div>/m) {
        my $def = lc(" $1 ");
        !REGEX_HTML_PARSING ? $def = $html_strip->parse($def) : $def =~ s/<(?:[^>'"]*|(['"]).*?\1)*>//gs;
        $def =~ s/[.,:;]/ /gm;
        $def =~ s/\s([0-9]|$excludes)\s/ /gm;
        # i can't decide which of the following two lines is better:
        #return &dedup(split /\s/, $def);
        return split /\s/, $def;
    }
}

sub getSuggestions() {
    foreach my $word(@wordlist) { $suggestion{$word} = 0 if "$word\n" =~ m/^$rx$/ }
}

sub importWordList() {
    open WORDLIST, "<english.0" or die $!;
    while (<WORDLIST>) {
        $_ =~ s/\s+$//g;
        push @wordlist, lc($_) if $_ !~ m/[^a-z]/i;
    }
    close WORDLIST;
    @wordlist = &dedup(@wordlist);
}

sub dedup {
    return keys %{{ map { $_ => 1 } @_ }};
}
output (Click to show)
Code:
Please input the search pattern: ??t??m
Please input the crossword clue: respect highly
Found 5 matches. Narrowing down....
Looking up the definition for esteem....
Looking up the definition for autism....
Looking up the definition for anthem....
Looking up the definition for fathom....
Looking up the definition for bottom....

Suggestions:
------------
Word:   Score:
esteem  7
autism  0
anthem  0
fathom  0
bottom  0


It works ~9/10 times. But as each match requires internet lookups, it's not very practical if you have large number of potential matches (which is why I'm going to rewrite it to work offline)

The other cavet is that it will only run on POSIX systems (Linux, OS X, etc) because I lazily used curl to grab the definitions rather than a Perl library.

As with the others posted, english.0 needs to be in the working directory
Edited by Plan9 - 5/30/13 at 5:29am
post #16 of 42

I find it humorous that creating your own regex function isn't labeled as the hardest challenge as it requires building a fairly complex state machine and an understanding as to how to break the regex into tokens and then convert them into a custom state machine. I do suppose that language recognition is much harder as it is an application of heuristics in an attempt to compensate for the determinism of computer algorithms.

post #17 of 42
Thread Starter 
Quote:
Originally Posted by hajile View Post

I find it humorous that creating your own regex function isn't labeled as the hardest challenge as it requires building a fairly complex state machine and an understanding as to how to break the regex into tokens and then convert them into a custom state machine.

Have you even tried the medium challenge? Because your waaaay off with that remark. The medium challenge is actually rather easy once you break the requirements down. You certainly don't need to build a regex function to complete the medium challenge (in fact you don't even need regex to complete the easy challenges as the pattern matching is that basic). If you don't mind, I'll explain why in a spoiler as others may want to work out the solution for themselves. Warning: Spoiler! (Click to show)
The pattern matching required is even less sophisticated than globbing. So the solution is actually very easy: The only wildcard is for a single character match so the pattern matching routine would just skip the matching for that character position and move onto the next. And since we know that we only want letters, all we need to do is drop the words to lower case and then check that each character code is between 97 and 122 (inclusive). Sure you wouldn't pick up any accents nor unicode characters, but that's forgivable given the nature of the challenge. However you're welcome to build a more sophisticated routine that supports unicode if you want to smile.gif

Edited by Plan9 - 5/30/13 at 9:03am
post #18 of 42
Thread Starter 
medium challenge, written in Go lang (Click to show)
Code:
// 8-medium.go
package main

import (
    "fmt"
    "io/ioutil"
    "os"
    "strings"
)

var (
    pattern  string   // the pattern well match against
    wordlist []string // our word database
)

func loadWordList() {
    // loads the text file from disk and splits it into a string array
    // (well, technically a "slice" since this is Go code. but the concept
    // is the same.)
    b, err := ioutil.ReadFile("english.0")
    if err != nil {
        fmt.Println(err.Error())
        os.Exit(1)
    }
    wordlist = strings.Split(string(b), "\r\n")
}

func main() {
    loadWordList()

    fmt.Print("Please input the search pattern: ")

    // user input
    fmt.Scanln(&pattern)
    pattern = strings.ToLower(pattern)

    // cycle through the word list
    cycleThroughWords()
}

func cycleThroughWords() {
    // loop through each word in word list
    for _, word := range wordlist {

        // check the word is the same length otherwise
        // give up and move onto the next word
        if len(word) == len(pattern) {

            // now lets call our bespoke string matching function.
            // if it does match, output the word
            if matchWord(strings.ToLower(word)) {
                fmt.Println(word)
            }
        }
    }
}

func matchWord(word string) (matched bool) {
    // loop through each character in word
    for i := 0; i < len(pattern); i++ {

        // first lets check the character is a letter and not
        // a number nor punctuation.
        b := []byte(word[i:i+1])
        if b[0] < 97 || b[0] > 122 {
            return false
        }

        // now lets check if the pattern is a wildcard
        if pattern[i:i+1] != "?" {

            // and lastly, if its not a wildcard, lets check
            // that they're the same letters
            if pattern[i:i+1] != word[i:i+1] {
                return false
            }

        }
    }

    // if we get this far, then the word has matched
    return true
}

As you can see - it's all pretty basic stuff smile.gif in fact half of that code is comments laugher.gif
output (Click to show)
Code:
Please input the search pattern: ??t??m
anthem
autism
bottom
esteem
fathom

Edited by Plan9 - 5/30/13 at 8:54am
post #19 of 42
Sounds like a lot of fun to try and do! But I am confused as to how you can get the program to find real words, rather than just random letters to fill in the gaps. I use Liberty Basic, and I am also learning C++. Sorry if this sounds like a stupid question, I am 12 and haven't done programming for quite a while.
post #20 of 42
Thread Starter 
Quote:
Originally Posted by JacP3 View Post

Sounds like a lot of fun to try and do! But I am confused as to how you can get the program to find real words, rather than just random letters to fill in the gaps. I use Liberty Basic, and I am also learning C++. Sorry if this sounds like a stupid question, I am 12 and haven't done programming for quite a while.

There's a zip file linked (under the heading resources) on the opening post of this thread. That zip file has a few dictionaries inside it - dictionaries which are literally just text files with a few thousand English words (one per line). We've been using the file named english.0 (which is why that gets referenced a lot in this thread).

For the easy challenges you have two options. In the easiest one, you can hardcode a number of words (eg as constants) into your program. In the not-so-easy-but-still-pretty-simple challenge you have to load the dictionary file from disk. I'd suggest you load them into an array (as the dictionary is one word per line, it should be pretty simple to convert that to one word per array element).

You can then use that as the basis of your matches instead of random characters smile.gif
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Coding and Programming
Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › Programming Challenge #8 (challenges for both beginners and experts)