Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › Application Programming › FAST 'on the fly' fuzzy string matching console tool written in C
New Posts  All Forums:Forum Nav:

FAST 'on the fly' fuzzy string matching console tool written in C - Page 3

post #21 of 32
@Plan9
offtopic.gif Your post #19 is completely off topic, do you realize that?

Quote:
I'd watch who you're calling an idiot there.
If you are thinking that you are an idiot it’s your right to think whatever you want thumb.gif.

Quote:
I'm a Linux and UNIX sys admin too.
I’m completely sure that you are the SysAdmin, who is dedicated to its work with an analytical thinking and positive attitude to the users. That’s why I’m delighted when such experienced Linux & Unix SysAdmin with more than 3000 posts for an year is talking to me smile.gif.

Quote:
I've never hit a bottleneck with grep. Not even when searching through entire file systems or hundreds of thousands of lines of text.
Really? I never knew that! You are my Star with your wisdom & knowledge tongue.gif!

Quote:
So maybe you're just doing it wrong
Of course I’m wrong smile.gif. Thanks for your time! It was pleasure for me to be your user wink.gif.
post #22 of 32
Quote:
Originally Posted by duhai View Post

@Plan9
offtopic.gif Your post #19 is completely off topic, do you realize that?
Then so was your comment that i was replying to. But then 99% of this thread has been off topic anyway - even without my contributions.
Quote:
Originally Posted by duhai View Post

If you are thinking that you are an idiot it’s your right to think whatever you want thumb.gif.
troll tongue.gif
Quote:
Originally Posted by duhai View Post

I’m completely sure that you are the SysAdmin, who is dedicated to its work with an analytical thinking and positive attitude to the users. That’s why I’m delighted when such experienced Linux & Unix SysAdmin with more than 3000 posts for an year is talking to me smile.gif.
I don't really see what a post count has to do with anything.

Even so, I hadn't realised I spent nearly that much time on here. I really need to spend less time on forums rolleyes.gif
post #23 of 32
@Sanmayce
Thanks for the reply.

Quote:
I want to beautify some of fragments especially the recursion - it should be exterminated;
I agree completely. The recursion must be left into the books biggrin.gif.

Quote:
For a reason (of mystical nature, he-he) I am too emotional and I get easily offended when a source etude of mine is shared and the reactions are as if I am a criminal who has no right to touch the subject, only pretensions&disrespect,..
I saw that you are very sensitive & emotional and it will be pity to change your intention for sharing because of the existence of non-creative or a toxic person. I hope that is not the case smile.gif.
It’s awful to be seen an empty pour soul that doesn’t care about the different opinions except its own limited vision and without any sense of shame or idea for something better. I’m surprised how persistent the stupidity can be.

Quote:
I don't have any info yet, PTHREADS (I am too lazy to read how they work) are native for *nix while I use OPEN MP (even though easy to use, a few basic things I do not understand), I heard (sadly I am not in Linux, many basic things there are also beyond my grasp) OPEN MP support is available already in gcc. As soon as the elf is ready I will post here.
Most of the best servers are written in C including the Linux kernel and they are open source wink.gif. I’m telling you all of this because I’ve spend some time with Kazahana and there are things to be fixed. I did two of them and the compilation with gcc 4.7.2 & OpenMP was successful thumb.gif. I prefer the Linux pthreads in front of the Intel’s OpenMP. I’m afraid that I’ll change a part of the source code to make Kazahana portable and adjusted according to my needs biggrin.gif. BTW, it works perfectly.

Quote:
As for 'grep', there are crafty programmers who juggle with multi-threaded I/O they should make it pass in the next century.
You are right, the grep is not Kazahana and for those who don’t know Kazahana it's a mistake to make direct relation between them smile.gif.

Quote:
And one more thing: don't take me too seriously, because, I know what programming is and for that very reason I am afraid to call myself a programmer, that is, take my tools and enjoy them as a gift.
It was a nice joke, ha-ha-ha tongue.gif and thanks for the gift rolleyes.gif!
post #24 of 32
If I've come across toxic its because the op has spent more time posting poems, manga and train pictures than differentiating this from existing tools.

I'm all for code sharing and free tools. But I also see reinventing existing tools as a bad thing unless they offer significant advantages as the new tools aren't going to be widely available (which is a complete bind if you're an administrator.

Thus far all I've had is condescending non-answers and extracts from Japanese culture.

So if you want to kick off about the toxicity of people in this thread then perhaps you should be looking at how badly this thread failed to answer my original questions.

And for the record, grep can be multithreaded via xargs. And I have written pthreaded applications myself, so all the comments I've been making are from a peer trying to grasp the point of yet another reinvention of the grep tool (and there are hundreds out there)


[edit]

I should point out that I've been here myself. I'd written a replacement for a number of Windows CLI tools. It did grepping / fuzzy matching (far less sophisticated than this though, but I much prefer precise matching personally), could pipe output between STDOUT / STDERR and parameters, and a boat load of other POSIX-like stuff that I missed on Windows (most of which I've long since forgotten). I released the app and source and very few people were interested in it. But that didn't bother me because it was a personal project. However I then started to administrate other servers and found this utility useless without copying it onto every box I wanted to work with. And it was the same case with desktops too. Then when I switched jobs, I just never bothered to install it because I basically had to learn to use the existing CLI tools better so my tools became redundant.

And it's exactly the same case with all the Linux / UNIX configs and aliases. It was such a bind copying them onto each box that I just stopped bothering.

So if I come across as negative, it's because I've been down this road many many times myself. So if this is just a personal project, then that's great. But if you're trying to push your project out there then the OP needs to be clearer about what this tool actually does that users cannot already do - fuzzy matching and multi-threading alone isn't enough to make the effort worthwhile.

I know it can be harsh criticism to swallow, but like I said, if this is purely a personal project then who cares? smile.gif But this does feel a bit like a sales/promotional thread. So if you're wanting to write applications that other people are really going to use and thus applications that are going to get your name known, then there's better concepts to focus your energy on. smile.gif

So I resent being called "toxic" and "idiot" because I happen to offer some concerns. Sometimes you have to take the bad feedback with the good in order to grow as a developer and as a person rolleyes.gif
Edited by Plan9 - 2/12/13 at 1:40am
post #25 of 32
@Sanmayce
Hi Sanmayce,

The Kazahana is not anymore just the fastest train it is the fastest space ship equipped with 16/22/30/60/400 threads thumb.gif. The speed boost with 60 threads was almost 57 times faster than currently used application in my work. It’s amazing! In Saturday I will make the test with 400 threads and I hope until then to put a video clip with the beautiful space ship flying across the HPC space biggrin.gif.

Sanmayce, thank you!
Banzai !!!
post #26 of 32
Quote:
Originally Posted by duhai View Post

The speed boost with 60 threads was almost 57 times faster than currently used application in my work. It’s amazing! In Saturday I will make the test with 400 threads
400 threads?
What sort of hardware are you running that on?
post #27 of 32
Thread Starter 
@Plan9
Man, we are too different to get along.

>... if this is purely a personal project then who cares?
Who cares who cares!
Persons who need a free search tool will have the chance to try this one, hopefully without your 'censorship'.

>But this does feel a bit like a sales/promotional thread.
Again you knocked me down, please stop throwing slanders at me, my only fault is that 6-7 years ago when I was choosing my domain I foolishly chose to be .com which I regret ever since, consequently I learned that it stands for commercial, but in my defence .org and .net appeared to me too pompous and not good for a personal site, anyway I don't sell anything, literally and figuratively.
You have many things to unlearn, that is only my personal opinion which happens to be overlapping with the truth in huge number of cases.
I see where the problem is, you have no faith in people, your presumption is 'guilty', to be open is a noble feature, you see ghosts emanating from your lack of confidence, maybe life has not been good to you, but as long as we are alive we have to learn how to play 'life'.

And for more manga (this word you don't know), being thankful and what not you can read one of my posts at thefreedictionary, I hope you will unlearn something.

@Duhai
Hi my friend, thank you for all your jokes, lively spirit and gratefulness, that's what I cherish most.
Eternal damnation for me if I don't share Kazahana with you, just check your Overclock.net's Personal Messages Box, cheers!

At 2:21 of 'Flower':
"... D-I-I-I-STANT CHILD MY FLOWER ... you amaze me." - one of Kylie's immortal "childs".

Quickly, without second-guessing myself, I am making these verses the motto of Kazahana.
I couldn't miss that one of my oldest callnames shares a common glyph (WIND) with Kazahana, in addition it turns out that an anime character exist with the same name translated as 'Windbloom', immediately I saw the connection:
Kazahana - wind/aerial flower

While searching the WEB for a snowflower I found ... the blog of ... Snow White:
http://divinetheatre.blogspot.com/2011/12/blessings-of-season.html

Below, an actual snowflake (taken from Snow White's blog) under a microscope! Breathtaking and unique!


Still, didn't simulate the wildcard matching thus discarding the nasty recursionsES, however I added two more wildcards:
- wildcard '.' any ALPHA character(s) or empty
- wildcard '`' any NON-ALPHA character(s) or empty

How to use them is shown below:
Code:
E:\Kazahana_r1-++fix+>"Kazahana_r1-++fix+_HEXADECAD-Threads_IntelV12.exe"
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++fix+, copyleft Kaze 2013-Feb-13.
Usage: Kazahana [AtMostLevenshteinDistance] string textualfile
Note1: There are three regimes: exact, wildcards and fuzzy searches. First two kick in when 2 parameters are given, fuzzy when 3.
Note2: What decides whether exact or wildcards? Of course presence of at least one wildcard. To see exact search see Example #4.
Note3: Exact search hits with 'Railgun_Quadruplet_7Gulliver'.
Note4: Incoming string is automatically lowercased for wildcards searches i.e. they are case insensitive.
Note5: Incoming string could be up to 21168/126 chars for exact&wildcards/Levenshtein respectively.
Note6: Incoming textualfile could be bigger than 4GB.
Note7: Each line should end with [CR]LF, that is Windows or/and UNIX style.
Note8: The dump goes to Kazahana.txt file.
Note9: Seven+two wildcards are available:
       wildcard '*' any character(s) or empty,
       wildcard '.' any ALPHA character(s) or empty,
       wildcard '`' any NON-ALPHA character(s) or empty,
       wildcard '@'/'#' any character {or empty}/{and not empty},
       wildcard '^'/'$' any ALPHA character {or empty}/{and not empty},
       wildcard '|'/'~' any NON-ALPHA character {or empty}/{and not empty}.
Example1: E:\>Kazahana 0 ramjet MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
Example2: E:\>Kazahana 3 psychedlicize MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
Example3: E:\>Kazahana "psyched^^^^^^ize^" MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
Example4: E:\>Kazahana "metal fatigue" enwiki-20121201-pages-articles.xml
Example5: E:\>Kazahana "out^^^^^^^^^^^^^ize*" MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd
          E:\>type Kazahana.txt
          [out^^^^^^^^^^^^^ize*] outhyperbolize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
          [out^^^^^^^^^^^^^ize*] outsize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
          [out^^^^^^^^^^^^^ize*] outsized /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
          [out^^^^^^^^^^^^^ize*] outstrategize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/
          [out^^^^^^^^^^^^^ize*] outtyrannize /MASAKARI_General-Purpose_Grade_English_Wordlist_r3_316423_words.wrd/

E:\Kazahana_r1-++fix+>copy con Severina.txt
Mitko Schtereff 4 president
777 trumps 666
Windbloom
^Z
        1 file(s) copied.

E:\Kazahana_r1-++fix+>"Kazahana_r1-++fix+_HEXADECAD-Threads_IntelV12.exe" ". .`." Severina.txt
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++fix+, copyleft Kaze 2013-Feb-13.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK

Kazahana: Total/Checked/Dumped xgrams: 3/3/1
Kazahana: Performance: 0 KB/clock
Kazahana: Performance: 1 xgrams/clock
Kazahana: Done.

E:\Kazahana_r1-++fix+>type Kazahana.txt
[. .`.] Mitko Schtereff 4 president /Severina.txt/

E:\Kazahana_r1-++fix+>"Kazahana_r1-++fix+_HEXADECAD-Threads_IntelV12.exe" `.` Severina.txt
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++fix+, copyleft Kaze 2013-Feb-13.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK

Kazahana: Total/Checked/Dumped xgrams: 3/3/2
Kazahana: Performance: 0 KB/clock
Kazahana: Performance: 3 xgrams/clock
Kazahana: Done.

E:\Kazahana_r1-++fix+>type Kazahana.txt
[`.`] 777 trumps 666 /Severina.txt/
[`.`] Windbloom /Severina.txt/

E:\Kazahana_r1-++fix+>"Kazahana_r1-++fix+_HEXADECAD-Threads_IntelV12.exe" . Severina.txt
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++fix+, copyleft Kaze 2013-Feb-13.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK

Kazahana: Total/Checked/Dumped xgrams: 3/3/1
Kazahana: Performance: 0 KB/clock
Kazahana: Performance: 3 xgrams/clock
Kazahana: Done.

E:\Kazahana_r1-++fix+>type Kazahana.txt
[.] Windbloom /Severina.txt/

E:\Kazahana_r1-++fix+>"Kazahana_r1-++fix+_HEXADECAD-Threads_IntelV12.exe" * Severina.txt
Kazahana, a superfast exact & wildcards & Levenshtein Distance (Wagner-Fischer) searcher, revision 1-++fix+, copyleft Kaze 2013-Feb-13.
omp_get_num_procs( ) = 2
omp_get_max_threads( ) = 2
Enforcing HEXADECAD i.e. hexadecuple-threads ...
Allocating Master-Buffer 7MB ... OK

Kazahana: Total/Checked/Dumped xgrams: 3/3/3
Kazahana: Performance: 0 KB/clock
Kazahana: Performance: 3 xgrams/clock
Kazahana: Done.

E:\Kazahana_r1-++fix+>type Kazahana.txt
[*] Mitko Schtereff 4 president /Severina.txt/
[*] 777 trumps 666 /Severina.txt/
[*] Windbloom /Severina.txt/

E:\Kazahana_r1-++fix+>

Latest revision: Kazahana_r1-++fix+.zip

Tuning is something very interesting and rewarding (it accumulates valuable chunks of experience), one (in fact one more) nasty bottleneck remains to be widened.

It is still hard for me to see what causes that brutal damage on speed scalability, further below.

Having beautified the Gulliver's arrays, by making them global and eliminating the unnecessary reinitializations, the result is a small but needed speed improvement:

Because the tested file is cached the I/O traffic doesn't disturb us.

1-threaded Exact search for 'ramjet' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 632 KB/clock !632!
Kazahana: Performance: 632 KB/clock
Kazahana: Performance: 625 KB/clock
Kazahana: Performance: 632 KB/clock
Kazahana: Performance: 625 KB/clock
Kazahana: Performance: 625 KB/clock
Kazahana: Performance: 624 KB/clock

r1-++fix:
Kazahana: Performance: 625 KB/clock
Kazahana: Performance: 632 KB/clock !632!
Kazahana: Performance: 632 KB/clock
Kazahana: Performance: 632 KB/clock
Kazahana: Performance: 632 KB/clock
Kazahana: Performance: 632 KB/clock
Kazahana: Performance: 632 KB/clock

1-threaded Exact search for 'metal_fatigue' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 713 KB/clock
Kazahana: Performance: 722 KB/clock !722!
Kazahana: Performance: 722 KB/clock
Kazahana: Performance: 722 KB/clock
Kazahana: Performance: 722 KB/clock
Kazahana: Performance: 722 KB/clock
Kazahana: Performance: 722 KB/clock

r1-++fix:
Kazahana: Performance: 703 KB/clock
Kazahana: Performance: 704 KB/clock !704!
Kazahana: Performance: 704 KB/clock
Kazahana: Performance: 704 KB/clock
Kazahana: Performance: 703 KB/clock
Kazahana: Performance: 704 KB/clock
Kazahana: Performance: 704 KB/clock

1-threaded Exact search for 'incomprehensible_misunderstanding' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 806 KB/clock !806!
Kazahana: Performance: 805 KB/clock
Kazahana: Performance: 806 KB/clock
Kazahana: Performance: 806 KB/clock
Kazahana: Performance: 806 KB/clock
Kazahana: Performance: 806 KB/clock
Kazahana: Performance: 805 KB/clock

r1-++fix:
Kazahana: Performance: 794 KB/clock !794!
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 794 KB/clock

Or roughly ((632+722+806)-(632+704+794))/(632+704+794)*100% = 1.4% speed up for the new revision.

16-threaded Exact search for 'ramjet' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 695 KB/clock !695!
Kazahana: Performance: 695 KB/clock
Kazahana: Performance: 687 KB/clock
Kazahana: Performance: 687 KB/clock
Kazahana: Performance: 687 KB/clock
Kazahana: Performance: 695 KB/clock
Kazahana: Performance: 687 KB/clock

r1-++fix:
Kazahana: Performance: 678 KB/clock
Kazahana: Performance: 678 KB/clock
Kazahana: Performance: 678 KB/clock
Kazahana: Performance: 670 KB/clock
Kazahana: Performance: 670 KB/clock
Kazahana: Performance: 686 KB/clock !686!
Kazahana: Performance: 670 KB/clock

16-threaded Exact search for 'metal_fatigue' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 783 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 751 KB/clock
Kazahana: Performance: 772 KB/clock
Kazahana: Performance: 794 KB/clock !794!
Kazahana: Performance: 752 KB/clock
Kazahana: Performance: 762 KB/clock

r1-++fix:
Kazahana: Performance: 752 KB/clock
Kazahana: Performance: 741 KB/clock
Kazahana: Performance: 751 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 741 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 783 KB/clock !783!

16-threaded Exact search for 'incomprehensible_misunderstanding' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 783 KB/clock
Kazahana: Performance: 817 KB/clock !817!
Kazahana: Performance: 794 KB/clock
Kazahana: Performance: 784 KB/clock
Kazahana: Performance: 772 KB/clock
Kazahana: Performance: 817 KB/clock

r1-++fix:
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 784 KB/clock !784!
Kazahana: Performance: 784 KB/clock
Kazahana: Performance: 741 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 783 KB/clock
Kazahana: Performance: 783 KB/clock

Or roughly ((695+794+817)-(686+783+784))/(686+783+784)*100% = 2.3% speed up for the new revision.

The thing that confuses me badly is the miserable speed up for 16-threaded executable: ((695+794+817)-(632+722+806))/(632+722+806)*100% = 6.7%, obviously I was wrong to expect at least 50%, what causes this ugliness, who can explain!?

I took desperate measures and reduced the master buffer from 7MB (in order to search long lines, as Wikipedia's, 7MB is the minimum) down to 1MB.
The master buffer is devoured by those 16 threads, that is, each thread has its own haystack or approximately 1MB/16 = 65536 bytes.

1-threaded Exact search for 'ramjet' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 731 KB/clock
Kazahana: Performance: 762 KB/clock !762!
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 762 KB/clock
Kazahana: Performance: 762 KB/clock

1-threaded Exact search for 'metal_fatigue' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 868 KB/clock
Kazahana: Performance: 869 KB/clock !869!
Kazahana: Performance: 869 KB/clock
Kazahana: Performance: 882 KB/clock
Kazahana: Performance: 869 KB/clock
Kazahana: Performance: 868 KB/clock
Kazahana: Performance: 869 KB/clock

1-threaded Exact search for 'incomprehensible_misunderstanding' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 1,091 KB/clock !1,091!
Kazahana: Performance: 1,089 KB/clock
Kazahana: Performance: 1,091 KB/clock
Kazahana: Performance: 1,089 KB/clock
Kazahana: Performance: 1,089 KB/clock
Kazahana: Performance: 1,089 KB/clock
Kazahana: Performance: 1,089 KB/clock

16-threaded Exact search for 'ramjet' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 830 KB/clock
Kazahana: Performance: 868 KB/clock
Kazahana: Performance: 869 KB/clock !869!
Kazahana: Performance: 868 KB/clock
Kazahana: Performance: 855 KB/clock
Kazahana: Performance: 869 KB/clock
Kazahana: Performance: 855 KB/clock

16-threaded Exact search for 'metal_fatigue' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 1,029 KB/clock !1,029!
Kazahana: Performance: 993 KB/clock
Kazahana: Performance: 976 KB/clock
Kazahana: Performance: 974 KB/clock
Kazahana: Performance: 958 KB/clock
Kazahana: Performance: 992 KB/clock
Kazahana: Performance: 943 KB/clock

16-threaded Exact search for 'incomprehensible_misunderstanding' into 889,537,624 bytes long file '4andabove_Gamera.tar.2.sorted':

r1-++fix+:
Kazahana: Performance: 1,181 KB/clock
Kazahana: Performance: 1,208 KB/clock
Kazahana: Performance: 1,264 KB/clock !1,264!
Kazahana: Performance: 1,183 KB/clock
Kazahana: Performance: 1,208 KB/clock
Kazahana: Performance: 1,209 KB/clock
Kazahana: Performance: 1,183 KB/clock

Again confusion: ((869+1029+1264)-(762+869+1091))/(762+869+1091)*100% = 16.1%, still far from my illusionary 50%, the drop is too deep, who can explain!?

The benchmark/torture that interests me the most is this:

- Physical RAM: 64GB, preferably quad-channeled - to tune fastest on non-fastest is not exciting;
- XEON class CPU 8cores/16threads, or AMD 8cores/8threads;
- Preferably both the RAM&CPU overclocked at MAX - I want a glimpse of the future.

It will show whether disappointing results on my T7500 'Bonboniera' are proportional to results on thread-wise CPUs like XEON/Bulldozer.
The above environment is excellent for tuning Kazahana because my master torture-test English Wikipedia (39.2GB) fits in the OS system cache thus eliminating the I/O traffic.

My primary interest lies in English language phrase suggesting, Galadriel and after that Kazahana were meant to be word/phrase suggesteresses in my free phrase-checker Masakari, however it doesn't hurt they both to be used as standalone tools.

Beside text torturing, what I need is at least one *nix programmer willing to help me to port Kazahana, my skills in *nix are tragical, I would be very glad to see (or rather hear) her operational in *nix environment - it would be the most meaningful 'mashallah' for me.
Just send me an email at sanmayce@sanmayce.com, I will give you (within 48 hours - my latency) a link to C source of latest 1-++fix+ revision.

One song from my childhood is still very dear to me: 'Diana Express - Severina', the world is really small: 'Diana Express' was (she is no more, she is in Trains' Heaven, Bulgaria is falling apart) the name of one of Bulgarian shinkansens (in Japanese: new rail line, in Bulgaria we simply say 'влакът стрела' i.e. 'the arrow train' not bullet train nor high velocity train, because Diana is the goddess of hunting, she is depicted as semi-naked beauty launching arrows gracefully and presumably with god-like accuracy), where 'SEVERINA' is the name of a girl (made of snow, the lyrics don't tell literally or metaphorically) coming every winter from the North, the closest equivalent of 'Severina' being 'Northina'.

Lyrics
Mitko Shterev - keyboards
Illya Angelov - lead vocal & guitar
Диана Експрес - Северина / Diana Express - Severina

Северина, момиче от сняг / Severina a girl made of snow
всяка зима е северен знак / every winter she is a northern sign
аз го имам в песен от юг / I have it in song from south
Северина - радост за друг / Severina - a joy not for me

И като сняг тихо вали / And like a snow she silently comes
вик от мойта любов / a scream from my love
и се топи и навява тъга / and she melts and brings sadness
песента ми за теб / my song for you

Северина, момиче от сняг / Severina a snow girl
на приказна фея / she is fabulous fairy's
е северен знак / northern sign
целува ме бързо / she kisses me quickly
и по снега тръгва зима / and winter start marching on
бяла тъга / white sadness


I've heard that Eskimo people have more than 200 words for snow, this trumps even the sensitivity of Japanese people who are best known for their reverence to Nature, just a few snow related ones:

hatsuyuki : first snow (of season)
hyouden : field of eternal snow
koyuki : light snow
ooyuki : heavy snow
setsuzou : snow sculpture
shinshin : sound of heavy snow-fall
shinshin : mind body
yukionna : snow woman, fairy

Mutsi-mutsi, 'shinshin' is all-zen, in my view, any language lacking a word for sound of snowfalling quickly must incorporate it and fill the GAP.

How much I would like someone versed in their languageS to teach me all variants of 'snowflake'.

In my language 'snowflake' is described with 'снежинка', whereas 'Snowwhite' with 'Снежанка', no bias here: these two Bulgarian words are fantabulous and so ringy, they are feminine (English in that respect fails to connote the most beautiful facet: thenderness), another lovely variant is the Russian 'Снегурочка', Russian is a brother language yet the 'snowflake' counterpart eludes me.

Please tell me what words are in use for 'snowflake' in your language.
post #28 of 32
Quote:
Originally Posted by Sanmayce View Post

@Plan9
Man, we are too different to get along.
You're the one getting shirty though. Every time I ask what differentiates this from existing tools that come pre-installed, you get funny and say I've offended you.
Quote:
Originally Posted by Sanmayce View Post

Persons who need a free search tool will have the chance to try this one, hopefully without your 'censorship'.
They already exist - pre-installed with every OS. And I'm not censoring you, I'm just asking what this tool does that's so awe-inspiring. You're the one censoring yourself by not using that opportunity to sing the praises of your application.
Quote:
Originally Posted by Sanmayce View Post

Again you knocked me down, please stop throwing slanders at me
That isn't slander. I'm actually getting quite annoyed at you now because it's impossible to hold a mature discussion without you throwing your toys out of the pram.
Quote:
Originally Posted by Sanmayce View Post

, my only fault is that 6-7 years ago when I was choosing my domain I foolishly chose to be .com which I regret ever since, consequently I learned that it stands for commercial, but in my defence .org and .net appeared to me too pompous and not good for a personal site, anyway I don't sell anything, literally and figuratively.
That's not the reason why this felt like a promotional thread. The reason why is because you started a thread centred around an app of yours - a thread that you've spammed multiple forums too, I might add- then you keep trying to change the subject whenever competing products are mentioned and making false claims that your app is the only free utility of it's kind available.

So while I would normally give kudos to those who do write freeware and particularly those who release the source, this those thread comes across mighty suspicious - and that's entirely down to the way how you've conducted yourself. It's impossible to get straight answers from you. Then out of the blue a user who hasn't posted before yet shares the same posting style as you comes on board and does the tried and tested routine of a third party endorsing a product. Equally suspicious.
Quote:
Originally Posted by Sanmayce View Post

You have many things to unlearn, that is only my personal opinion which happens to be overlapping with the truth in huge number of cases.
And you call me rude rolleyes.gif
Quote:
Originally Posted by Sanmayce View Post

And for more manga (this word you don't know), being thankful and what not you can read one of my posts at thefreedictionary, I hope you will unlearn something.
1) I know what manga means given that I was the one who posted the term.
2) I'm not being ungrateful, I'm just asking what this sodding application does that makes it so bloody special. However instead of answering what should be a straightforward question, you kick off as if I've insulted your mother.

I'm going to give up on this thread now though because it's become quite clear that you're too immature to chat about the software itself (which should have been the crux of this thread). And looking back at this thread, it's quite obvious that everyone else (bar duhai, who I'd put money on being your alias) is just as confused about the point of this application as I am. So you're doing your utility a real injustice by having this loopy attitude of yours.
Edited by Plan9 - 2/14/13 at 1:10am
post #29 of 32
Thread Starter 
No problema, so be it.
post #30 of 32
Thread Starter 
Just found one very cool 2 years old video:

Cheapest super computer in the world made by Prof. Hamada:

Description:
"This super computer was built by a Japanese university professor , its built using ordinary computer parts that you can find in any computer shop every where , and what is -very- special about this computer is that its very cheap compared to other super computers around the world ,computers owned by governments and big corporations that coasts around 1.2 bil$ each .this one coast only 420000$ ,and it broke the world record of performance ( calculations per second ) compared to the other extremely expensive super computer."


At 4:10 a very nice message.

At 5:04
... the biggest applications haven't even been imagined yet.

Nice, this guy is so natural, should have his own TV show:

What should I say, not a bad machine for my ENWIKI torture at all.

EDIT, 23 Feb:
The article going along with the above video.

I have been badly surprised when looked at AIDA64 Memory results on "powerful" Dual Xeon E5-2687W @3.4GHz:
Memory Copy: 10000MB/s

Just now I got why the system is called 'INSANITY' - quad channel monster machine with such miserable memory bandwidth!?


Two months ago I asked a forum fellow (cavallino) for stats on his dual channel XEON and AIDA reported 21530MB/s, how is that possible!?
Edited by Sanmayce - 2/23/13 at 4:20am
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Application Programming
Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › Application Programming › FAST 'on the fly' fuzzy string matching console tool written in C