Overclock.net - An Overclocking Community

Overclock.net - An Overclocking Community (https://www.overclock.net/forum/)
-   Benchmarking Software and Discussion (https://www.overclock.net/forum/21-benchmarking-software-discussion/)
-   -   Benchmark 'Sherlock Holmes' - superfast grepping into 12GB English subtitles (https://www.overclock.net/forum/21-benchmarking-software-discussion/1726262-benchmark-sherlock-holmes-superfast-grepping-into-12gb-english-subtitles.html)

Sanmayce 05-22-2019 03:53 AM

Benchmark 'Sherlock Holmes' - superfast grepping into 12GB English subtitles
 
1 Attachment(s)
Grepping time.

Until several days ago I didn't know of existence of 'ripgrep', the author claims it is the fastest.
Speaking of 'Exact Matching', he used a corpus of plain English sentences (a rip from 400+ thousand subtitle files) and benchmarked it.
His console tool is written in Rust, I proposed him a showdown of my Kazahana tool written in C, he agreed conditionally.

The thing that confuses me is how a GitHub project (as his) with 14,000+ stars and his author are reluctant to compare speeds in one open Rust vs C showdown.
The guy obviously is not a fan of heavy/exhaustive benchmarking, so I will do it myself, my wish was (and still is) to see what optimizations are left unused.

To reproduce his benchmark scenario, here comes his pattern:

Enter Command Prompt, you may double-click on my wide-screen prompt shortcut 'MokujIN GREEN 224 prompt.lnk' and enter:

Code:


F:\grep_vs_ripgrep_vs_Kazahana>GREP_BENCHMARK.bat "Sherlock Holmes" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt"

Code:

Pattern: "Sherlock Holmes"
Pattern Length: 15
Haystack: OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt
Haystack Length: 13,113,340,782 (Cached in System RAM)
Testmachine: Windows 10, i7-3630QM (4cores/8threads) 6MB cache, 16GB DDR3 1600MHz(@800MHz)
---------------------------------------------------------------------------------------------------
| Searcher                                                            | Global  Time |      Hits |
|--------------------------------------------------------------------------------------------------
| Kazahana_Trolldom_Monad_GCC_472_SSE41_32bit.exe              520120 |        5.954 |      7,673 |
| Kazahana_Trolldom_Hexadecad_GCC_730_SSE41_64bit.exe          520120 |        5.083 |      7,673 |
| Kazahana_Trolldom_Hexadecad_IntelV15_SSE41_64bit.exe        520120 |        5.513 |      7,673 |
| grep-2.5.4.exe -F -c (LC_ALL=C)                                    |      20.165 |      7,673 |
| ripgrep-11.0.1-x86_64-pc-windows-gnu.exe                            |        4.817 |      7,673 |
---------------------------------------------------------------------------------------------------


https://www.overclock.net/forum/atta...p;d=1558522378

https://software.intel.com/sites/def...-%27gun%27.png

Binaries are downloadable at:https://github.com/BurntSushi/ripgrep

The full benchmark package is downloadable at my Internet Drive, grep_vs_ripgrep_vs_Kazahana.7z, (1,411,102,425 bytes):
https://drive.google.com/file/d/1EZu...ew?usp=sharing

It contains:

Code:

F:\grep_vs_ripgrep_vs_Kazahana>dir
 Volume in drive F is Sanmayce_223GB_B
 Volume Serial Number is 443D-57A7

 Directory of F:\grep_vs_ripgrep_vs_Kazahana

05/21/2019  10:48 PM    <DIR>          .
05/21/2019  10:48 PM    <DIR>          ..
05/22/2019  12:21 AM          451,824 grep-2.5.4-bin.zip
05/22/2019  12:21 AM          898,241 grep-2.5.4-dep.zip
05/22/2019  12:21 AM        1,361,303 grep-2.5.4-src.zip
05/22/2019  12:21 AM            96,256 grep.exe
05/22/2019  12:21 AM            19,385 GREP_BENCHMARK.bat
05/22/2019  12:21 AM              502 Kazahana_compile_GCC_32bit.bat
05/22/2019  12:21 AM              502 Kazahana_compile_GCC_64bit.bat
05/22/2019  12:21 AM              617 Kazahana_compile_Intel12_32bit.bat
05/22/2019  12:21 AM        2,220,512 Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_Trolldom.c
05/22/2019  12:21 AM          431,699 Kazahana_Trolldom_Hexadecad_GCC_730_SSE41_64bit.exe
05/22/2019  12:21 AM          217,088 Kazahana_Trolldom_Hexadecad_IntelV15_SSE41_64bit.exe
05/22/2019  12:21 AM          251,390 Kazahana_Trolldom_Monad_GCC_472_SSE41_32bit.exe
05/22/2019  12:21 AM          165,782 Kazahana_Trolldom_Monad_GCC_730_SSE41_64bit.exe
05/22/2019  12:21 AM          195,584 Kazahana_Trolldom_Monad_IntelV15_SSE41_64bit.exe
05/22/2019  12:21 AM        1,008,128 libiconv2.dll
05/22/2019  12:21 AM          103,424 libintl3.dll
05/22/2019  12:21 AM        1,114,552 libiomp5md.dll
05/22/2019  12:21 AM            94,540 LineWordreporter.c
05/22/2019  12:21 AM            69,120 LineWordreporter.exe
05/22/2019  12:21 AM            1,633 MokujIN Amber 224 prompt.lnk
05/22/2019  12:21 AM            1,633 MokujIN GREEN 224 prompt.lnk
05/22/2019  12:21 AM    13,113,340,782 OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt
05/22/2019  12:21 AM          140,288 pcre3.dll
05/22/2019  12:21 AM            94,300 pthreadGC2.dll
05/22/2019  12:21 AM            79,360 regex2.dll
05/22/2019  12:21 AM            87,572 RESULTS_CPU_i7-3630QM.txt
05/22/2019  12:21 AM          207,289 RESULTS_CPU_i7-3630QM_pattern-'gun'.png
05/22/2019  12:21 AM          208,156 RESULTS_CPU_i7-3630QM_pattern-'Sherlock '.png
05/22/2019  12:21 AM          208,205 RESULTS_CPU_i7-3630QM_pattern-'Sherlock Holmes'.png
05/22/2019  12:21 AM        6,923,495 ripgrep-11.0.1-i686-pc-windows-gnu.zip
05/22/2019  12:21 AM        1,595,373 ripgrep-11.0.1-i686-pc-windows-msvc.zip
05/22/2019  12:21 AM        27,519,247 ripgrep-11.0.1-x86_64-pc-windows-gnu.exe
05/22/2019  12:21 AM        7,084,974 ripgrep-11.0.1-x86_64-pc-windows-gnu.zip
05/22/2019  12:21 AM        1,767,433 ripgrep-11.0.1-x86_64-pc-windows-msvc.zip
05/22/2019  12:21 AM            4,096 timer32.exe
              35 File(s) 13,167,964,285 bytes
              2 Dir(s)  1,365,311,488 bytes free

F:\grep_vs_ripgrep_vs_Kazahana>

Q: Why Kazahana revision 'Trolldom' is not fast as it should?
A: Simple, Kazahana "loses time" to split the buffer into 16 chunks, also Kazahana was intended as PLAIN C tool, the pattern lengths of 2/3 should be searched with SIMD.

Of course, Plain C is awesome for portability, but I have idea to write a SIMD revision of Exact Matching that will be beyond superfast, the idea is simplicity itself:

Code:

Pattern: "to"
Haystack:                          "otto...........toz"
HaystackVector1:                    "otto...........t"
HaystackVector2:                    "tto...........to"
Vector1:                            "tttttttttttttttt"
Vector2:                            "oooooooooooooooo"

Mask1=(HaystackVector1 AND Vector1): 0110...........1
Mask2=(HaystackVector2 AND Vector2): 001............1
Result=(Mask1 AND Mask2):            0010...........1

I intend to implement it, pretty much as in my 'Kamboocha' LCSS tool.
If the Result is zero then skip 16 bytes, of course YMM is more desirable - then 32 'if' statements will be "folded".


All times are GMT -7. The time now is 08:47 PM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.

User Alert System provided by Advanced User Tagging (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.
vBulletin Security provided by vBSecurity (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.

vBulletin Optimisation provided by vB Optimise (Pro) - vBulletin Mods & Addons Copyright © 2019 DragonByte Technologies Ltd.