Benchmark 'Sherlock Holmes' - superfast grepping into 12GB English subtitles - Overclock.net - An Overclocking Community

Forum Jump: 

Benchmark 'Sherlock Holmes' - superfast grepping into 12GB English subtitles

 
Thread Tools
post #1 of 1 (permalink) Old 05-22-2019, 03:53 AM - Thread Starter
Integer Benchmarker
 
Sanmayce's Avatar
 
Join Date: Mar 2012
Location: Sofia
Posts: 393
Rep: 22 (Unique: 19)
Benchmark 'Sherlock Holmes' - superfast grepping into 12GB English subtitles

Grepping time.

Until several days ago I didn't know of existence of 'ripgrep', the author claims it is the fastest.
Speaking of 'Exact Matching', he used a corpus of plain English sentences (a rip from 400+ thousand subtitle files) and benchmarked it.
His console tool is written in Rust, I proposed him a showdown of my Kazahana tool written in C, he agreed conditionally.

The thing that confuses me is how a GitHub project (as his) with 14,000+ stars and his author are reluctant to compare speeds in one open Rust vs C showdown.
The guy obviously is not a fan of heavy/exhaustive benchmarking, so I will do it myself, my wish was (and still is) to see what optimizations are left unused.

To reproduce his benchmark scenario, here comes his pattern:

Enter Command Prompt, you may double-click on my wide-screen prompt shortcut 'MokujIN GREEN 224 prompt.lnk' and enter:

Code:
F:\grep_vs_ripgrep_vs_Kazahana>GREP_BENCHMARK.bat "Sherlock Holmes" "OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt"
Code:
Pattern: "Sherlock Holmes"
Pattern Length: 15
Haystack: OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt
Haystack Length: 13,113,340,782 (Cached in System RAM)
Testmachine: Windows 10, i7-3630QM (4cores/8threads) 6MB cache, 16GB DDR3 1600MHz(@800MHz)
---------------------------------------------------------------------------------------------------
| Searcher                                                            | Global  Time |       Hits |
|--------------------------------------------------------------------------------------------------
| Kazahana_Trolldom_Monad_GCC_472_SSE41_32bit.exe              520120 |        5.954 |      7,673 |
| Kazahana_Trolldom_Hexadecad_GCC_730_SSE41_64bit.exe          520120 |        5.083 |      7,673 |
| Kazahana_Trolldom_Hexadecad_IntelV15_SSE41_64bit.exe         520120 |        5.513 |      7,673 |
| grep-2.5.4.exe -F -c (LC_ALL=C)                                     |       20.165 |      7,673 |
| ripgrep-11.0.1-x86_64-pc-windows-gnu.exe                            |        4.817 |      7,673 |
---------------------------------------------------------------------------------------------------





Binaries are downloadable at:https://github.com/BurntSushi/ripgrep

The full benchmark package is downloadable at my Internet Drive, grep_vs_ripgrep_vs_Kazahana.7z, (1,411,102,425 bytes):
https://drive.google.com/file/d/1EZu...ew?usp=sharing

It contains:

Code:
F:\grep_vs_ripgrep_vs_Kazahana>dir
 Volume in drive F is Sanmayce_223GB_B
 Volume Serial Number is 443D-57A7

 Directory of F:\grep_vs_ripgrep_vs_Kazahana

05/21/2019  10:48 PM    <DIR>          .
05/21/2019  10:48 PM    <DIR>          ..
05/22/2019  12:21 AM           451,824 grep-2.5.4-bin.zip
05/22/2019  12:21 AM           898,241 grep-2.5.4-dep.zip
05/22/2019  12:21 AM         1,361,303 grep-2.5.4-src.zip
05/22/2019  12:21 AM            96,256 grep.exe
05/22/2019  12:21 AM            19,385 GREP_BENCHMARK.bat
05/22/2019  12:21 AM               502 Kazahana_compile_GCC_32bit.bat
05/22/2019  12:21 AM               502 Kazahana_compile_GCC_64bit.bat
05/22/2019  12:21 AM               617 Kazahana_compile_Intel12_32bit.bat
05/22/2019  12:21 AM         2,220,512 Kazahana_r1-++fix+nowait_critical_nixFIX_WolfRAM+fixITER+EX+CS_fix_DEFINE_Trolldom.c
05/22/2019  12:21 AM           431,699 Kazahana_Trolldom_Hexadecad_GCC_730_SSE41_64bit.exe
05/22/2019  12:21 AM           217,088 Kazahana_Trolldom_Hexadecad_IntelV15_SSE41_64bit.exe
05/22/2019  12:21 AM           251,390 Kazahana_Trolldom_Monad_GCC_472_SSE41_32bit.exe
05/22/2019  12:21 AM           165,782 Kazahana_Trolldom_Monad_GCC_730_SSE41_64bit.exe
05/22/2019  12:21 AM           195,584 Kazahana_Trolldom_Monad_IntelV15_SSE41_64bit.exe
05/22/2019  12:21 AM         1,008,128 libiconv2.dll
05/22/2019  12:21 AM           103,424 libintl3.dll
05/22/2019  12:21 AM         1,114,552 libiomp5md.dll
05/22/2019  12:21 AM            94,540 LineWordreporter.c
05/22/2019  12:21 AM            69,120 LineWordreporter.exe
05/22/2019  12:21 AM             1,633 MokujIN Amber 224 prompt.lnk
05/22/2019  12:21 AM             1,633 MokujIN GREEN 224 prompt.lnk
05/22/2019  12:21 AM    13,113,340,782 OpenSubtitle_corpus_en_2018_(441,450,449_lines_FROM_446,612_files).txt
05/22/2019  12:21 AM           140,288 pcre3.dll
05/22/2019  12:21 AM            94,300 pthreadGC2.dll
05/22/2019  12:21 AM            79,360 regex2.dll
05/22/2019  12:21 AM            87,572 RESULTS_CPU_i7-3630QM.txt
05/22/2019  12:21 AM           207,289 RESULTS_CPU_i7-3630QM_pattern-'gun'.png
05/22/2019  12:21 AM           208,156 RESULTS_CPU_i7-3630QM_pattern-'Sherlock '.png
05/22/2019  12:21 AM           208,205 RESULTS_CPU_i7-3630QM_pattern-'Sherlock Holmes'.png
05/22/2019  12:21 AM         6,923,495 ripgrep-11.0.1-i686-pc-windows-gnu.zip
05/22/2019  12:21 AM         1,595,373 ripgrep-11.0.1-i686-pc-windows-msvc.zip
05/22/2019  12:21 AM        27,519,247 ripgrep-11.0.1-x86_64-pc-windows-gnu.exe
05/22/2019  12:21 AM         7,084,974 ripgrep-11.0.1-x86_64-pc-windows-gnu.zip
05/22/2019  12:21 AM         1,767,433 ripgrep-11.0.1-x86_64-pc-windows-msvc.zip
05/22/2019  12:21 AM             4,096 timer32.exe
              35 File(s) 13,167,964,285 bytes
               2 Dir(s)   1,365,311,488 bytes free

F:\grep_vs_ripgrep_vs_Kazahana>
Q: Why Kazahana revision 'Trolldom' is not fast as it should?
A: Simple, Kazahana "loses time" to split the buffer into 16 chunks, also Kazahana was intended as PLAIN C tool, the pattern lengths of 2/3 should be searched with SIMD.

Of course, Plain C is awesome for portability, but I have idea to write a SIMD revision of Exact Matching that will be beyond superfast, the idea is simplicity itself:

Code:
Pattern: "to"
Haystack:                           "otto...........toz"
HaystackVector1:                    "otto...........t"
HaystackVector2:                    "tto...........to"
Vector1:                            "tttttttttttttttt"
Vector2:                            "oooooooooooooooo"

Mask1=(HaystackVector1 AND Vector1): 0110...........1
Mask2=(HaystackVector2 AND Vector2): 001............1
Result=(Mask1 AND Mask2):            0010...........1
I intend to implement it, pretty much as in my 'Kamboocha' LCSS tool.
If the Result is zero then skip 16 bytes, of course YMM is more desirable - then 32 'if' statements will be "folded".
Attached Thumbnails
Click image for larger version

Name:	RESULTS_CPU_i7-3630QM_pattern-'Sherlock Holmes'.png
Views:	30
Size:	203.3 KB
ID:	270810  


Get down get down get down get it on show love and give it up
What are you waiting on?

Last edited by Sanmayce; 05-22-2019 at 04:04 AM.
Sanmayce is offline  
Sponsored Links
Advertisement
 
Reply

Quick Reply
Message:
Options

Register Now

In order to be able to post messages on the Overclock.net - An Overclocking Community forums, you must first register.
Please enter your desired user name, your email address and other required details in the form below.
User Name:
If you do not want to register, fill this field only and the name will be used as user name for your post.
Password
Please enter a password for your user account. Note that passwords are case-sensitive.
Password:
Confirm Password:
Email Address
Please enter a valid email address for yourself.
Email Address:

Log-in



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Show Printable Version Show Printable Version
Email this Page Email this Page


Forum Jump: 

Posting Rules  
You may post new threads
You may post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off