Overclock.net banner
1 - 17 of 260 Posts
Alrighty, I bit and made a little update for the workbook. I basically automated the entire process to not require anyone to manually sort nor to record time on a stopwatch. Just click the "Benchmark". I'll want to make a few more updates for the sheet because I do not know whether it tests all areas of the CPU or only cache (doesn't look like, but the data is still on the file.

Linky
 
Can you change the OP file to the newer link in my post as I made a minor change to how the range is picked for sorting the data.
Edit: Disabled ability to save the worksheet as saving the sheet in the same order that the code sorts gives incorrect results
 
I think MS added ability to use more than 2 cores to it's Office products with Office 2010. Also @ OP, if you're looking to do work with data, you should try to get PowerPivot addon onto your excel '10 or later. It's attuned to some of the tasks data scientists do with other software's.

With the current excel results, can some of you try to adjust your overclocks on your systems to see how the results variate.

Update: Excel (2010 & later) use multiple cores with formulas but when running macros, it's limited to one core. On my system, it pegs my Core 3 @ ~90% whereas others hover ~25-30%.

This bench could give ipc differences between cpu (don't know why pook is getting the result what he's getting), but I'd like to know the how the scores variate when changing cpu speed and ram speeds.
 
I made a little bigger change to the way the file benchmarks. I've changed how the final data is being sorted since that column is dependent on the data at the rear columns. The changes that I've made:

1. Truncated the data from the original 60,000 odd rows to just 1,000 rows where every cell is being calculated.
2. You need not close the file and restart to get correct score. The file now should be consistent whether you're benchmarking for the first time or 10th.

With just the above changes, the file takes ~3 times the time it took to bench the first time, while also pegging the main thread ~70-90%. Currently, the cells are only doing additions, divisions and getting averages across a small range of cells. I'd like to know any more suggestions that you guys like to see implemented.

P.S. I'm not an advanced user with VBA's, so I may not be able to accomplish everything that you may have in mind, but do let me know and I'll try to incorporate them.

Data1000.zip 1282k .zip file


I let the original file run on my work system the entire night with the changes that I made. My core 3 seems to be pegged at more than 70% and rest of the cores are pegged ~ 30.

 

Attachments

Currently trying to research more options to spread load across multiple threads and better utilization. From what I've gathered these past few days is that excel cannot utilize more than one core / thread if you're using VBA (mine does use VBA) but general calculations within the sheets with formulas can run parallel (it's why you see ~25% utilization on my sheets on other cores).

My file to me seems to be a mix of both, but I'm still looking to incorporate lookups to tie in RAM with proc usage as well.

It will take time since I find 2 hours to spend in the evening on this, so I'll be a little slow with updates.

My next goal is to find what gives better utilization across threads as well as to get consistent loads on each core.
 
Quote:
Originally Posted by AlphaC View Post

My observation :The original file , with simply clicking the column filter (column heading) and hitting "sort by smallest to largest" results in a horribly singlethreaded test. There's no way an i5 is faster than a i7-6950x.

ir88ed : I'm unsure of the purpose of sorting the columns , but if you need to sort multiple times you might consider using a Pivot Table or using MATLAB code to process the data in parallel and then spitting it back out into the Excel sheet as a new sheet altogether.
I like this suggestion, but wouldn't that basically be benchmarking MATLAB and not Excel ?
 
But, how would that affect benching excel ?
 
Quote:
Originally Posted by AlphaC View Post

Quote:
Originally Posted by huzzug View Post

But, how would that affect benching excel ?
64 bit might be better
tongue.gif


Especially for over 4 GB of virtual RAM...

https://msdn.microsoft.com/en-us/vba/excel-vba/articles/excel-performance-and-limit-improvements
Quote:
Large data sets and the 64-bit version of Excel

The 64-bit version of Excel 2010 is not constrained to 2 GB of RAM like the 32-bit version applications. Therefore, the 64-bit version of Excel 2010 enables users to create much larger workbooks. The 64-bit version of Windows enables a larger addressable memory capacity, and Excel is designed to take advantage of that capacity. For example, users are able to fill more of the grid with data than was possible in previous versions of Excel. As more RAM is added to the computer, Excel uses that additional memory, allows larger and larger workbooks, and scales with the amount of RAM available.

In addition, because the 64-bit version of Excel enables larger data sets, both the 32-bit and 64-bit versions of Excel 2010 introduce improvements to common large data set tasks such as entering and filling down data, sorting, filtering, and copying and pasting data. Memory usage is also optimized to be more efficient in both the 32-bit and 64-bit versions of Excel.
Quote:
Calculation improvements

Starting in Excel 2007, multithreaded calculation improved calculation performance.

Starting in Excel 2010, additional performance improvements were made to further increase calculation speed. Excel 2010 can call user-defined functions asynchronously. Calling functions asynchronously improves performance by allowing several calculations to run at the same time. When you run user-defined functions on a compute cluster, calling functions asynchronously enables several computers to be used to complete the calculations. For more information, see Asynchronous User-Defined Functions.

Multi-core processing

Excel 2010 made additional investments to take advantage of multi-core processors and increase performance for routine tasks. Starting in Excel 2010, the following features use multi-core processors: saving a file, opening a file, refreshing a PivotTable (for external data sources, except OLAP and SharePoint), sorting a cell table, sorting a PivotTable, and auto-sizing a column.

For operations that involve reading and loading or writing data, such as opening a file, saving a file, or refreshing data, splitting the operation into two processes increases performance speed. The first process gets the data, and the second process loads the data into the appropriate structure in memory or writes the data to a file. In this way, as soon as the first process begins reading a portion of data, the second process can immediately start loading or writing that data, while the first process continues to read the next portion of data. Previously, the first process had to finish reading all the data in a certain section before the second process could load that section of the data into memory or write the data to a file.
https://msdn.microsoft.com/en-us/vba/excel-vba/articles/excel-improving-calcuation-performance
Quote:
Drill-down approach to finding obstructions

The drill-down approach starts by timing the calculation of the workbook, the calculation of each worksheet, and the blocks of formulas on slow-calculating sheets. Do each step in order and note the calculation times.
To find obstructions using the drill-down approach

Ensure that you have only one workbook open and no other tasks are running.

Set calculation to manual.

Make a backup copy of the workbook.

Open the workbook that contains the Calculation Timers macros, or add them to the workbook.

Check the used range by pressing Ctrl+End on each worksheet in turn.

This shows where the last used cell is. If this is beyond where you expect it to be, consider deleting the excess columns and rows and saving the workbook. For more information, see the "Minimizing the used range" section in Excel performance: Tips for optimizing performance obstructions.

Run the FullCalcTimer macro.

The time to calculate all the formulas in the workbook is usually the worst-case time.

Run the RecalcTimer macro.

A recalculation immediately after a full calculation usually gives you the best-case time.

Calculate workbook volatility as the ratio of recalculation time to full calculation time.

This measures the extent to which volatile formulas and the evaluation of the calculation chain are obstructions.

Activate each sheet and run the SheetTimer macro in turn.

Because you just recalculated the workbook, this gives you the recalculate time for each worksheet. This should enable you to determine which ones are the problem worksheets.

Run the RangeTimer macro on selected blocks of formulas.

For each problem worksheet, divide the columns or rows into a small number of blocks.

Select each block in turn, and then run the RangeTimer macro on the block.

If necessary, drill down further by subdividing each block into a smaller number of blocks.

Prioritize the obstructions.


Speeding up calculations and reducing obstructions

It is not the number of formulas or the size of a workbook that consumes the calculation time. It is the number of cell references and calculation operations, and the efficiency of the functions being used.

Because most worksheets are constructed by copying formulas that contain a mixture of absolute and relative references, they usually contain a large number of formulas that contain repeated or duplicated calculations and references.

Avoid complex mega-formulas and array formulas. In general, it is better to have more rows and columns and fewer complex calculations. This gives both the smart recalculation and the multithreaded calculation in Excel a better opportunity to optimize the calculations. It is also easier to understand and debug. The following are a few rules to help you speed up workbook calculations.
In particular
Quote:
Avoid single-threaded functions:

PHONETIC
CELL when either the "format" or "address" argument is used
INDIRECT
GETPIVOTDATA
CUBEMEMBER
CUBEVALUE
CUBEMEMBERPROPERTY
CUBESET
CUBERANKEDMEMBER
CUBEKPIMEMBER
CUBESETCOUNT
ADDRESS where the fifth parameter (the sheet_name) is given
Any database function (DSUM, DAVERAGE, and so on) that refers to a pivot table
ERROR.TYPE
HYPERLINK
VBA and COM add-in user defined functions
I've seen the advantages to using 64bit, but VBS still work on one thread and seems to be 32bit. The general functions within excel are what can utilize more cores. My question is more to do with the current excel bench because I doubt you're excel is occupying more than 200MB of RAM even with 32bit.
 
Does Libre support VBA? Also, what are your system specs?
 
I still have the test with the entire 60K rows of data, but also 10K & 5K as well. Let me know if anyone of you want to run them as well. For giggles
 
I made a few further changes to the workbook.

Data1000.zip 193k .zip file


Also, you need not close the workbook to re-run the bench. It should provide consistent score.
 

Attachments

Quote:
Originally Posted by LostParticle View Post

Quote:
Originally Posted by huzzug View Post

I made a few further changes to the workbook.

Data1000.zip 0k .zip file


Also, you need not close the workbook to re-run the bench. It should provide consistent score.
I cannot extract your zipped file. Here's what I get:

Goofed up. You can try the new link now
 
Is that bench with the latest file? Something seems amiss. It takes my system ~an hour to complete the bench. How are you able to finish it within that time?
 
Are you guys running the file that I updated recently? My results seem to be way off if you guys are in fact running that file. I'll incorporate version numbers into it to better understand which file is running.
 
Well my question wasn't because I doubted any of you but because I'm getting ~2500secs on the benchmark which is pretty significant. Mine is Excel 2010. Maybe my system is doing something weird. It's an sandy bridge (don't know the model because work) but it's running stock.

Edit: Seems I found something. Our IT had Office 32-bit installed on our systems. Seems that's what causing these variations. Well, anybody have a formal request to submit their corporate IT dept for upgrading the versions of Office to 64-bit ?
 
Are these benches after the Meltdown & Spectre patches ? Does excel get affected by those patches ? Any one with older intel platform, who can confirm the performance impact of the patch and can give a before and after ?
 
You can try to run the file on either OpenOffice /LibreOffice for Mac but ActiveX for Mac isn't possible.
 
1 - 17 of 260 Posts