post #1 of 1
Thread Starter 
I am currently working on measuring the performance of OpenCL with Java bindings (also known as Java OpenCL or JOCL) on my computer's GPU and comparing this measured performance with just Java on the CPU of my machine. The code I am testing is to sum two 2D arrays (2000 rows by 2001 columns) 500 times (mathematically my code is written as z=x+y+y+y+y+... extending to 500). In order to measure the performance of each platform, I let each program (the OpenCL and the Java) run for 20 seconds and then I can calculate the number of FLOPs that were done by each program in that time and then compare the two.

I am running my code on a MacBook Pro with an Nvidia GeForce 9400M graphics card and a 2.4GHz Intel Core 2 Duo processor.

For my JOCL program, after running my code for 20 seconds, I calculate about 22 GigaFLOPs and for my Java program I calculate about 16 GigaFLOPs after the same amount of time. Do these numbers make sense for my computer's hardware and the problem I am trying to solve? Or perhaps someone can steer me in the right direction of how I might know if they make sense?

Any help will be greatly appreciated,