due to some Linux power management bugs trying to under-clock my retired crypto cards
for some reason something is trying to override my fan speeds and forcing the GPU's to reach 96 deg
before they go 100%
bios will not go over 40%
for some reason if my HDMI cable is plugged in the R9 290 idles @ 90 watts pull cable out idles 17 watts
I have been trying to under-volt the cards but I think the settings are being over written somewhere else
as the watts are still the same each time I have tried even with the lower core speeds
I was under the assumption DPM0 was the base value and then 65282 , 65283 etc just added 25ma ontop of each step ?
or should I manually be entering values for each setting ?
*edit* ok made progress manually entering each voltage
how ever after running 2x back to back benchmarks on highest settings 947:1250 @ 1v , now its crashing on auto
and I bench-marked @ every setting combination individually and it passed
so I guess this has something to do with not using correct multipliers where is a good place to find the ratio table ?
I did find some interesting things like 700:120 ran better than 800-900:120
can someone tell me what is causing this between D71000.rom and low180w.rom inside lowv.zip
about to re-flash D71000.rom as a sanity check before increasing voltage again ...
so far I have found
100mhz GPU and 150mhz Memory stock voltage @ 14w
100mhz GPU and 150mhz Memory 800ma voltage @ 12w .. no artifacts while bench-marking
100mhz GPU and 150mhz Memory 750ma voltage @ 11w .. no artifacts while bench-marking
100mhz GPU and 150mhz Memory 700ma voltage , nope
100mhz GPU and 120mhz Memory 750ma voltage @ 11w .. no artifacts while bench-marking
think 120mhz is as low as memory can go for 60hz refresh ?
lets try getting GPU core speed lower again
50mhz GPU and 120mhz Memory 750ma voltage @ 10w .. might try lowering voltage again
guess I will test HEVC playback first will be good if I can watch video idling the GPU
and still butter smooth @ 50mhz
and using 1/10th the power
and now @ ambient temp on low profile so can work on lowering idle fan speed now
might find my lowest voltage for max core speed first then work out the ones in between
100mhz GPU 300mhz memory 10w
100mhz GPU 300mhz memory 40w
100mhz GPU 600mhz memory 45w
100mhz GPU 1000mhz memory 50w
100mhz GPU 1250mhz memory 65w
@750ma lowstate ( had a lockup @740)
50mhz GPU 300mhz memory 10w
50mhz GPU 300mhz memory 35w
50mhz GPU 600mhz memory 40w
50mhz GPU 1000mhz memory 45w
50mhz GPU 1250mhz memory 55w
@925ma highstate very laggy can not hold stock speeds
affects the other displays running the desktop hoping 950ma will sort that out
@950 - 975ma highstate unstable
ma was rock solid @ 95 deg and scored higher score @ 180w ?
then a slight config change and it stopped @1025 seemed to be the next stable
moving onto using my own frequencies and finding economical voltages for them
I did get it down to 100w draw under full load with very small performance loss
ideally I want GPU using low Profile as I only use it for gaming and I have to use peak profile for fans to work anyhow !!,
using onboard for displays as amdgpu seems to have some power hungry mode activated if any displays are plugged in
so the R9 290 is headless
Fan speeds seem to be getting ignored in bios as well
but at least power cap is working
I did have MAX ASIC temp set lower @ 50 deg in hope that would speed up fans , but then someone said the lower that number the more volts the GPU wants to pull so put it back up ?
the cards had a hard life they mined really well for years I was passing over 330watts through them stock air cooled and 480w as submerged
with Vulkans multi GPU support wanting to use them for gaming on my Linux box
wish I knew why all power save features are disabled ...
# sudo cat /sys/kernel/debug/dri/1/amdgpu_pm_info
Clock Gating Flags Mask: 0x0
Graphics Medium Grain Clock Gating: Off
Graphics Medium Grain memory Light Sleep: Off
Graphics Coarse Grain Clock Gating: Off
Graphics Coarse Grain memory Light Sleep: Off
Graphics Coarse Grain Tree Shader Clock Gating: Off
Graphics Coarse Grain Tree Shader Light Sleep: Off
Graphics Command Processor Light Sleep: Off
Graphics Run List Controller Light Sleep: Off
Graphics 3D Coarse Grain Clock Gating: Off
Graphics 3D Coarse Grain memory Light Sleep: Off
Memory Controller Light Sleep: Off
Memory Controller Medium Grain Clock Gating: Off
System Direct Memory Access Light Sleep: Off
System Direct Memory Access Medium Grain Clock Gating: Off
Bus Interface Medium Grain Clock Gating: Off
Bus Interface Light Sleep: Off
Unified Video Decoder Medium Grain Clock Gating: Off
Video Compression Engine Medium Grain Clock Gating: Off
Host Data Path Light Sleep: Off
Host Data Path Medium Grain Clock Gating: Off
Digital Right Management Medium Grain Clock Gating: Off
Digital Right Management Light Sleep: Off
Rom Medium Grain Clock Gating: Off
Data Fabric Medium Grain Clock Gating: Off
Address Translation Hub Medium Grain Clock Gating: Off
Address Translation Hub Light Sleep: Off
GFX Clocks and Power:
150 MHz (MCLK)
100 MHz (SCLK)
662 MHz (PSTATE_SCLK)
1000 MHz (PSTATE_MCLK)
975 mV (VDDGFX)
14.179 W (average GPU)
GPU Temperature: 49 C
GPU Load: 0 %
MEM Load: 1 %
Ideally I want to find the sweet spot for performance / watts for multi GPU setup
so I don't have to buy new cards as I have a dozen or so R9 290's sitting here doing nothing
from memory that extra 50 mhz to go from 950 - 1000 doubled the power consumption
I also remember the performance difference between 1x and 8x was hardly noticeable so interested in seeing if Vulkan can run 16 GPU's while I have some pci'e breakout boards floating around here
on Linux i can change the power mode like this
echo 180000000 | sudo tee /sys/class/drm/card1/device/hwmon/hwmon1/power1_cap
this I know always pushes the R9's as I can hear them squeal instantly so a quick way for me to do stability tests
DRI_PRIME=1 glmark2 -b terrain
Newer bios are ratio locked now with voltage with memory, this was added to avoid the huge amounts of returns amd vendors were dealing with due to stupid people dropping voltages too far.
if your card is one of these locked to memory then you have pointers NOT mv specified voltages IE 65288 65287 etc etc.
Same rule applies for these pointers as the mV specified only in accending order will they work.
if you want to use the lowest pointer 65282 that is fine across the whole Pstates, though it may not achieve this if the memory is still specified at 1000mv since your card's core voltage is now locked with the memory voltages.
so if you want to achieve 850mv with 65282 your memory voltage needs to be 900mv. Which will really cut down the ability to overclock the memory drastically.
core voltage is usually 50mv above memory so if your memory is 1000mv then your core is 1.050mv etc.
hmm maybe this is my problem, how should I redo my voltage?
*edit* well after two days worth of testing
I have decided on 100mhz GPU core with 750ma this idles @ 10watts and has a peak load of 13w , I was able to get it down a lot lower as a 50mhz core 8watt idle and 10w load but then HEVC playback while in a low state was pushing it too hard as it was 100% load and unable to keep up in all situations while at 100mhz core its only 75% load so a few watts extra for a 200% boost in performance was worth while
I dropped the VDDCI to 875ma and let the bios manage the rest of the voltages , with manual voltages I was able to run the core @ stock speed under 100w with only 10% loss in performance but every so and so if switching between frequencies and forcing it to use high GPU core low state memory 947:120 it would hang so decided to let bios manage it , if using peak profile it runs fans at 100% and will eat up 180w but using high mode it will only use 130watt but fan will not go over 30% until it hits 96c , if I manually run full speeds GPU / RAM / FAN it is not much higher so its not the fan using the mysterious 50watts more power so i guess bios has another voltage table it uses
anyhow happier now I am using 1/10th the power idle than I was before
the cards are GV-R929D5-4GD-B so their are 3x different types of RAM being used between the cards so I can not do tighter RAM settings as using same bios for all cards
now my only issue is still the main reason I started poking around in the bios .. why does it ignore my fan speeds and keeps running low fan, even with higher settings in bios it still does the same speeds as before the only thing I can think of is the firmware the kernel is reading has settings it also reads
on Ubuntu kernel 5.3.8 something else is trying to take over the fans I can manually adjust them then at some stage something else locks me out , it seems once the bios kicks in then this other setting is taken over as I can see something forcing the bios fan speed down
I guess this is because i am not using 150mhz
amdgpu: [powerplay] VBIOS did not find boot engine clock value in dependency table. Using Memory DPM level 0!
this might be the reason for the fan issue
failed to send message 282 ret is 254
old cards and kernels dpm=1 enabled the new dpm; then when AMD power play came out, they swapped its definition and dpm=0 would select power play and dpm=1 would still select the old power management