Originally Posted by ryokoseigo
So turns out I Did have a spare PSU, I thought I sold it hahaha. And even more surprising, I had a single paperclip in my work bag. (is it 100% safe to have the paperclip as a jumper semi long term? I'm not even sure if current goes through it or not) I currently have only 1 GPU running off of the main power supply now. Will test tonight if I can. I checked if the EVGA card has VRM monitor temps, and I believe it does, although I'm not 100% sure I know which temp it is. Either way, the highest temp is 71C, which simply isn't very high. It's convenient if its the GPU, as its obviously the one that's driving the display, and well RMA is easy and I think EVGA offers parallel RMA's. However, if it doesn't crash, then I guess the only culprit that makes sense is the PSU. Also under warranty, but unless corsair offers similar RMA methods, its a real bummer to have to completly swap PSUs for a temporary situation. I guess I could still technically blame the mobo VRM's, but I just don't get how that would be the case. MPC uses less cpu then streaming a video in browser in most cases, so it seems mostly nonsense that an HDR video would push the VRMS too hot while literally doing anything else on the CPU doesn't.
It is definitely 100% safe to run with the paperclip in long term, jumping the Green Cable Pin and a Black Cable Pin. Green is the PSU On Pin, and it works by Shorting to Ground. So no current flows through, all it does is complete the circuit which allows the PSU to turn on, and when the Circuit is broken, thats what tells the PSU to turn off. The Only Danger in running with a Paperclip Long Term is accidentally somehow shorting something else to ground by touching the Paperclip with some other live conductor. As long as you are careful not to do that, then you are all good!
Yeah my leaning theory right now is definitely the VRM on the GPU. If you have HWinfo open the VRM Temps are the ones under EVGA ICX Labeled PWR1 through PWR5. While 71c is definitely within spec, I believe I remember mine getting somewhere near that too before I put a full cover water block on mine, We have to remember that this is just a spot check, and not necessarily the hottest Temperature on the VRM. Also depending on where EVGA Placed these sensors it could still be significantly hotter inside the component. Its really hard to say without being able to diagnose myself though, they could be completely fine. I just find it weird that every time you OC the card and then put it under a workload your issue happens.
Also I wouldn't count out your Boards VRM just yet, while I agree it seems unlikely at this point, the type of issue you are experiencing is right in line with OCP/OVP/OTP. And when these protections are tripped on Ryzen chips, a good amount of the time its actually the SOC VRM that is the issue, simply because they typically don't need to be overbuilt like the CPU side, but under certain Memory Heavy tasks, the SOC can get stressed, especially when pushing up the IF, so it is definitely possible that this iis the issue. but I would only investigate that further if your testing shows its definitely not the GPU.
If I were you I would be trying to replicate the exact same situation that makes the system shut down, just with the only difference being the Graphics cards being plugged into the PSU. For Troubleshooting sake, you want to create the exact conditions that caused the problem, just with one change, in this case the PSU. This way depending on the outcome of the test, you can be reasonably sure what the problem is, or at least what the problem is not. If the problem happens again, its likely that its not Your PSU, although there is one more test I would run to make sure (I would plug the EPU and 24 Pin into the extra PSU as well, this way you take your main PSU out of the equation, and see if the problem happens again with everything plugged into the extra PSU). If The Problem does happen again, and then you moved everything to the new PSU and it still happened, then I would definitely focus on your graphics cards as the issue. If however when you are testing just the Graphics cards plugged into the Extra PSU, you get through a day without issues, then you can be reasonably sure your Graphics cards aren't the issue, and start to focus your energy on Your PSU, Mobo, Or CPU. Basically just trouble shooting 101. Swap out parts, one by one, until the problem is eliminated, then isolate the part to make sure its repeatable. Then once you have found your part, if it ends up being your GPU for instance, there is a chance you can fix it on your own just by opening it up and examining it. But we will cross that road if/when we get there.