Overclock.net › Articles › About VRMs & MOSFETs / Motherboard Safety with high-TDP processors

About VRMs & MOSFETs / Motherboard Safety with high-TDP processors

 
 
 

The original discussion thread for this article is located at [CLICK HERE].
As the new OCN Articles system is open and editable by anyone, please feel free to correct any mistakes possibly made in this article.

-xd_1771, original writer


You will see me often warning users about VRM setups (i.e. STOP, get another motherboard and NOT THIS ONE) because your setup choice is not safe. Since I'm tired of waving my arms and screaming at people about mosfets/VRMs in every related thread I see, I've decided to complete this write up. A local PSU guru once said...

Quote:
Originally Posted by Phaedrus2129;12516806 
VRMs are power supplies just like your system PSU. All the same dangers apply.


How VRMs work:

How VRMs work (Click to show)

The VRM (voltage regulator module) contains the PWM controller, MOSFETs, power phases, chokes, and respective channels. These components are responsible for converting the output voltage from the power supply (12V, 5V, 3.3V) to the lower voltages that your CPU uses (i.e. 1.2V, 1.5V, etc.). Technically wherever different voltages are required on the motherboard, a VRM system will be required - there may be several on the motherboard. This article however mainly focuses on the CPU VRMs; these are located to the left of the CPU socket. The CPU VRMs are the VRMs on the motherboard that output the most heat and have to handle the most current they are of particularly high importance.

  • What is a MOSFET? A MOSFET (Metal Oxide Semiconductor Field Effect Transistor) is a part of the voltage regulator module usually to the left of the CPU socket. The MOSFETs themselves are active transistors that converts the 12V voltage into the VDIMM that the CPU uses. This element is crucial because it pretty much does all the power conversion for your CPU and generates the most heat, and are the most fragile in a VRM system. Boards that see frequent VRM failures usually have too few MOSFETs or too few power phases to support high power loads, and die as a result of over-capacity and overheating. This can be mitigated if MOSFETs are heatsinked and/or active cooled by fan. MOSFETs have a level of quality control: there are high quality MOSFETs with better design, higher capacity and more load tolerance - and there are poor quality MOSFETs that have lower quality standards and lower ratings.
  • The load is split into several phases or channels. More phases = more reliable operation and less heat. Higher phase count can result in the utilization of cheaper transistors, but without sacrifice of power output capability and resulting still in less heat output and lower cost. More phases/channels is usually better. On AMD platforms, higher phase counts (i.e. 8+2) can be only found on ATX boards.
  • The entire process is controlled by the PWM (Pulse-Width Modulation), a frequency output that the VRMs use to stabilize/cleans up the bulk of the power going through. PWM controllers are chips on motherboards where this control takes place. PWM frequency and modulation can have an effect to the amount of vDroop (voltage droop) you experience, as well as power delivery stability. High quality analog PWM systems will result in little to no vDroop. In a newer digital PWM systems, PWM signal to control the VRMs is digitally modulated. This may allow for more stable voltage delivery (i.e. no vDroop). The PWM also defines the amount of (true) phases that are outputted.
    Note that vDroop is actually an Intel spec designed to enable power savings and may be enabled on purpose as opposed to having to do with the quality of the VRMs and PWM controller.
  • The type of CPU connector (4/8-pin) does not have anything to do with the VRMs and phase count - you could have 8-pin + 3+1 phase or 4-pin + 8+2 phase. The 8-pin CPU power connector vs 4-pin is not important when you consider the amount of power the connector can deliver, but 8-pin connectors can result in more voltage stability and less vDroop.
  • VRMs have a play in overall system power efficiency. VRMs display similar characteristics to a power supply. They also have an efficiency level; a larger VRM system (i.e. an 8+2 phase system) would be more efficient at converting the input voltage to output voltage and have less waste power & heat, similar to an 80+ rated power supply. This will also result in less amps being pulled from the power supply.

The importance of power phase count

The importance of power phase count (Click to show)

Failures on motherboards with higher phase counts have been relatively infrequent if at all. Most of the culprits for VRM failures are the lower end 4+1 phase and 3+1 phase motherboards that aren't equipped to handle processors that consume lots of power and may be overclocked.  Smaller 4+1 phase systems or less on CPUs can be particularly risky due to the fact that each transistor must be capable of outputting more current and heat. This is why you normally see motherboards with low phase count failing (i.e. catching fire, frying, overloading), often on motherboards from only certain manufacturers or certain particular motherboards.

 

However, the motherboard brand/maker and their quality control can also define the quality of a VRM system.  For example, the majority of 2010-released MSI AMD motherboards with 4+1 phase or similar, heatsinked or not, did not have good quality and were prone to failure. This was due to the utilisation of transistors that may not be properly rated, driver chips not properly rated, and lack of VRM over-current protection. However, the Biostar TA890FXE, which comes with a similar-sized 4+2 power phase, was not failure-prone. It featured high amperage rating per transistor; completely rock-solid.

 

An 8+2 phase system may not necessarily provide any more current than a 4+1 phase if the amount of amperge capacity throghout the VRM system is the same; however, the 8+2 phase system would still do so with more efficiency, stability, and with less heat output. The situation of power phase count can be summarized in the following two sentences (in case the above was too long and complicated for you) by OCN PSU editor Phaedrus2129:

Quote:
Originally Posted by Phaedrus1219
However, as a practical consideration, many VRMs with more phases can supply more power. I mean, assuming you want to output 64A, it's usually cheaper to use sixteen 8A transistors than four 32A transistors. So more phases makes it cheaper to make the VRM more powerful (usually). So a VRM with fewer phases will often (but NOT ALWAYS) be less powerful, since making it more powerful is more expensive.

VRM system quality and design

VRM system quality and design (Click to show)

The quality of the VRM system in question and capability of handling processors that require lots of power usually comes down to these things:

  • MOSFET amperage rating denotes how many amps each MOSFET are capable of. There are two MOSFETs for each phase: high side and low side. If these MOSFETs each are rated for a low amperage supply, they may be unsafe to use for power-intensive processors with low phase counts. Many low-phase-count MSI boards use transistors lacking in amperage capability and fail due to over current.
  • Unfortunately it is not usually obvious what the MOSFET amperage rating is on the motherboard, and spec sheets will have to be searched and found. Transistors usually have their model no. imprinted on them in small print. This model no. can be searched online to obtain detailed documents describing the capacity of the particular transistor, including amperage rating. For basic reference, the AMD Motherboard VRM Information List tries to compile relevant information about VRM quality.
  • A low quality board can indicate that the MOSFETs are not rated for enough amperage for higher TDP applications.
  • Mosfets per channel are usually in groups of 3 or 4. On a good quality motherboard you will see 2 primary transistors (MOSFETs themselves) - 1 or more "high side" transistors and 1 or more "low side" transistors - and one or two other transistors nearby called MOSFET drivers.
  • Some motherboard manufacturers (particularly lower end ones that cannot devote as much cost into their motherboard) may choose to lower the cost on some motherboards and instead of a proper driver use a third (and fourth) transistor chip. This is a sacrifice of quality & reliability. Boards that use a 3rd transistor driver chip that use an improperly sized chip, are often the cause of problems and failures with power-intensive processors. Note that the driver chips are sometimes integrated with the PWM controller, in a not so obvious fashion - so don't fret if you happen to think that they are entirely missing on the board.
  • Smaller mosfets are usually always low RDS (on). Low RDS (on) fets much more efficient and cooler. Other style MOSFETs may also be low RDS (on) but this may not be obvious.
  • Although motherboards with higher phase count can use lower quality transistors, this does not mean it will supply less current than a lower phase count motherboard with higher quality transistors. The higher phase count motherboard would have the additional advantage of higher power delivery efficiency, cooler running (resulting in lower chance of common failure by overheat)

 

A proper MOSFET design will have two primary MOSFET chips (high side and low side) and one or more driver chips. MOSFETs are rated for a certain amperage; this may not be obvious and will require searching for spec sheets on the internet for more info, hence the existance of the AMD Motherboard VRM Information List to inform the user about VRM quality. A low quality board can indicate that the MOSFETs are not rated for enough amperage for higher TDP applications.

Driver MOSFETs

Driver MOSFETs (Click to show)

You have have heard of Driver MOSFETs. Driver MOSFETs (also known as DrMOS) integrate the MOSFETs and the driver into one package. While this can result in more efficiency, early driver MOSFETs were more fragile and failure prone at high voltage and current. This is rather noticeable on the Intel side, where many driver MOSFET boards were failing due to over current, usually (only) during extreme (sub-zero) overclocking scenarios.


The combination of low phase count and driver MOSFETs on the infamous MSI 790FX-GD70 (and succeeding 890FX-GD70) resulted in a board that was particularly known for VRM failures in high amperage scenarios (i.e. Phenom II x6). The MSI 890FXA-GD65 and new 8+2 MSI motherboards have doubled the amount of phases - and driver MOSFETs - resulting in higher possible amperage capability. Although these driver MOSFET motherboards with larger phase count are still more prone to failures (which have happened on these boards), the problem is not nearly as rampant as with more phase count, each of the driver MOSFET can supply less amperage and run under less heat and stress.

VRMs on certain platforms

VRMs on certain platforms (Click to show)
  • On AMD AM2+/AM3 systems: Split power phase. The majority of the phases actually bring power to the CPU, and auxiliary phase powers the integrated memory controller/IMC. This is why VRMs are advertised in phrases such as "4+1" or "8+2" rather than simply 5-phase or 10-phase. In 4+1: 4 phases to the CPU, 1 to the IMC.
  • On Intel LGA1156/LGA1366 boards this is similar.
  • On Intel LGA1156/LGA1155 boards with chipsets that support the integrated graphics, the phases are arranged as, for example: 4+1+1: 4 phases for CPU, 1 for integrated graphics, and 1 for memory controller. On LGA1156/LGA1155 boards without integrated graphics support (i.e. P55, P67) and on LGA1366, phases are arranged similarly to AMD (i.e. 4+1, 6+2, etc.)
  • AMD's new sockets FM1 and FM2 work differently; the GPU portion of the APU shares the voltage & power phase with the CPU portion. Therefore power phase work as the usual 4+1/8+2/etc on other AMD boards (CPU/GPU + IMC), not 4+1+1 as on Intel boards.
  • Older boards (i.e. before AM2+, LGA775) do not feature split power phase; the channels are not separated for certain items such as memory controller because they don't require a different voltage or more power yet, or the memory controller simply did not exist on CPU. Boards are advertised as such: 3-phase, 6-phase, etc. The VRM components were usually somewhat more separated on these older boards, this actually helps since the heat of one area doesn't spread as easily to the other. The memory controller is on the Northbridge on FSB-based platforms such as LGA775, for which power is taken from the main 24-pin connector.
  • Memory (RAM) power is always taken from a separate VRM system linked to the 24-pin connector, not the CPU VRM system.

What VRM cooling will do for you

What VRM cooling will do for you (Click to show)

VRM cooling is an important part of keeping VRM temps down. VRM cooling is usually placed on the MOSFETs, active transistors that are the most fragile and the hottest. Often the VRMs get little to no air so as much heat radiation as possible would be best. What I recommend you do in terms of cooling the VRMs and running a high TDP processor:

  • Add any sort of VRM heatsink such as MOS-C1 if there isn't already, especially on 4+1 boards even with quality
  • Add active cooling. A small fan, or Spot Cool, will do. While most VRMs will run safe with VRMs and no active fan cooling, huge temperature drops have been shown from even really weak 40MM fans that don't push much air.
  • Improve on case airflow. i.e. add that top fan in the slot above the VRMs (heat naturally dissipates upward).


OCN member mdocod has found that as of 3 March 2011, at least 71% of the VRM cooling failure incidents in the compiled list of horror stories have happened on a cooling that deviates from stock cooling. This value may be higher due to the amount of situations where cooling was not described.

 

"Stock" CPU cooling is designed to blow down onto the motherboard components, including VRMs. Aftermarket cooling, which includes: tower cooling, any sort of water cooling, is usually not. Remember, TDP rating on all boards is done with processors at stock and with stock cooling. That means your 4+1 phase or even 3+1 phase (on AMD platform) may actually be fine for a more power consuming (i.e. 125W TDP) processor with stock cooling & at stock speed, but deviate any one of these and you're on your own.

Motherboard TDP ratings and how they relate to VRM quality

Motherboard TDP ratings and how they relate to VRM quality (Click to show)

A lot of people claim that their boards are rated TDP (measure of processor heat output, but rough indicator of power consumption) at 125W-140W and it's still safe to run that processor on that board. Not that you should take these ratings with a grain of salt, but you should be reminded that all motherboards are ratified in TDP capability, with processors at stock speed and with the stock cooler installed.

 

At stock CPU speed and with the stock cooler (air blows past the heatsink fins and onto the board, so some air gets to the VRM area and other motherboard components for cooling) you are within that TDP limit. When you overclock or use any aftermarket CPU cooling that is not downward-blowing, you're then exceeding these limits, which may bring additional heat and instability into the VRMs (though this can be fixable with MOSFET heatsinks and fan). Overclocking is usually associated with many tower heatsinks that blow over the motherboard; this removal of VRM cooling may significantly increase chances of catastrophe. In a sample, 70% of all VRM failure incidents happened with aftermarket CPU cooling installed.

 

Motherboards with lower phase count and lower rated transistors usually have VRM systems that run hotter and are more prone to failure. Heat causes a lot of VRM problems including unstable power delivery and even fire hazard. Proper MOSFET/VRM cooling may help, and some boards allow you to monitor VRM temps (i.e. TMPIN2 on HWMonitor on some Gigabyte boards - for your board it may depend, TMPIN2 may exist or may not at all and it may not even be VRMs). Though different VRM systems may be rated for temperature differently, ideally the temperature should be the same as the CPU load (i.e. my VRMs load at around 60, with my CPU tagging along at slightly lower than that). Typically, proper VRM cooling installed will allow for higher TDP capability as the VRMs can run under less heat and stress - as a result the TDP is rated higher than usual for stock speed operation.

VRM Over-Current Protection

Warning: Spoiler! (Click to show)

Over Current Protection (OCP) is something I have recently been examining. Protection features exist against VRM overheating/overloading depending on motherboard model and brand. I believe it is a crucial feature on motherboards today, because this is the function that will protect your VRMs from a catastrophic failure. This is why I have never seen ASUS boards fail even if people take a lowly 3+1 ASUS boards and try to overclock a Phenom II x6 on it; ASUS boards feature this technology, as it is a part of the PWM controller design.

 

OCP can work in various ways; one of the ways it works is it downclocks the CPU speed & voltage - via cool'n'quiet or it's own function - if the VRM temperatures are detected as too high (similar to if CPU temps are too high), until they can recuperate and lower in temperature. As a result, it can reduce performance during a full load scenario. It is also how ASUS gets away with rating a few select 3+1 phase AMD motherboards at 125W, though at times the OCP may kick in too often at load even at stock speed/stock cooler and the rating would've been slightly improper for the board (there are few if any 3+1 phase boards ready for 125W processors).

 

Another common way is a full board shutdown; if MOSFETs are overloaded suddenly to the point where immediate shutdown is needed for protection (i.e. beginning an OCCT run on a 3+1 power phase on a Phenom II x6 OC'ed and at 1.5V), then OCP will kick in and the board will shut down to protect itself. ASRock boards and some Gigabyte boards are known for this.

 

But some motherboards do not feature any sort of OCP. OCN members and I have found that most recent MSI AMD boards feature NO protection of any sort against VRM failure/over current/over temperature, and this is likely why a majority of the catastrophic failures in the horror stories list are MSI boards. At the moment I and others have been trying to find out which brands/specific motherboards do use over current protection, and we are listing them down for future reference. Once that is done, take it at heart to purchase a board with OCP for your own safety and for the best confidence in overclocking.

Database of VRM failure incidents

Database of VRM failure incidents (Click to show)

This is a compiled list of all VRM failures that I have found and recorded so far.

[CLICK HERE to view]

How do I tell how many phases this motherboard has?

How do I tell how many phases this motherboard has? (Click to show)

It is not difficult to tell how many phases there are on a motherboard, for those who are curious. Look at the big black squares, called chokes (They're inductors, boxes containing coils that basically help filter and limit the current). If you see 10... that usually means an 8+2. 5... usually a 4+1. Sometimes there are different combinations depending on the platform. Note that amount of chokes will not necessarily mean that you have that amount of phases (due to such things as split phasing, defined by the PWM controller), however a split 4+1 phase with 10 chokes (split 8+2 power system) is still capable of handling more current than a split 4+1 phase with 5 chokes.

 

Also, a list of all AMD motherboards with detailed VRM information including quality/amount of phases can be found here.

What's good and bad

What's good and bad (Click to show)

If you are planning to buy a motherboard and want to consider the VRMs, here are some pointers:

AMD Platforms on high TDP (~125W) processor (includes unlocked CPUs):
Remember, you can refer to the AMD Motherboard VRM info database for info about specific motherboards (see link above)

  • Look for a minimum quality 4+1 phase on the board for use with high TDP processor. Higher is better though.
  • Be SURE it is of quality; if so,
    • Preferably low RDS (on) transistors
    • A proper transistor design and proper transistor amperage rating
    • On a brand that is not known for VRM failures.
    • If you are overclocking with a high TDP processor and a 4+1, consider MOSFET/VRM cooling a MUST. Some boards may already have this. Fewer phases will overheat much easier and be more prone to failure.
  • If you have enough budget to get a board with a better, larger VRM system (i.e. 8+2 phasing or similar) and/or room for larger board size (mATX boards are typically fitted with inferior VRM designs due to limited space), there is not much need to worry.
    • Cooling is no longer as big an issue because the larger amount of (smaller) transistors can run cooler over a larger area as they have no need to handle as much current.

 

Intel platforms on high TDP processor:
The rules are similar, however do pay attention to the overall TDP of the majority of Intel processors. Some platforms set a 95W max, this consumes less power and may require less phases for good functionality, even overclocked.

Other important VRM system-related resources

Warning: Spoiler! (Click to show)

What to do if you suspect your VRMs have failed

What to do if you suspect your VRMs have failed (Click to show)
  1. Unplug everything/cut power to the PC
  2. Check for visible damage (blown caps, missing parts from mobo, burn marks) [this might not always be the case]
  3. Use your sense of smell (if they blew it'd be pretty obvious to the nose, but it might smell really bad)
  4. Put out the fire! (If there's any)
  5. Run standard troubleshooting procedures to make sure it's not anything else (i.e. check the power supply)
  6. Try testing the motherboard with the 24-pin plugged in but without the 4-pin/8-pin CPU power plug. This is the ultimate dealbreaker; if the motherboard only boots when CPU power plug is unplugged (though it obviously won't POST), you sir have a VRM failure on your hands.
  7. Report it on the discussion thread! The more VRM horror stories are in the failure database, the more aware this can make people about this overlooked issue.


Remember, not all VRM failures are visible and involve fire & explosions! Boards will sometimes quietly go with a shutdown and not boot again. Sometimes VRM failures can take out other parts, as with PSUs, and sometimes not.


LAST UPDATED: 17th November 2012

Comments (22)

Very informative... gonna get a copy of this one.
Thank you, this was very interesting to read.
Very Cool Guide!
Very interesting indeed, thank you.
That's a lot of MSI!!
Military class my posterior...thanks for the guide, wish I'd never bought the board I have now. =[
wow...glad i decided against an MSI board....nice read
My old Gigabyte GA-880G-UD3H handled the 1100T @3.8Ghz 1.4V perfectly fine. Just putting it out there. It was a 4+1 phase mobo
Thank you so much for assembling this information. It will definitely help me decide when I buy future components.
A great read,very informative,yet easy to read and understand...
Was interested in learning more about VRMs,and how they work,this definitely helped,and cleared out a lot of things...
Great job
Would a faulty but not blown VRM cause Hyper Transport voltage issues? I have an MSI 870A Fuzion that gives me this error (even while not OC'd), and I feel like I might have a ticking time bomb in my rig.
Thanks a lot! Clears a few doubts I had...gonna share it with my AMD buddies.
I read the article about a year ago and acted on it. Although not an over clocker I like reliability in my systems. My wife and I both have ASUS M4A77T/USB3 (AM3) motherboards. I have a Phenom II X4 955BE and she has an Athlon II 640. I added a couple of Cooler Master Hyper 212 Heatsinks with push pull fans and found out the MOSFETS actually ran warmer as there was no direct air "blasting" at board level from the stock coolers. I got 2 packets of Enzotech MOS-C1 C1100 copper heatsinks and the glue that contains silver and put a heat sink on each of the CPU VRM's. They ran much cooler. My wife's PC is in a cupboard so the cooling is overkill but anyone who has a serious computer using spouse knows all to well what happens if it goes bang. I have a couple of other ASUS AM3 boards and they are used in HTPC solutions. My other main PC has a M4A88T-V EVO/USB3 (AM3) which has MOSFET heatsinks and it runs a Phenom II X6 1100Tand they get warm to the touch. Anyone who is into overclocking un locking cores needs to ensure they buy a board with a heatsink or add aftermarket ones. Consider it part of the job when you add the aftermarket cooler. It could save your CPU and Power Supply.
Wow. I respect your time you put into not just writing this, but sharing information you had to gather yourself. This kind of research would have taken me a lot longer, and probably with less understanding. I have a Gigabyte 990FXA UD5, and I had no idea what the 8+2 split phase meant, or why it was good or bad. It's running 1.54V eloquently, and now I know it's because it can handle the power from my PSU well. very cool.
OK. Supposing I do as this article suggests, then what?
I Guess stay away from msi...
There's more to it than counting 'phases.'
One of the most important parameters is the MOSFET's 'On Resistance' value. The heat produced when the MOSFET is switched on and flowing current is directly proportional to the device's resistance. On my MSI 870A-G54, the switching transistors are NIKOS P0603BGD. These have a rather high on-resistance of around 6 milliohms. This doesn't sound like much, but at ~50amps (they are rated for 70), it equates to 15 Watts of thermal power. On my board, two of these P0603s are run in parallel, with 4 pairs total. Since only one pair/phase is on at any given time, the total power isn't 15x8, it's only 15x2. Nevertheless, this ~30 watts of continuous power is enough to heat the 'VRM' area of the board significantly.

What's wrong with a little heat? Everything! As the MOSFET gets hotter, it's on-resistance - you guessed it - increases. The hotter it gets, the more heat it makes. The only thing preventing an immediate runaway meltdown is the heat-sinking ability of the motherboard. The TO252 surface mount package is designed to dump almost all of it's heat into the mobo via the surface mount solder junction. So long as the body of the motherboard can dissipate enough of this heat, the mosfet can me kept below the critical (runaway) temperature and all is hunky dory. Aiming a fan at the switching area of the motherboard is a very good idea. (I'm not at all convinced that a stick-on heatsink, applied to the top of the FET, will make much difference. The plastic body of the component does not conduct heat particularly well and a heatsink will not 'draw' off much energy. IMO, simply slapping a HS on your board will not make it safe!! Unless it blocks airflow to the mobo, it probably won't hurt, but it may not help much either.)

As you OC the CPU, the time that the mosfet is kept on increases. More on-time = more heat. This may be OK in the short term, but the mosfet will now be running hotter than before. Now throw in component degradation, which causes a slow increase in resistivity. Guess what? Yep! Higher operating temperatures accelerate this degradation.

As you can see, all the curves are stacked against you. A marginally cooled mosfet is a ticking bomb that will eventually fail. This is what MSI has chosen to install in my board. Why? I don't know. FETs with lower on-resistance also tend to have greater gate and overall device capacitance. (The junction area is physically larger - it can flow more current, but also requires a bigger 'kick' to switch on.) Higher capacitance causes a greater strain on whatever circuit is driving the mosfets. Maybe MSI cheaped out on the driver and was forced to use a high resistance, low capacitance FETs? I suspect the choice of components came down to some damnable bean counter who figured could save 2 cents each by buying the cheaper items. Crap is crap....
Very informative article. Every prosumer must know.
When I used to OC Phenom 955 on my M4A88TDM Evo the Mosfet used to get very hot to touch though I never exceeded 1.375 Vcore.
But now I use FX 8350 on M5A97 R2.0 which has heatsink on it, it does not get hot at all, but is slightly warm, even with 1.4 Vcore. Though the heat sink on NB does get very hot , maybe you have a detailed article on NB cooling as well.
Thanks Very Much!! Helped me buy a couple mobos so far (with the approved list and these tips) cheers
Overclock.net › Articles › About VRMs & MOSFETs / Motherboard Safety with high-TDP processors