This article is based mostly on Intel CPUs. Though the information may carry over to AMD CPUs, I do not claim that any information contained below is true for AMD cpus.
Please use common sense, good cooling, and test stability stringently; and you shouldn't have any problems.
How to Set Your Temperatures Correctly / TJ Max Explained
There has been a lot of speculation recently about temperature readings on the Core 2 line by using different programs or that certain programs are better for 65nm Quads vs. 45nm Quads. The truth of the matter is that all of the programs are just taking a reading given by the motherboard and using a very basic equation to find the reported core temperature.
Core Temp = TJ Max - Distance to TJ Max.
The motherboard reports the distance to TJ Max and the program inputs it with its value for TJ Max (Thermal Junction Maximum). Therefore if the wrong TJ Max is set, all core temperature readings are wrong. The TJ Max Values for most of their processors, but the Distance to TJ Max will always be correct.
Intel Source
Intel Atom N270 --- 90C
Notice: B3/L2/M0 Steppings were not included directly in the pdf sourced above. I never noticed their exclusion from the pdf and their subsequent inclusion by Tom's Hardware from which I took the chart and changed to fix the for update. I have now removed the B3/L2/M0 steppings from the chart with the exception of the L2/M0 E1000 which was in the pdf. Intel has said that it was their goal to raise TJ Max in the GO steppings so one could infer that the B3/L2 steppings would be the same as the B2 steppings, but there is not a definitive word and I will not make assumptions. Also, I would
guess that the M0 would be increased like the G0, but again there is no definitive word as these TJ Max Targets were never released.
Update: Reportedly, Core i7 CPUs read their TJ Max settings off of the die, but some programs are not yet patched to the settings encoded into the die. The average TJ Max seems to be 100C for the Core i7 line thus far, but please PM me with a screenshot if your i7 seems to be showing a different TJ Max.
With this knowledge, you should set each temperature monitoring program with the correct TJ Max.
For Real Temp, simply hit settings and in the new tab window hit Set TJ Max so you can set the correct TJ Max for each core.
For Core temp, go to Options/Adjust Offsets and set the offset to the difference between what Core Temp shows and the real TJ Max value. So in my case it showed 100C for my TJ Max, but the real TJ Max for a G0 Q6600 is 90C. So, 90-100= -10 Offset.
For HWMonitor, close HWMonitor and then open hwmonitorw.ini with notepad. Set "CPU_0_TJMAX=" to your TJ Max. (Example: CPU_0_TJMAX=90.0) Save and reopen HWMonitor.
Now all three programs should read the same for each core temperature.
The Other Side of TJ Max
TJ Max is an inherently unreliable and inaccurate method of measuring idle/low temperatures. Due to the way it was designed, the readings are so inaccurate under 50C that Intel says they can only be read as a number temperature below 50C. As the temperature approaches TJ Max, the precision increases and at TJ Max the temperature is considered to be 100% accurate. Because of this error at low temperatures, sub-ambient temperature/very high readings are sometimes given for idle temperatures. With the TJ Max method, your idle temperatures have no accuracy and therefore should be ignored. Under load, the temperatures become much more accurate and should be very carefully monitored.
A good example of the inaccuracy can be shown on the Q6600 (G0) with a TJ Max of 90C. At 90C, it is perfectly accurate, but at 50C is becomes as accurate as plus or minus 10. Below 50C, the error grows even larger to the extent of plus or minus 30. Notice that after TJ Max has been reached nothing can be said about temperatures accuracy or innaccuracy.
The Controversy
The numbers listed above are not the actual TJ Max values for each CPU; they are the TJ Max Target values. Each CPU has its own TJ Max that should be near the values above, but it may not be exact. Each CPU is set individually at the factory, but the TJ Max Target is what they were aiming for so
it is the best estimate of what our core temperatures really are.
The argument then says that you should calibrate your TJ Max by measuring the ambient temperature in the room, setting your CPU at stock with stock voltages, but then lower your multiplier to the lowest possible (usually 6 for Core 2 CPUs). Then adjust the TJ Max until your idle temp readings are about 7C higher than ambient. The problem with this argument is that Intel has implied that the Distance to TJ Max values are not linear, and therefore you would have accurate idle temps (not important) and very inaccurate load temps (very important).
The Tj Max Target values may not be exactly right for your chip, but the calibration method also relies on the idea which the Distance to TJ Max is a linear progression, which Intel infers that it is not.
Another problem with calibrating to idle/ambient is that you immediately introduce mass amounts of uncertainty into the readings. The article lists off possible values for temperature readings based on cooling type, but there are many reports of exceptionally "hot batches" of cpus, which would idle much higher than ambient temperature not due to the cooling but the thermal wattage dissipation. By saying that my cpu should idle around 8C higher at stock idle than ambient, I am already guessing at the capabilities of my cooler and at the heat it is dissipating. Depending on the cooler and the chip the uncertainty would just vary too much for my taste (Could be very accurate or could very inaccurate).
More Information and better instructions for the Idle/Ambient Calibration Method
CPU Thermal Specification
**WARNING**This is a very highly debated subject and I must say before I start say that this is your processor and running it at whatever temperature you decide to run it at is your choice. Also, a nice reminder is that lower temperatures mean higher overclocks. If you let your temps go to the maximum, it can actually cause more instability.
**WARNING**
A very comprehensive list of voltages and thermal specifications for both AMD and Intel can be found
here (Courtesy of DennyB). Something to be noted is that Thermal Specification is not the maximum temperature for the cores; it is the maximum temperature at which the center of the Integrated Heatsink (IHS) should go to.
Quote from Intel
Quote:
Thermal Specification: The thermal specification shown is the maximum case temperature at the maximum Thermal Design Power (TDP) value for that processor. It is measured at the geometric center on the topside of the processor integrated heat spreader. For processors without integrated heat spreaders such as mobile processors, the thermal specification is referred to as the junction temperature (Tj). The maximum junction temperature is defined by an activation of the processor Intel® Thermal Monitor. The Intel Thermal Monitor's automatic mode is used to indicate that the maximum TJ has been reached. |
(Courtesy of TwoCables)
On average, the temperature of the heatspreader is about 10-20C lower than that of the cores. This varies from CPU to CPU depending upon solder. So if the thermal specification of a CPU is 71C, the cores would have to be running at 81C before you would reach the thermal specification. Remember though, running up your temperatures that high
can and will cause more instability. This is why people almost always achieve higher overclocks on water, DICE, and LN2 respectively; CPUs run better colder.
Another interesting idea is that the thermal specification (Tcase) may increase as the Power increases linearly. In fact for the Q9000 series, T=0.28P+44.8 where T is Temperature in degrees C and P is Power in watts.
Source Page 79 This suggests that at higher overclocks the thermal specification of the CPU may be higher than normal, so it may be even harder to hurt your CPU with temps. It has been well documented that CPUs are resilient to heat and can withstand harsh conditions without degrading.
Example
Again, higher temperatures cause instability and will lower your maximum overclock.
When will you begin to thermally throttle? **Warning Technical**
You will not begin to throttle because of temps until you are approximately 20C above Thermal Specification. For example, the Thermal Specification (Tcase) of the Q6600 G0 is 71C. At approximately 91C, it will begin to thermally throttle. This then makes sense why the TJ Max target is 90C; Intel is trying to make the sensor the most sensitive when it is at the point of damaging the CPU. TJ Max Target is not the point of damage, but it is close to the THERMTRIP# point (the point when the CPU automatically throttles and shuts down until proper temperatures are restored). Since as discussed earlier thermal specification (Tcase) is actually the temperature of the center of the IHS and it's temperatures are ussually 10C below core temps, there is actually a 10C buffer between Distance to TJ Max=0 and Permanent Silicone Damage. What Intel has done is said that they are not allowing for any difference between the temperature of the cores and the temperature of the IHS and thus we are aligning Tcase (without buffer) with TJ Max. Remember however that depending upon the solder the difference between Tcase and Core temps is ussually 10C, Intel is simply being safe and not allowing for any difference even if it exists.
If you look at most CPUs Tcase values, their TJ Max Targets are almost always 20C above their Thermal Specifications (Tcase) which perfectly matches with this concept.
Intel Technical Quote
Quote:
In the event of a catastrophic cooling failure, the processor will automatically shut down when the silicon has reached a temperature approximately 20°C above the maximum Tc. Assertion of THERMTRIP# (Thermal Trip) indicates the processor junction temperature has reached a level beyond where permanent silicon damage may occur. Upon assertion of THERMTRIP#, the processor will shut off its internal clocks (thus, halting program execution) in an attempt to reduce the processor junction temperature. To protect the processor, its core voltage (VCC) must be removed following the assertion of THERMTRIP#. Driving of the THERMTRIP# signal is enabled within 10 μs of the assertion of PWRGOOD (provided VTT and VCC are asserted) and is disabled on de-assertion of PWRGOOD (if VTT or VCC are not valid, THERMTRIP# may also be disabled). Once activated, THERMTRIP# remains latched until PWRGOOD, VTT or VCC is deasserted. While the de-assertion of the PWRGOOD, VTT or VCC signal will de-assert THERMTRIP#, if the processor's junction temperature remains at or above the trip level, THERMTRIP# will again be asserted within 10 μs of the assertion of PWRGOOD (provided VTT and VCC are valid). |
Source Page 72
Example.
The E6600 Thermal Spec is 60.1C and allowing roughly 20C until Silicon damage gave it a Target at 80C. Everyone knows how poorly stock heatsinks perform and if you had a failure, it might not kill the CPU if it could just about boil water before taking damage. Over the years, Intel has refined their craft and on the same architecture size (65nm in this case) they were able to increase the Thermal Specification of the G0 to 71C with a TJ max of 90. It could have also been a change in the fabrication process going from B2 to G0 increasing the maximum operating temperature before damage.
The 45nm fabrication process could have also caused an increase in silicone heat resistance. If you look at the TJ Max Targets for almost all 45nm Intel CPUs, they are all higher which according to that PDF means they can take higher temps.That's not conclusive evidence that Intel is working on making CPUs more durable, but why wouldn't they.
The exception is extreme editions CPUs, which are by most accounts designed for running on extreme cooling. For example, the QX9770 has a thermal specification of only 55.5C! This might be called heresy, but it's very possible that extreme edition CPUs might be designed to run at extreme speeds but are actually made on lower quality silicone/manufactoring process as there is no need to make them stand up to high temps.
What about Maximum Voltages and Overclocking?
If you run at or below the Absolute Maximum Voltages for your CPU, you should never experience degredation or lose of life on your CPU. Overclocking will not decrease the lifetime of your CPU if and only if certain criteria are met.
1.)
Electrical Specification must be satisfied (1.55v for 65nm Core 2 Series and 1.45v for 45nm Core 2 Series, 45nm Core i7/i5 list 1.55v as their maximum, and 32nm Core i5 list as 1.40v. Make sure you check all other voltage specifications for VTT and CPU PLL. (Dram voltage on the i7 is a different story).
2.)
Signal Quality must be clear (Overclock must be perfectly stable, GTL lanes may need to be tweaked)
3.)
Mechanical specifications met (There is not a physical defect and the insides have not previously been gutted by running 1.9v through it)
4.)
Thermal Specifications must be satisfied (The IHS temp must be below Tcase)
According to Intel, "Within functional operation limits, functionality and long-term reliability can be expected." This says nothing about running over stock voltages or stock clocks. Only that you need to be stable, cool, below the maximum voltage, and mechanically sound at any speed.
Interestingly enough, criteria 2 (Signal Quality) alludes to the idea that unstable overclocks cause CPU degredation. This actually makes sense if you think of unstable/inappropriate signals causing BSODs and other errors actually causing physical damage as the inappropriate signals move through the CPU at random.
For the exact wording in the Intel document
Quote:
Absolute Maximum and Minimum Ratings
Table 2-2 specifies absolute maximum and minimum ratings only and lie outside the
functional limits of the processor. Within functional operation limits, functionality and
long-term reliability can be expected.
At conditions outside functional operation condition limits, but within absolute
maximum and minimum ratings, neither functionality nor long-term reliability can be
expected. If a device is returned to conditions within functional operation limits after
having been subjected to conditions outside these limits, but within the absolute
maximum and minimum ratings, the device may be functional, but with its lifetime
degraded depending on exposure to conditions exceeding the functional operation
condition limits.
At conditions exceeding absolute maximum and minimum ratings, neither functionality
nor long-term reliability can be expected. Moreover, if a device is subjected to these
conditions for any length of time then, when returned to conditions within the
functional operating condition limits, it will either not function, or its reliability will be
severely degraded.
Although the processor contains protective circuitry to resist damage from static
electric discharge, precautions should always be taken to avoid high static voltages or
electric fields.
...
NOTES:
1. For functional operation, all processor electrical, signal quality, mechanical and thermal
specifications must be satisfied. |
Source - Page 19
This may bring up the idea of OC Fade, but I have not seen example at or below 1.55v on a 65nm Core 2 Series that was completely stable (Prime95, LinX, Folding, ect.) or a 45nm Core 2 Series at or below 1.45v.
Quote from Intel to keep in mind
Quote:
Moreover, if a device is subjected to these conditions for any length of time then, when returned to conditions within the functional operating condition limits, it will either not function, or its reliability will be severely degraded. |
Please use common sense, good cooling, and test stability stringently; and you shouldn't have any problems.
Cheers,
ChickenInferno