It's an old topic really, but from what I gathered it has been mostly misunderstood and never thoroughly been touched upon.
While mice themselves are only rarely to blame for polling imprecision, I still think the topic deserves a dedicated thread here as it greatly impacts mousing and is an improvable issue.
The whole thing ended up longer than I wanted it to, mostly because I went into probably unnecessary lengths with my explanations, but whatever - told some people I would post this here. If anything is hard to understand or inaccurate, please don't hesitate to point it out.What is polling imprecision and how does it affect mousing?
Movement and other input data created within your mouse is grabbed by your PC periodically and made available for the general system and use in applications.
The rate at which data is transferred from your mouse's internal buffer to your PC is called polling rate. The higher the polling rate, i. e. the lower the polling interval, the less potential delay between movement or other input activity (buttons, wheel) being registered on your mouse and that input being translated into events on your desktop or in an application.
The polling rate reduces input latency, where the set poll interval is the maximum latency. Maximum latency because regardless of the poll interval, data can enter the mouse buffer at any point in time according to your physical actions, and polling happens at a set rate
regardless of the endpoint's buffer activity. That means you can physically actuate the mouse, it will fill its buffer with according data at point X after that action and the next poll from the host will grab that data at X + 0-polling interval.
The timing relation between your physical action and the consecutive poll will determine the latency added by polling. That can be mere microseconds or up to one whole polling interval. Optimally you would want to set your mouse to be polled at a rate of 1000Hz - once every millisecond. That way you ensure not only that polling latency is 1 millisecond at most, but you also increase the likeliness of polling latency being only fractions of that since more polls per time increase the likeliness of a poll process happening just after your mouse buffer is filled.
Note that this only goes for "single" events where the buffer is written to at an arbitrary time. If the buffer filling rate of your mouse exceeds the polling rate, i. e. if you move your mouse reasonably fast, the maximum polling latency will be met throughout that movement.
On the left side you see single events entering the buffer at an arbitrary point in time. Looking at the individual input events, you can see that the first enters the buffer just after the last USB poll. So that one will experience nearly maximum poll latency of 1ms before the next poll grabs it. The second event enters the buffer nearly towards the very end of the polling interval, just before the next poll is initiated. For that second input polling latency would be a few microseconds. The third input falls inbetween two poll processes, so roughly 500 microseconds of polling-induced latency for that.
On the right side you are moving the mouse around or scrolling the wheel fast and data enters the buffer at a rate beyond 1ms. So you will get a sum of those inputs delivered to the PC constantly each 1ms with an average latency of 500us.
Using the same depiction of poll processes, this is what imprecise poll times would look like:
The time inbetween polls is not consistent; polls are not evenly distributed throughout time. Normally when the CPU cannot address a poll process timely and you thus get an off-timed poll, the system will try to compensate and to handle the next poll process as much faster as the previous one was slower or vice-versa.
By the way, that also means that with higher polling rates (and lower CPI respectively) you additionally increase the speed at which you have to move the mouse before average polling latency occurs on every poll. For example at 400cpi @ 500Hz you hit 1ms average latency on your inputs at 500 / 400 = 1.25inch/s, whereas at 1kHz the average of 500us latency on your inputs is met only at 2.5inch/s.
Polling precision does not affect input latency, or that at least is not the primary reason why you would want to improve polling precision. Polling precision describes how quickly the CPU can address an interrupt routine triggered by the USB host controller
. You can think of it like framerate vs. frametime. You get 100 frames per second, but not every frame will be finished exactly 10 milliseconds after the last; some take longer to be rendered and others are rendered more quickly to compensate. Analogously you get 1000 polls per second, but some poll processes take longer and are compensated for with more quickly triggered or addressed consecutive polls.
Polling variance even on unoptimized systems is in the range of a couple hundred microseconds to a few thousand microseconds in flawed systems.
These are latencies you will not perceive, especially because they are compensated for and more severely mistimed polls are rare and happen amidst thousands of other polls.
What I argue you will perceive however is "stuttering". Going back to the frametime analogy, you can have framerates as high as 500fps, there will still be noticable microstuttering if frametimes are not consistent.
I personally think the threshold is somewhere in the ~500us range from my time playing around with this. I. e. if you regularly get polls off-timed by as much as half a millisecond, stutter will be apparent in cursor behaviour and more significantly in in-game rotation
. Any improvements beyond that still serve a purpuse though: For the PC, even 100us are a very significant time span, and getting mouse events processed as quickly as possible will make everything regarding input run more precisely. Especially considering that the games themselves add a lot of strain on the system, so any improvement to polling precision in a desktop environment may be necessary to get in-game polling precision to acceptable levels.
Polling precision can also be used as an indicator of how crisp your system is in general.
As you can see, I have optimized my system to be precise up to a maximum polling variance of 5us, the vast majority of polls being processed correctly timed in the nanosecond range - even hitting a maximum possible precision that may be determined by the host controller's ability to trigger ISRs, maximum hardware interrupt frequency as per line-based interrupt and TSC frequency that my host controller operates at or simply the maximum precision of the measurement program used itself.
These measurements are taken with the MouseTester by microe. Log Start -> circle your mouse fast enough to where you hit USB polling rate, but not fast enough to where malfunction could happen -> Log Stop. Look at the Interval vs. Time graph. Dismiss noise at the beginning and end of the motion with Data Point Start and Data Point End. You really don't have to move the mouse faster than 1m/s. If even that. Even with 400cpi you hit buffer filling rates of <1ms pretty quickly (polling rate / CPI = speed (inch/s) beyond which you fill the buffer more quickly than it is read from).Where does polling imprecision come from?
Starting briefly with the fundamentals, how does USB polling work?
Most importantly, the endpoint device is passive
. Upon registration on a USB host controller it requests among other things a certain protocol to be serviced with, including the polling interval as a bInterval specification. But that's it as far as the device's role in poll timing is concerned. While there is reason to believe the USB protocol allows for mid-operation changes of the bInterval specification for power saving purposes, I have yet to see any mouse utilize it. And when there are changes in the polling interval (when a mouse set to request 500Hz regularly shows 1000Hz readings for example) it most obviously is a flaw since the readings occur amidst constant tracking, where power saving features would be misplaced. I have seen this happening with Zowie mice.
Other possiblities of mice affecting polling that I have seen are that the internal buffer in the mouse is fed at rate below the set polling rate (e. g. the MX518 which caps out at around 700Hz) or mice simply not accepting certain polling specifications (seen most often in office-grade mice). Apparently firmwares can mess with poll behaviour as well, or so I've recently been made aware of by a user on here with a DA3G 1800cpi. Most likely related to buffer writing behaviour as well.
It's important to note that issues on the mouse side
like buffer filling inconsistencies or limits, and flaws in the USB communication settings are all things that would show in polling variance as whole milliseconds
. The device is still polled at a set millisecond interval, so when one poll returns no data because of the mouse messing up, you get data returns on a consecutive poll. Thus any poll delay with roots on the mouse side will always be in multiples of milliseconds.
Mouse-sided effects on reported poll times are very obvious in that they are not compensated for (you get extreme values offset only in one direction on the Y axis).Poll variance on the scale of microseconds and below on the other hand always has its roots on the PC side of the process
The active role in the polling process is occupied by the USB host controller. Depending on the polling interval specification, it issues timed interrupt schedules to the CPU to handle an endpoint's I/O tasks. The host controller itself is not the source of polling imprecision; it operates according to a clock of its own and is precisely timed to fulfill the USB microframe standard of 125us.
Interrupt service routine spawned by the host controller followed by 6 scheduled deferred procedure calls executed on core 2:
In practice, the endpoint is checked (polled) for states each Xms depending on the bInterval specification. When the mouse buffer contains data, its state is flagged accordingly to let the host know to grab that data within that same poll process. After a successful transfer the host hands the device an ACK or acknowledgement of successful transfer upon which the endpoint buffer is flushed.
The important bit here is that the host issues polls according to the set interval independent of what the mouse is doing
. The USB protocol may allow for more relaxed timing schedules or interrupt priorities when the endpoint has been flagged inactive for a certain amount of time, i. e. returned no state change after a certain amount of polls, but there's no detailed information about that out there that I could find, nor is it really important here.
So, there's this precisely timed interrupt schedule spawned by the host controller that the CPU has to satisfy upon being interrupted with an ISR. Any imprecision in the polling behaviour will therefore be rooted in the CPU's inability to do so (or rarely the controller's inability to timely prompt ISRs). Enabling the CPU to satisfy strict periodic demands is the main leverage point to get rid off polling imprecision
. We achieve this by either decreasing the amount of tasks the processor has to address (DPCs that are in competition with those scheduled by the USB host; ISRs prompted by other hardware and are addressed prior to the DPCs), increasing the rate and efficiency with which the processor addresses tasks or by switching task orders and priorities around.How to optimize the system:
First, consider a few more general tips for a cleaner base system:
- Uninstall proprietary HID drivers and software.
- Install the latest chipset and hardware drivers. In regards to PCI Bus, USB filter, USB host controller and sound drivers you will have to see for yourself whether generic drivers supplied by Windows provide better polling and DPC results. Generally you should go the "the more recent the better" route for these.
- Flash your motherboard with the most recent stable BIOS.
- Disconnect any devices or internal hardware components you don't use. This includes things like the CD drive and leaving motherboard headers like the front panel support empty unless you want to use them. Preferably the only device hosted on USB is the mouse.
Same goes for programs; everything installed you don't use is by definition bloat.
- Clean your startups with TechNet's autoruns.
- Disable any integrated motherboard components you don't use from the BIOS. USB 3.0 chip, sound chip, video chip, unused serial ports, ...
- Use the system variable "DEVMGR_SHOW_NONPRESENT_DEVICES 1" to reveal driver corpses of disconnected devices and uninstall everything that's not used.
- Disable from the device manager any USB host controllers except for the one the mouse is hosted on and uninstall third-party USB 3.0 drivers. Uninstall the left-over hubs as well after you disable host controllers.
- With Windows services, I have always kept it simple:
Essential services are set to automatic:
All other services you set to manual. This way Windows and applications can start services according to their dependencies and needs. For instance, you will boot without internet access and a connection will be established once attempted by other tasks or applications (if you want it to be esbtalished upon boot set WinHttpAutoProxySvc to Automatic). I have yet to run into problems with any calls for services to be ignored or dependencies not being co-launched, but there's a variety of software and associated interactions out there and so if you run into problems you will have to check which services you might have to manually start from the services.msc console for the piece of software to function properly.
The audio services are the only ones that I found do not respond to software calls. You can either set them to automatic or manually start them when you need them:
Some services you might want to specifically disable:
I won't address further basic Windows settings as this is not supposed to be a guide on how to properly set up Windows. One additional thing to consider for NVIDIA users is the NVIDIA Display Driver Service nvsvc can also be set to manual. Normally NVIDIA launches 2 instances of the NVIDIA Display Driver Helper Service (nvvsvc.exe) and one instance of the NVIDIA User Experience Driver Component (nvxdsync.exe) regardless of whether or not you have NVIDIA Experience installed (which I recommend you do not). With the parent service nvsvc set to manual, those will not automatically start with Windows. They won't start when you launch a game either. There are no problems without them; game profile settings are applied properly and so is everything else. *Have to correct myself here: The profile settings are applied properly (confirmed with NVIDIA Inspector frame limiter), but the game is less responsive without the processes running. Would be interesting to know what they are doing exactly. I recommend you either set nvsvc to automatically start with Windows or refer to the next sentence.
However, if you do want to use your NVIDIA Control Panel, you just have to right-click your desktop. The NVIDIA shell extension will then start all three processes to provide desktop-level support via. the context menu.
Going into more specific settings for the problem we are dealing with, the first thing is to configure the CPU. From the BIOS, disable any dynamic clocking, power saving or sleep features that apply to your processor.
Overclocking is optional and does not always benefit handling of time-sensitive or multi-threaded tasks. Look online for stable clocks for your processor and according voltages as well as RAM timings. If your motherboard supports automated voltage and RAM controls I have found no reason not to use them. This might depend on the processor and motherboard type, so do your research there.HPET or High Precision Event Timer
. Let's keep it simple here and say it is a hardware timer and as such enables hardware to interrupt the CPU more frequently than older timers as per QueryPerformanceFrequency. If your motherboard-CPU combination supports HPET, you should leave it enabled in the BIOS. Windows by default uses other timers and can synchronize and selectively utilize timers for hardware that requests it to do so.
Since Windows can call the HPET functions if needed or requested by hardware, there is no reason to force HPET onto Windows as a default timer with bcdedit /set platformclock true. For hardware that doesn't support HPET Windows in that case would always have to synchronize HPET with the timer the hardware supports.
You should also not specifically disable it with bcdedit /set platformclock false because if hardware does support high precision interrupt frequency it might function better (as seen for example in hardware that supports the MessageSignaledInterrupt mode). Just use bcdedit /deletevalue useplatformclock and let your OS decide on hardware timers. Further reading: http://www.windowstimestamp.com/descriptionISRs and DPCs:
Hardware interrupts to the CPU and their more or less delegated tasks.
The OS has no control over their execution. ISR's are triggered by physical hardware signals from devices to the CPU. When a device signals the CPU that it needs attention, the CPU immediately jumps to the driver's interrupt service routine. DPCs are scheduled by interrupt handlers and run at a priority only exceed by hardware interrupt service routines.
ISR and DPC activity usually increases with system activity. As a system becomes more active, interrupts and DPCs will generally become more frequent, taking up more CPU time. This can get to the point where they visibly (or audibly) affect system performance. In these cases, no single ISR or DPC routine is the problem - it is their cumulative usage of CPU time.
Both DPC latency checkers like LatencyMon and thread activity checkers like DispatchMon are useful here.
You don't want to swamp your CPU with tasks if you care for swift execution times. DPC Latency checkers schedule low priority tasks (DPC) and count the time the CPU needs to address those. The more tasks the CPU otherwise is occupied with, the longer this will take and the more "latency" as in delay of addressing general tasks is present on the system.There are three approaches to helping the processor do things more swiftly:
Some unsorted stuff:
- Reducing the work load:
Disconnect and disable hardware or software devices, uninstall drivers and software, clean your autostart, minimize the amount of background or scheduled tasks. All of this means less ISRs and DPCs for the CPU to have to dedicate cycles to and faster dedicating of CPU time to tasks you really want to be addressed.
- Increasing the work power:
Overclock the CPU, disable power management features like power saving, thermal control, dynamic clocking, sleep states, etc. and unpark your CPU cores with the CPUUnparkApp.
Use the high performance power plan. Make sure to go to regedit.exe -> HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Power\PowerSettings and go through all entries, setting "Attributes" DWORD entries you find to "2". This will reveal power settings in the advanced power plan settings otherwise unavailable.
For the high performance power plan you want to disable every power saving feature available, especially PCI power management and USB selective suspend. Reading will help. CPU power management will have a ton of entries. Ignore those; parking is disabled already and minimal CPU frequency will already be at 100%.
- Increasing the work efficiency:
The timer resolution of Windows. To my knowledge, basically the tick rate of Windows and determines scheduling precision of thread activity. That doesn't mean it affects the rate actions are performed at on a hardware-level, but how often Windows itself interrupts the hardware to check for the status of scheduled or periodic operations and either add new tasks or timeout others. Setting timer resolution lower than the default 15.6ms can help applications (not hardware) perform more accurately in that they are able to more frequently access drivers to have Windows create new or kill old tasks (any task the system may have delegated to the hardware and would potentially run for an unnecessary amount of time with a longer timer duration). TimerTool allows you to increase the tick to 500 microseconds instead of 15600 microseconds, helping software execution but also keeping the CPU more efficient by killing tasks more timely should they happen to be scheduled beyond their needs.
Task priorites. If you want certain tasks to be addressed more time-sensitively, setting IRQ priorities is a possibility to tell the CPU which interrupts are more important than others.
msinfo32.exe will tell you which IRQ# is assigned to which hardware component. For hardware that you want to have the CPU prioritize, create a IRQ#Priority DWORD entry in your registry under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\PriorityControl, where # corresponds to the actual number attached to the desired component. Set the value of that entry to 1.
Another potentially viable way to improve handling of interrupts is to resolve IRQ conflicts. In msinfo32.exe, look for components that share IRQ# under conflicts and see whether you can disable any, change IRQs from your BIOS, get your mouse registered on a host controller that doesn't share an IRQ# or try to see if any components support MSI mode. For that, head to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\PCI and create for every entry that possesses an "Interrupt Management" folder within the Device Parameters a new folder called "MessageSignaledInterruptProperties". In that context, create a DWORD entry called MSISupported and set that to 1. After you have done that with all devices under PCI, reboot and open your device manager. Hit view -> Resources by connection. Expand interrupt requests and scroll to the bottom. Every component that supports MSI mode will be at the very bottom with negative values. Take note of their hardware IDs and set the MSISupported entry to 0 for every component that is not one of those that support it.
Or use a little application called "MSI_util" that will spare you from having to adventure through your registry yourself.
Note that unstable overclocks or unmatching RAM timings affect work efficiency. As do multi- and hyperthreading techniques. People have seen improvements with disabling Intel's CPU hyperthreading feature. Disabling cores completely from the BIOS could be experimented with.
Another thing to think about is core affinity. The handling of resource-heavy components that regularly are not known for multithreaded performance can be assigned to a single core, preferably for this topic's purposes assigned of course to one that does not deal with your USB activity (look at DispatchMon).
For the audio handling specifically, go to your task manager's service tab and locate AudioSrv and AudioEndpointBuilder. Right-click, "Go to Process(es)". It will guide you to the corresponding svchost.exe process containing activity of the audio service, maintaining threads and scheduling tasks. Right-click on that process, "Set Affinity...". You can do this with processes from other drivers, services, or software as well.
Managing process priorities can in theory help (https://msdn.microsoft.com/en-us/library/windows/desktop/ms685100%28v=vs.85%29.aspx). These determine the scheduling pattern for your running software. Software should only compete with essential hardware-sided threads such as mouse input when set to real-time priority, so heed Microsoft's warning not to ever do that. You can experiment with setting other hardware-sided processes to lower priority levels though (such as the audio svchost.exe mentioned above).
You can now see why disconnecting and disabling as much hardware and uninstalling as much software as possible is beneficial: less interrupts and tasks the CPU has to deal with, more cycles dedicated earlier to stuff you truly use. So apply this logic as far as you deem worth it. I argue, if you have a dedicated gaming PC, why not go all the way? And in the rare case you do need things like your USB 3 ports, printers or CD drive, Internet Explorer or Windows features, etc. - you can just enable them. Even stuff that requires reboots should not be bothersome to enable with fast boot times that come with SSDs. Obviously if you do all of your things on your main sole PC it's for you to decide just how far you want to go.
You will also have to go through drivers and test yourself what kind of effect they have on DPC latencies and polling precision. As I mentioned, for chipsets the latest should be the best because with those manufacturers have no real obligation to release drivers regularly - they only ever do when they see major possible improvements or fixes.
With graphic, network and sound cards and other addon hardware that's a different story. LatencyMon will give you a pretty good idea if drivers are bad in regards to DPC latency and how healthy your USB activity is (look at driver execution times, especially USBPORT.SYS).
Sound components are complex and resource-heavy. I use an onboard chip with generic Windows drivers, but you will have to check yourself what the least heavy implementation is. Maybe external (TOSLINK, coax, USB) or internal (PCI) alternatives perform better than onboard solutions. USB sound cards likely wreck polling precision, but if you use one (or any USB device other than the mouse for that matter), try to get them to be hosted on a different controller than the mouse. If you use your keyboard on USB, consider reducing the keyboard polling rate with hidusbf.sys. Although, preferably plug your keyboard into PS/2 which as an asynchronous interface requires no periodic polling and thus leaves the CPU alone at times an USB-interfaced keyboard wouldn't.
Other external sound solutions (TOSLINK, coax) still need an active playback device in your PC. I use my onboard chip to optically feed an external amplifier. But since the onboard chip doesn't have to apply digital-to-analog conversion or amplify the signal, I can imagine it operates less demandingly than when using the line-out.
Disable any playback or recording devices you are not using, including any High Definition Audio Controllers in the device manager that come with your GPU.
Maybe PCI add-on USB cards perform even better than those implemented onboard. I haven't tested that. Doubt it, but something you can look at if you have such an extension card.
Disable your antivirus while you are gaming. Or consider not using one to begin with.
Remember that running a game is very resource-heavy and will affect polling precision significantly. Another reason why frame caps and low-quality;high-performance settings in games you are looking to improve the experience of can be useful.
Switch to 500Hz if you can't at all get 1000Hz stable on your system.
TimerTool has some interesting effects on polling behaviour. After manually triggering it, this is what different settings look like. I won't make any assumptions as to why this happens.
As you can see, basically anything above 0.5/1ms leads to my specific setup jumping between two discrete points of poll address timings. +30;-30us to be exact. I always set timer resolution to 0.5ms when I'm gaming.
Here is what totally messed up polling behaviour looks like (my laptop):
It feels horrible. Naturally lesser basic component performance, more strict thermal and power saving features, the built-in screen and so on play a role as well, but mousing instantly feels way better and controlled when switching to 500Hz instead of 1000Hz. The improvements just from reducing the polling rate on that laptop:
I would like to see results from your setups and possible findings you have playing with TimerTool and tweaking other things.