Fixed: SNMP stops responding after policy update

It’s been a while since I’ve been truly excited about a service pack, but I definitely am when it comes to Service Pack 1 for Windows Server 2008 R2! For literally years now, I’ve watched SNMP (mis)behave erratically on our Windows servers. Originally, we used ipMonitor (before SolarWinds purchased it) and then last year, we moved up to SolarWinds Orion NPM. Love the graphs. Love the traffic stats. ‘Get really frustrated when servers just flake out and stop answering SNMP’s calls…

So, quite ironically, I finally open a case with SolarWinds…last week (a.k.a. five days before we deployed SP1 network-wide). Nothing pops out to us, so we start capturing traffic with Wireshark and Microsoft Network Monitor at various points. Then on Sunday, we push SP1, which unbeknownst to us includes the hotfix described in KB980259.

Yesterday, merely 12 hours after installing it, several servers start flagging in NPM as not responding to SNMP, and I decide to dig in the event logs, hoping to see something I might have missed before. The event below was there twice and perfectly coincided with the cessation of SNMP service. In other places it showed up once or not at all, but the failing nodes had it twice…

Log Name:             System
Source:                   SNMP
Date:                       3/21/2011 1:41:22 PM
Event ID:               1500
Task Category:     None
Level:                     Error
Keywords:             Classic
User:                      N/A
Computer:            node.domain.com
Description:         The SNMP Service encountered an error while accessing the registry key SYSTEM\CurrentControlSet\Services\SNMP\Parameters\ExtensionAgents.

Interesting… After firing off this discovery to SW support before heading home for the day, they do some searching and come across Microsoft KBs 980259 and 972840, which cover a situation where SNMP stops responding after a Group Policy refresh. Our SNMP configuration is pushed via Group Policy and apparently after some refreshes, it can fail to find the registry keys that the policy deletes and recreates. And of course once it fails, it never tries again.

Upon looking at the hotfixes, my coworker and I found that the R2 version was included in SP1, which explains why only non-R2 servers are breaking this week. The non-R2 will come in an as-yet-unannounced SP3 for Win2k8 and Vista (I’m not holding my breath).

Anyways, ‘just figured I’d share this with the world at large in case you, too, are having issues with SNMP and are either too busy to dig deeply (like we were) or simply haven’t come across the fixes. Enjoy!

Be First to Comment

Leave a Reply