Windows Updates Failing on Server 2008 R2

I’ve seen a really strange error with Windows updates on some 2008 R2 servers where they fail to start downloading and installing updates. They can connect to Windows Update and find available updates, but once you select them and start the process off they fail after a few minutes. I’d tried all sorts, including rebooting and all the usual stuff.

The solution I found was probably the most unlikely thing I’ve ever seen, but here it is. On the notification tray, click the double up arrows and click customize;

Then tick the “Always show all icons and notifications on the taskbar” checkbox and ok out of the dialogue;

This then allowed me to start the updates downloading and installing. I think it’s something to do with Windows update not being able to create the taskbar icon in the notification area and the subsequent balloon notification that says updates are downloading, but that’s just a wild theory of mine, I have no proof that’s what it is.

I know the whole solution sounds a bit mental, but I’ve done this on a fair number of servers that were playing up with regards to downloading updates now and it’s always worked.

497 Day Bug

I recently ran into a server on a site that had seemingly stopped servicing any DHCP requests. So after a little digging and checking, the System event log showed the time the DHCP service first started throwing errors, the error here being EventID 1059 “The DHCP service failed to see a directory server for authorisation.” So DHCP seemed to be having trouble talking to a domain controller, which seemed odd, as it was also a domain controller itself. A quick check of dcdiag returned the following output;

Ldap search capability attribute search failed on server HOSTNAME, return
value = 81

It appeared that the server had stopped servicing LDAP requests too, as well as the word “capability” being spelt incorrectly in the error returned.

About the same time the DHCP service started reporting problems contacting a domain controller, the Group Policy client also started reporting it was unable to find a DC, which would be expected as it was trying to contact the server itself and again it was failing. Checking the Directory Service event log showed that it was complaining about replication, but not much else. Checking the replication, with repadmin /replsummary again threw an error with communication via LDAP.
Running the same command from another machine seemed to show the RPC server was down on the server, which it wasn’t, the service was up. So I checked the RPC ports with a quick netstat -no and was greeted with tens of thousands of ports all in a TIME_WAIT state. That would explain things then, if there’s no RPC ports available various things will start to break. Googling “Ports not closing TIME_WAIT” led me to a hotfix from Microsoft, All the TCP/IP ports that are in a TIME_WAIT status are not closed after 497 days from system startup in Windows Vista, in Windows 7, in Windows Server 2008 and in Windows Server 2008 R2.

And a little further Googling around the problem showed this to not be a problem limited to Microsoft, with various other vendors and products mentioned as affected such as Avaya, Brocade, Cisco, EMC, QLogic and VAX/VMS;

The 497 Day Uptime Bug
497 – The number of the IT beast

From the IBM post linked above;

Basically a 32bit counter used to record uptime will cause this problem when it overflows. If you record a tick for every 10 msec of uptime, then a 32-bit counter will overflow after approximately 497.1 days. This is because a 32 bit counter equates to 2^32, which can count 4,294,967,296 ticks. Because a tick is counted every 10 msec, we create 8,640,000 ticks per day (100*60*60*24). So after 497.102696 days, the counter will overflow.

So all that was left was to patch the thing, and while a hotfix is fine, it is fairly old and I did wonder if it had been included in standard Windows updates. Helpfully Microsoft advise that if you have the following security bulletin installed then the hotfix is not needed, suggesting it’s included in the security patch;

Microsoft Security Bulletin MS12-032 – Important

Again though, that’s a pretty old patch itself, and I guessed this must have been rolled into a standard patch at some point. So a quick search of the Microsoft Update Catalog for MS12-032 will show that update and all updates that supersede it, so you can then check if that update KB or any superseding update KB numbers are installed on your system.

If they’re installed, you should be fine and covered off against this, if not, keep an eye on your systems, as when they get over 497.1 days of uptime, you may find that some services start to fail, like ADDS and other dependant services.

Below is a basic script I wrote for PowerShell to get uptime of domain controllers in this example to see if any were approaching the time frame for needing this to be done, so they could then be checked off in WSUS for the right patches. WSUS is your friend here when it comes to rolling out fixes like this

$dcs = Get-ADDomainController -Filter * | sort name
foreach ($dc in $dcs)
{
$name = $dc.name
$a = Get-wmiobject -ComputerName $name -ClassName win32_operatingsystem
write-host "`n"
$name
[Management.ManagementDateTimeConverter]::ToDateTime($a.lastbootuptime)
}

Obviously the lesson here is keep your servers updated, but as we all know there are times when that’s not possible. In which case at least this should help you find and fix the ones that might be affected by this.

Group Policy Preferences Processing Order

Just to clarify something that people should be aware of, the Group Policy Preferences processing order. Within each CSE the settings are applied starting at number one and working down from there. I know it sounds obvious, but the documentation generally say “starting with the highest”, which I think leaves room for confusion as “the highest” could mean it finishes with one, especially when you look in the context of Group Policy and that the last setting applied wins.

Anyway, one and down from there.

svchost.exe troubleshooting

If you’ve ever been in a situation where you have a service falling over with no obvious cause, it might be some other service running under the same svchost process causing the failure. As it turns out the Microsoft Performance Team have a very handy guide on svchost troubleshooting.

This covers how to isolate the suspected service into it’s own process, even going as far as running it with it’s own svchost process, so it’s easier to see if it really is the service you suspect causing the problem, or something else. In my case I was trying to pin down a crash with the lanmanserver service, and this was very useful.

Group Policy – Unattended Sleep Timeout

Update: Thanks to a helpful comment from Alain Roy, the answer is here – Sleep unattended idle timeout

There is a Group Policy setting called “Specify the unattended sleep timeout” located here;

Computer Configuration – Administrative Templates – System – Power Management – Sleep Settings

The description given for the policy is;

This policy setting allows you to specify the period of inactivity before Windows transitions to sleep automatically when a user is not present at the computer.

If you enable this policy setting, you must provide a value, in seconds, indicating how much idle time should elapse before Windows automatically transitions to sleep when left unattended. If you specify 0 seconds, Windows does not automatically transition to sleep.

If you disable or do not configure this policy setting, users control this setting.

If the user has configured a slide show to run on the lock screen when the machine is locked, this can prevent the sleep transition from occurring. The “Prevent enabling lock screen slide show” policy setting can be used to disable the slide show feature.

What I want to know is how on earth the system determines when it’s unattended. What if you’re watching a full screen video, is that unattended? What if you’re just running an Excel calculation, is that unattended?

I can find very little information, none in fact, on the Internet on how this is determined, but if anyone knows, please share.

Enabling Individual Settings Within A Group Policy Preference

When you’re creating a set of Group Policy preferences, you can set all kinds of settings in a very similar way to how you would if you were sat in front of the machine. For example, IE settings are very intuitively laid out and it really is just like doing it within the internet control panel screen;

IE Preferences

The key thing to remember though, is that in a lot of cases, just setting the preference isn’t enough, you have to enable it too. So for example, entering a homepage into the preferences panel will not make it actually apply, as you can see the lines underneath it stay broken red, which means that the setting will not get applied;

Setting Not Applied

Although it’s not mentioned within the policy at all, there are keys you can press to enable or disable individual settings, or the entire page, and these are documented on Technet.

Basically, to enable the homepage setting we saw above, after you’ve finished entering it, press F6 and this will turn the line under the settings to green and this means that then this will get applied;

Applied Setting

So from this you can enable or disable any setting from within the policy, and hopefully take a little more control over your Internet Explorer settings going forwards.

.NET Framework Cleanup – Full Uninstall

I came across a server today that had some horrible problem with the .NET frameworks on it, and none of the updates or service packs from Windows update would install. I couldn’t remove any of the .NET applications using either the App/Remove Programs GUI, or via the correct msiexec install strings. I’m not sure how the server came to be like this, but it was a problem I had to sort out and basically I was a little stumped, until I came across Aaron Stebner’s MSDN blog. He had a post about completely removing the .NET applications in their entirety.

The application works on all versions of .NET and you can find it on his blog here
Once you’ve run the cleanup tool, reboot and then you can just install the applications from Windows Update again. The application worked very well and once it was finished, all was well in the .NET world.

Force Password Check With PDC When Login Failure Occurs

I didn’t even know this setting existed as an option within Group Policy, but then again, Group Policy is a bit of a beast at the best of times.

So, the PDC emulator is responsible in the domain for handling password replication to other domain controllers, a password change occurs on the PDC and this is then replicated out to all other domian controllers. But what if you have a large infrastructure and the password change hasn’t replicated out yet to the domain controller being used by a client to authenticate? Well there’s a policy setting you can apply to your domain controllers, that forces them to check with the PDC in the event they deny a logon request due to a bad password. The setting is called “Contact PDC on logon failure” and it is briefly detailed on TechNet, and within the Group Policy editor, lives at the below location;

Computer Configuration\Policies\Administrative Templates\System\NetLogon

Use with caution though, if used on domain controllers on slow WAN links, this will create a lot of traffic to the PDC, and in general it can create a lot of load on the PDC in large environments.

Windows 8 SMBv2 Incorrectly Cached Data From A Server 2008 R2 File Share

I had never noticed this problem when using Windows 7, but since upgrading to Windows 8 I’d seen frequent caching problems from a Windows Server 2008 R2 share. Both mapped drives and UNC paths showed the same out of date cached information, which seemed different to a lot of reports I’d read online that seemed to centre around mapped drives having the problem but not UNC paths. On my server offline files were disabled and the share was set to never be available offline on the server. A reboot provided a temporary workaround, but I needed a more permanent solution.

So after some investigation I started to get the idea that this might be related to caching when SMBv2 was negotiated and used between the client and server. After some further investigation I came across an article on TechNet which pointed to disabling the cache settings within SMBv2 on the client machine.

So adding three DWords to the registry and setting them to 0;

FileInfoCacheLifetime
FileNotFoundCacheLifetime
DirectoryCacheLifetime

Add them to the key below;
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters

After these had been added in a reboot was not required, and the caching issue was sorted.

Using Full DVD/CD Writing Capabilities in a Hyper-V VM

See the note at the end of the post before implementing this
Normally when you pass an optical drive through from the hyper-v server to the guest OS in a VM it can act as a read only device, even if it has writing capabilities.  This is a limitation of Hyper-V.

However this can be worked around using iSCSI.  Microsoft now make available their own iSCSI target software, but unless I’m mistaken that cannot share optical drives.  Alcohol 52% Free Edition contains the ability to setup an iSCSI target device which you can then connect to from Windows 7 and will handle optical drives.

Alcohol 52%

Install Alcohol 52% on the host OS, ignoring all the crapware and reboot as required.  Then start the application and look down to the bottom towards the list of drives.  Right click on the drive you wish to share and click sharing;

Then highlight the drive and click the new share button.

Give the share a name and highlight “share read” and “share write” then click ok

Back on the sharing window go to the options tab, enter a username and password and make sure the service is started and the startup type is automatic, this will ensure it still runs after a reboot.

Finally click ok and this completes the section on the host OS.

On the guest OS under Administrative Tools start the iSCSI initiator.  Under the discovery tab click discover portal and enter the IP address of the host OS, leaving the port the same and click ok.

Then head back to the targets tab and click refresh.  The iSCSI device setup should show, highlight it and click connect.  Then on the next window click ok, leaving the details the same.

Finally the iSCSI device should show connected and if you check your connected drives you should see the device listed in My Computer.

Job done.

Note: I salvaged this from a very old version of my blog, I know it applies to Hyper-V in Windows Server 2008 R2, but I’ve not tested this on any versions of Server 2012. This will work if you’re in a pinch and need it, but I couldn’t really suggest this for use in a production environment