Mark Gilbert's Tech Blog

Windows Updates Failing on Server 2008 R2

I’ve seen a really strange error with Windows updates on some 2008 R2 servers where they fail to start downloading and installing updates. They can connect to Windows Update and find available updates, but once you select them and start the process off they fail after a few minutes. I’d tried all sorts, including rebooting and all the usual stuff.

The solution I found was probably the most unlikely thing I’ve ever seen, but here it is. On the notification tray, click the double up arrows and click customize;

Then tick the “Always show all icons and notifications on the taskbar” checkbox and ok out of the dialogue;

This then allowed me to start the updates downloading and installing. I think it’s something to do with Windows update not being able to create the taskbar icon in the notification area and the subsequent balloon notification that says updates are downloading, but that’s just a wild theory of mine, I have no proof that’s what it is.

I know the whole solution sounds a bit mental, but I’ve done this on a fair number of servers that were playing up with regards to downloading updates now and it’s always worked.

Problems Loading Windows Update on Server 2000 and Server 2003

I recently had the misfortune of having some really old Server 2000 and Server 2003 boxes thrown my way that needed patching, and Windows Update was not loading in Internet Explorer 6 when it should have. Both servers gave slightly different error codes, but ultimately the rather quick fix was to go into Internet Explorer, and in the tools menu, into internet options. The in the advanced tab, under security, make sure that TLS 1.0 was enabled, which in the case of these two servers was not.

For good measure I also disabled SSL 2.0 and 3.0, as those really should have been turned off by now. after this was done, a quick restart of the browser allowed me to get to Windows Update again.

Some Files Not Being Replicated By DFSR

I recently came across a problem within a DFSR replicated folder where some files were not being replicated between the folders. After a bit of checking to make sure the file types were not on the excluded list I concluded that these could be temporary file after seeing this mentioned on a couple of forum threads.
Checking the files in explorer or using attrib.exe did not show any temporary attributes set, however checking with fsutil.exe did show a temporary attribute. The command to run is;

fsutil usn readdata "filename"

When I ran the command I got the following output;

PS L:\> fsutil usn readdata "Camera log.xlsx" Major Version : 0x3 Minor Version : 0x0 FileRef# : 0x000000000000000000050000000e5dd6 Parent FileRef# : 0x0000000000000000000100000000011a Usn : 0x00000002c60e7378 Time Stamp : 0x0000000000000000 00:00:00 01/01/1601 Reason : 0x0 Source Info : 0x0 Security Id : 0x0 File Attributes : 0x120 File Name Length : 0x1e File Name Offset : 0x4c FileName : Camera log.xlsx

The output here includes the file attributes on the file. The file attributes field is a bitmask that shows exactly what combinations of attributes have been set on the file in question. In my case shown here, 0x120 would be 0x20 (Archive) and 0x100 (Temporary) giving a bitmask of 0x120 for the file attributes.
Microsoft have an “Ask the Directory Services Team” blog post about this, listing all the possible values you can have in the file attributes field, but the short answer is 0x100 is the temporary value and if you’re bitmask includes the temporary attribute, then the file wouldn’t be replicated by DFSR.

If you’re looking to just remove the attributes for the file in question then the following command in PowerShell will do it;
Get-childitem ".\Camera log.xlsx" | ForEach-Object -process {if (($_.attributes -band 0x100) -eq 0x100) {$_.attributes = ($_.attributes -band 0xFEFF)}}

If you want to trawl any subfolders and remove temporary attributesfor more of these then you can use the following
Get-childitem .\ -recurse | ForEach-Object -process {if (($_.attributes -band 0x100) -eq 0x100) {$_.attributes = ($_.attributes -band 0xFEFF)}}

Or if you just want to do this in the current folder, just remove the recurse switch.

Server 2016 & Windows 10 Start Menu Not Working

I’d been having some problems with the start menu in both Server 2016 and Windows 10 stopping working. Googling around revealed various posts and loads of the same advice on how to fix the problem. These included using the Deployment Image Servicing and Management tool with the /restorehealth switch;

DISM /Online /Cleanup-Image /RestoreHealth

Reinstalling all modern apps via PowerShell with the following command;

Get-AppXPackage -AllUsers | Foreach {Add-AppxPackage -DisableDevelopmentMode -Register "$($_.InstallLocation)\AppXManifest.xml"}

Creating a new user account and just using that, not an option if the problem affects all accounts on the machine. The only one of the options mentioned that did help was to re-install Windows, this left the start menu working. However as soon as I domain joined the machine again, it stopped working again after a restart. This led me to look at Group Policy as a potential culprit, and sure enough, moving the object to a separate OU and blocking all policy on it left the start menu working. After a long process of linking policies in one by one I came down to a very specific registry setting.

I’d set the ACLs on a specific registry subkey of HKLM, in this case it was HKLM\Software\Microsoft\RPC. These ACLs were missing one specific entry, namely APPLICATION PACKAGE AUTHORITY\ALL APPLICATION PACKAGES.
Adding this in with only read permissions and forcing a policy update brought the start menu immediately back to life. That ACL is one that has appeared in Server 2012 I think, but since that particular part of our policy predates 2012 that ACL wasn’t there. Oddly enough I’ve not seen this cause any problems with Server 2012/2012 R2/Windows 8/8.1, only with Server 2016 & Windows 10.

So the take away from this is to make sure if you restrict any registry ACLs, make sure you include read access for APPLICATION PACKAGE AUTHORITY\ALL APPLICATION PACKAGES.

If all this was helpful and worked for you, please drop a quick note in the comments.

Server 2012 R2 Licencing Problem

I’ve seen a problem on a few servers, where they have been fully configured to use a licence server with available CALs, but after a time still report that there are no licence servers available to use. The servers had been configured to talk to the licence server by following the process Microsoft document at the link below;

Guidelines for installing the Remote Desktop Session Host role service on a computer running Windows Server 2012 without the Remote Desktop Connection Broker role service

Everything appears to check out, and I know the licence server is being used by other Server 2012 R2 servers for their CALs, so I know essentially the licence server is working. The server appeared to just not be using the licence server details I’d given it and simply falling over when the grace period ran out.

The solution was found in the registry, with the following key;

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server\RCM\GracePeriod

After removing the binary value in there and only leaving the default string and rebooting the server, the servers would check in to the licence server. I had to take control of the registry key to make this happen, and then revert the permissions back after I’d finished.

Some people have reporting seeing event ID’s 1129 and 1130 in the TerminalServices-RemoteConnectionManager event log, but I didn’t see these in all cases.

Unable to perform Remote Desktop Services installation – Unable to connect to the server by using Windows PowerShell remoting

When starting an RDS farm install today I was presented with an error saying that the server could not be connected to via WinRM, which was odd as the server giving the error was the machine I was running the install from. A screenshot of the error is below;

I did a little Googling of the problem and found a number of posts reporting this was related to IPv6 and that the fix or workaround was to disable IPv6. In my eyes this isn’t a workaround, Microsoft do advise against disabling IPv6. So, after a little more thinking about this, I wondered how the WinRM listeners were configured, and in particular the IPv6 listeners. Surprise, surprise, the IPv4 listeners were configured, but the IPv6 listeners simply were empty in Group Policy. An empty listener address range in policy means those listeners are disabled. Configuring these correctly in the policy and restarting the server then allowed the RDS installation to proceed.

Microsoft do give some detail on how to configure this setting and I just thought I’d share, as disabling IPv6 shouldn’t really be a fix for anything.

497 Day Bug

I recently ran into a server on a site that had seemingly stopped servicing any DHCP requests. So after a little digging and checking, the System event log showed the time the DHCP service first started throwing errors, the error here being EventID 1059 “The DHCP service failed to see a directory server for authorisation.” So DHCP seemed to be having trouble talking to a domain controller, which seemed odd, as it was also a domain controller itself. A quick check of dcdiag returned the following output;

Ldap search capability attribute search failed on server HOSTNAME, return value = 81

It appeared that the server had stopped servicing LDAP requests too, as well as the word “capability” being spelt incorrectly in the error returned.

About the same time the DHCP service started reporting problems contacting a domain controller, the Group Policy client also started reporting it was unable to find a DC, which would be expected as it was trying to contact the server itself and again it was failing. Checking the Directory Service event log showed that it was complaining about replication, but not much else. Checking the replication, with repadmin /replsummary again threw an error with communication via LDAP.
Running the same command from another machine seemed to show the RPC server was down on the server, which it wasn’t, the service was up. So I checked the RPC ports with a quick netstat -no and was greeted with tens of thousands of ports all in a TIME_WAIT state. That would explain things then, if there’s no RPC ports available various things will start to break. Googling “Ports not closing TIME_WAIT” led me to a hotfix from Microsoft, All the TCP/IP ports that are in a TIME_WAIT status are not closed after 497 days from system startup in Windows Vista, in Windows 7, in Windows Server 2008 and in Windows Server 2008 R2.

And a little further Googling around the problem showed this to not be a problem limited to Microsoft, with various other vendors and products mentioned as affected such as Avaya, Brocade, Cisco, EMC, QLogic and VAX/VMS;

The 497 Day Uptime Bug
497 – The number of the IT beast

From the IBM post linked above;

Basically a 32bit counter used to record uptime will cause this problem when it overflows. If you record a tick for every 10 msec of uptime, then a 32-bit counter will overflow after approximately 497.1 days. This is because a 32 bit counter equates to 2^32, which can count 4,294,967,296 ticks. Because a tick is counted every 10 msec, we create 8,640,000 ticks per day (100*60*60*24). So after 497.102696 days, the counter will overflow.

So all that was left was to patch the thing, and while a hotfix is fine, it is fairly old and I did wonder if it had been included in standard Windows updates. Helpfully Microsoft advise that if you have the following security bulletin installed then the hotfix is not needed, suggesting it’s included in the security patch;

Microsoft Security Bulletin MS12-032 – Important

Again though, that’s a pretty old patch itself, and I guessed this must have been rolled into a standard patch at some point. So a quick search of the Microsoft Update Catalog for MS12-032 will show that update and all updates that supersede it, so you can then check if that update KB or any superseding update KB numbers are installed on your system.

If they’re installed, you should be fine and covered off against this, if not, keep an eye on your systems, as when they get over 497.1 days of uptime, you may find that some services start to fail, like ADDS and other dependant services.

Below is a basic script I wrote for PowerShell to get uptime of domain controllers in this example to see if any were approaching the time frame for needing this to be done, so they could then be checked off in WSUS for the right patches. WSUS is your friend here when it comes to rolling out fixes like this

$dcs = Get-ADDomainController -Filter * | sort name foreach ($dc in $dcs) { $name = $dc.name $a = Get-wmiobject -ComputerName $name -ClassName win32_operatingsystem write-host "`n" $name [Management.ManagementDateTimeConverter]::ToDateTime($a.lastbootuptime) }

Obviously the lesson here is keep your servers updated, but as we all know there are times when that’s not possible. In which case at least this should help you find and fix the ones that might be affected by this.

Problems Logging Into Active Directory Accounts on a Mac With a Home Folder Specified

So, after running into this problem, I was initially sceptical of what the cause may be. I’d see talk around that Macs didn’t like their home folders to be part of an Active Directory domain that ends in the pseudo TLD of “.local”, but I never quite believed that this would be the cause.

Basically, symptoms would be that the machine will fail to log in using the domain credentials, and will just say something generic sounding like “Unable to login to the account, an error occurred”. After lots of testing and fettling with both the Mac and the domain settings (This was a new domain being provisioned for a specific event, and I wouldn’t suggest you just generally tinker with your domain controller configurations), it was found that the account could be logged in if the home drive was disabled in AD. In my case the home drive path was a location within a DFS namespace, but even a direct share on a file server gave the same results.

So, I spun up a new domain on a separate server (oh the joys of virtualisation) and this time gave the domain a .net TLD and the home drive specified in the same way within A DFS namespace. Surprisingly the account logged in here first time after the Mac had been rebound to the new domain. Some further fiddling was required with the domain controllers to make sure that they were responding to all requests with FQDN responses as opposed to NetBIOS ones. The details on how to do this via PowerShell or a direct registry hack are linked. After these changes have been made a reboot of the server will be needed, but then they should respond with FQDN addresses for both DFS referrals and targets.

At this point, the whole thing should work, and as usual, I hope this saves someone some time in figuring this out.

HP VMWare VIB Sources Not Connected – HP Killed The Old Pre-Rebrand URLS

I noticed today that when in the VMWare Update Manager in admin view, some of the custom VIBs I had in were showing as “Not Connected”. This was my custom location for HP VIBs of http://vibsdepot.hp.com/index.xml as I use the HP image on the hosts in this vCenter. When I forced VUM to check the URL again, it was coming back again as “Not Connected”. So I thought I would try loading the XML file in a browser, which presented me with this lovely little “notification”;

I say “notification” as what they’ve done is use a redirect to point you to a different URL, which then contains the message that you must use a different URL now.
The new HP VIB URL is https://vibsdepot.hpe.com/index.xml and note the https rather than http.

Adding the new updated URL to the XML file get’s us right back into a connected state;

This has obviously been done following the HP and HPE split that was announced a few years ago, but which is obviously just starting to have consequences for things like this.

I hope this helps someone out.

Server 2012, 2012 R2 & 2016 Disable Or Remove Deduplication On A Volume

Update – 11/11/2020: I’ve added a link here to Microsoft’s updated Server 2016 documentation that details deduplication – Understanding Data Deduplication

I just thought I’d post about this, as it’s something I’ve come up against recently, how to disable deduplication on a volume on Server 2012, 2012 R2 or 2016 and inflate the data back to it’s original form. In this example, the volume in question is E:

So let’s start with step one;
DO NOT DISABLE DEDUPLICATION ON THE VOLUME
If you disable dedup on the volume first, you simply stop new data being processed, rather than rehydrating your already deduplicated data.

So with that in mind the, step two would be to run the following command in PowerShell;
Start-DedupJob -Type Unoptimization -Volume E: -Full

When that job has completed, which you can check with the Get-DedupJob
command, you’ll then find that deduplication has been disabled on the disk. Since there’s still the garbage collection job to run, we need to rather counter-intuitively turn dedup back on for the volume with the following command Enable-DedupVolume -Volume E:

Once this is done, the next step is to run the following command to start your garbage collection on the volume;
Start-DedupJob -Type GarbageCollection -Volume E: -Full

Finally, after that, the final step is to turn off dedup on the volume with the following command;
Disable-DedupVolume -Volume E:

And that should save you any unnecessary drama.

Note
When all this is done, the volume will still show in some places like server manager sat at 0% deduplication rate, which is fine, as we’ve turned it off. I would guess this is just a bug, but it seems once a volume has been touched by the deduplication processes, it never goes back to a blank value for dedup rate.