While designing a TS farm for one of my clients I recently got to know too much about the Windows 2003 Terminal Services Session Directory. Windows 2008 renamed this to Terminal Services Session Broker.  This little service basically tracks which user is logged onto which terminal server within a TS farm. Without this session directory, users’ sessions can get orphaned, resulting in wasted resource consumption.

In summary the basic configuration is to have multiple TS servers in a farm. These servers can be load balanced using DNS round robin, hardware load balancers (e.g. Alteons), software load balancers (e.g. HAProxy – very good software BTW!!) or Windows NLB. Each of the TS servers are configured with the TS session directory.

When a user connects to the TS, the TS checks for an existing session for the user by contacting the TSSD. If the user does not have a session, the TS allocates a session and allows the user to attempt to logon. If an existing session if found within the farm, the TS issues a RDP redirect (the RDP client needs to be XP or more recent to support this) to the server with the existing session.

This is all good and well, but what happens with the TSSD is unavailable (e.g. during patching or other maintenance windows)?

  • Contrary to some documents, users can still logon to the TS servers with the TSSD being unavailable.
  • Some documents indicate that the TSSD database is cleared when the TSSD service is restarted. This is not the case either. The TSSD service stores the existing sessions in a jet db under %systemroot%system32tssesdir and will reload the database if it is deemed fresh enough. (I never did find out the definition of “fresh enough”). The tssdis.log file (which needs to be enabled) shows this behaviour.
  • MS documents also indicate there is no way to move this database (well for Windows 2003 anyway).

The problem arises when users logon or logoff from the TS servers while the TSSD is unavailable as the TSSD database will have no record of the users’ sessions – again potentially resulting in orphaned sessions.

 The Microsoft documented method to overcome this is to cluster the TSSD service. That sounds good, but what about the TSSD database being under c:windowssystem32? Well the long and the short of it is that the TSSD is cluster aware. However, MS fail to mention that by running the TSSD on server with MSCS installed the TSSD service assumes it will be clustered and expects to be in a cluster service group. The next assumption is makes is that it can use the disk resource in the service group.

In short, the TSSD service can be clustered. When part of the cluster the TSSD service transparently moves its jet database onto one of the shared drives within the service group to provide a HA TSSD service. There is a MS document describing the addition of the TSSD into a cluster but it does not mention that the database is simply silently moved.

The setup document is at http://www.microsoft.com/windowsserver2003/techinfo/overview/sessiondirectory.mspx

(Apparently within Windows 2008 the location of the TSSB database can be changed by altering HKEY_LOCAL_MACHINESYSTEMCurrentControlSetServicesTssdisParameters)

A decent article explaining the TSSD concept more fully is available at http://www.brianmadden.com/blogs/brianmadden/archive/2004/11/30/understanding-the-terminal-server-session-directory.aspx Note that no mention of the tssesdir directory being moved, when clustered, is made however.

Well I am trying to upgrade MailScanner on one of my servers. Alas, these upgrades are never straight forward. The MailScanner upgrade went reasonably smoothly but when it came to starting MailScanner the problem showed itself. Starting MailScanner resulted in the following error:

root@host:/opt/MailScanner/bin# ./MailScanner
 is only avaliable with the XS version at /usr/local/share/perl/5.8.7/Compress/Zlib.pm line 9
BEGIN failed--compilation aborted at /usr/local/share/perl/5.8.7/Compress/Zlib.pm line 9.
Compilation failed in require at /opt/MailScanner/lib/MailScanner/SA.pm line 42.
BEGIN failed--compilation aborted at /opt/MailScanner/lib/MailScanner/SA.pm line 42.
Compilation failed in require at ./MailScanner.orig line 110.
BEGIN failed--compilation aborted at ./MailScanner.orig line 110.
root@host:/opt/MailScanner/bin#

The root of the error is the message about “is only avaliable with the XS version” within Compress/Zlib.pm line 9. There are a few posts relating to this but no real resolutions. One of the common suggestions, which works for a few people it seems, is to force reinstall the Scalar::Util Perl module. Use “force install Scalar::Util” from within the CPAN shell. This didn’t work for me.

I tried a short test script to try reproduce the problem:

use Scalar::Util qw(dualvar);
use Compress::Zlib;
print "$Scalar::Util::VERSIONn";
my $foo = dualvar 10, "Hello";
print "$foon";

This worked which indicates that the XS functionality is working properly.

I figured it might be related to the load order of the libraries. For MailScanner 4.79.11-1 I managed to work around this problem by inserting

use Scalar::Util qw(dualvar);

in at line 38 (just after “require 5.005;”). This appears to load the Scalar::Util XS functions before MailScanner alters the library search paths.

Hopefully this helps someone else with this problem.

I recently encountered a problem while upgrading the iLO2 firmware on a HP DL360G5 server. It seems if you reopen the remote console window while remotely updating the iLO2 firmware you can corrupt the firmware upgrade process. Hmm. This caught me out. When I rebooted my server it sat there not beeping or doing anything for a couple of minutes. I thought I had toasted it. I started attempting ROM recovery process (flipping dip switches on the motherboard btw) but this didn’t help. Removing the RAM caused the server to beep so it was obviously not totally dead. During one of my “boot- nothing-google-retry” cycles I left the server running. After about 10 minutes the server beeped and continued booting up – BUT the POST results showed no iLO2. So I figured it was a iLO firmware problem (which until then I had thought it was a ROM problem) which changed my google terms. I eventually found HP document c01850906.

The document outlines a process for recovering from a corrupt iLO2 firmware update.

The document describes it better but the steps are pretty much the following:

Boot off maintenance CD (have patience, the boot might take 10 or so minutes)

run firmware update
ctl-alt dbx

alt-ctrl-1  enter
cd /mnt/bootdevice/compaq/swpackages
rmmod hpilo
rmmon ilo
sh CP012108.scexe –direct

Message about firmware 1.81 and programming flash

reboot

The full process is at:

 http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c01850906

Document c01850906

Recently I needed to do some tests on some Cisco lab kit I have access to. The tests were with full BGP feeds – which cannot be handled by the lab routers due to minimal memory specifications. I set about investigating the memory upgrades and was shocked (again!) at the price of real Cisco RAM. More worryingly, some of the on-line shops I looked at did not convince me they were selling genuine Cisco RAM rather than “compatible”.

Anyway, I decided to look into getting “compatible” memory since these are lab routers, out of warranty and are not under SmartNet contract. Your mileage may vary with this information and you may invalidate your SmartNet or warranty status. That said…

I needed to upgrade Cisco 1801, Cisco 1841, Cisco 2801 and Cisco 2811 routers.

The 1801 went simply from 128MB to 384MB with a Crucial memory SO-DIMM part CT3264X335. The 1801 takes a 200pin SO-DIMM DDR 2700, CL2.5, Unbuffered, Non-ECC, DDR333 memory module. A slower DDR DIMM module may work, but this DDR2700 one worked for me.

The Cisco 1841 took 144pin SDRAM SO-DIMM, PC133, non-ECC, unbuffered. This is the same as the 2801.

The Cisco 2801 takes 144 pin SDRAM SO-DIMM, Unbuffered, non-ECC, PC133. A list of compatible parts should be: KVR133X64SC3/256, MT8LSDT3264LHG-133, THLY25N01C75, CT32M64S8W7E, HYS64V32220GDL-7, MH32S64PFJ6L, EBS26UC6APS-75, HYM72V32M636BT-6, THLY25N01B75, NT256S64VH8A0GM-75, MT8LSDT3264HG-133

The 2811 can be upgraded from 256MB to 512MB or 768MB. The router takes 184-pin unbuffered, ECC, DDR PC2700 ram. I initially tried a Crucial part (CT6472Z335 which was a module MT9VDDT6472AY-335F1) which did not work. The MT9VDDT6472AY is a single rank 512MB module. I then tried a dual rank 512MB module (KVR266X72C2/512) which worked. I would expect MT18VDDT6472AG or similar (with 18 modules rather than 9 modules) dual rank 512MB module to work.

During this investigation and upgrade, it would appear that the 2500, 2600, 3600 take the same flash modules too. They do support different flash module sizes, but realising that the modules are compatible makes troubleshooting or replacing parts simpler. Of course, you have to re-flash the appropriate IOS!

Hopefully this will help someone out there! Of course, using non-Cisco memory may get you into hot water if you ever ring Cisco for support! You have been warned.

Well three weeks till the big one! I am starting to get things ready and ensuring that I have enough of  everything that will be going with me.

Yesterday I did a 150km cycle – the Ride to the Horns cyclosportive around Aylesbury. Quite a tough course with multiple short sharp climbs. I’m sure I saw the gradient show 18% on my cycle computer a couple of times. The ride was enjoyable but seemed to drag on somewhat. I suspect that this was due to riding in a group, taking too long at the feed stations and most noticeably due to one of my fellow riders getting a puncture that took around 45 minutes to fix. Long story short, we got through three tubes, two CO2 canisters and a new tyre! I would recommend this ride to someone looking to get some hill climbing in. Apparently the route takes in most of the decent climbs in the area. My legs are feeling it today!

The bike went well. New wheels are a treat. The gearing seems to be good, apart from dropping the chain twice while shifting back onto the small chain-ring, so I might have to tweak that. The rear brake still rubs a touch so I may try and tweak that too.

My on going seat post creak seems to be resolved – for the moment at least as I suspect it will return. My Felt B2’s seat post has been creaking for a while. The bike shop thought it was the saddle clamps which got some oil and attention but this failed to remedy the squeak. I then used carbon paste on the post and a drop of oil on the seat post clamp screws which helped for a bit – but the creak returned. I have now put some electrical tape around the lip of the seat post opening on the frame which means that the seat post clamp doesn’t rub on the frame. This seems to have resolved the creak for the moment. I may undo the clamp and see the state of the tape – it’s possible that it will need attention regularly which will be fine if it keeps the squeak at bay!

I managed to get a 3900m swim in on Saturday in the 50m outdoor pool. Lovely and clear water at a decent temperature too. Not a bad swiming venue, if a little bit far to goto regularly.

Anyway a slightly easier week for me this week – Hurray!

Hi,

A technical post for a change… For a while now I have noticed that my Vista laptop has been running with a svchost.exe process at anything from 50% to 100% CPU busy time. I tracked it down to the plug and play and DCOM svchost.exe process. Using Google to investigate this issue revealed that a number of people had the same problem – busy CPU bound svchost.exe processes.

Common causes seem to be bad sound card drivers or VMware’s workstation product’s vNetwork services.  I tried various things to resolve my problem; none of which worked.

Last night I managed to resolve my problem of a busy svchost.exe process. Turns out to be caused by the Garmin Ant Agent needed for my Garmin 405! Quiting the Ant Agent immediately drops the svchost.exe process back to normal and starting it causes the svchost.exe process CPU utilisation to jump skyward. This only happens with the Ant Agent USB device removed. With the USB device inserted the computer appears to be OK. I guess the root cause is the Ant Agent being over zealous with it’s polling to see if the USB device has been inserted.

So for now I will not have the Ant Agent running unless I need it to download a workout from my Garmin 405.