A long time ago in a different era, a young engineer and his friend founded a company called Winternals, which cooked up some tools to look inside the way Windows operated. The utilities were used to understand the way things really worked and went on to provide technologists a variety of ways to troubleshoot issues and optimize performance.
Early and popular tools, which went on to be published on the sysinternals.com website, included RegMon – which monitors what was happening in the Windows Registry – and FileMon, which kept an eye on the file system. Both of these tools could help a user figure out what an application is doing, maybe to check it’s not misbehaving, or seeking undocumented settings where the app might be looking to see if a particular file or registry key existed. Sysinternals made the tools free, and since Winternals was acquired by Microsoft in 2006, they still are.
Co-founder Mark Russinovich wrote lots of other fun and useful stuff. For giggles, he built the first BSOD screensaver and a means to remotely deploy it on someone else’s PC, making them think it had crashed, probably causing them to turn it off and on again. Or the ZoomIt tool that he used to great effect in his keynote speeches which were always a highlight at events like TechEd or Ignite. Watching thousands of geeks queueing for an hour to make sure they can get a seat near the front almost invites Jobs-ian comparisons. For what can be relatively dry content, Mark has a great way of talking about how the technology really works and manages to be quite interesting: even if half of the concepts fly straight over your head, the rest is generally worth listening to – like a Brian Cox lecture.
After joining Microsoft, Mark continued to build SysInternals tools and replaced RegMon and FileMon, with Process Monitor aka ProcMon. Another big utility, Process Explorer, is a kind of shibboleth amongst Windows techies… if you’re still using TaskMan to look under the hood, then you’re just not hard enough.
Despite moving to becoming the CTO for Azure and being a member of the most Technical Fellows, he still has a hand in with Sysinternals, culminating recently in a celebration of the 25th anniversary of the first set of utilities. The day-long virtual conference gave deep dive sessions into a few of the most popular tools, along with an interesting fireside chat with Mark and an overview of Sysinternals tools for Linux. See the recording here.
Oh, and one more thing. The Sysinternals Suite is now available in the Windows Store – so you can grab the latest versions of all the core tools (70 of them… yes, that’s right, 70, and for how much?) with just a few clicks.
Tag: Systems Management
Tip o’ the Week #99 – Is your hard disk just “on”?
One frustrating aspect of a modern PC is when it seems to slow down inexplicably, even when it’s not obviously busy. Sometimes that could be evidenced by the hard disk light flickering a lot of the time, or in extreme cases, solidly lit up. There are a number of reasons why this could be the case – here are some tips on finding out why and maybe what to do about it.
Your PC is just not good enough
A common reason why your disk is really busy (sometimes known as thrashing) is simply that the machine doesn’t have enough oomph to do what it’s being told to. It could be you just don’t have enough of some critical resources, such as memory. If there isn’t enough physical memory (RAM) in the machine, then when an application wants to hold information in memory, something else which is currently in memory needs to be “paged out” – written to disk, temporarily.
That’s all very well, until the application that was using the data that’s just been paged out needs it back -then, something else is paged out, and the previous data is read back in. If you get to the point where you’re really short of RAM, the PC will be thrashing to the point of exclusion to practically everything else. The whole process is a lot like the juggling you might need to do when you’re trying to work with more than two things but are limited to having only two hands.
The only solution to not having enough RAM is to add some more (not always straightforward), or make the machine do less. Look in Resource Monitor (press Windowskey-R then enter “resmon“) under the memory tab, and you’ll see how much of your physical memory is being used. You can also look and see which applications are using up all the memory and maybe think about shutting them down, or making room for them by closing other applications.
Modern day whack-a-mole
Curing performance problems can be like pushing a blockage from one place to another, or like the whack-a-mole fairground game where you hit one issue and another one just pops up elsewhere. If your PC isn’t running out of memory, maybe the processor (CPU) is the bottleneck, or perhaps it’s the disk itself.
If the CPU is slow, then everything else will feel pretty slow – the whole machine will just feel like it’s overworked. If the disk is slow, then the machine will bog down every time it needs to do something disk-intensive. Combine a possibly slow disk with running out of memory, and you’ve got the perfect storm – a PC that is constantly shuttling stuff to-and-fro between memory and disk, and burdening the CPU with all the additional overhead to do so.
There are some things you can do to mitigate the “disk light on” issue, however.
It’s probably Outlook
ToW #96 covered an issue where Outlook might use up a large amount of disk space, and maintaining that kind of volume will put something of a strain on the PC. Outlook is probably the heaviest desktop application most of us use, and if it isn’t hammering your memory or processor, then it will probably be nailing your hard disk.
It’s still worth making sure your hard disk isn’t badly fragmented, a situation where files end up scattered across the surface of the disk in lots of pieces or fragments. If you have a nice clean disk that’s largely empty, then Windows would write a new file out in one big splurge of “contiguous” fragments or clusters.
When files are deleted, all that happens is those clusters that are currently used, get marked as free so they can be over-written in future. If the disk gets increasingly full up, though, it may be that the only free space exists in small chunks all over the place – meaning Windows has to do more work to read and write files.
You can run Disk Defragmentation by going to Start and typing in Disk Defrag, then you’ll be able to run the Defrag process interactively, or schedule it to happen in the background – ensuring that you pick a time that you won’t be really busy on your PC, otherwise it will be the Disk Defrag that’s making the light glow.
To allow fragmentation a better shot of cleaning up the disk, it may be a good idea to close applications that are likely to be using big files (like Outlook, whose OST file is probably the biggest file on your hard disk), and if you have a high degree of fragmentation, then it would be worth getting rid of the hidden Hibernate File on your hard disk – that’s where Windows writes the contents of memory if the battery on your laptop runs out, so it’s gigabytes in size.
To delete your Hibernate File, you need to fire up a command prompt in Administrator mode – go to Start menu and start typing command then right-click and choose Run as administrator.
A quick alternative is to go to Start, then type cmd and press CTRL-SHIFT-ENTER, which tells Windows to run whatever you’ve typed in as an administrator. Try it: you too can run notepad as an admin.
Once you have your admin Command Prompt (denoted by the window title of Administrator C:\Windows\etc), then type powercfg -h off to switch the Hibernate functionality off, and in so doing, ditch the hiberfil.sys file. Once you’ve finished defragmenting, you can switch hibernate back on by repeating with powercfg -h on.
Is your disk just too slow? How would you know?
Finally for this week, there’s a possibility that your disk is just basically slow and there’s not a lot you can do about that short of replacing it. If you look in Device Manager (Start -> then type Device Manager), and expand out the Disk Drives section, you will see what kind of hard disk you have – try Binging the cryptic model number and you might find the specifications of the disk – does it spin at 5,400rpm or 7,200rpm, or is I solid state? Does it have any cache? Maybe reviewers on Amazon et al will pan that model’s performance, or even suggest that a simple firmware upgrade of the disk itself will solve performance issues. [Here Be Dragons – be very careful if you go down this route].
You can see if your disk is the bottleneck to PC performance by looking at the Disk tab in Resource Monitor, expanding out the Storage section. You’ll see Disk Queue Length as one of the columns on there – that’s a measure of how much stuff Windows is waiting for to be read from or written to the disk. If the machine is busy and doing a lot of disk work, this might be legitimately quite high (maybe double figures) but if it’s sustained then it could be illustrating that the disk is struggling to keep up with the requests the PC is making of it.
That could be a symptom that it’s just not quick enough, but it could be a forebear of the disk being faulty – maybe the reason it’s taking ages is because it’s physically about to fail. Best get it checked out.
And don’t forget ReadyBoost
After sending this original tip above within Microsoft, a reader (Rob Orwin) responded to remind me about ReadyBoost – so I added the following in a subsequent tip. In Rob’s own words.
Whenever my computer is being a bit sluggish, I stuff two memory sticks, which I always carry around in my laptop bag, in the USB ports and as if by magic everything starts running as if it’s on steroids. It’s instantaneous as you only need to dedicate a device to ReadyBoost once, and then every time you put it in the USB drive it gets automatically used as pseudo-RAM. Another option is to get a ReadyBoost compatible SD card and stick it in the laptop’s SD card slot – which pretty much no one ever uses. [and 4Gb SD cards can be picked up for a few £s]
Yes, it’s not quite as fast as actually adding RAM but it’s a lot easier and a great deal faster than having to use the HDD for virtual memory. I learnt this from a friend who’s a graphic designer. She uses ReadyBoost whenever she needs to do huge batch operations in PhotoShop. The ReadyBoost feature was apparently the main reason why she got her company to buy her a PC instead of a Mac. When a Mac is out of RAM, it’s out of RAM.
I even use ReadyBoost at home to run Windows 7 on a laptop that is 12 years old and has 256Mb RAM.
Exchange 2010 beta & high availability strategies
Today, the Exchange team released details of Exchange 14, now to be known as Exchange Server 2010. [download here]. There’s plenty of new stuff in the box, but I’m just going to look at one: high availability & data replication.
[My previous missives on Exchange 2007 HA are here, here and here]
There are some interesting differences between 2007 and 2010, particularly in the way databases are handled and what that means for clustering.
THERE IS NO SINGLE COPY CLUSTER ANY MORE.
Single Copy Clusters, or the traditional way of deploying Exchange onto a Windows Cluster with several nodes sharing a copy of the data held in a central SAN, have quite a few downsides … like there being that Single Copy, or the fact that the storage hardware is typically complex and expensive.
There are other pretty major changes, like storage groups going away (it’s just a database now, a move that Exchange 2007 previewed by the advice that you should only have a single DB per SG), or the fact that databases are now the unit of failover (rather than the whole server…), or the ability now to install multiple roles on servers providing high availability – so you could deploy highly available, clustered/replicated environment to a small number of users, without having lots of boxes or VMs.
Oh, Local Continuous Replication goes away too…
Well, reading the documentation explains a bit more about how Exchange 2010 will change the way that high availability can be achieved – no more the need for a MSCS cluster to be set up first should make it simpler, for one. From that site:
Changes to High Availability from Previous Versions of Exchange
Exchange 2010 includes many changes to its core architecture. Two prominent features from Exchange 2007, namely CCR and SCR, have been combined and evolved into a single framework called a database availability group (DAG). The DAG handles both on-site data replication and off-site data replication, and forms a platform that makes operating a highly available Exchange environment easier than ever before. Other new high availability concepts are introduced in Exchange 2010, such as database mobility, and incremental deployment. The concepts of a backup-less and RAID-less organization are also being introduced in Exchange 2010.
In a nutshell, the key aspects to data and service availability for the Mailbox server role and mailbox databases are:
- Exchange 2010 uses an enhanced version of the same continuous replication technology introduced in Exchange 2007. See the section below entitled “Changes to Continuous Replication from Exchange Server 2007” for more information.
- Storage groups no longer exist in Exchange 2010. Instead, there are simply mailbox databases and mailbox database copies, and public folder databases. The primary management interfaces for Exchange databases has moved within the Exchange Management Console from the Mailbox node under Server Configuration to the Mailbox node under Organization Configuration.
- Some Windows Failover Clustering technology is used by Exchange 2010, but it is now completely managed under-the-hood by Exchange. Administrators do not need to install, build or configure any aspects of failover clustering when deploying highly available Mailbox servers.
- Each Mailbox server can host as many as 100 databases. In this Beta release of Exchange 2010, each Mailbox server can host a maximum of 50 databases. The total number of databases equals the combined number of active and passive databases on a server.
- Each mailbox database can have as many as 16 copies.
- In addition to the transport dumpster feature, a new Hub Transport server feature named shadow redundancy has been added. Shadow redundancy provides redundancy for messages for the entire time they are in transit. The solution involves a technique similar to the transport dumpster. With shadow redundancy, the deletion of a message from the transport database is delayed until the transport server verifies that all of the next hops for that message have completed delivery. If any of the next hops fail before reporting back successful delivery, the message is resubmitted for delivery to that next hop. For more information about shadow redundancy, see Understanding Shadow Redundancy.
SMSE – a System Center light hidden under a bushel
SMSE – pronounced (in the UK at least) as ‘Smuzzy’, short for Server Management Suite Enterprise – is a licensing package from Microsoft, which can be an amazingly effective way to buy systems management software for your Windows server estate.
If you’re planning to virtualise your Windows server world, then SMSE is something of a no-brainer, since buying a single SMSE license for the host machine allows you to use System Center to manage not just the host but any number of guest (or child) VMs running on it.
Combine that with the license for Windows Server 2008 Datacenter Edition, which allows unlimited licensing for Windows Server running as guests, and you’ve got a platform for running & managing as many Windows-based applications servers as you can squeeze onto the box, running on any virtualisation platform.
System Center is the umbrella name given to systems management technologies, broadly encompassing:
- Configuration Manager (as-was SMS, though totally re-engineered), which can be used for software distribution and “desired configuration state” management … so in a server example, you might want to know if someone has “tweaked” the configuration of a server, and either be alerted to the fact or maybe even reverse the change.
- Operations Manager (or MOM, as it was known before this version), performs systems monitoring and reporting, so can monitor the health and performance of a whole array of systems, combined with “management packs” (or "knowledge modules” as some would think of them) which tell Ops Mgr how a given application should behave. Ops Mgr can tell an administrator of an impending problem with their server application, before it becomes a problem.
- Data Protection Manager – a new application, now in 2nd release, which can be used either on its own or in conjunction with some other enterprise backup solution, to perform point in time snap shots of running applications and keep the data available. DPM lets the administrator deliver a nearer RTO and more up to date RPO, at very low cost.
- Virtual Machine Manager – a new server, also in 2nd release, which manages the nuts & bolts of a virtual infrastructure, either based on Microsoft’s Hyper-V or VMWare’s ESX with Virtual Center. If you have a mixture of Hyper-V and VMWare, using VMM lets you manage the whole thing from a single console.
It’s easy to overlook managing of guests in a virtualised environment – the effort in doing such a project typically goes into moving the physical machines into the virtual world, but it’s equally important to make sure that you’re managing the operations of what happens inside the guest VMs, as much as you’re managing the mechanics of the virtual environment.
I’ve used a line which I think sums up the proposition nicely, and I’ve seen others quote the same logic:
If you have a mess of physical servers and you virtualise them, all you’re left with is a virtual mess.
Applying the idea of SMSE to a virtual environment, for one cost (at US estimated retail price, $1500), you get management licenses for Ops Manager, Config Manager, VMM and DPM, for the host machine and all of its guests.
Think of a virtualised Exchange environment, for example – that $1500 would cover Ops Manager telling you that Exchange was working well, Config Manager keeping the servers up to date and patched properly (even offline VMs), VMM managing the operation of the virtual infrastructure, and DPM keeping backups of the data within the Exchange servers (and maybe even the running VMs).
Isn’t that a bargain?