What is it about Windows? Part II
Click Here to read Part I
The Registry
What is the registry? Well, it's a database of system and application settings. Put simply, it's a database! Now, let's think about that for a moment. Microsoft already makes a REAL database named MSSQL. There are many third party databases including MySQL, Oracle, Ingres, PostgreSQL, NDBM, and the list goes on. So, why did Microsoft opt for some no-named crappily written kludge-of-a-database for such a critical system component as the registry? I mean, who's brilliant idea was this?
So, you're probably thinking, "But some of those databases weren't available back when the registry was introduced". Perhaps MySQL wasn't available then, but MSSQL, Oracle, NDBM and several others were all available. In fact, they even had MSAccess with mdb files or they could have even used Dbase (DBF files), both of which ship with actual repair tools. In fact, all of the databases mentioned support repairing corrupted tables. Why did Microsoft choose this so-called database named, I think, 'SAM'? I'll never know. This database system is extremely easy to corrupt and impossible to repair. There aren't any repair tools available (or none that I know that actually work, that is). So, if the registry hive gets corrupted, Windows won't even boot. Better have a backup of the hives or you'll be backing up, formatting and reinstalling not only the operating system, but all of your applications all over again. A typical MS standby when things just don't work right.
If you're wanting to run a mission critical production system, would you trust your critical data to an operating system that hangs its day-to-day existence on a database that could become corrupted at any turn? This is part of the reason I trust UNIX/Linux a whole lot more. Of course, there are critical files on UNIX that, if corrupted, could prevent UNIX from booting. The difference between UNIX and Windows in this regard, however, is that in the UNIX instance you can easily repair or replace those files and bring the entire system back easily. With Windows, if the registry gets corrupted and you can't boot, you can't easily fix this issue. Windows repair processes cannot fix a corrupted registry. If you don't happen to have a recent backup of the registry, you can't easily drop a new one in place. If you drop a brand new registry in place, it may not boot at all, or it could boot up showing no applications loaded (depending on the corrupted hive).
With the introduction of System Restore in Windows XP, this can aid in recovering the registry, but only if System Restore has done its job and you have a recent backup. I find, however, that System Restore doesn't always do its job properly and that the registry files are weeks old or not relevant. Worse, you can't tell if the hive files are the 'most current' simply by looking and there are no tools to help you find out. Also, 9.9999 times out of 10, you can't simply use the Automated System Recovery (ASR) to restore these files anyway. Every time I've had to recover registry hive files from the System Restore, I've had to do it through a very tedious manual process with the Recovery Console (extremely time consuming). Because of the excessive amounts of time involved in manually using the Recovery Console, I've found another alternative recovery system that boots a live version of Windows from CD so I get a full WIndows environment instead of just an extremely limited shell.
It just amazes me that in this day and age, Windows is still stuck in 1984 with a database that has no recovery tools, no repair tools and no easy way to fix problems that arise. In spite of Microsoft, there is a flourishing community of hackers (the good kind) who have written recovery systems and tools to aid (although not repair) hive corruption. It's unfortunate that as a system administrator, I have to rely on third party tools to fix problems that arise regularly... and Microsoft releases absolutely nothing in this area. There are loads of knowledgebase articles related to corrupted registries, but simply no tools from Microsoft to aid in fixing these issues.
It's most definitely about time for Microsoft to get rid of the extremely old and unreliable registry database system and replace it with an actual database that has journaling, recovery and repair tools.
System Administration + Datacenter Reliability
As a systems administrator and overall IT guy, I have worked with many systems in my 15 year career. I will say that Microsoft's operating systems are the absolute weakest when it comes to systems administration. For everyday easy stuff, like adding users, printers, applications, etc... Microsoft keeps it in the GUI world and makes it 'friendly'. However, most things related to system administration are not in this GUI world. When things break (hard drives die, cards fry, CPU fans break, files go missing, corrupted MFT, etc), there are extremely few tools to help manage, diagnose or fix these issues. Microsoft does provide some limited command line tools to aid in fixing some limited issues. But, once you've gotten to the point where you need a command line tool (which most major failure issues require), you're likely well beyond the tools that are included with Windows (or in the resource kit). Microsoft has basically limited its exposure and systems admin tools to the operating system in GUI mode.
Windows lack of hardware diagnostics and proactive failure monitoring
It is left up to the hardware manufacturer to provide tools/drivers to monitor the hardware for failures as Windows has no provisions for this at all. Even still, the hardware monitoring tools provided by third parties are usually so limited as to be worthless.
For example, most hard drives today support S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology). This technology has been available for at least 8-10 years. Simplistically, it is a system by which the drive itself keeps a log of how many errors, how often and where on the drive the errors exist. If the errors hit a certain threshold, the drive is marked as 'failed' internally. Yet, Microsoft still does not support S.M.A.R.T. technology natively in the OS. There may be add on softwares that can help this issue, but if, for example, your S.M.A.R.T. drive fails, Windows doesn't report the error at all. It just continues to try to load data from this drive failed or not!
It is this lack of proper simple systems administration tools, diagnostics and proactive hardware monitoring that makes Windows unreliable for data center operation. Basically, you hang your system out to do work, but if things fail, Windows never tells you. Midrange UNIX servers have had these basic diagnostic and monitoring tools for years. Yet, Windows has never gotten there. I've worked on systems that were designed in the late 80's that had monitoring systems available for such things as autoremapping of bad memory sectors, autofailure of bad memory sticks, notifications of failed drives (and the system actually tries to be proactive and detect such things). These were systems that I used back in the early 90's and likely designed in the 80's! Here we are in 2007 and Windows still doesn't have tools and diagnostics such as these! ... And Microsoft has the gall to call one of their products a 'Datacenter' version!
(continued in Part III)

No comments:
Post a Comment