WindowsDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


AddThis Social Bookmark Button

PC Hacks for Windows
Pages: 1, 2

Hack #40. Partition and Format Wisely

Partition your drive for efficient file play.

Do you know how much of your hard disk space you’re wasting? If you are using Windows 95, 98, 98SE, or ME (even 2000 or XP without NTFS), you could be wasting at least 10% and possibly up to 30% of your hard drive space due to suboptimal allocation unit sizes.



Disk partitions are logical regions of a disk drive containing a filesystem. Partitioning a hard disk is like subdividing parcels of land and dictating how those parcels will be further subdivided into lots or common areas. Partitioning establishes how big the parcel of land to be used will be. The filesystem within a partition contains files and directories that are organized in what we see as a hierarchical tree structure.

We use the filesystem and the tools and support for it within an operating system to put things into and take them out of the “parcels” (called clusters) of space allocated in the partition. If we put a small house in a large parcel (cluster) we have a lot of empty space. In terms of land, we could landscape or farm that space, but in terms of filesystems, we get only one “house” per cluster. One house may occupy many clusters, but nothing else can use that cluster, even if there is empty space within it.

The Different Partition Types

The different operating systems available for PCs provide support for various partition types and filesystems. There are five types of partitions you will encounter on x86 machines:

Primary

A primary partition is the first and often the only partition on a hard disk drive, occupying all available disk space. A primary partition is required for DOS and Windows 9x-Me, but Windows NT and later, as well as Linux, can boot from an extended partition. The primary partition can contain only one logical drive. You may have up to four primary partitions, or a maximum of three primary and one extended partition.

Extended

An extended partition can only exist if there is at least one primary partition. This partition may occupy the remainder of the drive’s free space or only a portion of it; the remainder may contain either NTFS or non- DOS partitions. The extended partition may contain one or more logical drives.

Logical

Within an extended partition, at least one logical partition must be made if a DOS or Windows filesystem will access the space as a drive letter. If the extended partition is created but no logical partitions are created within it, any operating system may lay claim to the space or change the extended partition into a non-DOS partition.

NTFS

An NTFS partition is typically created and used by Windows NT, 2000, XP, or 2003. DOS and Windows 9x-Me utilities have no direct control over or access to NTFS partitions. Each NTFS partition may contain logical partitions and drives of their own.

Non-DOS

A non-DOS partition is any partition type not supported by DOS or Windows, which could be any of the different versions of Linux, FreeBSD, SunOS, or others. Those specific operating systems may allocate space and filesystem support through numerous other filesystem types.

TIP: The FDISK utility that comes with MS-DOS and Windows 95, 98, and Me refers to filesystems created by other operating systems (such as Linux or FreeBSD) as non-DOS partitions.

On most IBM, HP, Compaq, and Dell systems, the first partition may be a non-DOS rather than a primary partition, may have an MBR, and may be made Active, but instead of an operating system may contain a boot loader to access diagnostic, system setup, and recovery files. This partition may consume a few tens of megabytes to a few gigabytes, depending on what the vendor wants to store in it for later use.

The BIOS in these systems will present a boot loader option to press a specific key to access these features and boot from this partition, and if that key is not pressed within a preset amount of time the BIOS will set another partition (primary, NTFS, or other non-DOS partition) to Active to load the operating system. Alternatively, some systems require use of a recovery boot diskette or CD that provides access to the recovery files on the recovery partition.

Each partition type may be either the Active (bootable) partition or not. An Active partition is not automatically a system or bootable partition, but must be made into a system or bootable partition by whichever operating system you install.

The Active partition is the one the PC system BIOS looks for in order to find bootable files and an operating system. To be bootable, the Active partition must have a Master Boot Record and must contain the bootable operating system files to start. The remainder of the operating system’s non-boot files may reside on another partition and logical drive. DOS and Windows 9x– Me will only boot from an Active and primary partition.

TIP: Third-party multiboot utilities like BootMagic, LILO, GRUB and System Commander change the Active partition to select which operating system will boot up.

The boot loader in Windows NT through Windows Server 2003 can boot DOS or other versions of Windows in the same Active partition.

Windows 2000 and Windows Server 2003 support two types of disk configurations —Basic and Dynamic—created with Windows Disk Management console. A Basic disk can use the partition tables supported by respective versions of Windows, MS-DOS, and Windows NT. A Basic disk, the typical type of partition you use, can hold primary partitions, extended partitions, or logical drives.

Basic volumes include partitions and logical drives and may contain volumes created using Windows NT 4.0 or earlier, such as volume sets, stripe sets, mirror sets, and stripe sets with parity. In Windows 2000, these volumes are called spanned volumes, striped volumes, mirrored volumes, and RAID-5 volumes, respectively.

Like a Basic volume, a Dynamic disk can hold simple volumes, spanned volumes, mirrored volumes, striped volumes, and RAID-5 volumes. However, with Dynamic storage, you can perform disk and volume management without having to restart the operating system.

In most cases, you will encounter only primary and extended partitions, or NTFS partitions with Basic disks.

The Different Filesystems

After you’ve partitioned a drive, you need to decide which filesystems will actually live on it. There are dozens of different filesystems you may encounter in the wild. Here are a few:

DOS FAT-12, FAT-16, FAT-32

DOS filesystems known as FAT-12, FAT-16, and FAT-32 (although no version of DOS supports FAT-32) have evolved from the early days (when only diskettes were available) to support increasingly larger hard drives. The “FAT” in the filesystem names stands for File Allocation Table, which is something every filesystem has in some form but that stuck as the exclusive name for DOS filesystems. The numeric designation refers to how many bits of information are available to identify the clusters where files are stored: 12 bits allows 4,096 clusters/files (including directories), 16 allows 65,536 clusters/files to be kept track of, and 32 bits allows up to 4.2 billion clusters/files to be kept track of in a single partition. The directory entry in a FAT filesystem keeps track of start and end clusters for each portion of a file that is stored, as well as the filesystem attributes—Read-Only, Archive status, Hidden, and System files. No file access security is provided for in a DOS/FAT file system.

NTFS

One or more NTFS partitions may exist on a hard drive, with or without Primary and Extended or Non-DOS partitions. NTFS is a journaling filesystem, meaning that it records information about filesystem activity to improve recoverability in the event of a system crash. NTFS uses two methods to keep track of directories and files: first, a Master File Table (MFT) that “knows all” about the directories or folders and files on the disk and, second, the files themselves, which store information about the files. In fact, if a file is small enough, it is contained within the MFT itself rather than on a separate area of the disk. Directories in NTFS store information about the directory, not the files in them. The information about a file in an NTFS filesystem stores not only filename, location, and attributes, but also security information. The number of files and directories NTFS can store is almost limitless, unless the Master File Tables grow so large from keeping track of so many files that they consume all the free space on the drive.

ext, ext2, and ext3

The ext filesystems are used by Linux. ext supports drives and individual files as large as 2 GB. ext2 supports partitions as large as 4 Terabytes and files as large as 2 GB (Linux 2.2) and over 2 GB for Linux 2.4 and above. ext3 is a journaling filesystem compatible with ext2. A journaling filesystems makes a record of filesystem changes before they are made, which adds greater reliability to the file activity.

reiserfs

reiserfs is a journaling filesystem with exceptional granular security capabilities suitable for military applications, developed under DARPA (Defense Advanced Research Projects Agency) sponsorship. Significant information on reiserfs can be found at http://www.namesys.com.

jfs

jfs is a journaling filesystem for Linux servers developed by IBM. More information about jfs can be found at http://oss.software.ibm.com/developerworks/opensource/jfs/.

The FAT and NTFS filesystems track disk space use in predefined allocations of clusters. Clusters are made up of one or more 512-byte units of storage space. Under the FAT-16 filesystem, the maximum number of clusters is determined by a 16-bit numbering system and a predetermined maximum number of 512-byte sectors of space per cluster. These days, the only place you are likely to encounter FAT-16 partitions is on much older computers, flash memory cards, and embedded systems, but you can still create a FAT-16 filesystem if you need to access the space through an older (6.22 and earlier) version of DOS.

Under these design constraints and limitations, the largest possible disk partition in a FAT-16 filesystem may consist of 65,536 clusters of data. The maximum allowable cluster data size is 64 sectors per cluster, or 32,768 bytes. In total, the maximum size of a FAT-16 disk partition is approximately 2,048 megabytes (2 gigabytes). (The previous partition size limitation under early DOS versions using the FAT-12 filesystem was a meager 32 megabytes.) Table 5-1 lists the FAT-16 cluster sizes for various partition sizes. By the way, a cluster may contain only one file reference, so there is also a limitation on the total number of files a FAT filesystem can keep track of: 65,536 files for FAT-16.

Table 5-1. Cluster sizes for FAT-16 partitions

Partition size FAT-16 cluster size
0–127 MB 2 KB = 2,048 Bytes (4 sectors)
128–255 MB 4 KB = 4,096 Bytes (8 sectors)
256–511 MB 8 KB = 8,192 Bytes (16 sectors)
512–1,023 MB 16 KB = 16,384 Bytes (32 sectors)
1,024–2,047 MB 32 KB = 32,768 Bytes (64 sectors)

TIP: With a FAT filesystem it is possible under any filesystem to run out of disk space not because you’ve filled up your entire drive with files but because you’ve used up the number of file allocations the filesystem provides, so the more clusters you have, the higher the total number of files you can store.

The FAT-32 filesystem supported under Windows 95 OEM SR2, 98, 98SE, Me, NT (SP4 and later), 2000, and XP can accommodate disk drives up to 4 terabytes in size (32 GB under Windows 2000), with as many as 4 billion clusters/files with cluster sizes of 32 KB. For very small (512-byte) files this results in less than 2% file storage efficiency and a gross waste of space, indicating that partitioning your drive to use smaller cluster sizes is advisable for many of us. FAT-32 limits the maximum file size to 2 GB, which is adequate for most of us, but if you expect to work with larger files—large databases for example—you must use NTFS. Table 5-2 lists the cluster sizes for FAT-32 partitions.

Table 5-2. Cluster sizes for FAT-32 partitions

Partition size FAT-32 cluster size
0-259 MB 512 bytes (1 sector)
260-511 MB 4 KB (8 sectors)
512-8,191 MB 8 KB (16 sectors)
8,192-16,383 MB 16 KB (32 sectors)
32,768 MB-2 Terabytes 32 KB (64 sectors)

The NTFS also allocates disk space in increments or units as little as 512 bytes—which, coincidentally, is the size of a single sector of disk space. Like the FAT filesystem, unless you select an allocation unit of 512 bytes when partitioning and formatting, sectors are usually combined to make up clusters, but NTFS has a large enough numeric range to keep track of a lot of clusters, so clusters can be as small as a single 512-byte sector or made up of multiple sectors. The maximum number of nits—clusters or sectors—that NTFS can keep track of provides for maximum disk space capacities in the order of terabytes. The most space any file will waste is only some portion of the clusters—as is evident with FAT filesystems. As shown in Table 5-3, NTFS uses clusters to track file storage, but these clusters are much smaller than the clusters of FAT-16 or FAT-32 filesystems. It is possible to reformat NTFS partitions using XP’s Disk Management console to use smaller or larger cluster sizes at your discretion.

Table 5-3. Cluster sizes for Windows NTFS partitions

Partition size NTFS cluster size
0–512 MB 512 bytes (1 sector)
512–1,024 MB 1,024 bytes (2 sectors)
1,024–2,048 MB 2,048 bytes (4 sectors)
2,048–4,096 MB 4,096 bytes (8 sectors); 8,192 bytes and larger possible
4,096–8,192 MB 8,192 bytes (16 sectors)
8,192–16,384 MB 16,384 bytes (32 sectors)
16,384–32,768 MB 32,768 bytes (64 sectors)
> 32,768 MB 65,536 bytes (128 sectors)

In most cases, when you cannot predict the general size or types of files you will be saving, large or small, it is preferable for storage efficiency to use the smallest cluster size possible. By large files I mean those measured in tens or hundreds of megabytes, something that really chews up disk space that you want to access with as few repetitive disk operations as possible (such as huge database files that may be found on servers, or video files). Most of us, unless we collect a lot of audio and video files, have mostly small datafiles— far less than a megabyte—including all the text and graphics from web pages, email, and average documents and spreadsheets.

Depending on the disk-caching read-ahead method and amount of cache used within a specific disk drive, using a 1 KB cluster size under NTFS will require the equivalent of 1,000 discrete disk accesses to read or write a 1 megabyte file versus 250 accesses with a 4 KB cluster size, but the alternative is having your average 1–2 KB web page and little (50–256 byte) graphics files chewing up 2–4 KB more disk space than they need to. A measly 2–4 KB may not seem like much, but if you let Internet Explorer’s Temporary Internet File caching grow to 512 MB or larger, you’re easily wasting 256 MB of disk space on a bunch of web files you may never see again anyway. Figure 5-1 illustrates the size of the datafile on an NTFS volume with 8,192-byte (16-sector) clusters (483 512-byte sectors or 30.1875 8,192-byte clusters) and the amount of disk space the file actually consumes (488 512-byte sectors or 30.5 8,192-byte clusters). This file ends up wasting 0.5 clusters or 4,096 bytes of disk space. If you add up a lot of datafiles wasting half a cluster or more—especially if the cluster sizes are 8, 16, 32, or 64 KB each—you end up with a lot of unusable disk space, also known as “slack” space, occupied by absolutely nothing of value.

Figure 5-1
Figure 5-1. Windows File Properties reveals actual data size versus disk space used

It is important to note that the disk operations and performance seen by the operating system can be significantly different than what goes on inside the drive itself. The drive, of course, has to read or write all sectors containing data the operating system wants and will be doing all of the mechanical work to find each and every sector needed, be they contiguous (unfragmented files) or spread out in different places on the drive (fragmented files).

The operating system’s file and directory scheme keeps track of files in the file tables (the directory of the Master File Table) and tells the drive where to get file fragments from. The drive only knows how to find tracks and sectors and doesn’t know where specific files are. If the drive’s firmware and caching scheme are smart, it will optimize file placement and file reads by itself. If the drive has a large internal cache, it will take in all or most of the operating system’s commands, tell the OS it’s “got it,” and go off and do the work, releasing the OS to do other things. Someday perhaps we’ll have operating- system-aware disk drives or specific disk drives that filesystems offload functions to so the OS can be an OS rather than a file manager, but for now the operating system and driver vendors are responsible for optimizing their file- and disk-handling functions.

TIP: NTFS supports and can be configured to use on-the-fly file compression to save disk space on a drive-by-drive or file-byfile basis. NTFS will not compress files on a drive using a cluster size of 4 KB or less.

Jim Aspinwall has been the Windows Helpdesk columnist and feature editor for CNET.com and is the author of three books on PC maintenance.


Return to WindowsDevCenter.com.