If you do, you’re a step ahead of Microsoft.

For all their highly qualified programmers, Microsoft don’t use the correct units for file sizes. In Windows systems data sizes are written in the binary size form, but labelled in the decimal form. This can be a bit confusing, so I’ll do my best to explain it.

N.B. Microsoft are not the only offender in this matter, but I mention them because they are at the core of many computer users’ lives.

If you are familiar with the difference between mega and mebi, giga and gibi, feel free to skip ahead. I won’t be offended. Much. 😛

Decimal units

Decimal data units are what you are used to seeing and what you will assume you are seeing on Microsoft systems too. A few of these unit prefixes are kilo, mega, giga and tera. They are commonly used to label binary data sizes and there have even been lawsuits disputing storage sizes.

Decimal prefixes can be represented as 10x. These are the same prefixes used in every day metric measures, for example 1 kilometre = 1000 metres.

Binary units

Binary units are used less often and I know of some programmers and engineers who have never even heard of them before. A few of these are kibi, mebi, gibi and tebi. These are contractions, for example “mebi” is a contraction of “mega binary”.

Although many have not heard of these units, they are used more than you possibly realise because binary sizes of bytes are often labelled with decimal units in error, such as with Microsoft giving gibibyte values the gigabyte unit (GB).

Byte units in the binary form can be represented as 2x bytes.

Comparison

This becomes much clearer when binary and decimal values are compared side by side.

(Please read “,” as commas not decimal points)

DecimalBinary
SIBytesIECBytes
kB (kilobyte)103 = 1,000KiB (kibibyte)210 = 1,024
MB (megabyte)106 = 1,000,000MiB (mebibyte)220 = 1,048,576
GB (gigabyte)109 = 1,000,000,000GiB (gibibyte)230 = 1,073,741,824
TB (terabyte)1012 = 1,000,000,000,000TiB (tebibyte)240 = 1,099,511,627,776

As you can see, in the higher units the difference begins to become substantial. We are now at the point that terabytes are becoming commonplace, but if the unit is incorrect you could be tricked by a 10% difference.

Microsoft and the units

Microsoft do all their workings in binary sizes of bytes (kiB, MiB, GiB) but label them as decimal (kB, MB, GB). To provide some evidence of this a will show you some screen-shots regarding hard drive size.

Screen-shot of System Information on Windows Vista. The section shown is 'Disks' in the section 'Storage' in the section 'Components'. The highlighted item is 'Size' with value '298.09 GB (320,070,320,640 bytes)'

In System Information you can see that my hard-drive is 320,070,320,640 bytes. Dell advertised this drive as 320GB. This is the correct number of gigabytes, as 320,070,320,640 / 1,000,000,000 (bytes in a gigabyte) = 320 as a whole number.

To the left of the bytes is what Microsoft say the drive is in gigabytes: 298.09GB. This number is a binary representation, in gibibytes, but labelled as the decimal form, gigabytes: 320,070,320,640 / 1,073,741,824 (bytes in a gibibyte) = 298.09 (to 2 decimal places).

To summarise the point, my hard-drive’s size should be written as 320GB or 298GiB.

Screen-shot of the Properties dialogue box on Windows Vista for drive C. A section is highlighted, reading: 'Used space: 122,234,949,632 bytes  113 GB. Free space: 182,066,348,032 bytes  169 GB. Capacity: 304,301,297,664 bytes  283 GB.'

You can see this problem again in the drive properties dialogue box. In the screen-shot, used space, free space and capacity should read 122GB, 182GB and 304GB respectively, or the units should be GiB.

Why does it matter?

If you only ever use Windows and never look at the size of you hard-drive or other storage and never look at file sizes on the Internet or external media then it doesn’t matter, because you will only ever be working with sizes in the same form.

However, if you don’t live in a Microsoft bubble then you will come across decimal data sizes that don’t match up when Windows tells you the size. For example, you purchase a drive of one size and Windows says it is a different size, due to using the wrong units. This can become a problem.

How many users would consider that it is Microsoft that is wrong and not the smaller company they purchased the storage from? I certainly jumped to the conclusion that Dell were in the wrong about my hard-drive, until I looked in System Information.

These units are standardised for a reason, and that reason is so people know what values they are reading based on the unit provided. Microsoft using the wrong units could be compared to speeds in miles per hour being labelled as kilometres per hour, although there the difference in values is far greater, but you get the point. 😉

Why don’t Microsoft use the correct one?

That’s not a question I can directly answer, but the likely hood is Microsoft have not yet adopted the IEC’s binary prefixes because they are still relatively new, having been introduced in 2000.

Microsoft’s operating systems have been around for far longer than binary prefixes and so it could confuse those who have been using Windows for more than 9 years if Microsoft suddenly made the change.

However, I do still find it odd that Microsoft chose to label binary multiples in with decimal units, because unit prefixes such as “kilo” and “mega” have been used in science for a long time. To call 1024 bytes a kilobyte when 1000 metres is a kilometre is just asking for trouble, because they went against the norm.

References & further reading

While writing this article I used several Wikipedia articles to make sure I didn’t get anything wrong. (Don’t worry, I checked the sources and other sites to confirm.)

If you would like to learn more about this topic, I recommend you read the IEC’s article on prefixes for binary multiples. That article explains things far better than I have. I suppose I could have just linked to it and not written a full article, but I wanted to input my own opinions on the subject. 😀

There is also an interesting Wikipedia article on binary prefixes.

Share this


    Have your say

    Leave a Reply

    8 comments

    • I bought a 500 GB drive and it only appears as a 465 GB drive in Windows explorer. I think they are all in it together, so they can sell items that are seemingly higher in quoted capacity, but the actual realized capacity as recognized in Windows is less.

    • Ah, so that’s why my 4GB drive is recognized less than 4GBs.

      I can see how it’s more of a problem as time goes by.

      Right now we’re starting to consider even Terabytes and Tebibytes…it’ll definitely get more serious.

    • Thanks – that really cleared up my confusion. I think a good analogy would be selling a quart of milk as if it were a litre (since a quart is pretty close to a litre.) If you sell one bottle to a single customer they might not mind so much, but what if you sold it to a distributor buying 10,000 units?

      • A slightly better analogy is that the manufacturer is selling the milk accurately as 10,000 quarts, but the customer believes they received only 9464 quarts because the company who made their measuring device measures in litres but reports the value as quarts. (Microsoft is selling the incorrectly-labelled measuring devices.)

    • The thing is the kibi mebi Gigi binary prefixes didn’t come in until 1999. We are getting to a point were this needs sorting out but I wouldn’t blame Microsoft for it. If I was to put my conspiracy theory hat on I’d blame the hard drive manufacturers, they had the law suits coming for false advertisement, they were fine in the 90’s but they could see that within 10 years they’d be facing lawsuits, so they engineered a new standard.

      • So you’re saying Microsoft hasn’t had enough time or something? It’s been sixteen years! They’ve made at least *eight* releases of Windows since the IEC released the binary prefixes standard. Linux (the kernel) started using the IEC prefixes around 2001, and both Mac OS and most software on Linux systems have switched to the IEC prefixes by 2008 or 2009. Microsoft is the last holdout, as usual.

        The hard drive manufacturers did not come up with the binary prefixes; they have used only SI unit prefixes for about the last twenty years, so they’ve had no need for them.

    • Pointless carping, in my opinion. Like saying it’s not a blizzard because the wind only hit 49 mph, not 50. Consider that any storage system running close to the difference between the units is already in trouble. Also most file systems probably waste 10% in unallocated space at the end of clusters, in the middle of files, and fragmentation.

    1 trackback / pingback

    Trackback URL: http://www.arteki.com/do-you-know-your-gigabytes-from-your-gibibytes/trackback/



    css.php