Log in

No account? Create an account


Endianness and C

« previous entry | next entry »
7th Feb 2008 | 11:50

This is in response to mathew's recent post. I'm posting it here (and emailing it) because I can't get his frustrating Wordpress installation to log me in.

He asks, "It’s possible to write some C code to work out whether a machine’s architecture is little-endian or big-endian with respect to bytes. Is it possible, using only ANSI C, to work out whether the machine’s architecture is big-endian or little-endian with respect to bits?"

Machines do not (in general) have any bitwise endianness that's visible to programmers. In order to access bits you have to use shift and mask operations in registers, and these operations do not imply an absolute numbering of bits in the same way that the machine's addressing architecture does for bytes. (Instruction architecture documentation will often number the bits in a word, but a program running on the machine can't tell whether this documentation uses big-endian or little-endian conventions, or whether it numbers the bits from zero or one.)

Machines do have bitwise endiannes at a lower level that is invisible to the programmer, when buses or IO serialise and deserialise bytes, e.g. to and from RAM or disk or network. The serialized form is not accessible, so its endianness doesn't matter. There is an exception to this, which is output devices where the serialised form is visible to the user, e.g. displays with sub-byte pixels. But this is the device's endianness, not the whole machine's, and different devices attached to the same machine can legitimately disagree.

The same arguments about machines' instruction architectures apply to most of the C programming model, i.e. bits are mostly only accessible by shifts and masks, so endianness isn't visible. The exception is bit fields in structs, which allow words to be broken down into units of arbitrary size. Bit fields must have an endianness convention for how they are packed into words, but this is chosen by the compiler and need not agree with the hardware's byte-wise endianness. However the compiler usually does agree with the hardware, so that the following two structures have the same layout, though this is not required by the standard:

        struct a {
                unsigned char first;
                unsigned char second;
        struct b {
                unsigned first : 8;
                unsigned second : 8;

| Leave a comment |

Comments {3}


from: cartesiandaemon
date: 7th Feb 2008 12:40 (UTC)

I'm a little confused. Do modern computers even *have* bit endianness? I see someone asked that question a lot more directly on the other post. I get the impression that "no endianness visible to programmers" I would have described as "no endianness". For instance, if where-ever a bit-endianness is required (you mention serial buses) a consistent one is used, that would be its endianness. But need it be consistent?

If you *could* address sub-byte addresses (say, you specified address in *bits*, and normally but not always had to give a multiple of 8 or 64) then there obviously would be an endianness and likely the compiler would choose to agree with it, which your example would show.

But if you can't, then except for a few exceptions, isn't a bite a unit -- a collection of eight wires which might be left-to-right or right-to-left depending on optimizations? So you can ask "Is the first of a bit field the same as the LSB or MSB?" But the hardware would treat that as arbitrary, so long as LSB matches to LSB when bytes meet?

Reply | Thread

Tony Finch

from: fanf
date: 7th Feb 2008 12:54 (UTC)

You asked:

Do modern computers even *have* bit endianness?

I said:

Machines do not (in general) have any bitwise endianness

Reply | Parent | Thread


from: cartesiandaemon
date: 7th Feb 2008 13:22 (UTC)

Oh yes. Sorry, nevermind.

Reply | Parent | Thread