The byte order fallacy

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

The byte order fallacy

Ben Coman
 
I bumped into an interesting article [1] by Rob Pike that... 
i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

is the wrong way to handle endianness since... 
  "It assumes integers are addressable at any byte offset; on some machines that's not true."
  "It depends on integers being 32 bits long, or requires more #ifdefs to pick a 32-bit integer type."

I guess some people here have byte-ordering war wounds so I'd value your opinion on whether I should file that as good advice.
I notice we have a couple of instances of this...

$  find platforms -name "*.c" -exec grep -Hi "<<" {} \; | grep ">>" | grep 24 | grep 8
./win32/plugins/FontPlugin/sqWin32FontPlugin.c:  #define BYTE_SWAP(w) ((w << 24) | ((w & 0xFF00) << 8) | ((w >> 8) & 0xFF00) | (w >> 24))
./win32/vm/sqWin32Window.c:   #define BYTE_SWAP(w) w = (w<<24) | ((w&0xFF00)<<8) | ((w>>8)&0xFF00) | (w>>24)

cheers -ben

[1] https://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html

Reply | Threaded
Open this post in threaded view
|

Re: The byte order fallacy

Nicolas Cellier
 
Hi Ben,
last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in platforms/Cross/vm/sqMemoryAccess.h
These macros use the compiler intrinsics if possible.
https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3ab6dcc4bb2d2ef3ac5c65a
So we should not define any new one and rather replace the suboptimal ones with the ones above.

i = *((int*)data);
is doubly bad because of possible alignment problem as you told,but also because it introduces pointer aliasing, what the C compiler does not like, and the standards neither (C89 C99 ...).

However, our int are 32 bits long in all supported platforms. Would it not be the case, the VM would break in many places.
For example, you can see the cost in term of number of commits for the fact that sizeof long < sizeof void * in LLP64 both in VMMaker and Opensmalltalk-vm source.

2017-06-13 6:13 GMT+02:00 Ben Coman <[hidden email]>:
 
I bumped into an interesting article [1] by Rob Pike that... 
i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

is the wrong way to handle endianness since... 
  "It assumes integers are addressable at any byte offset; on some machines that's not true."
  "It depends on integers being 32 bits long, or requires more #ifdefs to pick a 32-bit integer type."

I guess some people here have byte-ordering war wounds so I'd value your opinion on whether I should file that as good advice.
I notice we have a couple of instances of this...

$  find platforms -name "*.c" -exec grep -Hi "<<" {} \; | grep ">>" | grep 24 | grep 8
./win32/plugins/FontPlugin/sqWin32FontPlugin.c:  #define BYTE_SWAP(w) ((w << 24) | ((w & 0xFF00) << 8) | ((w >> 8) & 0xFF00) | (w >> 24))
./win32/vm/sqWin32Window.c:   #define BYTE_SWAP(w) w = (w<<24) | ((w&0xFF00)<<8) | ((w>>8)&0xFF00) | (w>>24)

cheers -ben

[1] https://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html



Reply | Threaded
Open this post in threaded view
|

Re: The byte order fallacy

Ben Coman
 
cool. thx Nicolas.
cheers -ben

On Tue, Jun 13, 2017 at 3:38 PM, Nicolas Cellier <[hidden email]> wrote:
 
Hi Ben,
last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in platforms/Cross/vm/sqMemoryAccess.h
These macros use the compiler intrinsics if possible.
https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3ab6dcc4bb2d2ef3ac5c65a
So we should not define any new one and rather replace the suboptimal ones with the ones above.

i = *((int*)data);
is doubly bad because of possible alignment problem as you told,but also because it introduces pointer aliasing, what the C compiler does not like, and the standards neither (C89 C99 ...).

However, our int are 32 bits long in all supported platforms. Would it not be the case, the VM would break in many places.
For example, you can see the cost in term of number of commits for the fact that sizeof long < sizeof void * in LLP64 both in VMMaker and Opensmalltalk-vm source.

2017-06-13 6:13 GMT+02:00 Ben Coman <[hidden email]>:
 
I bumped into an interesting article [1] by Rob Pike that... 
i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

is the wrong way to handle endianness since... 
  "It assumes integers are addressable at any byte offset; on some machines that's not true."
  "It depends on integers being 32 bits long, or requires more #ifdefs to pick a 32-bit integer type."

I guess some people here have byte-ordering war wounds so I'd value your opinion on whether I should file that as good advice.
I notice we have a couple of instances of this...

$  find platforms -name "*.c" -exec grep -Hi "<<" {} \; | grep ">>" | grep 24 | grep 8
./win32/plugins/FontPlugin/sqWin32FontPlugin.c:  #define BYTE_SWAP(w) ((w << 24) | ((w & 0xFF00) << 8) | ((w >> 8) & 0xFF00) | (w >> 24))
./win32/vm/sqWin32Window.c:   #define BYTE_SWAP(w) w = (w<<24) | ((w&0xFF00)<<8) | ((w>>8)&0xFF00) | (w>>24)

cheers -ben

[1] https://commandcenter.blogspot.com.au/2012/04/byte-order-fallacy.html





Reply | Threaded
Open this post in threaded view
|

Re: The byte order fallacy

K K Subbu
In reply to this post by Nicolas Cellier
 
On Tuesday 13 June 2017 01:08 PM, Nicolas Cellier wrote:
> last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in
> platforms/Cross/vm/sqMemoryAccess.h
> These macros use the compiler intrinsics if possible.
> https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3ab6dcc4bb2d2ef3ac5c65a
> So we should not define any new one and rather replace the suboptimal
> ones with the ones above.

Why not use htonl family of functions? They are part of POSIX standards
and widely available in <netinet/in.h>.

See the man pages for htonl and endian.

Regards .. Subbu
Reply | Threaded
Open this post in threaded view
|

Re: The byte order fallacy

Tobias Pape
 

> On 13.06.2017, at 20:28, K K Subbu <[hidden email]> wrote:
>
> On Tuesday 13 June 2017 01:08 PM, Nicolas Cellier wrote:
>> last year I added SQ_SWAP_4_BYTES and SQ_SWAP_8_BYTES macros in
>> platforms/Cross/vm/sqMemoryAccess.h
>> These macros use the compiler intrinsics if possible.
>> https://github.com/OpenSmalltalk/opensmalltalk-vm/commit/c0da5abcc7a538efd3ab6dcc4bb2d2ef3ac5c65a
>> So we should not define any new one and rather replace the suboptimal
>> ones with the ones above.
>
> Why not use htonl family of functions? They are part of POSIX standards and widely available in <netinet/in.h>.
>
> See the man pages for htonl and endian.

because "On machines which have a byte order which is the same as the network order, [htonl is] defined as null macro[]."
which defies the generality of the swapping functions. Also, per posix there's only 16 and 32 bit versions of those, but
we need 32 and 64 bit versions.

Best regards
        -Tobias


>
> Regards .. Subbu