🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Are all integers stored as compliment of 2 internally?

Started by
9 comments, last by Adam_42 4 years, 2 months ago

I'm brushing up my assembly skills and basically just following a book at this point but I'm doing some research on my own by disassembling all the programs that I assemble (I assemble with ml64 microsoft assembler and disassemble with x64dbg). I'm getting some strange results in the disassembly. For example I'm storing 9,223,372,036,854,775,807 in a quad word, which should be the max integer value that a could be held by a qword, the disassembly shows FF FF FF FF FF FF FF 7F, however if I add 1 to the number, making it 9,223,372,036,854,775,808 the disassembly shows 00 00 00 00 00 00 00 80, why is that? What is 80 in the last 8 bits? Is that related to a all the numbers being stored in compliment of 2 system? I've also had a question about this as well, if all the numbers are stored in compliment of 2 doens't that mean that all the stored numbers are automatically 63 bits (the last bit is used for to make compliment of 2 work) but what if I want to create an unsigned integer and use all 64 bits? How would I specify that to the assembler? Thanks.

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

Advertisement

You're looking at the output in the wrong order.

I assume Little Endian, so FF FF FF FF FF FF FF 7F in the assembly is actually 7FFFFFFFFFFFFFFF as qword. Add one, you get 8000000000000000.
Which means, the MSB is getting set.

Whether that's a negative number, or the next positive value depends on how you treat the value. For the CPU there's no difference.

Fruny: Ftagn! Ia! Ia! std::time_put_byname! Mglui naflftagn std::codecvt eY'ha-nthlei!,char,mbstate_t>

Even in higher languages (where it is allowed but C/C++ allow it) you can treat your number as whatever you like. It just depends on the current processing context if the high-order bits are set to indicate a sign or if it is used as an additional number bit. You can even just split numbers and handle them as pairs of lower byte ordered numbers or vice versa.

A technique to speed up memory compare is for example to treat the memory as multiple of an QWORD (64 bit) and compare them against each other (if possible).

However, endianess is always a problem if you start watching at numbers in binary. Network is supposed to handle everything in big endian, same for databases while Windows is the only little endian system (as far as I know)

Shaarigan said:

However, endianess is always a problem if you start watching at numbers in binary. Network is supposed to handle everything in big endian, same for databases while Windows is the only little endian system (as far as I know)

Endianness is a property of the processor, not the operating system. Almost all processors you are likely to come across in game development are little endian these days, with the exception of the ARM series which are bi-endian. But you're correct that Network communications are defined as big endian, a by-product of the prevailing choice on the machines used when these standards were defined.

VanillaSnake21 said:
if all the numbers are stored in compliment of 2 doens't that mean that all the stored numbers are automatically 63 bits

Well, two's complement always means N-1 significant bits to store an N-bit integer in binary, so it means all 64-bit integers represented in two's complement take 64 bits of storage. It might be quibbling, but it would be like saying the decimal number “10” is a single-digit number because the least-significant digit is a zero.

Yes, most common processors support doing integer arithmetic using two's complement representation, and most modern compilers will generate code accordingly. If you do a close reading of ISO/IEC 9899 (The C programming language standard) it weasels around how the underlying hardware represents integers but given the values specified in the limits section, it's clear that the only reasonable way to fulfil all the requirements is to use twos complement binary representation of integers.

Various historic mainframes supported other forms of binary representation of integers. The x86 still has BCD instructions (because why not have more instructions than anyone can ever learn? it boosts transistor counts!) but ARM and other more ‘modern’ processors do not.

Stephen M. Webb
Professional Free Software Developer

@Endurion Oooh, right the endiness. Makes sense now. For some reason I thought it overflowed when I added the one which made me believe it was limited at that number. Now I'm doing more testing and I see I can go all the way up to 18,446,744,073,709,551,615 to make the debugger read FF FF FF FF FF FF FF FF FF, so yea I can see it's either negative or positive depending on what sort of convention I assume. Thank you.

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

@endurion Hey I just have another question about that. Lets say I want to represent addition of two unsigned integers and add them together with “add” instruction, how would the cpu know if the numbers are negative or positive since the MSB is set in both of them?

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

The difference between signed and unsigned is one of interpretation. As far as the CPU is concerned, all integers are simultaneously signed AND unsigned - the resulting bit patterns are the same either way. There is a set of flags that indicate noteworthy results - zero, negative, signed overflow, unsigned overflow (i.e. carry) and these are all set or cleared appropriately after any arithmetic instruction.

So for example, if you tell the CPU to add the bytes 7F and 01, the result will be 80, and the flags will indicate there was signed overflow, no carry, and that the result is negative and not zero. If you instead add the bytes F0 and 30, the result is 20, with the flags showing a carry, but no signed overflow, and that the result is neither negative nor zero, and if you add, say, E0 and E0, you get C0, carry, no overflow, negative, and not zero. From there, it's up to your own code to decide which of those flags are actually relevant - if your doing unsigned addition, for example, you care about the carry, but if you're doing signed math you care about the overflow and negative flags.

@undefined @anthony serrano

Ok I see what you mean, I've done some sample problems and can see that the but patterns are indeed the same, it's just that with normal addition you care about the carry and in negative addition you discard the carry. Thanks for explaining that!

You didn't come into this world. You came out of it, like a wave from the ocean. You are not a stranger here. -Alan Watts

Note that for multiplication, there are different instructions (MUL and IMUL) for signed and unsigned integers, as it does matter in that case.

Division is the same (DIV and IDIV).

This topic is closed to new replies.

Advertisement