Adding two IEEE754 floating-point representations and interpreting the result.

This isn't for any class or homework. As part of my personal study, I'm trying to better understand the IEEE754 representation of decimal floating-point numbers in binary. I'd like to add two numbers: $1.111$ and $2.222$, then compare the result by converting the IEEE754 representation of the sum back to decimal.

Per this online tool:

$1.111 = 00111111100011100011010100111111$

$2.222 = 01000000000011100011010100111111$

Summing these two together using signed binary addition, I get:

$0111 1111 1001 1100 0110 1010 0111 1110$

In hexadecimal, this is:

$7F9C6A7E$

And according to this other version of the tool, that corresponds to $NaN$.

What's going on here?

asked Nov 25 at 0:53

AleksandrH

1,22221123

You can't expect doing integer addition on floating-point representations to give meaningful results.
– Henning Makholm
Nov 25 at 1:01

How would I go about trying to do what I want to do here?
– AleksandrH
Nov 25 at 1:06

I have no idea what it is you want to do. Use floating-point addition rather than integer?
– Henning Makholm
Nov 25 at 1:07

Yes, I was under the impression that once I have the two floating-point numbers represented as binary strings, I could simply add them together bit by bit and then translate the resulting 32-bit string to decimal floating point. The IEEE754 standard defines conversions in both directions (binary to decimal and decimal to binary).
– AleksandrH
Nov 25 at 1:12

You have to adjust them so they have the same mantissa before you add them. You ought to read about what the IEEE754 representation is actually constructed.
– saulspatz
Nov 25 at 1:12

|
show 4 more comments

Per this online tool:

$1.111 = 00111111100011100011010100111111$

$2.222 = 01000000000011100011010100111111$

Summing these two together using signed binary addition, I get:

$0111 1111 1001 1100 0110 1010 0111 1110$

In hexadecimal, this is:

$7F9C6A7E$

And according to this other version of the tool, that corresponds to $NaN$.

What's going on here?

asked Nov 25 at 0:53

AleksandrH

1,22221123

You can't expect doing integer addition on floating-point representations to give meaningful results.
– Henning Makholm
Nov 25 at 1:01

How would I go about trying to do what I want to do here?
– AleksandrH
Nov 25 at 1:06

I have no idea what it is you want to do. Use floating-point addition rather than integer?
– Henning Makholm
Nov 25 at 1:07

Yes, I was under the impression that once I have the two floating-point numbers represented as binary strings, I could simply add them together bit by bit and then translate the resulting 32-bit string to decimal floating point. The IEEE754 standard defines conversions in both directions (binary to decimal and decimal to binary).
– AleksandrH
Nov 25 at 1:12

You have to adjust them so they have the same mantissa before you add them. You ought to read about what the IEEE754 representation is actually constructed.
– saulspatz
Nov 25 at 1:12

|
show 4 more comments

Per this online tool:

$1.111 = 00111111100011100011010100111111$

$2.222 = 01000000000011100011010100111111$

Summing these two together using signed binary addition, I get:

$0111 1111 1001 1100 0110 1010 0111 1110$

In hexadecimal, this is:

$7F9C6A7E$

And according to this other version of the tool, that corresponds to $NaN$.

What's going on here?

asked Nov 25 at 0:53

AleksandrH

1,22221123

Per this online tool:

$1.111 = 00111111100011100011010100111111$

$2.222 = 01000000000011100011010100111111$

Summing these two together using signed binary addition, I get:

$0111 1111 1001 1100 0110 1010 0111 1110$

In hexadecimal, this is:

$7F9C6A7E$

And according to this other version of the tool, that corresponds to $NaN$.

What's going on here?

binary floating-point

asked Nov 25 at 0:53

AleksandrH

1,22221123

asked Nov 25 at 0:53

AleksandrH

1,22221123

asked Nov 25 at 0:53

AleksandrH

1,22221123

asked Nov 25 at 0:53

AleksandrH

1,22221123

asked Nov 25 at 0:53

AleksandrH

1,22221123

You can't expect doing integer addition on floating-point representations to give meaningful results.
– Henning Makholm
Nov 25 at 1:01

How would I go about trying to do what I want to do here?
– AleksandrH
Nov 25 at 1:06

I have no idea what it is you want to do. Use floating-point addition rather than integer?
– Henning Makholm
Nov 25 at 1:07

Yes, I was under the impression that once I have the two floating-point numbers represented as binary strings, I could simply add them together bit by bit and then translate the resulting 32-bit string to decimal floating point. The IEEE754 standard defines conversions in both directions (binary to decimal and decimal to binary).
– AleksandrH
Nov 25 at 1:12

You have to adjust them so they have the same mantissa before you add them. You ought to read about what the IEEE754 representation is actually constructed.
– saulspatz
Nov 25 at 1:12

|
show 4 more comments

You can't expect doing integer addition on floating-point representations to give meaningful results.
– Henning Makholm
Nov 25 at 1:01

How would I go about trying to do what I want to do here?
– AleksandrH
Nov 25 at 1:06

I have no idea what it is you want to do. Use floating-point addition rather than integer?
– Henning Makholm
Nov 25 at 1:07

Yes, I was under the impression that once I have the two floating-point numbers represented as binary strings, I could simply add them together bit by bit and then translate the resulting 32-bit string to decimal floating point. The IEEE754 standard defines conversions in both directions (binary to decimal and decimal to binary).
– AleksandrH
Nov 25 at 1:12

You have to adjust them so they have the same mantissa before you add them. You ought to read about what the IEEE754 representation is actually constructed.
– saulspatz
Nov 25 at 1:12

You can't expect doing integer addition on floating-point representations to give meaningful results.
– Henning Makholm
Nov 25 at 1:01

How would I go about trying to do what I want to do here?
– AleksandrH
Nov 25 at 1:06

I have no idea what it is you want to do. Use floating-point addition rather than integer?
– Henning Makholm
Nov 25 at 1:07

Yes, I was under the impression that once I have the two floating-point numbers represented as binary strings, I could simply add them together bit by bit and then translate the resulting 32-bit string to decimal floating point. The IEEE754 standard defines conversions in both directions (binary to decimal and decimal to binary).
– AleksandrH
Nov 25 at 1:12

You have to adjust them so they have the same mantissa before you add them. You ought to read about what the IEEE754 representation is actually constructed.
– saulspatz
Nov 25 at 1:12

|
show 4 more comments

1 Answer
1

active

oldest

votes

You cannot expect to use integer binary addition on two floating-point representations and get a meaningful result.

First, $1.111$ cannot be represented exactly in binary floating point. Your 00111111100011100011010100111111 is actually the IEEE-754 single precision representation of the number
$$ 1.11099994182586669921875 $$
which is the closest representable number to $1.111$. This breaks up as

  0      01111111        00011100011010100111111

sign  biased exponent  fractional part of mantissa

and stands for the number
$$ 1.00011100011010100111111_2 times 2^{127-127} $$

The representation of $2.222$ is twice that, with the same mantissa but the exponent one higher. When we add them we must position the mantissas correctly with respect to each other:

   1.00011100011010100111111

+ 10.0011100011010100111111

----------------------------

= 11.01010101001111110111101

  11.0101010100111111011110   <-- rounded to 1+23 bits mantissa using round-to-even



 0    10000000   10101010100111111011110

sign biased exp    fractional mantissa

And the representation 01000000010101010100111111011110 corresponds to the number
$$ 3.332999706268310546875 $$
Note that this is not the closest representable number to $3.333$, which would be the next one,
$$ 3.33329999446868896484375 $$
but the round-to-even rule led to rounding down the full result of the addition, which compounded the error inherent in the two inputs each being slightly smaller than $1.111$ and $2.222$.

edited Nov 25 at 1:39

answered Nov 25 at 1:23

Henning Makholm

238k16303537

I followed this well until we got to the $10.00...$ part. Why did the decimal point move one place to the right?
– AleksandrH
Nov 25 at 1:43

@AleksandrH: Because the second addend has a biased exponent of 10000000, so it represents the number $1.langlemathit{mantissa}rangle_2 times 2^{128-127}$ -- in other words the binary points is shifted one position to the right.
– Henning Makholm
Nov 25 at 1:46

Yeah, I don't understand. Sorry for wasting your time.
– AleksandrH
Nov 25 at 14:06

@AleksandrH: The job of the exponent is to encode where the binary point is. That's what makes the representation "floating point" -- you can move the point! In the $2.22$ representation the exponent is $1$ (after we subtract the fixed bias), meaning that the point is after one of the explicitly represented mantissa bits.
– Henning Makholm
Nov 25 at 14:15

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3012295%2fadding-two-ieee754-floating-point-representations-and-interpreting-the-result%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You cannot expect to use integer binary addition on two floating-point representations and get a meaningful result.

  0      01111111        00011100011010100111111

sign  biased exponent  fractional part of mantissa

and stands for the number
$$ 1.00011100011010100111111_2 times 2^{127-127} $$

The representation of $2.222$ is twice that, with the same mantissa but the exponent one higher. When we add them we must position the mantissas correctly with respect to each other:

   1.00011100011010100111111

+ 10.0011100011010100111111

----------------------------

= 11.01010101001111110111101

  11.0101010100111111011110   <-- rounded to 1+23 bits mantissa using round-to-even



 0    10000000   10101010100111111011110

sign biased exp    fractional mantissa

edited Nov 25 at 1:39

answered Nov 25 at 1:23

Henning Makholm

238k16303537

I followed this well until we got to the $10.00...$ part. Why did the decimal point move one place to the right?
– AleksandrH
Nov 25 at 1:43

@AleksandrH: Because the second addend has a biased exponent of 10000000, so it represents the number $1.langlemathit{mantissa}rangle_2 times 2^{128-127}$ -- in other words the binary points is shifted one position to the right.
– Henning Makholm
Nov 25 at 1:46

Yeah, I don't understand. Sorry for wasting your time.
– AleksandrH
Nov 25 at 14:06

@AleksandrH: The job of the exponent is to encode where the binary point is. That's what makes the representation "floating point" -- you can move the point! In the $2.22$ representation the exponent is $1$ (after we subtract the fixed bias), meaning that the point is after one of the explicitly represented mantissa bits.
– Henning Makholm
Nov 25 at 14:15

add a comment |

You cannot expect to use integer binary addition on two floating-point representations and get a meaningful result.

  0      01111111        00011100011010100111111

sign  biased exponent  fractional part of mantissa

and stands for the number
$$ 1.00011100011010100111111_2 times 2^{127-127} $$

The representation of $2.222$ is twice that, with the same mantissa but the exponent one higher. When we add them we must position the mantissas correctly with respect to each other:

   1.00011100011010100111111

+ 10.0011100011010100111111

----------------------------

= 11.01010101001111110111101

  11.0101010100111111011110   <-- rounded to 1+23 bits mantissa using round-to-even



 0    10000000   10101010100111111011110

sign biased exp    fractional mantissa

edited Nov 25 at 1:39

answered Nov 25 at 1:23

Henning Makholm

238k16303537

I followed this well until we got to the $10.00...$ part. Why did the decimal point move one place to the right?
– AleksandrH
Nov 25 at 1:43

@AleksandrH: Because the second addend has a biased exponent of 10000000, so it represents the number $1.langlemathit{mantissa}rangle_2 times 2^{128-127}$ -- in other words the binary points is shifted one position to the right.
– Henning Makholm
Nov 25 at 1:46

Yeah, I don't understand. Sorry for wasting your time.
– AleksandrH
Nov 25 at 14:06

@AleksandrH: The job of the exponent is to encode where the binary point is. That's what makes the representation "floating point" -- you can move the point! In the $2.22$ representation the exponent is $1$ (after we subtract the fixed bias), meaning that the point is after one of the explicitly represented mantissa bits.
– Henning Makholm
Nov 25 at 14:15

add a comment |

You cannot expect to use integer binary addition on two floating-point representations and get a meaningful result.

  0      01111111        00011100011010100111111

sign  biased exponent  fractional part of mantissa

and stands for the number
$$ 1.00011100011010100111111_2 times 2^{127-127} $$

The representation of $2.222$ is twice that, with the same mantissa but the exponent one higher. When we add them we must position the mantissas correctly with respect to each other:

   1.00011100011010100111111

+ 10.0011100011010100111111

----------------------------

= 11.01010101001111110111101

  11.0101010100111111011110   <-- rounded to 1+23 bits mantissa using round-to-even



 0    10000000   10101010100111111011110

sign biased exp    fractional mantissa

edited Nov 25 at 1:39

answered Nov 25 at 1:23

Henning Makholm

238k16303537

You cannot expect to use integer binary addition on two floating-point representations and get a meaningful result.

  0      01111111        00011100011010100111111

sign  biased exponent  fractional part of mantissa

and stands for the number
$$ 1.00011100011010100111111_2 times 2^{127-127} $$

The representation of $2.222$ is twice that, with the same mantissa but the exponent one higher. When we add them we must position the mantissas correctly with respect to each other:

   1.00011100011010100111111

+ 10.0011100011010100111111

----------------------------

= 11.01010101001111110111101

  11.0101010100111111011110   <-- rounded to 1+23 bits mantissa using round-to-even



 0    10000000   10101010100111111011110

sign biased exp    fractional mantissa

edited Nov 25 at 1:39

answered Nov 25 at 1:23

Henning Makholm

238k16303537

edited Nov 25 at 1:39

answered Nov 25 at 1:23

Henning Makholm

238k16303537

answered Nov 25 at 1:23

Henning Makholm

238k16303537

answered Nov 25 at 1:23

Henning Makholm

238k16303537

I followed this well until we got to the $10.00...$ part. Why did the decimal point move one place to the right?
– AleksandrH
Nov 25 at 1:43

@AleksandrH: Because the second addend has a biased exponent of 10000000, so it represents the number $1.langlemathit{mantissa}rangle_2 times 2^{128-127}$ -- in other words the binary points is shifted one position to the right.
– Henning Makholm
Nov 25 at 1:46

Yeah, I don't understand. Sorry for wasting your time.
– AleksandrH
Nov 25 at 14:06

@AleksandrH: The job of the exponent is to encode where the binary point is. That's what makes the representation "floating point" -- you can move the point! In the $2.22$ representation the exponent is $1$ (after we subtract the fixed bias), meaning that the point is after one of the explicitly represented mantissa bits.
– Henning Makholm
Nov 25 at 14:15

add a comment |

I followed this well until we got to the $10.00...$ part. Why did the decimal point move one place to the right?
– AleksandrH
Nov 25 at 1:43

@AleksandrH: Because the second addend has a biased exponent of 10000000, so it represents the number $1.langlemathit{mantissa}rangle_2 times 2^{128-127}$ -- in other words the binary points is shifted one position to the right.
– Henning Makholm
Nov 25 at 1:46

Yeah, I don't understand. Sorry for wasting your time.
– AleksandrH
Nov 25 at 14:06

@AleksandrH: The job of the exponent is to encode where the binary point is. That's what makes the representation "floating point" -- you can move the point! In the $2.22$ representation the exponent is $1$ (after we subtract the fixed bias), meaning that the point is after one of the explicitly represented mantissa bits.
– Henning Makholm
Nov 25 at 14:15

I followed this well until we got to the $10.00...$ part. Why did the decimal point move one place to the right?
– AleksandrH
Nov 25 at 1:43

@AleksandrH: Because the second addend has a biased exponent of 10000000, so it represents the number $1.langlemathit{mantissa}rangle_2 times 2^{128-127}$ -- in other words the binary points is shifted one position to the right.
– Henning Makholm
Nov 25 at 1:46

Yeah, I don't understand. Sorry for wasting your time.
– AleksandrH
Nov 25 at 14:06

@AleksandrH: The job of the exponent is to encode where the binary point is. That's what makes the representation "floating point" -- you can move the point! In the $2.22$ representation the exponent is $1$ (after we subtract the fixed bias), meaning that the point is after one of the explicitly represented mantissa bits.
– Henning Makholm
Nov 25 at 14:15

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Csdrhrt