Non-deterministic sum of floats





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







9















Let me state the obvious fist: I completely understand that floating point types cannot accurately represent decimal values. This is not about that! Nevertheless, floating point calculations are supposed to be deterministic.



Now that this is out of the way, let me show you the curious case I observed today. I have a list of floating-point values, and I want to sum them up:



CREATE TABLE #someFloats (val float);
INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

SELECT STR(SUM(#someFloats.val), 30, 15) FROM #someFloats;

DROP TABLE #someFloats;

-- yields:
-- 13.600000000000001


So far, so good - no surprises here. We all know that 1.2 can't be represented exactly in binary representation, so the "imprecise" result is expected.



Now the following strange thing happens when I left-join another table:



CREATE TABLE #A (a int);
INSERT INTO #A (a) VALUES (1), (2);

CREATE TABLE #someFloats (val float);
INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

SELECT #A.a, STR(SUM(#someFloats.val), 30, 15)
FROM #someFloats LEFT JOIN #A ON 1 = 1
GROUP BY #A.a;

DROP TABLE #someFloats;
DROP TABLE #A;

-- yields
-- 1 13.600000000000001
-- 2 13.599999999999998


(sql fiddle, you can also see the execution plan there)



I have the same sum over the same values, but a different floating-point error. If I add more rows to table #A, we can see that the value alternates between those two values. I was only able to reproduce this issue with a LEFT JOIN; INNER JOIN works as expected here.



This is inconvenient, because it means that a DISTINCT, GROUP BY or PIVOT sees them as different values (which is actually how we discovered this issue).



The obvious solution is to round the value, but I'm curious: Is there a logical explanation for this behavior?










share|improve this question





























    9















    Let me state the obvious fist: I completely understand that floating point types cannot accurately represent decimal values. This is not about that! Nevertheless, floating point calculations are supposed to be deterministic.



    Now that this is out of the way, let me show you the curious case I observed today. I have a list of floating-point values, and I want to sum them up:



    CREATE TABLE #someFloats (val float);
    INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

    SELECT STR(SUM(#someFloats.val), 30, 15) FROM #someFloats;

    DROP TABLE #someFloats;

    -- yields:
    -- 13.600000000000001


    So far, so good - no surprises here. We all know that 1.2 can't be represented exactly in binary representation, so the "imprecise" result is expected.



    Now the following strange thing happens when I left-join another table:



    CREATE TABLE #A (a int);
    INSERT INTO #A (a) VALUES (1), (2);

    CREATE TABLE #someFloats (val float);
    INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

    SELECT #A.a, STR(SUM(#someFloats.val), 30, 15)
    FROM #someFloats LEFT JOIN #A ON 1 = 1
    GROUP BY #A.a;

    DROP TABLE #someFloats;
    DROP TABLE #A;

    -- yields
    -- 1 13.600000000000001
    -- 2 13.599999999999998


    (sql fiddle, you can also see the execution plan there)



    I have the same sum over the same values, but a different floating-point error. If I add more rows to table #A, we can see that the value alternates between those two values. I was only able to reproduce this issue with a LEFT JOIN; INNER JOIN works as expected here.



    This is inconvenient, because it means that a DISTINCT, GROUP BY or PIVOT sees them as different values (which is actually how we discovered this issue).



    The obvious solution is to round the value, but I'm curious: Is there a logical explanation for this behavior?










    share|improve this question

























      9












      9








      9


      1






      Let me state the obvious fist: I completely understand that floating point types cannot accurately represent decimal values. This is not about that! Nevertheless, floating point calculations are supposed to be deterministic.



      Now that this is out of the way, let me show you the curious case I observed today. I have a list of floating-point values, and I want to sum them up:



      CREATE TABLE #someFloats (val float);
      INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

      SELECT STR(SUM(#someFloats.val), 30, 15) FROM #someFloats;

      DROP TABLE #someFloats;

      -- yields:
      -- 13.600000000000001


      So far, so good - no surprises here. We all know that 1.2 can't be represented exactly in binary representation, so the "imprecise" result is expected.



      Now the following strange thing happens when I left-join another table:



      CREATE TABLE #A (a int);
      INSERT INTO #A (a) VALUES (1), (2);

      CREATE TABLE #someFloats (val float);
      INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

      SELECT #A.a, STR(SUM(#someFloats.val), 30, 15)
      FROM #someFloats LEFT JOIN #A ON 1 = 1
      GROUP BY #A.a;

      DROP TABLE #someFloats;
      DROP TABLE #A;

      -- yields
      -- 1 13.600000000000001
      -- 2 13.599999999999998


      (sql fiddle, you can also see the execution plan there)



      I have the same sum over the same values, but a different floating-point error. If I add more rows to table #A, we can see that the value alternates between those two values. I was only able to reproduce this issue with a LEFT JOIN; INNER JOIN works as expected here.



      This is inconvenient, because it means that a DISTINCT, GROUP BY or PIVOT sees them as different values (which is actually how we discovered this issue).



      The obvious solution is to round the value, but I'm curious: Is there a logical explanation for this behavior?










      share|improve this question














      Let me state the obvious fist: I completely understand that floating point types cannot accurately represent decimal values. This is not about that! Nevertheless, floating point calculations are supposed to be deterministic.



      Now that this is out of the way, let me show you the curious case I observed today. I have a list of floating-point values, and I want to sum them up:



      CREATE TABLE #someFloats (val float);
      INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

      SELECT STR(SUM(#someFloats.val), 30, 15) FROM #someFloats;

      DROP TABLE #someFloats;

      -- yields:
      -- 13.600000000000001


      So far, so good - no surprises here. We all know that 1.2 can't be represented exactly in binary representation, so the "imprecise" result is expected.



      Now the following strange thing happens when I left-join another table:



      CREATE TABLE #A (a int);
      INSERT INTO #A (a) VALUES (1), (2);

      CREATE TABLE #someFloats (val float);
      INSERT INTO #someFloats (val) VALUES (1), (1), (1.2), (1.2), (1.2), (3), (5);

      SELECT #A.a, STR(SUM(#someFloats.val), 30, 15)
      FROM #someFloats LEFT JOIN #A ON 1 = 1
      GROUP BY #A.a;

      DROP TABLE #someFloats;
      DROP TABLE #A;

      -- yields
      -- 1 13.600000000000001
      -- 2 13.599999999999998


      (sql fiddle, you can also see the execution plan there)



      I have the same sum over the same values, but a different floating-point error. If I add more rows to table #A, we can see that the value alternates between those two values. I was only able to reproduce this issue with a LEFT JOIN; INNER JOIN works as expected here.



      This is inconvenient, because it means that a DISTINCT, GROUP BY or PIVOT sees them as different values (which is actually how we discovered this issue).



      The obvious solution is to round the value, but I'm curious: Is there a logical explanation for this behavior?







      sql-server floating-point






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Mar 29 at 14:27









      HeinziHeinzi

      1,3721532




      1,3721532






















          1 Answer
          1






          active

          oldest

          votes


















          12














          Actually, the link you're referring to does not say that floating point arithmetic calculations are always deterministic. In fact, in one of the answers it's mentioned that addition is not associative (meaning (a + b) + c does not necessarily equal a + (b + c)), which is also said in this answer.



          If stream aggregation happens to process rows of each group in different order, this could explain the behaviour you observe.






          share|improve this answer
























          • Associativity has no relation to determinism, so that bit is misleading.

            – Mooing Duck
            Mar 30 at 0:42











          • Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

            – mustaccio
            Mar 30 at 1:39











          • No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

            – Mooing Duck
            Mar 30 at 4:36











          • Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

            – mustaccio
            Mar 30 at 12:21








          • 1





            Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

            – Mooing Duck
            Mar 30 at 22:20












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "182"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f233513%2fnon-deterministic-sum-of-floats%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          12














          Actually, the link you're referring to does not say that floating point arithmetic calculations are always deterministic. In fact, in one of the answers it's mentioned that addition is not associative (meaning (a + b) + c does not necessarily equal a + (b + c)), which is also said in this answer.



          If stream aggregation happens to process rows of each group in different order, this could explain the behaviour you observe.






          share|improve this answer
























          • Associativity has no relation to determinism, so that bit is misleading.

            – Mooing Duck
            Mar 30 at 0:42











          • Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

            – mustaccio
            Mar 30 at 1:39











          • No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

            – Mooing Duck
            Mar 30 at 4:36











          • Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

            – mustaccio
            Mar 30 at 12:21








          • 1





            Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

            – Mooing Duck
            Mar 30 at 22:20
















          12














          Actually, the link you're referring to does not say that floating point arithmetic calculations are always deterministic. In fact, in one of the answers it's mentioned that addition is not associative (meaning (a + b) + c does not necessarily equal a + (b + c)), which is also said in this answer.



          If stream aggregation happens to process rows of each group in different order, this could explain the behaviour you observe.






          share|improve this answer
























          • Associativity has no relation to determinism, so that bit is misleading.

            – Mooing Duck
            Mar 30 at 0:42











          • Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

            – mustaccio
            Mar 30 at 1:39











          • No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

            – Mooing Duck
            Mar 30 at 4:36











          • Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

            – mustaccio
            Mar 30 at 12:21








          • 1





            Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

            – Mooing Duck
            Mar 30 at 22:20














          12












          12








          12







          Actually, the link you're referring to does not say that floating point arithmetic calculations are always deterministic. In fact, in one of the answers it's mentioned that addition is not associative (meaning (a + b) + c does not necessarily equal a + (b + c)), which is also said in this answer.



          If stream aggregation happens to process rows of each group in different order, this could explain the behaviour you observe.






          share|improve this answer













          Actually, the link you're referring to does not say that floating point arithmetic calculations are always deterministic. In fact, in one of the answers it's mentioned that addition is not associative (meaning (a + b) + c does not necessarily equal a + (b + c)), which is also said in this answer.



          If stream aggregation happens to process rows of each group in different order, this could explain the behaviour you observe.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 29 at 15:23









          mustacciomustaccio

          10.1k72239




          10.1k72239













          • Associativity has no relation to determinism, so that bit is misleading.

            – Mooing Duck
            Mar 30 at 0:42











          • Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

            – mustaccio
            Mar 30 at 1:39











          • No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

            – Mooing Duck
            Mar 30 at 4:36











          • Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

            – mustaccio
            Mar 30 at 12:21








          • 1





            Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

            – Mooing Duck
            Mar 30 at 22:20



















          • Associativity has no relation to determinism, so that bit is misleading.

            – Mooing Duck
            Mar 30 at 0:42











          • Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

            – mustaccio
            Mar 30 at 1:39











          • No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

            – Mooing Duck
            Mar 30 at 4:36











          • Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

            – mustaccio
            Mar 30 at 12:21








          • 1





            Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

            – Mooing Duck
            Mar 30 at 22:20

















          Associativity has no relation to determinism, so that bit is misleading.

          – Mooing Duck
          Mar 30 at 0:42





          Associativity has no relation to determinism, so that bit is misleading.

          – Mooing Duck
          Mar 30 at 0:42













          Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

          – mustaccio
          Mar 30 at 1:39





          Non-associativity of floating point addition leads to non-deterministic behaviour of the SQL Server aggregate function SUM(), would you agree @MooingDuck?

          – mustaccio
          Mar 30 at 1:39













          No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

          – Mooing Duck
          Mar 30 at 4:36





          No? Integer Division is a clear counterexample. It is non-associative, but entirely deterministic. Likewise, floating point division should be non-associative and still deterministic. From that, we conclude it's reasonable for addition to be non-associative and still deterministic. That being said, if the order of additions isn't deterministic, then the result will likewise not be deterministic, so your first and last sentence are still correct regardless.

          – Mooing Duck
          Mar 30 at 4:36













          Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

          – mustaccio
          Mar 30 at 12:21







          Integer division is a counterexample for the SQL Server SUM() over floating point arguments, how exactly?

          – mustaccio
          Mar 30 at 12:21






          1




          1





          Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

          – Mooing Duck
          Mar 30 at 22:20





          Integer division is non-associative and deterministic. Therefore, arithmetic operations associativity is not related to determinism. Therefore any non-associativity of SUM() must be irrelevant toward it's determinism. I agree that SUM appears to be non deterministic, but you should remove mentions of associativity, since that's unrelated.

          – Mooing Duck
          Mar 30 at 22:20


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Database Administrators Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f233513%2fnon-deterministic-sum-of-floats%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Plaza Victoria

          Brian Clough

          Cáceres