Calculating cumulative error in distribution of elements
I am dealing with a real life situation where I have a distribution of 12 unique elements that can be permutated in any way possible (so 12! = 479,001,600 possible permutations). The index position of each element has a fixed measure and each element has a dimension that fits that measure but will almost never fit 100%, IOW there will always be some error that must be held within a threshold but even when it is within the threshold it is best to minimize the error.
Think of having 12 children of similar size and 12 pairs of differently sized shoes. The ideal would be to assign a pair to each child so the shoes fit but if that's not possible, it's okay if a couple of children are off by half a size each. You also want to avoid a scenario of an outlier whereby 11 children are fitting almost perfectly but one kid is off by two sizes (hypothetically, because two sizes would be more than the allowed threshold). It is better to have each kid off by half a size even though the cumulative error (all added together) is less than two sizes because of the drastic deviation.
I need to come up with a measurement that will reveal the most optimal distribution (permutation) of elements with regard to this error. Ideally, if each error were 0, the cumulative measurement for the set would be 0 (perfect). But inevitably there will be some errors. As explained in the paragraph above, just summing each individual error would not suffice because if 11 elements had 0 error but 1 had one that is just under the threshold would be disproportionately throwing off the balance.
I was thinking of standard deviation
to represent how optimal each permutation is with regard to error. Is this a good way to approach it or should I use something else ?
statistics standard-deviation
add a comment |
I am dealing with a real life situation where I have a distribution of 12 unique elements that can be permutated in any way possible (so 12! = 479,001,600 possible permutations). The index position of each element has a fixed measure and each element has a dimension that fits that measure but will almost never fit 100%, IOW there will always be some error that must be held within a threshold but even when it is within the threshold it is best to minimize the error.
Think of having 12 children of similar size and 12 pairs of differently sized shoes. The ideal would be to assign a pair to each child so the shoes fit but if that's not possible, it's okay if a couple of children are off by half a size each. You also want to avoid a scenario of an outlier whereby 11 children are fitting almost perfectly but one kid is off by two sizes (hypothetically, because two sizes would be more than the allowed threshold). It is better to have each kid off by half a size even though the cumulative error (all added together) is less than two sizes because of the drastic deviation.
I need to come up with a measurement that will reveal the most optimal distribution (permutation) of elements with regard to this error. Ideally, if each error were 0, the cumulative measurement for the set would be 0 (perfect). But inevitably there will be some errors. As explained in the paragraph above, just summing each individual error would not suffice because if 11 elements had 0 error but 1 had one that is just under the threshold would be disproportionately throwing off the balance.
I was thinking of standard deviation
to represent how optimal each permutation is with regard to error. Is this a good way to approach it or should I use something else ?
statistics standard-deviation
add a comment |
I am dealing with a real life situation where I have a distribution of 12 unique elements that can be permutated in any way possible (so 12! = 479,001,600 possible permutations). The index position of each element has a fixed measure and each element has a dimension that fits that measure but will almost never fit 100%, IOW there will always be some error that must be held within a threshold but even when it is within the threshold it is best to minimize the error.
Think of having 12 children of similar size and 12 pairs of differently sized shoes. The ideal would be to assign a pair to each child so the shoes fit but if that's not possible, it's okay if a couple of children are off by half a size each. You also want to avoid a scenario of an outlier whereby 11 children are fitting almost perfectly but one kid is off by two sizes (hypothetically, because two sizes would be more than the allowed threshold). It is better to have each kid off by half a size even though the cumulative error (all added together) is less than two sizes because of the drastic deviation.
I need to come up with a measurement that will reveal the most optimal distribution (permutation) of elements with regard to this error. Ideally, if each error were 0, the cumulative measurement for the set would be 0 (perfect). But inevitably there will be some errors. As explained in the paragraph above, just summing each individual error would not suffice because if 11 elements had 0 error but 1 had one that is just under the threshold would be disproportionately throwing off the balance.
I was thinking of standard deviation
to represent how optimal each permutation is with regard to error. Is this a good way to approach it or should I use something else ?
statistics standard-deviation
I am dealing with a real life situation where I have a distribution of 12 unique elements that can be permutated in any way possible (so 12! = 479,001,600 possible permutations). The index position of each element has a fixed measure and each element has a dimension that fits that measure but will almost never fit 100%, IOW there will always be some error that must be held within a threshold but even when it is within the threshold it is best to minimize the error.
Think of having 12 children of similar size and 12 pairs of differently sized shoes. The ideal would be to assign a pair to each child so the shoes fit but if that's not possible, it's okay if a couple of children are off by half a size each. You also want to avoid a scenario of an outlier whereby 11 children are fitting almost perfectly but one kid is off by two sizes (hypothetically, because two sizes would be more than the allowed threshold). It is better to have each kid off by half a size even though the cumulative error (all added together) is less than two sizes because of the drastic deviation.
I need to come up with a measurement that will reveal the most optimal distribution (permutation) of elements with regard to this error. Ideally, if each error were 0, the cumulative measurement for the set would be 0 (perfect). But inevitably there will be some errors. As explained in the paragraph above, just summing each individual error would not suffice because if 11 elements had 0 error but 1 had one that is just under the threshold would be disproportionately throwing off the balance.
I was thinking of standard deviation
to represent how optimal each permutation is with regard to error. Is this a good way to approach it or should I use something else ?
statistics standard-deviation
statistics standard-deviation
asked Aug 22 '15 at 17:37
amphibient
1156
1156
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Problems like this all come down to a choice of error function. Conventional choice would be least squares (in each slot measure the distance between actual and predicted and square, then sum over the slots). This tends to give plausible results in a variety of situations, and it has the huge advantage of being computationally fairly tractable. Try it! It is a theorem of Gauss that the least squares fit minimizes error variance (in a fairly broad class of situations) so it may be what you are looking for.
If you find you don't care for the results you can try other measures for error...perhaps a function which grows faster than $(distance)^2$. That will tend to kill outliers for sure, but the calculation will be harder.
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
1
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1406113%2fcalculating-cumulative-error-in-distribution-of-elements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Problems like this all come down to a choice of error function. Conventional choice would be least squares (in each slot measure the distance between actual and predicted and square, then sum over the slots). This tends to give plausible results in a variety of situations, and it has the huge advantage of being computationally fairly tractable. Try it! It is a theorem of Gauss that the least squares fit minimizes error variance (in a fairly broad class of situations) so it may be what you are looking for.
If you find you don't care for the results you can try other measures for error...perhaps a function which grows faster than $(distance)^2$. That will tend to kill outliers for sure, but the calculation will be harder.
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
1
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
add a comment |
Problems like this all come down to a choice of error function. Conventional choice would be least squares (in each slot measure the distance between actual and predicted and square, then sum over the slots). This tends to give plausible results in a variety of situations, and it has the huge advantage of being computationally fairly tractable. Try it! It is a theorem of Gauss that the least squares fit minimizes error variance (in a fairly broad class of situations) so it may be what you are looking for.
If you find you don't care for the results you can try other measures for error...perhaps a function which grows faster than $(distance)^2$. That will tend to kill outliers for sure, but the calculation will be harder.
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
1
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
add a comment |
Problems like this all come down to a choice of error function. Conventional choice would be least squares (in each slot measure the distance between actual and predicted and square, then sum over the slots). This tends to give plausible results in a variety of situations, and it has the huge advantage of being computationally fairly tractable. Try it! It is a theorem of Gauss that the least squares fit minimizes error variance (in a fairly broad class of situations) so it may be what you are looking for.
If you find you don't care for the results you can try other measures for error...perhaps a function which grows faster than $(distance)^2$. That will tend to kill outliers for sure, but the calculation will be harder.
Problems like this all come down to a choice of error function. Conventional choice would be least squares (in each slot measure the distance between actual and predicted and square, then sum over the slots). This tends to give plausible results in a variety of situations, and it has the huge advantage of being computationally fairly tractable. Try it! It is a theorem of Gauss that the least squares fit minimizes error variance (in a fairly broad class of situations) so it may be what you are looking for.
If you find you don't care for the results you can try other measures for error...perhaps a function which grows faster than $(distance)^2$. That will tend to kill outliers for sure, but the calculation will be harder.
answered Aug 22 '15 at 17:46
lulu
39k24677
39k24677
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
1
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
add a comment |
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
1
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
So is standard deviation not good?
– amphibient
Aug 22 '15 at 18:01
1
1
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
Minimizing variance is the same as minimizing st. dev.; people tend to speak of variance in these contexts because it is additive. But look at the results when you try least squares. It works in a lot of situations, but not always. In financial models, for example, you almost always need to add an additional error term, or several.
– lulu
Aug 22 '15 at 22:38
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1406113%2fcalculating-cumulative-error-in-distribution-of-elements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown