Convergence of Linear Neural Networks in the Easiest Framework
$begingroup$
EDIT: Thanks to the first answer I'm lighting some assumptions:
I'm trying to understand the basics of machine learning, and I have this theoretical question:
I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.
Assume that, during training, I just show my network datapoints sampled from a simple distribution $D$ which lies on a circle with radius $r$ and nothing else. Furthermore, let's assume that these points all have the same label $0$. After training with S.G.D, I want my network to perform well just on this circle, I don't care how does it behave on data sampled outside (I assume it would be close to random classification, since it only sees data from inside this circle).
How many iterations will SGD need in order to converge to a good local minimum?
How can I prove it?
My intuition is that the algorithm should be rather fast, since a function which would achieve 100% accuracy is very simple: it's the boundary of the circle itself, so we just need the network to approximate any closed curve around it and output $0$ for every point inside it. But I may be wrong.
Thank you very much!
convergence algorithms machine-learning neural-networks
$endgroup$
add a comment |
$begingroup$
EDIT: Thanks to the first answer I'm lighting some assumptions:
I'm trying to understand the basics of machine learning, and I have this theoretical question:
I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.
Assume that, during training, I just show my network datapoints sampled from a simple distribution $D$ which lies on a circle with radius $r$ and nothing else. Furthermore, let's assume that these points all have the same label $0$. After training with S.G.D, I want my network to perform well just on this circle, I don't care how does it behave on data sampled outside (I assume it would be close to random classification, since it only sees data from inside this circle).
How many iterations will SGD need in order to converge to a good local minimum?
How can I prove it?
My intuition is that the algorithm should be rather fast, since a function which would achieve 100% accuracy is very simple: it's the boundary of the circle itself, so we just need the network to approximate any closed curve around it and output $0$ for every point inside it. But I may be wrong.
Thank you very much!
convergence algorithms machine-learning neural-networks
$endgroup$
add a comment |
$begingroup$
EDIT: Thanks to the first answer I'm lighting some assumptions:
I'm trying to understand the basics of machine learning, and I have this theoretical question:
I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.
Assume that, during training, I just show my network datapoints sampled from a simple distribution $D$ which lies on a circle with radius $r$ and nothing else. Furthermore, let's assume that these points all have the same label $0$. After training with S.G.D, I want my network to perform well just on this circle, I don't care how does it behave on data sampled outside (I assume it would be close to random classification, since it only sees data from inside this circle).
How many iterations will SGD need in order to converge to a good local minimum?
How can I prove it?
My intuition is that the algorithm should be rather fast, since a function which would achieve 100% accuracy is very simple: it's the boundary of the circle itself, so we just need the network to approximate any closed curve around it and output $0$ for every point inside it. But I may be wrong.
Thank you very much!
convergence algorithms machine-learning neural-networks
$endgroup$
EDIT: Thanks to the first answer I'm lighting some assumptions:
I'm trying to understand the basics of machine learning, and I have this theoretical question:
I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.
Assume that, during training, I just show my network datapoints sampled from a simple distribution $D$ which lies on a circle with radius $r$ and nothing else. Furthermore, let's assume that these points all have the same label $0$. After training with S.G.D, I want my network to perform well just on this circle, I don't care how does it behave on data sampled outside (I assume it would be close to random classification, since it only sees data from inside this circle).
How many iterations will SGD need in order to converge to a good local minimum?
How can I prove it?
My intuition is that the algorithm should be rather fast, since a function which would achieve 100% accuracy is very simple: it's the boundary of the circle itself, so we just need the network to approximate any closed curve around it and output $0$ for every point inside it. But I may be wrong.
Thank you very much!
convergence algorithms machine-learning neural-networks
convergence algorithms machine-learning neural-networks
edited Nov 29 '18 at 16:25
Alfred
asked Nov 29 '18 at 15:26
AlfredAlfred
184
184
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
This question cannot be answered. There are several issues with it that you need to be more precise about:
To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.
Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.
I think you need to re-frame your question to be a lot more specific, and more possible. Also, there is in general no way to know before training how good your training error will be. This is the same as point 2 above - if you knew that you could get your training error down to X with your particular neural network, then you'd already know enough about your data to not need to learn from it.
$endgroup$
1
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3018771%2fconvergence-of-linear-neural-networks-in-the-easiest-framework%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
This question cannot be answered. There are several issues with it that you need to be more precise about:
To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.
Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.
I think you need to re-frame your question to be a lot more specific, and more possible. Also, there is in general no way to know before training how good your training error will be. This is the same as point 2 above - if you knew that you could get your training error down to X with your particular neural network, then you'd already know enough about your data to not need to learn from it.
$endgroup$
1
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
add a comment |
$begingroup$
This question cannot be answered. There are several issues with it that you need to be more precise about:
To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.
Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.
I think you need to re-frame your question to be a lot more specific, and more possible. Also, there is in general no way to know before training how good your training error will be. This is the same as point 2 above - if you knew that you could get your training error down to X with your particular neural network, then you'd already know enough about your data to not need to learn from it.
$endgroup$
1
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
add a comment |
$begingroup$
This question cannot be answered. There are several issues with it that you need to be more precise about:
To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.
Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.
I think you need to re-frame your question to be a lot more specific, and more possible. Also, there is in general no way to know before training how good your training error will be. This is the same as point 2 above - if you knew that you could get your training error down to X with your particular neural network, then you'd already know enough about your data to not need to learn from it.
$endgroup$
This question cannot be answered. There are several issues with it that you need to be more precise about:
To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.
Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.
I think you need to re-frame your question to be a lot more specific, and more possible. Also, there is in general no way to know before training how good your training error will be. This is the same as point 2 above - if you knew that you could get your training error down to X with your particular neural network, then you'd already know enough about your data to not need to learn from it.
answered Nov 29 '18 at 15:55
Michael StachowskyMichael Stachowsky
1,250417
1,250417
1
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
add a comment |
1
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
1
1
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3018771%2fconvergence-of-linear-neural-networks-in-the-easiest-framework%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown