Convergence of Linear Neural Networks in the Easiest Framework

EDIT: Thanks to the first answer I'm lighting some assumptions:

I'm trying to understand the basics of machine learning, and I have this theoretical question:

I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.

Assume that, during training, I just show my network datapoints sampled from a simple distribution $D$ which lies on a circle with radius $r$ and nothing else. Furthermore, let's assume that these points all have the same label $0$. After training with S.G.D, I want my network to perform well just on this circle, I don't care how does it behave on data sampled outside (I assume it would be close to random classification, since it only sees data from inside this circle).

How many iterations will SGD need in order to converge to a good local minimum?

How can I prove it?

My intuition is that the algorithm should be rather fast, since a function which would achieve 100% accuracy is very simple: it's the boundary of the circle itself, so we just need the network to approximate any closed curve around it and output $0$ for every point inside it. But I may be wrong.

Thank you very much!

edited Nov 29 '18 at 16:25

asked Nov 29 '18 at 15:26

Alfred

184

add a comment |

EDIT: Thanks to the first answer I'm lighting some assumptions:

I'm trying to understand the basics of machine learning, and I have this theoretical question:

I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.

How many iterations will SGD need in order to converge to a good local minimum?

How can I prove it?

Thank you very much!

edited Nov 29 '18 at 16:25

asked Nov 29 '18 at 15:26

Alfred

184

add a comment |

EDIT: Thanks to the first answer I'm lighting some assumptions:

I'm trying to understand the basics of machine learning, and I have this theoretical question:

I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.

How many iterations will SGD need in order to converge to a good local minimum?

How can I prove it?

Thank you very much!

edited Nov 29 '18 at 16:25

asked Nov 29 '18 at 15:26

Alfred

184

EDIT: Thanks to the first answer I'm lighting some assumptions:

I'm trying to understand the basics of machine learning, and I have this theoretical question:

I have a one layer linear neural network $ f: mathbb{R}^d rightarrow mathbb{R}^2 $ and two classes to learn, hence we are talking about binary classification.

How many iterations will SGD need in order to converge to a good local minimum?

How can I prove it?

Thank you very much!

convergence algorithms machine-learning neural-networks

edited Nov 29 '18 at 16:25

asked Nov 29 '18 at 15:26

Alfred

184

edited Nov 29 '18 at 16:25

asked Nov 29 '18 at 15:26

Alfred

184

edited Nov 29 '18 at 16:25

asked Nov 29 '18 at 15:26

Alfred

184

asked Nov 29 '18 at 15:26

Alfred

184

asked Nov 29 '18 at 15:26

Alfred

184

add a comment |

1 Answer
1

active

oldest

votes

This question cannot be answered. There are several issues with it that you need to be more precise about:

To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.

Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.

I think you need to re-frame your question to be a lot more specific, and more possible. Also, there is in general no way to know before training how good your training error will be. This is the same as point 2 above - if you knew that you could get your training error down to X with your particular neural network, then you'd already know enough about your data to not need to learn from it.

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

1

$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3018771%2fconvergence-of-linear-neural-networks-in-the-easiest-framework%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

This question cannot be answered. There are several issues with it that you need to be more precise about:

To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.

Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

1

$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09

add a comment |

This question cannot be answered. There are several issues with it that you need to be more precise about:

To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.

Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

1

$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09

add a comment |

This question cannot be answered. There are several issues with it that you need to be more precise about:

To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.

Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

This question cannot be answered. There are several issues with it that you need to be more precise about:

To achieve zero training error that means that the data is perfectly described by your network. This isn't unreasonable to assume that a data set could be made this way, since after all neural networks (as you've defined them) are really just cascades of function composition. In order to get zero training error, you simply need to take your neural network, completely untrained, and use it to generate your training set. Immediately you'll get zero training error.

Presuming that you didn't want to do that, then it would depend way too strongly on the data itself, and therein lies the major issue with this question - if you knew how many neurons you needed to get zero training error, then you'd already know enough about the data to not need to learn anything about it. Zero training error means that you can generate a perfect neural network to describe the (training) data. If you could do that in the first place, then you'd know so much about your data that learning is pointless.

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

answered Nov 29 '18 at 15:55

Michael Stachowsky

1,250417

1

$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09

add a comment |

1

$begingroup$
Ok thank you very much, I'll try to edit it!
$endgroup$
– Alfred
Nov 29 '18 at 16:09

Ok thank you very much, I'll try to edit it!

– Alfred
Nov 29 '18 at 16:09

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

PORHV3,Op5Jee8obbK3,8IcIIDEgWk6m,MvIm,xWY7g3kJdvJqnT9csJ90OciOcGJhLU,Jb1rYw32RPTrhqr5ddPu Y4Zi

搜尋此網誌

Csdrhrt