Convexity of a Log Likelihood Function

Multi tool use

Goal

I would like to proof than the Negative Log Likelihood Function of Sample drawn from a Normal Distribution is convex.

Below a Figure showing an example of such function:

enter image description here

Motivation of this question is detailed at the end of the post.

Sketching the demonstration

What I did so far...

First I have written the likelihood function for a single observation:

$$mathcal{L} (mu, sigma mid x) = frac{1}{sqrt{2pisigma^2} } e^{ -frac{(x-mu)^2}{2sigma^2} }$$

For convenience I take the Negative Logarithm of the Likelihood function, I think it does not change the extremum because Logarithm is a monotonic function:

$$f(mu, sigma mid x) = -ln mathcal{L} (mu, sigma mid x) = frac{1}{2}ln(2pi) + ln(sigma) +frac{(x-mu)^2}{2sigma^2}$$

Then I can assess the Jacobian (check: 1 and 2):

$$
mathbf{J}_f = left[
begin{matrix}
-frac{x-mu}{sigma^2}\
-frac{(x-mu)^2 - sigma^2}{sigma^3}
end{matrix}
right]
$$

And the Hessian (check 3, 4 and 5):

$$
mathbf{H} = left[
begin{matrix}
frac{1}{sigma^2} & frac{2(x - mu)}{sigma^3} \
frac{2(x - mu)}{sigma^3} & frac{3(x - mu)^2 - sigma^2}{sigma^4}
end{matrix}
right]
$$

Of the Negative Log Likelihood function.

Now, I compute the Hessian of the Negative Log Likelihood function for $N$ observations:

$$
mathbf{A} = frac{1}{N}sumlimits_{i=1}^{N}mathbf{H} = left[
begin{matrix}
frac{1}{sigma^2} & frac{2(bar{x} - mu)}{sigma^3} \
frac{2(bar{x} - mu)}{sigma^3} & frac{frac{3}{N}sumlimits_{i=1}^{N}(x-mu)^2 - sigma^2}{sigma^4}
end{matrix}
right]
$$

If everything is right at this point:

Proving the function is convex is equivalent to prove than the Hessian is semi-positive definite;

Additionally I know that:

A semi-positive definite Matrix must have all its eigenvalues non-negative;

Because the Hessian Matrix is symmetric, all eigenvalues are real;

So if I prove than all eigenvalues are positive real numbers, then I can claim the function is convex. We can also check, as @LinAlg suggested, that both determinant and trace of Matrix $mathbf{A}$ are positive:

$$
begin{align}
det(mathbf{A}) geq 0 Leftrightarrow & frac{3}{N}sumlimits_{i=1}^{N}(x-mu)^2 - sigma^2 - 4(bar{x} - mu)^2 geq 0 \
operatorname{tr}(mathbf{A}) geq 0 Leftrightarrow & frac{3}{N}sumlimits_{i=1}^{N}(x-mu)^2 geq 0
end{align}
$$

It is obvious that $operatorname{tr}(mathbf{A}) geq 0$.

Inequality

The inequality $det(mathbf{A}) geq 0$ is not obvious at the first glance, it requires a bit of algebra. Expanding all squares, applying sum, simplifying and grouping gives:

$$
3left[frac{1}{N}sumlimits_{i=1}^{N}x^2 - bar{x}^2right] -(bar{x}-mu)^2 -sigma^2 geq 0
$$

Now I can rewrite it using Standard Deviation Estimation:

$$
3left[frac{N-1}{N}s^2_x - sigma^2right] + 2sigma^2 geq (bar{x}-mu)^2
$$

For $N$ sufficiently large it tends to:

$$
3left(s^2_x - sigma^2right) + 2sigma^2 geq (bar{x}-mu)^2
$$

Or:

$$
left|bar{x}-muright| leq sqrt{3s^2_x - sigma^2}
$$

Provided the radicand is non -negative. This last inequality provides a bound for Mean Asbolute Error which must be lower than approximately $sqrt{2}sigma$. Finally, if the estimators converge to expected values it reduces to:

$$
sigma geq 0
$$

Which is trivially true by definition.

My interpretation of this inequality is:

If we have sufficiently large statistics, drawn from a Normal Distribution, and the Mean and Variance Estimation are close enough to their expected value then the Negative Likelihood Function should be convex. If expected values $mu$ and $sigma$ are known, it is possible to assess the convexity.

Questions

Are my reasoning and the interpretation of the result correct?

Can we formally prove that $det(mathbf{A}) geq 0$?

Motivation

This question arose from a numerical example I develop. I am sampling from a Normal Distribution $mathcal{N}(mu=2, sigma=3)$ with $N=1000$ and I would like to emphasize all steps of a Maximum Likelihood Estimation. Then when I visualized the function to minimize I wondered: Can we say that this function is convex? Which made me write this post on MSE.

enter image description here

edited Dec 19 '18 at 8:31

asked Dec 18 '18 at 10:08

jlandercy

261214

$begingroup$
Your simplification of $A$ is not correct, since you 'abuse' Bias and $sigma$. The determinant is the product of the eigenvalues and the trace is the sum of the eigenvalues, so it suffices to check if the trace and determinant are positive.
$endgroup$
– LinAlg
Dec 18 '18 at 15:36

$begingroup$
@LinAlg, Thank you for the comment could you be more explicit, what do you mean by abusing the bias and sigma?
$endgroup$
– jlandercy
Dec 18 '18 at 16:33

$begingroup$
sigma and bias do not depend on data but are fixed numbers based on the probility distrubtion
$endgroup$
– LinAlg
Dec 18 '18 at 16:47

$begingroup$
@LinAlg I have updated my post to take your remarks into account. Would you mind review it? I have some difficulty to show that $det(mathbf{A})geq 0$. Thank you.
$endgroup$
– jlandercy
Dec 18 '18 at 22:56

1

$begingroup$
I have checked your steps and they are correct. For fixed $x$ you get an ellipsoid on which the function is convex.
$endgroup$
– LinAlg
Dec 18 '18 at 23:05

add a comment |

Goal

I would like to proof than the Negative Log Likelihood Function of Sample drawn from a Normal Distribution is convex.

Below a Figure showing an example of such function:

enter image description here

Motivation of this question is detailed at the end of the post.

Sketching the demonstration

What I did so far...

First I have written the likelihood function for a single observation:

$$mathcal{L} (mu, sigma mid x) = frac{1}{sqrt{2pisigma^2} } e^{ -frac{(x-mu)^2}{2sigma^2} }$$

For convenience I take the Negative Logarithm of the Likelihood function, I think it does not change the extremum because Logarithm is a monotonic function:

$$f(mu, sigma mid x) = -ln mathcal{L} (mu, sigma mid x) = frac{1}{2}ln(2pi) + ln(sigma) +frac{(x-mu)^2}{2sigma^2}$$

Then I can assess the Jacobian (check: 1 and 2):

$$
mathbf{J}_f = left[
begin{matrix}
-frac{x-mu}{sigma^2}\
-frac{(x-mu)^2 - sigma^2}{sigma^3}
end{matrix}
right]
$$

And the Hessian (check 3, 4 and 5):

$$
mathbf{H} = left[
begin{matrix}
frac{1}{sigma^2} & frac{2(x - mu)}{sigma^3} \
frac{2(x - mu)}{sigma^3} & frac{3(x - mu)^2 - sigma^2}{sigma^4}
end{matrix}
right]
$$

Of the Negative Log Likelihood function.

Now, I compute the Hessian of the Negative Log Likelihood function for $N$ observations:

If everything is right at this point:

Proving the function is convex is equivalent to prove than the Hessian is semi-positive definite;

Additionally I know that:

A semi-positive definite Matrix must have all its eigenvalues non-negative;

Because the Hessian Matrix is symmetric, all eigenvalues are real;

It is obvious that $operatorname{tr}(mathbf{A}) geq 0$.

Inequality

The inequality $det(mathbf{A}) geq 0$ is not obvious at the first glance, it requires a bit of algebra. Expanding all squares, applying sum, simplifying and grouping gives:

$$
3left[frac{1}{N}sumlimits_{i=1}^{N}x^2 - bar{x}^2right] -(bar{x}-mu)^2 -sigma^2 geq 0
$$

Now I can rewrite it using Standard Deviation Estimation:

$$
3left[frac{N-1}{N}s^2_x - sigma^2right] + 2sigma^2 geq (bar{x}-mu)^2
$$

For $N$ sufficiently large it tends to:

$$
3left(s^2_x - sigma^2right) + 2sigma^2 geq (bar{x}-mu)^2
$$

Or:

$$
left|bar{x}-muright| leq sqrt{3s^2_x - sigma^2}
$$

$$
sigma geq 0
$$

Which is trivially true by definition.

My interpretation of this inequality is:

Questions

Are my reasoning and the interpretation of the result correct?

Can we formally prove that $det(mathbf{A}) geq 0$?

Motivation

enter image description here

edited Dec 19 '18 at 8:31

asked Dec 18 '18 at 10:08

jlandercy

261214

$begingroup$
Your simplification of $A$ is not correct, since you 'abuse' Bias and $sigma$. The determinant is the product of the eigenvalues and the trace is the sum of the eigenvalues, so it suffices to check if the trace and determinant are positive.
$endgroup$
– LinAlg
Dec 18 '18 at 15:36

$begingroup$
@LinAlg, Thank you for the comment could you be more explicit, what do you mean by abusing the bias and sigma?
$endgroup$
– jlandercy
Dec 18 '18 at 16:33

$begingroup$
sigma and bias do not depend on data but are fixed numbers based on the probility distrubtion
$endgroup$
– LinAlg
Dec 18 '18 at 16:47

$begingroup$
@LinAlg I have updated my post to take your remarks into account. Would you mind review it? I have some difficulty to show that $det(mathbf{A})geq 0$. Thank you.
$endgroup$
– jlandercy
Dec 18 '18 at 22:56

1

$begingroup$
I have checked your steps and they are correct. For fixed $x$ you get an ellipsoid on which the function is convex.
$endgroup$
– LinAlg
Dec 18 '18 at 23:05

add a comment |

Goal

I would like to proof than the Negative Log Likelihood Function of Sample drawn from a Normal Distribution is convex.

Below a Figure showing an example of such function:

enter image description here

Motivation of this question is detailed at the end of the post.

Sketching the demonstration

What I did so far...

First I have written the likelihood function for a single observation:

$$mathcal{L} (mu, sigma mid x) = frac{1}{sqrt{2pisigma^2} } e^{ -frac{(x-mu)^2}{2sigma^2} }$$

For convenience I take the Negative Logarithm of the Likelihood function, I think it does not change the extremum because Logarithm is a monotonic function:

$$f(mu, sigma mid x) = -ln mathcal{L} (mu, sigma mid x) = frac{1}{2}ln(2pi) + ln(sigma) +frac{(x-mu)^2}{2sigma^2}$$

Then I can assess the Jacobian (check: 1 and 2):

$$
mathbf{J}_f = left[
begin{matrix}
-frac{x-mu}{sigma^2}\
-frac{(x-mu)^2 - sigma^2}{sigma^3}
end{matrix}
right]
$$

And the Hessian (check 3, 4 and 5):

$$
mathbf{H} = left[
begin{matrix}
frac{1}{sigma^2} & frac{2(x - mu)}{sigma^3} \
frac{2(x - mu)}{sigma^3} & frac{3(x - mu)^2 - sigma^2}{sigma^4}
end{matrix}
right]
$$

Of the Negative Log Likelihood function.

Now, I compute the Hessian of the Negative Log Likelihood function for $N$ observations:

If everything is right at this point:

Proving the function is convex is equivalent to prove than the Hessian is semi-positive definite;

Additionally I know that:

A semi-positive definite Matrix must have all its eigenvalues non-negative;

Because the Hessian Matrix is symmetric, all eigenvalues are real;

It is obvious that $operatorname{tr}(mathbf{A}) geq 0$.

Inequality

The inequality $det(mathbf{A}) geq 0$ is not obvious at the first glance, it requires a bit of algebra. Expanding all squares, applying sum, simplifying and grouping gives:

$$
3left[frac{1}{N}sumlimits_{i=1}^{N}x^2 - bar{x}^2right] -(bar{x}-mu)^2 -sigma^2 geq 0
$$

Now I can rewrite it using Standard Deviation Estimation:

$$
3left[frac{N-1}{N}s^2_x - sigma^2right] + 2sigma^2 geq (bar{x}-mu)^2
$$

For $N$ sufficiently large it tends to:

$$
3left(s^2_x - sigma^2right) + 2sigma^2 geq (bar{x}-mu)^2
$$

Or:

$$
left|bar{x}-muright| leq sqrt{3s^2_x - sigma^2}
$$

$$
sigma geq 0
$$

Which is trivially true by definition.

My interpretation of this inequality is:

Questions

Are my reasoning and the interpretation of the result correct?

Can we formally prove that $det(mathbf{A}) geq 0$?

Motivation

enter image description here

edited Dec 19 '18 at 8:31

asked Dec 18 '18 at 10:08

jlandercy

261214

Goal

I would like to proof than the Negative Log Likelihood Function of Sample drawn from a Normal Distribution is convex.

Below a Figure showing an example of such function:

enter image description here

Motivation of this question is detailed at the end of the post.

Sketching the demonstration

What I did so far...

First I have written the likelihood function for a single observation:

$$mathcal{L} (mu, sigma mid x) = frac{1}{sqrt{2pisigma^2} } e^{ -frac{(x-mu)^2}{2sigma^2} }$$

For convenience I take the Negative Logarithm of the Likelihood function, I think it does not change the extremum because Logarithm is a monotonic function:

$$f(mu, sigma mid x) = -ln mathcal{L} (mu, sigma mid x) = frac{1}{2}ln(2pi) + ln(sigma) +frac{(x-mu)^2}{2sigma^2}$$

Then I can assess the Jacobian (check: 1 and 2):

$$
mathbf{J}_f = left[
begin{matrix}
-frac{x-mu}{sigma^2}\
-frac{(x-mu)^2 - sigma^2}{sigma^3}
end{matrix}
right]
$$

And the Hessian (check 3, 4 and 5):

$$
mathbf{H} = left[
begin{matrix}
frac{1}{sigma^2} & frac{2(x - mu)}{sigma^3} \
frac{2(x - mu)}{sigma^3} & frac{3(x - mu)^2 - sigma^2}{sigma^4}
end{matrix}
right]
$$

Of the Negative Log Likelihood function.

Now, I compute the Hessian of the Negative Log Likelihood function for $N$ observations:

If everything is right at this point:

Proving the function is convex is equivalent to prove than the Hessian is semi-positive definite;

Additionally I know that:

A semi-positive definite Matrix must have all its eigenvalues non-negative;

Because the Hessian Matrix is symmetric, all eigenvalues are real;

It is obvious that $operatorname{tr}(mathbf{A}) geq 0$.

Inequality

The inequality $det(mathbf{A}) geq 0$ is not obvious at the first glance, it requires a bit of algebra. Expanding all squares, applying sum, simplifying and grouping gives:

$$
3left[frac{1}{N}sumlimits_{i=1}^{N}x^2 - bar{x}^2right] -(bar{x}-mu)^2 -sigma^2 geq 0
$$

Now I can rewrite it using Standard Deviation Estimation:

$$
3left[frac{N-1}{N}s^2_x - sigma^2right] + 2sigma^2 geq (bar{x}-mu)^2
$$

For $N$ sufficiently large it tends to:

$$
3left(s^2_x - sigma^2right) + 2sigma^2 geq (bar{x}-mu)^2
$$

Or:

$$
left|bar{x}-muright| leq sqrt{3s^2_x - sigma^2}
$$

$$
sigma geq 0
$$

Which is trivially true by definition.

My interpretation of this inequality is:

Questions

Are my reasoning and the interpretation of the result correct?

Can we formally prove that $det(mathbf{A}) geq 0$?

Motivation

enter image description here

proof-verification convex-analysis normal-distribution positive-semidefinite log-likelihood

edited Dec 19 '18 at 8:31

asked Dec 18 '18 at 10:08

jlandercy

261214

edited Dec 19 '18 at 8:31

asked Dec 18 '18 at 10:08

jlandercy

261214

edited Dec 19 '18 at 8:31

asked Dec 18 '18 at 10:08

jlandercy

261214

asked Dec 18 '18 at 10:08

jlandercy

261214

asked Dec 18 '18 at 10:08

jlandercy

261214

$begingroup$
Your simplification of $A$ is not correct, since you 'abuse' Bias and $sigma$. The determinant is the product of the eigenvalues and the trace is the sum of the eigenvalues, so it suffices to check if the trace and determinant are positive.
$endgroup$
– LinAlg
Dec 18 '18 at 15:36

$begingroup$
@LinAlg, Thank you for the comment could you be more explicit, what do you mean by abusing the bias and sigma?
$endgroup$
– jlandercy
Dec 18 '18 at 16:33

$begingroup$
sigma and bias do not depend on data but are fixed numbers based on the probility distrubtion
$endgroup$
– LinAlg
Dec 18 '18 at 16:47

$begingroup$
@LinAlg I have updated my post to take your remarks into account. Would you mind review it? I have some difficulty to show that $det(mathbf{A})geq 0$. Thank you.
$endgroup$
– jlandercy
Dec 18 '18 at 22:56

1

$begingroup$
I have checked your steps and they are correct. For fixed $x$ you get an ellipsoid on which the function is convex.
$endgroup$
– LinAlg
Dec 18 '18 at 23:05

add a comment |

$begingroup$
Your simplification of $A$ is not correct, since you 'abuse' Bias and $sigma$. The determinant is the product of the eigenvalues and the trace is the sum of the eigenvalues, so it suffices to check if the trace and determinant are positive.
$endgroup$
– LinAlg
Dec 18 '18 at 15:36

$begingroup$
@LinAlg, Thank you for the comment could you be more explicit, what do you mean by abusing the bias and sigma?
$endgroup$
– jlandercy
Dec 18 '18 at 16:33

$begingroup$
sigma and bias do not depend on data but are fixed numbers based on the probility distrubtion
$endgroup$
– LinAlg
Dec 18 '18 at 16:47

$begingroup$
@LinAlg I have updated my post to take your remarks into account. Would you mind review it? I have some difficulty to show that $det(mathbf{A})geq 0$. Thank you.
$endgroup$
– jlandercy
Dec 18 '18 at 22:56

1

$begingroup$
I have checked your steps and they are correct. For fixed $x$ you get an ellipsoid on which the function is convex.
$endgroup$
– LinAlg
Dec 18 '18 at 23:05

Your simplification of $A$ is not correct, since you 'abuse' Bias and $sigma$. The determinant is the product of the eigenvalues and the trace is the sum of the eigenvalues, so it suffices to check if the trace and determinant are positive.

– LinAlg
Dec 18 '18 at 15:36

@LinAlg, Thank you for the comment could you be more explicit, what do you mean by abusing the bias and sigma?

– jlandercy
Dec 18 '18 at 16:33

sigma and bias do not depend on data but are fixed numbers based on the probility distrubtion

– LinAlg
Dec 18 '18 at 16:47

@LinAlg I have updated my post to take your remarks into account. Would you mind review it? I have some difficulty to show that $det(mathbf{A})geq 0$. Thank you.

– jlandercy
Dec 18 '18 at 22:56

I have checked your steps and they are correct. For fixed $x$ you get an ellipsoid on which the function is convex.

– LinAlg
Dec 18 '18 at 23:05

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3044987%2fconvexity-of-a-log-likelihood-function%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Mathematics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

v6c C96 zVSObJogxs9P eAeXqP M0m,5gMelnun1iVErUdGmLp6KPr,bx,ZCRjXW1SikWJxrnkhSr5 Wx1NSi

搜尋此網誌

Csdrhrt