Derivative of inner product











up vote
3
down vote

favorite
2












If the inner product of some vector $mathbf{x}$ can be expressed as



$$langle mathbf{x}, mathbf{x}rangle_G = mathbf{x}^T Gmathbf{x}$$



where $G$ is some symmetric matrix, if I want the derivative of this inner product with respect to $mathbf{x}$, I should get a vector as a result since this is the derivative of a scalar function by a vector (https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-vector).



Nevertheless, this formula tells me that I should get a row-vector, and not a normal vector.



$$frac{mathrm{d}}{mathrm{d} mathbf{x}} (mathbf{x}^TGmathbf{x}) = 2mathbf{x}^T G$$



(http://www.cs.huji.ac.il/~csip/tirgul3_derivatives.pdf)
which is a row-vector.



Why do I get this contradiction?










share|cite|improve this question


























    up vote
    3
    down vote

    favorite
    2












    If the inner product of some vector $mathbf{x}$ can be expressed as



    $$langle mathbf{x}, mathbf{x}rangle_G = mathbf{x}^T Gmathbf{x}$$



    where $G$ is some symmetric matrix, if I want the derivative of this inner product with respect to $mathbf{x}$, I should get a vector as a result since this is the derivative of a scalar function by a vector (https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-vector).



    Nevertheless, this formula tells me that I should get a row-vector, and not a normal vector.



    $$frac{mathrm{d}}{mathrm{d} mathbf{x}} (mathbf{x}^TGmathbf{x}) = 2mathbf{x}^T G$$



    (http://www.cs.huji.ac.il/~csip/tirgul3_derivatives.pdf)
    which is a row-vector.



    Why do I get this contradiction?










    share|cite|improve this question
























      up vote
      3
      down vote

      favorite
      2









      up vote
      3
      down vote

      favorite
      2






      2





      If the inner product of some vector $mathbf{x}$ can be expressed as



      $$langle mathbf{x}, mathbf{x}rangle_G = mathbf{x}^T Gmathbf{x}$$



      where $G$ is some symmetric matrix, if I want the derivative of this inner product with respect to $mathbf{x}$, I should get a vector as a result since this is the derivative of a scalar function by a vector (https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-vector).



      Nevertheless, this formula tells me that I should get a row-vector, and not a normal vector.



      $$frac{mathrm{d}}{mathrm{d} mathbf{x}} (mathbf{x}^TGmathbf{x}) = 2mathbf{x}^T G$$



      (http://www.cs.huji.ac.il/~csip/tirgul3_derivatives.pdf)
      which is a row-vector.



      Why do I get this contradiction?










      share|cite|improve this question













      If the inner product of some vector $mathbf{x}$ can be expressed as



      $$langle mathbf{x}, mathbf{x}rangle_G = mathbf{x}^T Gmathbf{x}$$



      where $G$ is some symmetric matrix, if I want the derivative of this inner product with respect to $mathbf{x}$, I should get a vector as a result since this is the derivative of a scalar function by a vector (https://en.wikipedia.org/wiki/Matrix_calculus#Scalar-by-vector).



      Nevertheless, this formula tells me that I should get a row-vector, and not a normal vector.



      $$frac{mathrm{d}}{mathrm{d} mathbf{x}} (mathbf{x}^TGmathbf{x}) = 2mathbf{x}^T G$$



      (http://www.cs.huji.ac.il/~csip/tirgul3_derivatives.pdf)
      which is a row-vector.



      Why do I get this contradiction?







      linear-algebra derivatives vectors inner-product-space






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Nov 30 at 9:15









      The Bosco

      510211




      510211






















          4 Answers
          4






          active

          oldest

          votes

















          up vote
          5
          down vote













          For a smooth $f:mathbb{R}^ntomathbb{R}^m$, you have $df:mathbb{R}^ntomathcal{L}(mathbb{R}^n,mathbb{R}^m)$



          Being differentiable is equivalent to:
          $$
          f(x+h)=f(x)+df(x)cdot h+o(|h|)
          $$



          In your case, $f(x)=langle x,x rangle_G$ and $m=1$, hence differential at $x$, $df(x)$ is in $mathcal{L}(mathbb{R}^n,mathbb{R})$. It's a linear form.



          Let's be more explicit:
          begin{align*}
          f(x+h)=& langle x+h,x+h rangle_G \
          =& underbrace{langle x,x rangle_G}_{f(x)} + underbrace{2langle x,h rangle_G }_{df(x)cdot h}+ underbrace{langle h,h rangle_G}_{in o(|h|)}\
          end{align*}



          Hence your differential is defined by
          $$
          df(x)cdot h = 2langle x,h rangle_G = (2x^tG)h
          $$

          where $2x^tG=left(partial_{x_1} f,dots,partial_{x_n} fright)$ is your "row" vector.



          Note that, because $m=1$, you can also use a vector $nabla f(x)$ to represent $df(x)$ using the canonical scalar product. This vector is by definition the gradient of $f$:



          $$
          df(x)cdot h = langle nabla f(x),h rangle = langle 2Gx,h rangle
          $$

          where $nabla f(x)=2Gx=left(begin{array}{c}partial_{x_1} f \ ... \partial_{x_n} fend{array}right)$. This is your "column" vector.






          share|cite|improve this answer






























            up vote
            4
            down vote













            The difference is in the fact the author in the second reference prefers to arrange the components of the gradient. In the first paragraph they state




            Let $xin mathbb{R}^n$ (a column vector) and let $f : mathbb{R}^n to R$. The derivative of $f$ with respect to $x$ is a row vector:
            $$
            frac{partial f}{partial x} = left(frac{partial f}{partial x_1}, cdots , frac{partial f}{partial x_n} right)
            $$




            You can argue this is a better option than the first one (e.g. this answer), but at the end of the day is just a matter of notation. Pick the one you prefer and stick with it to avoid problems down the line






            share|cite|improve this answer




























              up vote
              2
              down vote













              More generally, suppose we differentiate any scalar-valued function $f$ of a vector $mathbf{x}$ with respect to $mathbf{x}$. By the chain rule, $$df=sum_ifrac{partial f}{partial x_i}dx_i=boldsymbol{nabla}fcdot dmathbf{x}=boldsymbol{nabla}f^T dmathbf{x}.$$(Technically, I should write $df=(boldsymbol{nabla}f^T dmathbf{x})_{11}$ to take the unique entry of a $1times 1$ matrix.)



              If you want to define the derivative of $f$ with respect to $mathbf{x}$ as the $dmathbf{x}$ coefficient in $df$, you use the last expression, obtaining the row vector $boldsymbol{nabla}f^T$. Defining it instead as the left-hand argument of the dot product, giving the column vector $boldsymbol{nabla}f$, is an alternative convention.






              share|cite|improve this answer




























                up vote
                0
                down vote













                Why not use the Leibniz-rule? We have, where $langle .,.rangle$ denotes the standard inner product
                $$D_p(langle x,xrangle_G)=2langle p,xrangle_G=2p^TGx=2langle p,Gxrangle.$$



                Note that the derivative of $fcolonmathbb R^ntomathbb R$ is not a vector, but a linear form instead. The gradient $nabla^{langle .,.rangle_G}f$ in respect to the inner product $langle .,.rangle_G$ is the unique vector which represents this linear form in presence of the specified inner product. In our case we have
                $$nabla^{langle .,.rangle_G}f(x)=2x,quadtext{that is}quad
                D_p(langle x,xrangle_G)=langle p,2xrangle_G$$

                whereas
                $$nabla^{langle .,.rangle}f(x)=2Gx,quadtext{and that is}quad
                D_p(langle x,xrangle_G)=langle p,2Gxrangle$$






                share|cite|improve this answer























                  Your Answer





                  StackExchange.ifUsing("editor", function () {
                  return StackExchange.using("mathjaxEditing", function () {
                  StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
                  StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
                  });
                  });
                  }, "mathjax-editing");

                  StackExchange.ready(function() {
                  var channelOptions = {
                  tags: "".split(" "),
                  id: "69"
                  };
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function() {
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled) {
                  StackExchange.using("snippets", function() {
                  createEditor();
                  });
                  }
                  else {
                  createEditor();
                  }
                  });

                  function createEditor() {
                  StackExchange.prepareEditor({
                  heartbeatType: 'answer',
                  convertImagesToLinks: true,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: 10,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader: {
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  },
                  noCode: true, onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  });


                  }
                  });














                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function () {
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3019859%2fderivative-of-inner-product%23new-answer', 'question_page');
                  }
                  );

                  Post as a guest















                  Required, but never shown

























                  4 Answers
                  4






                  active

                  oldest

                  votes








                  4 Answers
                  4






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes








                  up vote
                  5
                  down vote













                  For a smooth $f:mathbb{R}^ntomathbb{R}^m$, you have $df:mathbb{R}^ntomathcal{L}(mathbb{R}^n,mathbb{R}^m)$



                  Being differentiable is equivalent to:
                  $$
                  f(x+h)=f(x)+df(x)cdot h+o(|h|)
                  $$



                  In your case, $f(x)=langle x,x rangle_G$ and $m=1$, hence differential at $x$, $df(x)$ is in $mathcal{L}(mathbb{R}^n,mathbb{R})$. It's a linear form.



                  Let's be more explicit:
                  begin{align*}
                  f(x+h)=& langle x+h,x+h rangle_G \
                  =& underbrace{langle x,x rangle_G}_{f(x)} + underbrace{2langle x,h rangle_G }_{df(x)cdot h}+ underbrace{langle h,h rangle_G}_{in o(|h|)}\
                  end{align*}



                  Hence your differential is defined by
                  $$
                  df(x)cdot h = 2langle x,h rangle_G = (2x^tG)h
                  $$

                  where $2x^tG=left(partial_{x_1} f,dots,partial_{x_n} fright)$ is your "row" vector.



                  Note that, because $m=1$, you can also use a vector $nabla f(x)$ to represent $df(x)$ using the canonical scalar product. This vector is by definition the gradient of $f$:



                  $$
                  df(x)cdot h = langle nabla f(x),h rangle = langle 2Gx,h rangle
                  $$

                  where $nabla f(x)=2Gx=left(begin{array}{c}partial_{x_1} f \ ... \partial_{x_n} fend{array}right)$. This is your "column" vector.






                  share|cite|improve this answer



























                    up vote
                    5
                    down vote













                    For a smooth $f:mathbb{R}^ntomathbb{R}^m$, you have $df:mathbb{R}^ntomathcal{L}(mathbb{R}^n,mathbb{R}^m)$



                    Being differentiable is equivalent to:
                    $$
                    f(x+h)=f(x)+df(x)cdot h+o(|h|)
                    $$



                    In your case, $f(x)=langle x,x rangle_G$ and $m=1$, hence differential at $x$, $df(x)$ is in $mathcal{L}(mathbb{R}^n,mathbb{R})$. It's a linear form.



                    Let's be more explicit:
                    begin{align*}
                    f(x+h)=& langle x+h,x+h rangle_G \
                    =& underbrace{langle x,x rangle_G}_{f(x)} + underbrace{2langle x,h rangle_G }_{df(x)cdot h}+ underbrace{langle h,h rangle_G}_{in o(|h|)}\
                    end{align*}



                    Hence your differential is defined by
                    $$
                    df(x)cdot h = 2langle x,h rangle_G = (2x^tG)h
                    $$

                    where $2x^tG=left(partial_{x_1} f,dots,partial_{x_n} fright)$ is your "row" vector.



                    Note that, because $m=1$, you can also use a vector $nabla f(x)$ to represent $df(x)$ using the canonical scalar product. This vector is by definition the gradient of $f$:



                    $$
                    df(x)cdot h = langle nabla f(x),h rangle = langle 2Gx,h rangle
                    $$

                    where $nabla f(x)=2Gx=left(begin{array}{c}partial_{x_1} f \ ... \partial_{x_n} fend{array}right)$. This is your "column" vector.






                    share|cite|improve this answer

























                      up vote
                      5
                      down vote










                      up vote
                      5
                      down vote









                      For a smooth $f:mathbb{R}^ntomathbb{R}^m$, you have $df:mathbb{R}^ntomathcal{L}(mathbb{R}^n,mathbb{R}^m)$



                      Being differentiable is equivalent to:
                      $$
                      f(x+h)=f(x)+df(x)cdot h+o(|h|)
                      $$



                      In your case, $f(x)=langle x,x rangle_G$ and $m=1$, hence differential at $x$, $df(x)$ is in $mathcal{L}(mathbb{R}^n,mathbb{R})$. It's a linear form.



                      Let's be more explicit:
                      begin{align*}
                      f(x+h)=& langle x+h,x+h rangle_G \
                      =& underbrace{langle x,x rangle_G}_{f(x)} + underbrace{2langle x,h rangle_G }_{df(x)cdot h}+ underbrace{langle h,h rangle_G}_{in o(|h|)}\
                      end{align*}



                      Hence your differential is defined by
                      $$
                      df(x)cdot h = 2langle x,h rangle_G = (2x^tG)h
                      $$

                      where $2x^tG=left(partial_{x_1} f,dots,partial_{x_n} fright)$ is your "row" vector.



                      Note that, because $m=1$, you can also use a vector $nabla f(x)$ to represent $df(x)$ using the canonical scalar product. This vector is by definition the gradient of $f$:



                      $$
                      df(x)cdot h = langle nabla f(x),h rangle = langle 2Gx,h rangle
                      $$

                      where $nabla f(x)=2Gx=left(begin{array}{c}partial_{x_1} f \ ... \partial_{x_n} fend{array}right)$. This is your "column" vector.






                      share|cite|improve this answer














                      For a smooth $f:mathbb{R}^ntomathbb{R}^m$, you have $df:mathbb{R}^ntomathcal{L}(mathbb{R}^n,mathbb{R}^m)$



                      Being differentiable is equivalent to:
                      $$
                      f(x+h)=f(x)+df(x)cdot h+o(|h|)
                      $$



                      In your case, $f(x)=langle x,x rangle_G$ and $m=1$, hence differential at $x$, $df(x)$ is in $mathcal{L}(mathbb{R}^n,mathbb{R})$. It's a linear form.



                      Let's be more explicit:
                      begin{align*}
                      f(x+h)=& langle x+h,x+h rangle_G \
                      =& underbrace{langle x,x rangle_G}_{f(x)} + underbrace{2langle x,h rangle_G }_{df(x)cdot h}+ underbrace{langle h,h rangle_G}_{in o(|h|)}\
                      end{align*}



                      Hence your differential is defined by
                      $$
                      df(x)cdot h = 2langle x,h rangle_G = (2x^tG)h
                      $$

                      where $2x^tG=left(partial_{x_1} f,dots,partial_{x_n} fright)$ is your "row" vector.



                      Note that, because $m=1$, you can also use a vector $nabla f(x)$ to represent $df(x)$ using the canonical scalar product. This vector is by definition the gradient of $f$:



                      $$
                      df(x)cdot h = langle nabla f(x),h rangle = langle 2Gx,h rangle
                      $$

                      where $nabla f(x)=2Gx=left(begin{array}{c}partial_{x_1} f \ ... \partial_{x_n} fend{array}right)$. This is your "column" vector.







                      share|cite|improve this answer














                      share|cite|improve this answer



                      share|cite|improve this answer








                      edited Nov 30 at 10:43

























                      answered Nov 30 at 10:09









                      Picaud Vincent

                      1,05015




                      1,05015






















                          up vote
                          4
                          down vote













                          The difference is in the fact the author in the second reference prefers to arrange the components of the gradient. In the first paragraph they state




                          Let $xin mathbb{R}^n$ (a column vector) and let $f : mathbb{R}^n to R$. The derivative of $f$ with respect to $x$ is a row vector:
                          $$
                          frac{partial f}{partial x} = left(frac{partial f}{partial x_1}, cdots , frac{partial f}{partial x_n} right)
                          $$




                          You can argue this is a better option than the first one (e.g. this answer), but at the end of the day is just a matter of notation. Pick the one you prefer and stick with it to avoid problems down the line






                          share|cite|improve this answer

























                            up vote
                            4
                            down vote













                            The difference is in the fact the author in the second reference prefers to arrange the components of the gradient. In the first paragraph they state




                            Let $xin mathbb{R}^n$ (a column vector) and let $f : mathbb{R}^n to R$. The derivative of $f$ with respect to $x$ is a row vector:
                            $$
                            frac{partial f}{partial x} = left(frac{partial f}{partial x_1}, cdots , frac{partial f}{partial x_n} right)
                            $$




                            You can argue this is a better option than the first one (e.g. this answer), but at the end of the day is just a matter of notation. Pick the one you prefer and stick with it to avoid problems down the line






                            share|cite|improve this answer























                              up vote
                              4
                              down vote










                              up vote
                              4
                              down vote









                              The difference is in the fact the author in the second reference prefers to arrange the components of the gradient. In the first paragraph they state




                              Let $xin mathbb{R}^n$ (a column vector) and let $f : mathbb{R}^n to R$. The derivative of $f$ with respect to $x$ is a row vector:
                              $$
                              frac{partial f}{partial x} = left(frac{partial f}{partial x_1}, cdots , frac{partial f}{partial x_n} right)
                              $$




                              You can argue this is a better option than the first one (e.g. this answer), but at the end of the day is just a matter of notation. Pick the one you prefer and stick with it to avoid problems down the line






                              share|cite|improve this answer












                              The difference is in the fact the author in the second reference prefers to arrange the components of the gradient. In the first paragraph they state




                              Let $xin mathbb{R}^n$ (a column vector) and let $f : mathbb{R}^n to R$. The derivative of $f$ with respect to $x$ is a row vector:
                              $$
                              frac{partial f}{partial x} = left(frac{partial f}{partial x_1}, cdots , frac{partial f}{partial x_n} right)
                              $$




                              You can argue this is a better option than the first one (e.g. this answer), but at the end of the day is just a matter of notation. Pick the one you prefer and stick with it to avoid problems down the line







                              share|cite|improve this answer












                              share|cite|improve this answer



                              share|cite|improve this answer










                              answered Nov 30 at 9:29









                              caverac

                              12.2k21027




                              12.2k21027






















                                  up vote
                                  2
                                  down vote













                                  More generally, suppose we differentiate any scalar-valued function $f$ of a vector $mathbf{x}$ with respect to $mathbf{x}$. By the chain rule, $$df=sum_ifrac{partial f}{partial x_i}dx_i=boldsymbol{nabla}fcdot dmathbf{x}=boldsymbol{nabla}f^T dmathbf{x}.$$(Technically, I should write $df=(boldsymbol{nabla}f^T dmathbf{x})_{11}$ to take the unique entry of a $1times 1$ matrix.)



                                  If you want to define the derivative of $f$ with respect to $mathbf{x}$ as the $dmathbf{x}$ coefficient in $df$, you use the last expression, obtaining the row vector $boldsymbol{nabla}f^T$. Defining it instead as the left-hand argument of the dot product, giving the column vector $boldsymbol{nabla}f$, is an alternative convention.






                                  share|cite|improve this answer

























                                    up vote
                                    2
                                    down vote













                                    More generally, suppose we differentiate any scalar-valued function $f$ of a vector $mathbf{x}$ with respect to $mathbf{x}$. By the chain rule, $$df=sum_ifrac{partial f}{partial x_i}dx_i=boldsymbol{nabla}fcdot dmathbf{x}=boldsymbol{nabla}f^T dmathbf{x}.$$(Technically, I should write $df=(boldsymbol{nabla}f^T dmathbf{x})_{11}$ to take the unique entry of a $1times 1$ matrix.)



                                    If you want to define the derivative of $f$ with respect to $mathbf{x}$ as the $dmathbf{x}$ coefficient in $df$, you use the last expression, obtaining the row vector $boldsymbol{nabla}f^T$. Defining it instead as the left-hand argument of the dot product, giving the column vector $boldsymbol{nabla}f$, is an alternative convention.






                                    share|cite|improve this answer























                                      up vote
                                      2
                                      down vote










                                      up vote
                                      2
                                      down vote









                                      More generally, suppose we differentiate any scalar-valued function $f$ of a vector $mathbf{x}$ with respect to $mathbf{x}$. By the chain rule, $$df=sum_ifrac{partial f}{partial x_i}dx_i=boldsymbol{nabla}fcdot dmathbf{x}=boldsymbol{nabla}f^T dmathbf{x}.$$(Technically, I should write $df=(boldsymbol{nabla}f^T dmathbf{x})_{11}$ to take the unique entry of a $1times 1$ matrix.)



                                      If you want to define the derivative of $f$ with respect to $mathbf{x}$ as the $dmathbf{x}$ coefficient in $df$, you use the last expression, obtaining the row vector $boldsymbol{nabla}f^T$. Defining it instead as the left-hand argument of the dot product, giving the column vector $boldsymbol{nabla}f$, is an alternative convention.






                                      share|cite|improve this answer












                                      More generally, suppose we differentiate any scalar-valued function $f$ of a vector $mathbf{x}$ with respect to $mathbf{x}$. By the chain rule, $$df=sum_ifrac{partial f}{partial x_i}dx_i=boldsymbol{nabla}fcdot dmathbf{x}=boldsymbol{nabla}f^T dmathbf{x}.$$(Technically, I should write $df=(boldsymbol{nabla}f^T dmathbf{x})_{11}$ to take the unique entry of a $1times 1$ matrix.)



                                      If you want to define the derivative of $f$ with respect to $mathbf{x}$ as the $dmathbf{x}$ coefficient in $df$, you use the last expression, obtaining the row vector $boldsymbol{nabla}f^T$. Defining it instead as the left-hand argument of the dot product, giving the column vector $boldsymbol{nabla}f$, is an alternative convention.







                                      share|cite|improve this answer












                                      share|cite|improve this answer



                                      share|cite|improve this answer










                                      answered Nov 30 at 9:42









                                      J.G.

                                      20k21932




                                      20k21932






















                                          up vote
                                          0
                                          down vote













                                          Why not use the Leibniz-rule? We have, where $langle .,.rangle$ denotes the standard inner product
                                          $$D_p(langle x,xrangle_G)=2langle p,xrangle_G=2p^TGx=2langle p,Gxrangle.$$



                                          Note that the derivative of $fcolonmathbb R^ntomathbb R$ is not a vector, but a linear form instead. The gradient $nabla^{langle .,.rangle_G}f$ in respect to the inner product $langle .,.rangle_G$ is the unique vector which represents this linear form in presence of the specified inner product. In our case we have
                                          $$nabla^{langle .,.rangle_G}f(x)=2x,quadtext{that is}quad
                                          D_p(langle x,xrangle_G)=langle p,2xrangle_G$$

                                          whereas
                                          $$nabla^{langle .,.rangle}f(x)=2Gx,quadtext{and that is}quad
                                          D_p(langle x,xrangle_G)=langle p,2Gxrangle$$






                                          share|cite|improve this answer



























                                            up vote
                                            0
                                            down vote













                                            Why not use the Leibniz-rule? We have, where $langle .,.rangle$ denotes the standard inner product
                                            $$D_p(langle x,xrangle_G)=2langle p,xrangle_G=2p^TGx=2langle p,Gxrangle.$$



                                            Note that the derivative of $fcolonmathbb R^ntomathbb R$ is not a vector, but a linear form instead. The gradient $nabla^{langle .,.rangle_G}f$ in respect to the inner product $langle .,.rangle_G$ is the unique vector which represents this linear form in presence of the specified inner product. In our case we have
                                            $$nabla^{langle .,.rangle_G}f(x)=2x,quadtext{that is}quad
                                            D_p(langle x,xrangle_G)=langle p,2xrangle_G$$

                                            whereas
                                            $$nabla^{langle .,.rangle}f(x)=2Gx,quadtext{and that is}quad
                                            D_p(langle x,xrangle_G)=langle p,2Gxrangle$$






                                            share|cite|improve this answer

























                                              up vote
                                              0
                                              down vote










                                              up vote
                                              0
                                              down vote









                                              Why not use the Leibniz-rule? We have, where $langle .,.rangle$ denotes the standard inner product
                                              $$D_p(langle x,xrangle_G)=2langle p,xrangle_G=2p^TGx=2langle p,Gxrangle.$$



                                              Note that the derivative of $fcolonmathbb R^ntomathbb R$ is not a vector, but a linear form instead. The gradient $nabla^{langle .,.rangle_G}f$ in respect to the inner product $langle .,.rangle_G$ is the unique vector which represents this linear form in presence of the specified inner product. In our case we have
                                              $$nabla^{langle .,.rangle_G}f(x)=2x,quadtext{that is}quad
                                              D_p(langle x,xrangle_G)=langle p,2xrangle_G$$

                                              whereas
                                              $$nabla^{langle .,.rangle}f(x)=2Gx,quadtext{and that is}quad
                                              D_p(langle x,xrangle_G)=langle p,2Gxrangle$$






                                              share|cite|improve this answer














                                              Why not use the Leibniz-rule? We have, where $langle .,.rangle$ denotes the standard inner product
                                              $$D_p(langle x,xrangle_G)=2langle p,xrangle_G=2p^TGx=2langle p,Gxrangle.$$



                                              Note that the derivative of $fcolonmathbb R^ntomathbb R$ is not a vector, but a linear form instead. The gradient $nabla^{langle .,.rangle_G}f$ in respect to the inner product $langle .,.rangle_G$ is the unique vector which represents this linear form in presence of the specified inner product. In our case we have
                                              $$nabla^{langle .,.rangle_G}f(x)=2x,quadtext{that is}quad
                                              D_p(langle x,xrangle_G)=langle p,2xrangle_G$$

                                              whereas
                                              $$nabla^{langle .,.rangle}f(x)=2Gx,quadtext{and that is}quad
                                              D_p(langle x,xrangle_G)=langle p,2Gxrangle$$







                                              share|cite|improve this answer














                                              share|cite|improve this answer



                                              share|cite|improve this answer








                                              edited Nov 30 at 16:21

























                                              answered Nov 30 at 16:14









                                              Michael Hoppe

                                              10.6k31733




                                              10.6k31733






























                                                  draft saved

                                                  draft discarded




















































                                                  Thanks for contributing an answer to Mathematics Stack Exchange!


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  Use MathJax to format equations. MathJax reference.


                                                  To learn more, see our tips on writing great answers.





                                                  Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                  Please pay close attention to the following guidance:


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  To learn more, see our tips on writing great answers.




                                                  draft saved


                                                  draft discarded














                                                  StackExchange.ready(
                                                  function () {
                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3019859%2fderivative-of-inner-product%23new-answer', 'question_page');
                                                  }
                                                  );

                                                  Post as a guest















                                                  Required, but never shown





















































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown

































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown







                                                  Popular posts from this blog

                                                  Plaza Victoria

                                                  In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

                                                  How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...