Csdrhrt

Question

I asked myself whether the this pointer could be overused since I usually use it every single time I refer to a member variable or function. I wondered if it could have performance impact since there must be a pointer which needs to be dereferenced every time. So I wrote some test code

struct A {

    int x;



    A(int X) {

        x = X; /* And a second time with this->x = X; */

    }

};



int main() {

    A a(8);



    return 0;

}

and surprisingly even with -O0 they output the exact same assembler code.

Also if I use a member function and call it in another member function it shows the same behavior. So is the this pointer just a compile time thing and not an actual pointer? Or are there cases where this is actually translated and dereferenced? I use GCC 4.4.3 btw.

Possible duplicate of Is there overhead using this-> in c++? — Nov 12 at 19:04
Comments are not for extended discussion; this conversation has been moved to chat. — yesterday

StoryTeller 88.8k12179245 · Accepted Answer · 2018-11-12 15:05:58Z

So is the this pointer just a compile time thing and not an actual pointer?

It very much is a run time thing. It refers to the object on which the member function is invoked, naturally that object can exist at run time.

What is a compile time thing is how name lookup works. When a compiler encounters x = X it must figure out what is this x that is being assigned. So it looks it up, and finds the member variable. Since this->x and x refer to the same thing, naturally you get the same assembly output.

Comments are not for extended discussion; this conversation has been moved to chat. — yesterday

Toby Speight 15.8k133965 · Answer 2 · 2018-11-12 18:58:53Z

up vote
23
down vote

It is an actual pointer, as the standard specifies it (§12.2.2.1):

In the body of a non-static (12.2.1) member function, the keyword this is a prvalue expression whose value is the address of the object for which the function is called. The type of this in a member function of a class X is X*.

this is actually implicit every time you reference a non-static member variable or member function within a class own code. It is also needed (either when implicit or explicit) because the compiler needs to tie back the function or the variable to an actual object at runtime.

Using it explicitly is rarely useful, unless you need, for example, to disambiguate between a parameter and a member variable within a member function. Otherwise, without it the compiler will shadow the member variable with the parameter (See it live on Coliru).

edited Nov 12 at 18:58

Toby Speight

15.8k133965

answered Nov 12 at 15:20

JBL

9,52433567

6

You also need to explicitly write this-> when accessing a member of a non-dependent base type from a template member. Not often needed, and a good compiler will diagnose exactly when you forget it, but worth mentioning.
– Toby Speight
Nov 12 at 18:58

1

It can also be very useful to write "this->" when developing with an IDE, because the IDE can then provide a list of members to select from. (Personally, I tend not to use an IDE, but if one chooses to, taking advantage of it seems sensible.)
– Martin Bonner
2 days ago

3

"Using it explicitly is rarely useful", from the compiler perspective, true; From a human perspective, some teams will enforce this as a style rule to prevent human-error introduced bugs.
– Tezra
2 days ago

add a comment |

score 14 · Answer 3 · 2018-11-12 17:01:01Z

up vote
14
down vote

this always has to exist when you are in a non-static method. Whether you explicitly use it or not, you have to have a reference to the current instance, and this is what this gives you.

In both cases, you are going to access memory through the this pointer. It's just that you can omit it in some cases.

edited Nov 12 at 17:01

answered Nov 12 at 15:05

Matthieu Brucher

5,5061128

Essentially, syntactical sugar (whether by inclusion or omission, its a shortcut).
– Draco18s
Nov 12 at 16:55

add a comment |

Peter Cordes 114k16173297 · Answer 4 · 2018-11-12 16:03:26Z

This is almost a duplicate of How do objects work in x86 at the assembly level?, where I comment the asm output of some examples, including showing which register the this pointer was passed in.

In asm, this works exactly like a hidden first arg, so both the member-function foo::add(int) and the non-member add which takes an explicit foo* first arg compile to exactly the same asm.

struct foo {

    int m;

    void add(int a);  // not inline so we get a stand-alone definition emitted

};



void foo::add(int a) {

    this->m += a;

}



void add(foo *obj, int a) {

    obj->m += a;

}

On the Godbolt compiler explorer, compiling for x86-64 with the System V ABI (first arg in RDI, second in RSI), we get:

# gcc8.2 -O3

foo::add(int):

        add     DWORD PTR [rdi], esi   # memory-destination add

        ret

add(foo*, int):

        add     DWORD PTR [rdi], esi

        ret

I use GCC 4.4.3

That was released in January 2010, so it's missing nearly a decade of improvements to the optimizer, and to error messages. The gcc7 series has been out and stable for a while. Expect missed optimizations with such an old compiler, especially for modern instruction sets like AVX.

Peter Mortensen 13.3k1983111 · Answer 5 · 2018-11-13 05:14:28Z

After compilation, every symbol is just an address, so it can't be a run-time issue.

Any member symbol is compiled to an offset in the current class anyway, even if you didn't use this.

When name is used in C++ it can be one of the following.

In the global namespace (like ::name), or in the current namespace, or in the used namespace (when using namespace ... been used)

In the current class

Local definition, in upper block

Local definition, in current block

Therefore, when you write code, the compiler should scan each, in a manner to look for the symbol name, from the current block and up to the global namespace.

Using this->name helps the compiler to narrow the search for name to only look for it in the current class scope, meaning it skips local definitions, and if not found in class scope, do not look for it in the global scope.

Helmut Zeisel 663 · Answer 6 · 2018-11-12 16:00:19Z

Here is a simple example how "this" could be useful during runtime:

#include <vector>

#include <string>

#include <iostream>



class A;

typedef std::vector<A*> News; 

class A

{

public:

    A(const char* n): name(n){}

    std::string name;

    void subscribe(News& n)

    {

       n.push_back(this);

    }

};



int main()

{

    A a1("Alex"), a2("Bob"), a3("Chris");

    News news;



    a1.subscribe(news);

    a3.subscribe(news);



    std::cout << "Subscriber:";

    for(auto& a: news)

    {

      std::cout << " " << a->name;

    }

    return 0;

}

Trass3r 2,7701429 · Answer 7 · 2018-11-13 05:49:29Z

Your machine does not know anything about class methods, they are normal functions under the hood.
Hence methods have to be implemented by always passing a pointer to the current object, it's just implicit in C++, i.e. T Class::method(...) is just syntactic sugar for T Class_Method(Class* this, ...).

Other languages like Python or Lua choose to make it explicit and modern object-oriented C APIs like Vulkan (unlike OpenGL) use a similar pattern.

Agent_L 3,1811620 · Answer 8 · 2018-11-13 12:26:00Z

since I usually use it every single time I refer to a member variable or function.

You always use this when you refer to a member variable or function. There is simply no other way to reach members. The only choice is implicit vs explicit notation.

Let's go back to see how it was done before this to understand what this is.

Without OOP:

struct A {

    int x;

};



void foo(A* that) {

    bar(that->x)

}

With OOP but writing this explicitly

struct A {

    int x;



    void foo(void) {

        bar(this->x)

    }

};

using shorter notation:

struct A {

    int x;



    void foo(void) {

        bar(x)

    }

};

But the difference is only in source code. All are compiled to same thing. If you create a member method, the compiler will create a pointer argument for you and name it "this". If you omit this-> when referring to a member, the compiler is clever just enough to insert it for you most of the time. That's it. The only difference is 6 less letters in the source.

Writing this explicitly makes sense when there is an ambiguity, namely another variable named just like your member variable:

struct A {

    int x;



    A(int x) {

        this->x = x

    }

};

There are some instances, like __thiscall, where OO and non-OO code may end bit different in asm, but whenever the pointer is passed on stack and then optimized to a register or in ECX from the very beginning doesn't make it "not a pointer".

Szak1 37539 · Answer 9 · 2018-11-13 15:17:44Z

up vote
2
down vote

"this" can also safeguard against shadowing by a function parameter, for example:

class Vector {

   public:

      double x,y,z;

      void SetLocation(double x, double y, double z);

};



void Vector::SetLocation(double x, double y, double z) {

   this->x = x; //Passed parameter assigned to member variable

   this->y = y;

   this->z = z;

}

(Obviously, writing such code is discouraged.)

answered 2 days ago

Szak1

37539

1

Usually shadowing comes up as an issue when the member variable is being shadowed by an introduced local variable (where you normally aren't thinking of what is in the global scope), so use of this->x is encouraged to prevent such modification bugs.
– Tezra
2 days ago

Yeah unfortunately -Wshadow is not enabled with -Wall. gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
– Trass3r
yesterday

add a comment |

score 2 · Answer 10 · 2018-11-14 01:38:58Z

if the compiler inlines a member function that is called with static rather than dynamic binding, it might be able to optimize away the this pointer. Take this simple example:

#include <iostream>



using std::cout;

using std::endl;



class example {

  public:

  int foo() const { return x; }

  int foo(const int i) { return (x = i); }



  private:

  int x;

};



int main(void)

{

  example e;

  e.foo(10);

  cout << e.foo() << endl;

}

GCC 7.3.0 with the -march=x86-64 -O -S flag is able to compile cout << e.foo() to three instructions:

movl    $10, %esi

leaq    _ZSt4cout(%rip), %rdi

call    _ZNSolsEi@PLT

This is a call to std::ostream::operator<<. Remember that cout << e.foo(); is syntactic sugar for std::ostream::operator<< (cout, e.foo());. And operator<<(int) could be written two ways: static operator<< (ostream&, int), as a non-member function, where the operand on the left is an explicit parameter, or operator<<(int), as a member function, where it’s implicitly this.

The compiler was able to deduce that e.foo() will always be the constant 10. Since the 64-bit x86 calling convention is to pass function arguments in registers, that compiles down to the single movl instruction, which sets the second function parameter to 10. The leaq instruction sets the first argument (which might be an explicit ostream& or the implicit this) to &cout. Then the program makes a call to the function.

In more complex cases, though—such as if you have a function taking an example& as a parameter—the compiler needs to look up this, as this is what tells the program which instance it’s working with, and therefore, which instance’s x data member to look up.

Consider this example:

class example {

  public:

  int foo() const { return x; }

  int foo(const int i) { return (x = i); }



  private:

  int x;

};



int bar( const example& e )

{

  return e.foo();

}

The function bar() gets compiled to a bit of boilerplate and the instruction:

movl    (%rdi), %eax

ret

You remember from the previous example that %rdi on x86-64 is the first function argument, the implicit this pointer for the call to e.foo(). Putting it in parentheses, (%rdi), means look up the variable at that location. (Since the only data in an example instance is x, &e.x happens to be the same as &e in this case.) Moving the contents to %eax sets the return value.

In this case, the compiler needed the implicit this argument to foo(/* example* this */) to be able to find &e and therefore &e.x. In fact, inside a member function (that isn’t static), x, this->x and (*this).x all mean the same thing.

Alexander 30.1k44474 · Answer 11 · 2018-11-14 02:44:59Z

this is a pointer. It's like an implicit parameter that's part of every method. You could imagine using plain C functions and writing code like:

Socket makeSocket(int port) { ... }

void send(Socket *this, Value v) { ... }

Value receive(Socket *this) { ... }



Socket *mySocket = makeSocket(1234);

send(mySocket, someValue); // The subject, `mySocket`, is passed in as a param called "this", explicitly

Value newData = receive(socket);

In C++, similar code might look like:

mySocket.send(someValue); // The subject, `mySocket`, is passed in as a param called "this"

Value newData = mySocket.receive();

Justin Time 2,9621328 · Answer 12 · 2018-11-14 22:53:40Z

this is indeed a runtime pointer (albeit one implicitly supplied by the compiler), as has been iterated in most answers. It is used to indicate which instance of a class a given member function is to operate on when called; for any given instance c of class C, when any member function cf() is called, c.cf() will be supplied a this pointer equal to &c (this naturally also applies to any struct s of type S, when calling member function s.sf(), as shall be used for cleaner demonstrations). It can even be cv-qualified just as any other pointer, with the same effects (but, unfortunately, not the same syntax due to being special); this is commonly used for const correctness, and much less frequently for volatile correctness.

template<typename T>

uintptr_t addr_out(T* ptr) { return reinterpret_cast<uintptr_t>(ptr); }



struct S {

    int i;



    uintptr_t address() const { return addr_out(this); }

};



// Format a given numerical value into a hex value for easy display.

// Implementation omitted for brevity.

template<typename T>

std::string hex_out_s(T val, bool disp0X = true);



// ...



S s[2];



std::cout << "Control example: Two distinct instances of simple class.n";

std::cout << "s[0] address:tttt"        << hex_out_s(addr_out(&s[0]))

          << "n* s[0] this pointer:ttt" << hex_out_s(s[0].address())

          << "nn";

std::cout << "s[1] address:tttt"        << hex_out_s(addr_out(&s[1]))

          << "n* s[1] this pointer:ttt" << hex_out_s(s[1].address())

          << "nn";

Sample output:

Control example: Two distinct instances of simple class.

s[0] address:                           0x0000003836e8fb40

* s[0] this pointer:                    0x0000003836e8fb40



s[1] address:                           0x0000003836e8fb44

* s[1] this pointer:                    0x0000003836e8fb44

These values aren't guaranteed, and can easily change from one execution to the next; this can most easily be observed while creating and testing a program, through the use of build tools.

Mechanically, it's similar to a hidden parameter added to the start of each member function's argument list; x.f() cv can be seen as a special variant of f(cv X* this), albeit with a different format for linguistic reasons. In fact, there were recent proposals by both Stroustrup and Sutter to unify the call syntax of x.f(y) and f(x, y), which would've made this implicit behaviour an explicit linguistic rule. It unfortunately was met with concerns that it may cause a few unwanted surprises for library developers, and thus not yet implemented; to my knowledge, the most recent proposal is a joint proposal, for f(x,y) to be able to fall back on x.f(y) if no f(x,y) is found, similar to the interaction between, e.g., std::begin(x) and member function x.begin().

In this case, this would be more akin to a normal pointer, and the programmer would be able to specify it manually. If a solution is found to allow the more robust form without violating the principle of least astonishment (or bringing any other concerns to pass), then an equivalent to this would also be able to be implicitly generated as a normal pointer for non-member functions, as well.

Relatedly, one important thing to note is that this is the instance's address, as seen by that instance; while the pointer itself is a runtime thing, it doesn't always have the value you'd think it has. This becomes relevant when looking at classes with more complex inheritance hierarchies. Specifically, when looking at cases where one or more member classes that contain member functions don't have the same address as the derived class itself. Three cases in particular come to mind:

^{Note that these are demonstrated using MSVC, with class layouts output via the undocumented -d1reportSingleClassLayout compiler parameter, due to me finding it more easily readable than GCC or Clang equivalents.}

Non-standard layout: When a class is standard layout, the address of an instance's first data member is exactly identical to the address of the instance itself; thus, this can be said to be equivalent to the first data member's address. This will hold true even if said data member is a member of a base class, as long as the derived class continues to follow standard layout rules. ...Conversely, this also means that if the derived class isn't standard layout, then this is no longer guaranteed.

struct StandardBase {

    int i;



    uintptr_t address() const { return addr_out(this); }

};



struct NonStandardDerived : StandardBase {

    virtual void f() {}



    uintptr_t address() const { return addr_out(this); }

};



static_assert(std::is_standard_layout<StandardBase>::value, "Nyeh.");

static_assert(!std::is_standard_layout<NonStandardDerived>::value, ".heyN");



// ...



NonStandardDerived n;



std::cout << "Derived class with non-standard layout:"

          << "n* n address:ttttt"                      << hex_out_s(addr_out(&n))

          << "n* n this pointer:tttt"                   << hex_out_s(n.address())

          << "n* n this pointer (as StandardBase):tt"     << hex_out_s(n.StandardBase::address())

          << "n* n this pointer (as NonStandardDerived):t" << hex_out_s(n.NonStandardDerived::address())

          << "nn";

Sample output:

Derived class with non-standard layout:

* n address:                                    0x00000061e86cf3c0

* n this pointer:                               0x00000061e86cf3c0

* n this pointer (as StandardBase):             0x00000061e86cf3c8

* n this pointer (as NonStandardDerived):       0x00000061e86cf3c0

Note that StandardBase::address() is supplied with a different this pointer than NonStandardDerived::address(), even when called on the same instance. This is because the latter's use of a vtable caused the compiler to insert a hidden member.

class StandardBase      size(4):

        +---

 0      | i

        +---

class NonStandardDerived        size(16):

        +---

 0      | {vfptr}

        | +--- (base class StandardBase)

 8      | | i

        | +---

        | <alignment member> (size=4)

        +---

NonStandardDerived::$vftable@:

        | &NonStandardDerived_meta

        |  0

 0      | &NonStandardDerived::f 

NonStandardDerived::f this adjustor: 0

Virtual base classes: Due to virtual bases trailing after the most-derived class, the this pointer supplied to a member function inherited from a virtual base will be different than the one provided to members of the derived class itself.

struct VBase {

    uintptr_t address() const { return addr_out(this); }

};

struct VDerived : virtual VBase {

    uintptr_t address() const { return addr_out(this); }

};



// ...



VDerived v;



std::cout << "Derived class with virtual base:"

          << "n* v address:ttttt"              << hex_out_s(addr_out(&v))

          << "n* v this pointer:tttt"           << hex_out_s(v.address())

          << "n* this pointer (as VBase):ttt"    << hex_out_s(v.VBase::address())

          << "n* this pointer (as VDerived):ttt" << hex_out_s(v.VDerived::address())

          << "nn";

Sample output:

Derived class with virtual base:

* v address:                                    0x0000008f8314f8b0

* v this pointer:                               0x0000008f8314f8b0

* this pointer (as VBase):                      0x0000008f8314f8b8

* this pointer (as VDerived):                   0x0000008f8314f8b0

Once again, the base class' member function is supplied with a different this pointer, due to VDerived's inherited VBase having a different starting address than VDerived itself.

class VDerived  size(8):

        +---

 0      | {vbptr}

        +---

        +--- (virtual base VBase)

        +---

VDerived::$vbtable@:

 0      | 0

 1      | 8 (VDerivedd(VDerived+0)VBase)

vbi:       class  offset o.vbptr  o.vbte fVtorDisp

           VBase       8       0       4 0

Multiple inheritance: As can be expected, multiple inheritance can easily lead to cases where the this pointer passed to one member function is different than the this pointer passed to a different member function, even if both functions are called with the same instance. This can come up for member functions of any base class other than the first, similarly to when working with non-standard layout classes (where all base classes after the first start at a different address than the derived class itself)... but it can be especially surprising in the case of virtual functions, when multiple members supply virtual functions with the same signature.

struct Base1 {

    int i;



    virtual uintptr_t address() const { return addr_out(this); }

    uintptr_t raw_address() { return addr_out(this); }

};

struct Base2 {

    short s;



    virtual uintptr_t address() const { return addr_out(this); }

    uintptr_t raw_address() { return addr_out(this); }

};

struct Derived : Base1, Base2 {

    bool b;



    uintptr_t address() const override { return addr_out(this); }

    uintptr_t raw_address() { return addr_out(this); }

};



// ...



Derived d;



std::cout << "Derived class with multiple inheritance:"

          << "n  (Calling address() through a static_cast reference, then the appropriate raw_address().)"

          << "n* d address:ttttt"               << hex_out_s(addr_out(&d))

          << "n* d this pointer:tttt"            << hex_out_s(d.address())                          << " (" << hex_out_s(d.raw_address())          << ")"

          << "n* d this pointer (as Base1):ttt"   << hex_out_s(static_cast<Base1&>((d)).address())   << " (" << hex_out_s(d.Base1::raw_address())   << ")"

          << "n* d this pointer (as Base2):ttt"   << hex_out_s(static_cast<Base2&>((d)).address())   << " (" << hex_out_s(d.Base2::raw_address())   << ")"

          << "n* d this pointer (as Derived):ttt" << hex_out_s(static_cast<Derived&>((d)).address()) << " (" << hex_out_s(d.Derived::raw_address()) << ")"

          << "nn";

Sample output:

Derived class with multiple inheritance:

  (Calling address() through a static_cast reference, then the appropriate raw_address().)

* d address:                                    0x00000056911ef530

* d this pointer:                               0x00000056911ef530 (0x00000056911ef530)

* d this pointer (as Base1):                    0x00000056911ef530 (0x00000056911ef530)

* d this pointer (as Base2):                    0x00000056911ef530 (0x00000056911ef540)

* d this pointer (as Derived):                  0x00000056911ef530 (0x00000056911ef530)

We would expect each raw_address() to same rules due to each explicitly being a separate function, and thus that Base2::raw_address() will return a different value than Derived::raw_address(). But since we know derived functions will always call the most-derived form, how is address() correct when called from a reference to Base2? This is due to a little compiler trickery called an "adjustor thunk", which is a helper that takes a base class instance's this pointer and adjusts it to point to the most-derived class instead, when necessary.

class Derived   size(40):

        +---

        | +--- (base class Base1)

 0      | | {vfptr}

 8      | | i

        | | <alignment member> (size=4)

        | +---

        | +--- (base class Base2)

16      | | {vfptr}

24      | | s

        | | <alignment member> (size=6)

        | +---

32      | b

        | <alignment member> (size=7)

        +---

Derived::$vftable@Base1@:

        | &Derived_meta

        |  0

 0      | &Derived::address 

Derived::$vftable@Base2@:

        | -16

 0      | &thunk: this-=16; goto Derived::address 

Derived::address this adjustor: 0

If you're curious, feel free to tinker around with this little program, to take a look at how the addresses change if you run it multiple times, or at cases where it might have a different value than you may expect.

StoryTeller 88.8k12179245 · Accepted Answer · 2018-11-12 15:05:58Z

So is the this pointer just a compile time thing and not an actual pointer?

It very much is a run time thing. It refers to the object on which the member function is invoked, naturally that object can exist at run time.

What is a compile time thing is how name lookup works. When a compiler encounters x = X it must figure out what is this x that is being assigned. So it looks it up, and finds the member variable. Since this->x and x refer to the same thing, naturally you get the same assembly output.

Comments are not for extended discussion; this conversation has been moved to chat. — yesterday

Toby Speight 15.8k133965 · Answer 14 · 2018-11-12 18:58:53Z

up vote
23
down vote

It is an actual pointer, as the standard specifies it (§12.2.2.1):

In the body of a non-static (12.2.1) member function, the keyword this is a prvalue expression whose value is the address of the object for which the function is called. The type of this in a member function of a class X is X*.

this is actually implicit every time you reference a non-static member variable or member function within a class own code. It is also needed (either when implicit or explicit) because the compiler needs to tie back the function or the variable to an actual object at runtime.

Using it explicitly is rarely useful, unless you need, for example, to disambiguate between a parameter and a member variable within a member function. Otherwise, without it the compiler will shadow the member variable with the parameter (See it live on Coliru).

edited Nov 12 at 18:58

Toby Speight

15.8k133965

answered Nov 12 at 15:20

JBL

9,52433567

6

You also need to explicitly write this-> when accessing a member of a non-dependent base type from a template member. Not often needed, and a good compiler will diagnose exactly when you forget it, but worth mentioning.
– Toby Speight
Nov 12 at 18:58

1

It can also be very useful to write "this->" when developing with an IDE, because the IDE can then provide a list of members to select from. (Personally, I tend not to use an IDE, but if one chooses to, taking advantage of it seems sensible.)
– Martin Bonner
2 days ago

3

"Using it explicitly is rarely useful", from the compiler perspective, true; From a human perspective, some teams will enforce this as a style rule to prevent human-error introduced bugs.
– Tezra
2 days ago

add a comment |

score 14 · Answer 15 · 2018-11-12 17:01:01Z

up vote
14
down vote

this always has to exist when you are in a non-static method. Whether you explicitly use it or not, you have to have a reference to the current instance, and this is what this gives you.

In both cases, you are going to access memory through the this pointer. It's just that you can omit it in some cases.

edited Nov 12 at 17:01

answered Nov 12 at 15:05

Matthieu Brucher

5,5061128

Essentially, syntactical sugar (whether by inclusion or omission, its a shortcut).
– Draco18s
Nov 12 at 16:55

add a comment |

Peter Cordes 114k16173297 · Answer 16 · 2018-11-12 16:03:26Z

This is almost a duplicate of How do objects work in x86 at the assembly level?, where I comment the asm output of some examples, including showing which register the this pointer was passed in.

In asm, this works exactly like a hidden first arg, so both the member-function foo::add(int) and the non-member add which takes an explicit foo* first arg compile to exactly the same asm.

struct foo {

    int m;

    void add(int a);  // not inline so we get a stand-alone definition emitted

};



void foo::add(int a) {

    this->m += a;

}



void add(foo *obj, int a) {

    obj->m += a;

}

On the Godbolt compiler explorer, compiling for x86-64 with the System V ABI (first arg in RDI, second in RSI), we get:

# gcc8.2 -O3

foo::add(int):

        add     DWORD PTR [rdi], esi   # memory-destination add

        ret

add(foo*, int):

        add     DWORD PTR [rdi], esi

        ret

I use GCC 4.4.3

That was released in January 2010, so it's missing nearly a decade of improvements to the optimizer, and to error messages. The gcc7 series has been out and stable for a while. Expect missed optimizations with such an old compiler, especially for modern instruction sets like AVX.

Peter Mortensen 13.3k1983111 · Answer 17 · 2018-11-13 05:14:28Z

After compilation, every symbol is just an address, so it can't be a run-time issue.

Any member symbol is compiled to an offset in the current class anyway, even if you didn't use this.

When name is used in C++ it can be one of the following.

In the global namespace (like ::name), or in the current namespace, or in the used namespace (when using namespace ... been used)

In the current class

Local definition, in upper block

Local definition, in current block

Therefore, when you write code, the compiler should scan each, in a manner to look for the symbol name, from the current block and up to the global namespace.

Using this->name helps the compiler to narrow the search for name to only look for it in the current class scope, meaning it skips local definitions, and if not found in class scope, do not look for it in the global scope.

Helmut Zeisel 663 · Answer 18 · 2018-11-12 16:00:19Z

Here is a simple example how "this" could be useful during runtime:

#include <vector>

#include <string>

#include <iostream>



class A;

typedef std::vector<A*> News; 

class A

{

public:

    A(const char* n): name(n){}

    std::string name;

    void subscribe(News& n)

    {

       n.push_back(this);

    }

};



int main()

{

    A a1("Alex"), a2("Bob"), a3("Chris");

    News news;



    a1.subscribe(news);

    a3.subscribe(news);



    std::cout << "Subscriber:";

    for(auto& a: news)

    {

      std::cout << " " << a->name;

    }

    return 0;

}

Trass3r 2,7701429 · Answer 19 · 2018-11-13 05:49:29Z

Your machine does not know anything about class methods, they are normal functions under the hood.
Hence methods have to be implemented by always passing a pointer to the current object, it's just implicit in C++, i.e. T Class::method(...) is just syntactic sugar for T Class_Method(Class* this, ...).

Other languages like Python or Lua choose to make it explicit and modern object-oriented C APIs like Vulkan (unlike OpenGL) use a similar pattern.

Agent_L 3,1811620 · Answer 20 · 2018-11-13 12:26:00Z

since I usually use it every single time I refer to a member variable or function.

You always use this when you refer to a member variable or function. There is simply no other way to reach members. The only choice is implicit vs explicit notation.

Let's go back to see how it was done before this to understand what this is.

Without OOP:

struct A {

    int x;

};



void foo(A* that) {

    bar(that->x)

}

With OOP but writing this explicitly

struct A {

    int x;



    void foo(void) {

        bar(this->x)

    }

};

using shorter notation:

struct A {

    int x;



    void foo(void) {

        bar(x)

    }

};

But the difference is only in source code. All are compiled to same thing. If you create a member method, the compiler will create a pointer argument for you and name it "this". If you omit this-> when referring to a member, the compiler is clever just enough to insert it for you most of the time. That's it. The only difference is 6 less letters in the source.

Writing this explicitly makes sense when there is an ambiguity, namely another variable named just like your member variable:

struct A {

    int x;



    A(int x) {

        this->x = x

    }

};

There are some instances, like __thiscall, where OO and non-OO code may end bit different in asm, but whenever the pointer is passed on stack and then optimized to a register or in ECX from the very beginning doesn't make it "not a pointer".

Szak1 37539 · Answer 21 · 2018-11-13 15:17:44Z

up vote
2
down vote

"this" can also safeguard against shadowing by a function parameter, for example:

class Vector {

   public:

      double x,y,z;

      void SetLocation(double x, double y, double z);

};



void Vector::SetLocation(double x, double y, double z) {

   this->x = x; //Passed parameter assigned to member variable

   this->y = y;

   this->z = z;

}

(Obviously, writing such code is discouraged.)

answered 2 days ago

Szak1

37539

1

Usually shadowing comes up as an issue when the member variable is being shadowed by an introduced local variable (where you normally aren't thinking of what is in the global scope), so use of this->x is encouraged to prevent such modification bugs.
– Tezra
2 days ago

Yeah unfortunately -Wshadow is not enabled with -Wall. gcc.gnu.org/onlinedocs/gcc/Warning-Options.html
– Trass3r
yesterday

add a comment |

score 2 · Answer 22 · 2018-11-14 01:38:58Z

if the compiler inlines a member function that is called with static rather than dynamic binding, it might be able to optimize away the this pointer. Take this simple example:

#include <iostream>



using std::cout;

using std::endl;



class example {

  public:

  int foo() const { return x; }

  int foo(const int i) { return (x = i); }



  private:

  int x;

};



int main(void)

{

  example e;

  e.foo(10);

  cout << e.foo() << endl;

}

GCC 7.3.0 with the -march=x86-64 -O -S flag is able to compile cout << e.foo() to three instructions:

movl    $10, %esi

leaq    _ZSt4cout(%rip), %rdi

call    _ZNSolsEi@PLT

This is a call to std::ostream::operator<<. Remember that cout << e.foo(); is syntactic sugar for std::ostream::operator<< (cout, e.foo());. And operator<<(int) could be written two ways: static operator<< (ostream&, int), as a non-member function, where the operand on the left is an explicit parameter, or operator<<(int), as a member function, where it’s implicitly this.

The compiler was able to deduce that e.foo() will always be the constant 10. Since the 64-bit x86 calling convention is to pass function arguments in registers, that compiles down to the single movl instruction, which sets the second function parameter to 10. The leaq instruction sets the first argument (which might be an explicit ostream& or the implicit this) to &cout. Then the program makes a call to the function.

In more complex cases, though—such as if you have a function taking an example& as a parameter—the compiler needs to look up this, as this is what tells the program which instance it’s working with, and therefore, which instance’s x data member to look up.

Consider this example:

class example {

  public:

  int foo() const { return x; }

  int foo(const int i) { return (x = i); }



  private:

  int x;

};



int bar( const example& e )

{

  return e.foo();

}

The function bar() gets compiled to a bit of boilerplate and the instruction:

movl    (%rdi), %eax

ret

You remember from the previous example that %rdi on x86-64 is the first function argument, the implicit this pointer for the call to e.foo(). Putting it in parentheses, (%rdi), means look up the variable at that location. (Since the only data in an example instance is x, &e.x happens to be the same as &e in this case.) Moving the contents to %eax sets the return value.

In this case, the compiler needed the implicit this argument to foo(/* example* this */) to be able to find &e and therefore &e.x. In fact, inside a member function (that isn’t static), x, this->x and (*this).x all mean the same thing.

Alexander 30.1k44474 · Answer 23 · 2018-11-14 02:44:59Z

this is a pointer. It's like an implicit parameter that's part of every method. You could imagine using plain C functions and writing code like:

Socket makeSocket(int port) { ... }

void send(Socket *this, Value v) { ... }

Value receive(Socket *this) { ... }



Socket *mySocket = makeSocket(1234);

send(mySocket, someValue); // The subject, `mySocket`, is passed in as a param called "this", explicitly

Value newData = receive(socket);

In C++, similar code might look like:

mySocket.send(someValue); // The subject, `mySocket`, is passed in as a param called "this"

Value newData = mySocket.receive();

Justin Time 2,9621328 · Answer 24 · 2018-11-14 22:53:40Z

this is indeed a runtime pointer (albeit one implicitly supplied by the compiler), as has been iterated in most answers. It is used to indicate which instance of a class a given member function is to operate on when called; for any given instance c of class C, when any member function cf() is called, c.cf() will be supplied a this pointer equal to &c (this naturally also applies to any struct s of type S, when calling member function s.sf(), as shall be used for cleaner demonstrations). It can even be cv-qualified just as any other pointer, with the same effects (but, unfortunately, not the same syntax due to being special); this is commonly used for const correctness, and much less frequently for volatile correctness.

template<typename T>

uintptr_t addr_out(T* ptr) { return reinterpret_cast<uintptr_t>(ptr); }



struct S {

    int i;



    uintptr_t address() const { return addr_out(this); }

};



// Format a given numerical value into a hex value for easy display.

// Implementation omitted for brevity.

template<typename T>

std::string hex_out_s(T val, bool disp0X = true);



// ...



S s[2];



std::cout << "Control example: Two distinct instances of simple class.n";

std::cout << "s[0] address:tttt"        << hex_out_s(addr_out(&s[0]))

          << "n* s[0] this pointer:ttt" << hex_out_s(s[0].address())

          << "nn";

std::cout << "s[1] address:tttt"        << hex_out_s(addr_out(&s[1]))

          << "n* s[1] this pointer:ttt" << hex_out_s(s[1].address())

          << "nn";

Sample output:

Control example: Two distinct instances of simple class.

s[0] address:                           0x0000003836e8fb40

* s[0] this pointer:                    0x0000003836e8fb40



s[1] address:                           0x0000003836e8fb44

* s[1] this pointer:                    0x0000003836e8fb44

These values aren't guaranteed, and can easily change from one execution to the next; this can most easily be observed while creating and testing a program, through the use of build tools.

Mechanically, it's similar to a hidden parameter added to the start of each member function's argument list; x.f() cv can be seen as a special variant of f(cv X* this), albeit with a different format for linguistic reasons. In fact, there were recent proposals by both Stroustrup and Sutter to unify the call syntax of x.f(y) and f(x, y), which would've made this implicit behaviour an explicit linguistic rule. It unfortunately was met with concerns that it may cause a few unwanted surprises for library developers, and thus not yet implemented; to my knowledge, the most recent proposal is a joint proposal, for f(x,y) to be able to fall back on x.f(y) if no f(x,y) is found, similar to the interaction between, e.g., std::begin(x) and member function x.begin().

In this case, this would be more akin to a normal pointer, and the programmer would be able to specify it manually. If a solution is found to allow the more robust form without violating the principle of least astonishment (or bringing any other concerns to pass), then an equivalent to this would also be able to be implicitly generated as a normal pointer for non-member functions, as well.

Relatedly, one important thing to note is that this is the instance's address, as seen by that instance; while the pointer itself is a runtime thing, it doesn't always have the value you'd think it has. This becomes relevant when looking at classes with more complex inheritance hierarchies. Specifically, when looking at cases where one or more member classes that contain member functions don't have the same address as the derived class itself. Three cases in particular come to mind:

^{Note that these are demonstrated using MSVC, with class layouts output via the undocumented -d1reportSingleClassLayout compiler parameter, due to me finding it more easily readable than GCC or Clang equivalents.}

Non-standard layout: When a class is standard layout, the address of an instance's first data member is exactly identical to the address of the instance itself; thus, this can be said to be equivalent to the first data member's address. This will hold true even if said data member is a member of a base class, as long as the derived class continues to follow standard layout rules. ...Conversely, this also means that if the derived class isn't standard layout, then this is no longer guaranteed.

struct StandardBase {

    int i;



    uintptr_t address() const { return addr_out(this); }

};



struct NonStandardDerived : StandardBase {

    virtual void f() {}



    uintptr_t address() const { return addr_out(this); }

};



static_assert(std::is_standard_layout<StandardBase>::value, "Nyeh.");

static_assert(!std::is_standard_layout<NonStandardDerived>::value, ".heyN");



// ...



NonStandardDerived n;



std::cout << "Derived class with non-standard layout:"

          << "n* n address:ttttt"                      << hex_out_s(addr_out(&n))

          << "n* n this pointer:tttt"                   << hex_out_s(n.address())

          << "n* n this pointer (as StandardBase):tt"     << hex_out_s(n.StandardBase::address())

          << "n* n this pointer (as NonStandardDerived):t" << hex_out_s(n.NonStandardDerived::address())

          << "nn";

Sample output:

Derived class with non-standard layout:

* n address:                                    0x00000061e86cf3c0

* n this pointer:                               0x00000061e86cf3c0

* n this pointer (as StandardBase):             0x00000061e86cf3c8

* n this pointer (as NonStandardDerived):       0x00000061e86cf3c0

Note that StandardBase::address() is supplied with a different this pointer than NonStandardDerived::address(), even when called on the same instance. This is because the latter's use of a vtable caused the compiler to insert a hidden member.

class StandardBase      size(4):

        +---

 0      | i

        +---

class NonStandardDerived        size(16):

        +---

 0      | {vfptr}

        | +--- (base class StandardBase)

 8      | | i

        | +---

        | <alignment member> (size=4)

        +---

NonStandardDerived::$vftable@:

        | &NonStandardDerived_meta

        |  0

 0      | &NonStandardDerived::f 

NonStandardDerived::f this adjustor: 0

Virtual base classes: Due to virtual bases trailing after the most-derived class, the this pointer supplied to a member function inherited from a virtual base will be different than the one provided to members of the derived class itself.

struct VBase {

    uintptr_t address() const { return addr_out(this); }

};

struct VDerived : virtual VBase {

    uintptr_t address() const { return addr_out(this); }

};



// ...



VDerived v;



std::cout << "Derived class with virtual base:"

          << "n* v address:ttttt"              << hex_out_s(addr_out(&v))

          << "n* v this pointer:tttt"           << hex_out_s(v.address())

          << "n* this pointer (as VBase):ttt"    << hex_out_s(v.VBase::address())

          << "n* this pointer (as VDerived):ttt" << hex_out_s(v.VDerived::address())

          << "nn";

Sample output:

Derived class with virtual base:

* v address:                                    0x0000008f8314f8b0

* v this pointer:                               0x0000008f8314f8b0

* this pointer (as VBase):                      0x0000008f8314f8b8

* this pointer (as VDerived):                   0x0000008f8314f8b0

Once again, the base class' member function is supplied with a different this pointer, due to VDerived's inherited VBase having a different starting address than VDerived itself.

class VDerived  size(8):

        +---

 0      | {vbptr}

        +---

        +--- (virtual base VBase)

        +---

VDerived::$vbtable@:

 0      | 0

 1      | 8 (VDerivedd(VDerived+0)VBase)

vbi:       class  offset o.vbptr  o.vbte fVtorDisp

           VBase       8       0       4 0

Multiple inheritance: As can be expected, multiple inheritance can easily lead to cases where the this pointer passed to one member function is different than the this pointer passed to a different member function, even if both functions are called with the same instance. This can come up for member functions of any base class other than the first, similarly to when working with non-standard layout classes (where all base classes after the first start at a different address than the derived class itself)... but it can be especially surprising in the case of virtual functions, when multiple members supply virtual functions with the same signature.

struct Base1 {

    int i;



    virtual uintptr_t address() const { return addr_out(this); }

    uintptr_t raw_address() { return addr_out(this); }

};

struct Base2 {

    short s;



    virtual uintptr_t address() const { return addr_out(this); }

    uintptr_t raw_address() { return addr_out(this); }

};

struct Derived : Base1, Base2 {

    bool b;



    uintptr_t address() const override { return addr_out(this); }

    uintptr_t raw_address() { return addr_out(this); }

};



// ...



Derived d;



std::cout << "Derived class with multiple inheritance:"

          << "n  (Calling address() through a static_cast reference, then the appropriate raw_address().)"

          << "n* d address:ttttt"               << hex_out_s(addr_out(&d))

          << "n* d this pointer:tttt"            << hex_out_s(d.address())                          << " (" << hex_out_s(d.raw_address())          << ")"

          << "n* d this pointer (as Base1):ttt"   << hex_out_s(static_cast<Base1&>((d)).address())   << " (" << hex_out_s(d.Base1::raw_address())   << ")"

          << "n* d this pointer (as Base2):ttt"   << hex_out_s(static_cast<Base2&>((d)).address())   << " (" << hex_out_s(d.Base2::raw_address())   << ")"

          << "n* d this pointer (as Derived):ttt" << hex_out_s(static_cast<Derived&>((d)).address()) << " (" << hex_out_s(d.Derived::raw_address()) << ")"

          << "nn";

Sample output:

Derived class with multiple inheritance:

  (Calling address() through a static_cast reference, then the appropriate raw_address().)

* d address:                                    0x00000056911ef530

* d this pointer:                               0x00000056911ef530 (0x00000056911ef530)

* d this pointer (as Base1):                    0x00000056911ef530 (0x00000056911ef530)

* d this pointer (as Base2):                    0x00000056911ef530 (0x00000056911ef540)

* d this pointer (as Derived):                  0x00000056911ef530 (0x00000056911ef530)

We would expect each raw_address() to same rules due to each explicitly being a separate function, and thus that Base2::raw_address() will return a different value than Derived::raw_address(). But since we know derived functions will always call the most-derived form, how is address() correct when called from a reference to Base2? This is due to a little compiler trickery called an "adjustor thunk", which is a helper that takes a base class instance's this pointer and adjusts it to point to the most-derived class instead, when necessary.

class Derived   size(40):

        +---

        | +--- (base class Base1)

 0      | | {vfptr}

 8      | | i

        | | <alignment member> (size=4)

        | +---

        | +--- (base class Base2)

16      | | {vfptr}

24      | | s

        | | <alignment member> (size=6)

        | +---

32      | b

        | <alignment member> (size=7)

        +---

Derived::$vftable@Base1@:

        | &Derived_meta

        |  0

 0      | &Derived::address 

Derived::$vftable@Base2@:

        | -16

 0      | &thunk: this-=16; goto Derived::address 

Derived::address this adjustor: 0

If you're curious, feel free to tinker around with this little program, to take a look at how the addresses change if you run it multiple times, or at cases where it might have a different value than you may expect.

搜尋此網誌

Csdrhrt

Is the “this” pointer just a compile time thing?

12 Answers
12

Your Answer

Post as a guest

12 Answers
12

12 Answers
12

Post as a guest

Popular posts from this blog

Plaza Victoria

Puebla de Zaragoza

Musa

Is the “this” pointer just a compile time thing?

12 Answers 12

Your Answer

Sign up or log in

Post as a guest

Post as a guest

12 Answers 12

12 Answers 12

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Plaza Victoria

Puebla de Zaragoza

Musa

12 Answers
12

12 Answers
12

12 Answers
12