Pandas dataframe: Remove secondary upcoming same value












9














I have a dataframe:



col1  col2
a 0
b 1
c 1
d 0
c 1
d 0


On 'col2' I want to keep only the first 1 from the top and replace every 1 below the first one with a 0, such that the output is:



col1  col2
a 0
b 1
c 0
d 0
c 0
d 0


Thank you very much.










share|improve this question





























    9














    I have a dataframe:



    col1  col2
    a 0
    b 1
    c 1
    d 0
    c 1
    d 0


    On 'col2' I want to keep only the first 1 from the top and replace every 1 below the first one with a 0, such that the output is:



    col1  col2
    a 0
    b 1
    c 0
    d 0
    c 0
    d 0


    Thank you very much.










    share|improve this question



























      9












      9








      9


      1





      I have a dataframe:



      col1  col2
      a 0
      b 1
      c 1
      d 0
      c 1
      d 0


      On 'col2' I want to keep only the first 1 from the top and replace every 1 below the first one with a 0, such that the output is:



      col1  col2
      a 0
      b 1
      c 0
      d 0
      c 0
      d 0


      Thank you very much.










      share|improve this question















      I have a dataframe:



      col1  col2
      a 0
      b 1
      c 1
      d 0
      c 1
      d 0


      On 'col2' I want to keep only the first 1 from the top and replace every 1 below the first one with a 0, such that the output is:



      col1  col2
      a 0
      b 1
      c 0
      d 0
      c 0
      d 0


      Thank you very much.







      python pandas dataframe






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Dec 6 at 15:46









      timgeb

      48.9k116390




      48.9k116390










      asked Dec 6 at 15:33









      s900n

      435616




      435616
























          8 Answers
          8






          active

          oldest

          votes


















          9














          You can find the index of the first 1 and set others to 0:



          mask = df['col2'].eq(1)
          df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0


          For better performance, see Efficiently return the index of the first value satisfying condition in array.






          share|improve this answer





















          • Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
            – timgeb
            Dec 6 at 16:18












          • @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
            – jpp
            Dec 6 at 16:28












          • Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
            – timgeb
            Dec 6 at 16:30





















          4














          np.flatnonzero



          Because I thought we needed more answers



          df.loc[df.index[np.flatnonzero(df.col2)[1:]], 'col2'] -= 1
          df

          col1 col2
          0 a 0
          1 b 1
          2 c 0
          3 d 0
          4 c 0
          5 d 0




          Same thing but a little more sneaky.



          df.col2.values[np.flatnonzero(df.col2.values)[1:]] -= 1
          df

          col1 col2
          0 a 0
          1 b 1
          2 c 0
          3 d 0
          4 c 0
          5 d 0





          share|improve this answer































            4














            Case 1: df has only ones and zeros in col2 and integer indexes.



            >>> df
            col1 col2
            0 a 0
            1 b 1
            2 c 1
            3 d 0
            4 c 1
            5 d 0


            You can use:



            >>> df.loc[df['col2'].idxmax() + 1:, 'col2'] = 0
            >>> df
            col1 col2
            0 a 0
            1 b 1
            2 c 0
            3 d 0
            4 c 0
            5 d 0




            Case2: df can have all kinds of values in col2 and has integer indexes.



            >>> df # demo dataframe
            col1 col2
            0 a 0
            1 b 1
            2 c 2
            3 d 2
            4 c 3
            5 d 3


            You can use:



            >>> df.loc[(df['col2'] == 1).idxmax() + 1:, 'col2'] = 0
            >>> df
            col1 col2
            0 a 0
            1 b 1
            2 c 0
            3 d 0
            4 c 0
            5 d 0




            Case 3: df can have all kinds of values in col2 and has an arbitrary index.



            >>> df
            col1 col2
            u a -1
            v b 1
            w c 2
            x d 2
            y c 3
            z d 3


            You can use:



            >>> df['col2'].iloc[(df['col2'].values == 1).argmax() + 1:] = 0
            >>> df
            col1 col2
            u a -1
            v b 1
            w c 0
            x d 0
            y c 0
            z d 0





            share|improve this answer































              3














              Using drop_duplicates with reindex



              df.col2=df.col2.drop_duplicates().reindex(df.index,fill_value=0)
              df
              Out[1078]:
              col1 col2
              0 a 0
              1 b 1
              2 c 0
              3 d 0
              4 c 0
              5 d 0





              share|improve this answer





























                3














                You can use numpy for an effficient solution:



                a = df.col2.values
                b = np.zeros_like(a)
                b[a.argmax()] = 1
                df.assign(col2=b)




                  col1  col2
                0 a 0
                1 b 1
                2 c 0
                3 d 0
                4 c 0
                5 d 0





                share|improve this answer































                  1














                  i like this too



                  data['col2'][np.where(data['col2'] == 1)[0][0]+1:] = 0





                  share|improve this answer

















                  • 1




                    Chained indexing is not recommended.
                    – jpp
                    Dec 6 at 16:41












                  • Thanks for the update..
                    – iamklaus
                    Dec 7 at 8:43



















                  1














                  Sooo many options, here's mine... almost the same as timgebs answer (found independently), but still different ;)



                  Find the index of col2 that has the first occurence of a 1, and change all row values after that index to 0:



                  df['col2'].iloc[df.col2.idxmax()+1:] = 0





                  share|improve this answer





















                  • Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                    – jpp
                    Dec 6 at 16:42












                  • Totally agree. Your solution is more general.
                    – Sander van den Oord
                    Dec 6 at 17:42



















                  0














                  id = list(df["col2"]).index(1)
                  df.iloc[id+1:]["col2"].replace(1,0,inplace=True)





                  share|improve this answer

















                  • 3




                    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                    – Nic3500
                    Dec 6 at 16:00










                  • Chained indexing is not recommended.
                    – jpp
                    Dec 6 at 16:41











                  Your Answer






                  StackExchange.ifUsing("editor", function () {
                  StackExchange.using("externalEditor", function () {
                  StackExchange.using("snippets", function () {
                  StackExchange.snippets.init();
                  });
                  });
                  }, "code-snippets");

                  StackExchange.ready(function() {
                  var channelOptions = {
                  tags: "".split(" "),
                  id: "1"
                  };
                  initTagRenderer("".split(" "), "".split(" "), channelOptions);

                  StackExchange.using("externalEditor", function() {
                  // Have to fire editor after snippets, if snippets enabled
                  if (StackExchange.settings.snippets.snippetsEnabled) {
                  StackExchange.using("snippets", function() {
                  createEditor();
                  });
                  }
                  else {
                  createEditor();
                  }
                  });

                  function createEditor() {
                  StackExchange.prepareEditor({
                  heartbeatType: 'answer',
                  autoActivateHeartbeat: false,
                  convertImagesToLinks: true,
                  noModals: true,
                  showLowRepImageUploadWarning: true,
                  reputationToPostImages: 10,
                  bindNavPrevention: true,
                  postfix: "",
                  imageUploader: {
                  brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                  contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                  allowUrls: true
                  },
                  onDemand: true,
                  discardSelector: ".discard-answer"
                  ,immediatelyShowMarkdownHelp:true
                  });


                  }
                  });














                  draft saved

                  draft discarded


















                  StackExchange.ready(
                  function () {
                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53654729%2fpandas-dataframe-remove-secondary-upcoming-same-value%23new-answer', 'question_page');
                  }
                  );

                  Post as a guest















                  Required, but never shown

























                  8 Answers
                  8






                  active

                  oldest

                  votes








                  8 Answers
                  8






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes









                  9














                  You can find the index of the first 1 and set others to 0:



                  mask = df['col2'].eq(1)
                  df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0


                  For better performance, see Efficiently return the index of the first value satisfying condition in array.






                  share|improve this answer





















                  • Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
                    – timgeb
                    Dec 6 at 16:18












                  • @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
                    – jpp
                    Dec 6 at 16:28












                  • Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
                    – timgeb
                    Dec 6 at 16:30


















                  9














                  You can find the index of the first 1 and set others to 0:



                  mask = df['col2'].eq(1)
                  df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0


                  For better performance, see Efficiently return the index of the first value satisfying condition in array.






                  share|improve this answer





















                  • Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
                    – timgeb
                    Dec 6 at 16:18












                  • @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
                    – jpp
                    Dec 6 at 16:28












                  • Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
                    – timgeb
                    Dec 6 at 16:30
















                  9












                  9








                  9






                  You can find the index of the first 1 and set others to 0:



                  mask = df['col2'].eq(1)
                  df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0


                  For better performance, see Efficiently return the index of the first value satisfying condition in array.






                  share|improve this answer












                  You can find the index of the first 1 and set others to 0:



                  mask = df['col2'].eq(1)
                  df.loc[mask & (df.index != mask.idxmax()), 'col2'] = 0


                  For better performance, see Efficiently return the index of the first value satisfying condition in array.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Dec 6 at 15:37









                  jpp

                  90.8k2052101




                  90.8k2052101












                  • Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
                    – timgeb
                    Dec 6 at 16:18












                  • @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
                    – jpp
                    Dec 6 at 16:28












                  • Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
                    – timgeb
                    Dec 6 at 16:30




















                  • Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
                    – timgeb
                    Dec 6 at 16:18












                  • @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
                    – jpp
                    Dec 6 at 16:28












                  • Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
                    – timgeb
                    Dec 6 at 16:30


















                  Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
                  – timgeb
                  Dec 6 at 16:18






                  Can you think of a good solution for the case when the index is arbitrary, like Index(['u', 'v', 'w', 'x', 'y', 'z'] AND col2 could be something like [2, 0, 0, 1, 3, 1]?
                  – timgeb
                  Dec 6 at 16:18














                  @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
                  – jpp
                  Dec 6 at 16:28






                  @timgeb, To adapt this solution, I think you can use positional indexing (instead of index labels). Something like df.loc[mask & (np.arange(df.shape[0]) != np.where(mask)[0][0]), 'col2'] = 0. But I'm sure there are more Pythonic ways.
                  – jpp
                  Dec 6 at 16:28














                  Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
                  – timgeb
                  Dec 6 at 16:30






                  Ah, I thought of using numpy, too. Just a bit differently. See my case 3. ;)
                  – timgeb
                  Dec 6 at 16:30















                  4














                  np.flatnonzero



                  Because I thought we needed more answers



                  df.loc[df.index[np.flatnonzero(df.col2)[1:]], 'col2'] -= 1
                  df

                  col1 col2
                  0 a 0
                  1 b 1
                  2 c 0
                  3 d 0
                  4 c 0
                  5 d 0




                  Same thing but a little more sneaky.



                  df.col2.values[np.flatnonzero(df.col2.values)[1:]] -= 1
                  df

                  col1 col2
                  0 a 0
                  1 b 1
                  2 c 0
                  3 d 0
                  4 c 0
                  5 d 0





                  share|improve this answer




























                    4














                    np.flatnonzero



                    Because I thought we needed more answers



                    df.loc[df.index[np.flatnonzero(df.col2)[1:]], 'col2'] -= 1
                    df

                    col1 col2
                    0 a 0
                    1 b 1
                    2 c 0
                    3 d 0
                    4 c 0
                    5 d 0




                    Same thing but a little more sneaky.



                    df.col2.values[np.flatnonzero(df.col2.values)[1:]] -= 1
                    df

                    col1 col2
                    0 a 0
                    1 b 1
                    2 c 0
                    3 d 0
                    4 c 0
                    5 d 0





                    share|improve this answer


























                      4












                      4








                      4






                      np.flatnonzero



                      Because I thought we needed more answers



                      df.loc[df.index[np.flatnonzero(df.col2)[1:]], 'col2'] -= 1
                      df

                      col1 col2
                      0 a 0
                      1 b 1
                      2 c 0
                      3 d 0
                      4 c 0
                      5 d 0




                      Same thing but a little more sneaky.



                      df.col2.values[np.flatnonzero(df.col2.values)[1:]] -= 1
                      df

                      col1 col2
                      0 a 0
                      1 b 1
                      2 c 0
                      3 d 0
                      4 c 0
                      5 d 0





                      share|improve this answer














                      np.flatnonzero



                      Because I thought we needed more answers



                      df.loc[df.index[np.flatnonzero(df.col2)[1:]], 'col2'] -= 1
                      df

                      col1 col2
                      0 a 0
                      1 b 1
                      2 c 0
                      3 d 0
                      4 c 0
                      5 d 0




                      Same thing but a little more sneaky.



                      df.col2.values[np.flatnonzero(df.col2.values)[1:]] -= 1
                      df

                      col1 col2
                      0 a 0
                      1 b 1
                      2 c 0
                      3 d 0
                      4 c 0
                      5 d 0






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Dec 6 at 15:56

























                      answered Dec 6 at 15:51









                      piRSquared

                      152k22144285




                      152k22144285























                          4














                          Case 1: df has only ones and zeros in col2 and integer indexes.



                          >>> df
                          col1 col2
                          0 a 0
                          1 b 1
                          2 c 1
                          3 d 0
                          4 c 1
                          5 d 0


                          You can use:



                          >>> df.loc[df['col2'].idxmax() + 1:, 'col2'] = 0
                          >>> df
                          col1 col2
                          0 a 0
                          1 b 1
                          2 c 0
                          3 d 0
                          4 c 0
                          5 d 0




                          Case2: df can have all kinds of values in col2 and has integer indexes.



                          >>> df # demo dataframe
                          col1 col2
                          0 a 0
                          1 b 1
                          2 c 2
                          3 d 2
                          4 c 3
                          5 d 3


                          You can use:



                          >>> df.loc[(df['col2'] == 1).idxmax() + 1:, 'col2'] = 0
                          >>> df
                          col1 col2
                          0 a 0
                          1 b 1
                          2 c 0
                          3 d 0
                          4 c 0
                          5 d 0




                          Case 3: df can have all kinds of values in col2 and has an arbitrary index.



                          >>> df
                          col1 col2
                          u a -1
                          v b 1
                          w c 2
                          x d 2
                          y c 3
                          z d 3


                          You can use:



                          >>> df['col2'].iloc[(df['col2'].values == 1).argmax() + 1:] = 0
                          >>> df
                          col1 col2
                          u a -1
                          v b 1
                          w c 0
                          x d 0
                          y c 0
                          z d 0





                          share|improve this answer




























                            4














                            Case 1: df has only ones and zeros in col2 and integer indexes.



                            >>> df
                            col1 col2
                            0 a 0
                            1 b 1
                            2 c 1
                            3 d 0
                            4 c 1
                            5 d 0


                            You can use:



                            >>> df.loc[df['col2'].idxmax() + 1:, 'col2'] = 0
                            >>> df
                            col1 col2
                            0 a 0
                            1 b 1
                            2 c 0
                            3 d 0
                            4 c 0
                            5 d 0




                            Case2: df can have all kinds of values in col2 and has integer indexes.



                            >>> df # demo dataframe
                            col1 col2
                            0 a 0
                            1 b 1
                            2 c 2
                            3 d 2
                            4 c 3
                            5 d 3


                            You can use:



                            >>> df.loc[(df['col2'] == 1).idxmax() + 1:, 'col2'] = 0
                            >>> df
                            col1 col2
                            0 a 0
                            1 b 1
                            2 c 0
                            3 d 0
                            4 c 0
                            5 d 0




                            Case 3: df can have all kinds of values in col2 and has an arbitrary index.



                            >>> df
                            col1 col2
                            u a -1
                            v b 1
                            w c 2
                            x d 2
                            y c 3
                            z d 3


                            You can use:



                            >>> df['col2'].iloc[(df['col2'].values == 1).argmax() + 1:] = 0
                            >>> df
                            col1 col2
                            u a -1
                            v b 1
                            w c 0
                            x d 0
                            y c 0
                            z d 0





                            share|improve this answer


























                              4












                              4








                              4






                              Case 1: df has only ones and zeros in col2 and integer indexes.



                              >>> df
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 1
                              3 d 0
                              4 c 1
                              5 d 0


                              You can use:



                              >>> df.loc[df['col2'].idxmax() + 1:, 'col2'] = 0
                              >>> df
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 0
                              3 d 0
                              4 c 0
                              5 d 0




                              Case2: df can have all kinds of values in col2 and has integer indexes.



                              >>> df # demo dataframe
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 2
                              3 d 2
                              4 c 3
                              5 d 3


                              You can use:



                              >>> df.loc[(df['col2'] == 1).idxmax() + 1:, 'col2'] = 0
                              >>> df
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 0
                              3 d 0
                              4 c 0
                              5 d 0




                              Case 3: df can have all kinds of values in col2 and has an arbitrary index.



                              >>> df
                              col1 col2
                              u a -1
                              v b 1
                              w c 2
                              x d 2
                              y c 3
                              z d 3


                              You can use:



                              >>> df['col2'].iloc[(df['col2'].values == 1).argmax() + 1:] = 0
                              >>> df
                              col1 col2
                              u a -1
                              v b 1
                              w c 0
                              x d 0
                              y c 0
                              z d 0





                              share|improve this answer














                              Case 1: df has only ones and zeros in col2 and integer indexes.



                              >>> df
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 1
                              3 d 0
                              4 c 1
                              5 d 0


                              You can use:



                              >>> df.loc[df['col2'].idxmax() + 1:, 'col2'] = 0
                              >>> df
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 0
                              3 d 0
                              4 c 0
                              5 d 0




                              Case2: df can have all kinds of values in col2 and has integer indexes.



                              >>> df # demo dataframe
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 2
                              3 d 2
                              4 c 3
                              5 d 3


                              You can use:



                              >>> df.loc[(df['col2'] == 1).idxmax() + 1:, 'col2'] = 0
                              >>> df
                              col1 col2
                              0 a 0
                              1 b 1
                              2 c 0
                              3 d 0
                              4 c 0
                              5 d 0




                              Case 3: df can have all kinds of values in col2 and has an arbitrary index.



                              >>> df
                              col1 col2
                              u a -1
                              v b 1
                              w c 2
                              x d 2
                              y c 3
                              z d 3


                              You can use:



                              >>> df['col2'].iloc[(df['col2'].values == 1).argmax() + 1:] = 0
                              >>> df
                              col1 col2
                              u a -1
                              v b 1
                              w c 0
                              x d 0
                              y c 0
                              z d 0






                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Dec 6 at 16:29

























                              answered Dec 6 at 15:38









                              timgeb

                              48.9k116390




                              48.9k116390























                                  3














                                  Using drop_duplicates with reindex



                                  df.col2=df.col2.drop_duplicates().reindex(df.index,fill_value=0)
                                  df
                                  Out[1078]:
                                  col1 col2
                                  0 a 0
                                  1 b 1
                                  2 c 0
                                  3 d 0
                                  4 c 0
                                  5 d 0





                                  share|improve this answer


























                                    3














                                    Using drop_duplicates with reindex



                                    df.col2=df.col2.drop_duplicates().reindex(df.index,fill_value=0)
                                    df
                                    Out[1078]:
                                    col1 col2
                                    0 a 0
                                    1 b 1
                                    2 c 0
                                    3 d 0
                                    4 c 0
                                    5 d 0





                                    share|improve this answer
























                                      3












                                      3








                                      3






                                      Using drop_duplicates with reindex



                                      df.col2=df.col2.drop_duplicates().reindex(df.index,fill_value=0)
                                      df
                                      Out[1078]:
                                      col1 col2
                                      0 a 0
                                      1 b 1
                                      2 c 0
                                      3 d 0
                                      4 c 0
                                      5 d 0





                                      share|improve this answer












                                      Using drop_duplicates with reindex



                                      df.col2=df.col2.drop_duplicates().reindex(df.index,fill_value=0)
                                      df
                                      Out[1078]:
                                      col1 col2
                                      0 a 0
                                      1 b 1
                                      2 c 0
                                      3 d 0
                                      4 c 0
                                      5 d 0






                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Dec 6 at 15:41









                                      W-B

                                      100k73163




                                      100k73163























                                          3














                                          You can use numpy for an effficient solution:



                                          a = df.col2.values
                                          b = np.zeros_like(a)
                                          b[a.argmax()] = 1
                                          df.assign(col2=b)




                                            col1  col2
                                          0 a 0
                                          1 b 1
                                          2 c 0
                                          3 d 0
                                          4 c 0
                                          5 d 0





                                          share|improve this answer




























                                            3














                                            You can use numpy for an effficient solution:



                                            a = df.col2.values
                                            b = np.zeros_like(a)
                                            b[a.argmax()] = 1
                                            df.assign(col2=b)




                                              col1  col2
                                            0 a 0
                                            1 b 1
                                            2 c 0
                                            3 d 0
                                            4 c 0
                                            5 d 0





                                            share|improve this answer


























                                              3












                                              3








                                              3






                                              You can use numpy for an effficient solution:



                                              a = df.col2.values
                                              b = np.zeros_like(a)
                                              b[a.argmax()] = 1
                                              df.assign(col2=b)




                                                col1  col2
                                              0 a 0
                                              1 b 1
                                              2 c 0
                                              3 d 0
                                              4 c 0
                                              5 d 0





                                              share|improve this answer














                                              You can use numpy for an effficient solution:



                                              a = df.col2.values
                                              b = np.zeros_like(a)
                                              b[a.argmax()] = 1
                                              df.assign(col2=b)




                                                col1  col2
                                              0 a 0
                                              1 b 1
                                              2 c 0
                                              3 d 0
                                              4 c 0
                                              5 d 0






                                              share|improve this answer














                                              share|improve this answer



                                              share|improve this answer








                                              edited Dec 6 at 15:53

























                                              answered Dec 6 at 15:39









                                              user3483203

                                              30.2k82354




                                              30.2k82354























                                                  1














                                                  i like this too



                                                  data['col2'][np.where(data['col2'] == 1)[0][0]+1:] = 0





                                                  share|improve this answer

















                                                  • 1




                                                    Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41












                                                  • Thanks for the update..
                                                    – iamklaus
                                                    Dec 7 at 8:43
















                                                  1














                                                  i like this too



                                                  data['col2'][np.where(data['col2'] == 1)[0][0]+1:] = 0





                                                  share|improve this answer

















                                                  • 1




                                                    Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41












                                                  • Thanks for the update..
                                                    – iamklaus
                                                    Dec 7 at 8:43














                                                  1












                                                  1








                                                  1






                                                  i like this too



                                                  data['col2'][np.where(data['col2'] == 1)[0][0]+1:] = 0





                                                  share|improve this answer












                                                  i like this too



                                                  data['col2'][np.where(data['col2'] == 1)[0][0]+1:] = 0






                                                  share|improve this answer












                                                  share|improve this answer



                                                  share|improve this answer










                                                  answered Dec 6 at 15:42









                                                  iamklaus

                                                  84148




                                                  84148








                                                  • 1




                                                    Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41












                                                  • Thanks for the update..
                                                    – iamklaus
                                                    Dec 7 at 8:43














                                                  • 1




                                                    Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41












                                                  • Thanks for the update..
                                                    – iamklaus
                                                    Dec 7 at 8:43








                                                  1




                                                  1




                                                  Chained indexing is not recommended.
                                                  – jpp
                                                  Dec 6 at 16:41






                                                  Chained indexing is not recommended.
                                                  – jpp
                                                  Dec 6 at 16:41














                                                  Thanks for the update..
                                                  – iamklaus
                                                  Dec 7 at 8:43




                                                  Thanks for the update..
                                                  – iamklaus
                                                  Dec 7 at 8:43











                                                  1














                                                  Sooo many options, here's mine... almost the same as timgebs answer (found independently), but still different ;)



                                                  Find the index of col2 that has the first occurence of a 1, and change all row values after that index to 0:



                                                  df['col2'].iloc[df.col2.idxmax()+1:] = 0





                                                  share|improve this answer





















                                                  • Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                                                    – jpp
                                                    Dec 6 at 16:42












                                                  • Totally agree. Your solution is more general.
                                                    – Sander van den Oord
                                                    Dec 6 at 17:42
















                                                  1














                                                  Sooo many options, here's mine... almost the same as timgebs answer (found independently), but still different ;)



                                                  Find the index of col2 that has the first occurence of a 1, and change all row values after that index to 0:



                                                  df['col2'].iloc[df.col2.idxmax()+1:] = 0





                                                  share|improve this answer





















                                                  • Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                                                    – jpp
                                                    Dec 6 at 16:42












                                                  • Totally agree. Your solution is more general.
                                                    – Sander van den Oord
                                                    Dec 6 at 17:42














                                                  1












                                                  1








                                                  1






                                                  Sooo many options, here's mine... almost the same as timgebs answer (found independently), but still different ;)



                                                  Find the index of col2 that has the first occurence of a 1, and change all row values after that index to 0:



                                                  df['col2'].iloc[df.col2.idxmax()+1:] = 0





                                                  share|improve this answer












                                                  Sooo many options, here's mine... almost the same as timgebs answer (found independently), but still different ;)



                                                  Find the index of col2 that has the first occurence of a 1, and change all row values after that index to 0:



                                                  df['col2'].iloc[df.col2.idxmax()+1:] = 0






                                                  share|improve this answer












                                                  share|improve this answer



                                                  share|improve this answer










                                                  answered Dec 6 at 15:55









                                                  Sander van den Oord

                                                  563420




                                                  563420












                                                  • Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                                                    – jpp
                                                    Dec 6 at 16:42












                                                  • Totally agree. Your solution is more general.
                                                    – Sander van den Oord
                                                    Dec 6 at 17:42


















                                                  • Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                                                    – jpp
                                                    Dec 6 at 16:42












                                                  • Totally agree. Your solution is more general.
                                                    – Sander van den Oord
                                                    Dec 6 at 17:42
















                                                  Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                                                  – jpp
                                                  Dec 6 at 16:42






                                                  Be careful, this sets all values to 0 after the specified index, not just the ones equal to 1. Though that's the same with some other answers too.
                                                  – jpp
                                                  Dec 6 at 16:42














                                                  Totally agree. Your solution is more general.
                                                  – Sander van den Oord
                                                  Dec 6 at 17:42




                                                  Totally agree. Your solution is more general.
                                                  – Sander van den Oord
                                                  Dec 6 at 17:42











                                                  0














                                                  id = list(df["col2"]).index(1)
                                                  df.iloc[id+1:]["col2"].replace(1,0,inplace=True)





                                                  share|improve this answer

















                                                  • 3




                                                    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                                                    – Nic3500
                                                    Dec 6 at 16:00










                                                  • Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41
















                                                  0














                                                  id = list(df["col2"]).index(1)
                                                  df.iloc[id+1:]["col2"].replace(1,0,inplace=True)





                                                  share|improve this answer

















                                                  • 3




                                                    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                                                    – Nic3500
                                                    Dec 6 at 16:00










                                                  • Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41














                                                  0












                                                  0








                                                  0






                                                  id = list(df["col2"]).index(1)
                                                  df.iloc[id+1:]["col2"].replace(1,0,inplace=True)





                                                  share|improve this answer












                                                  id = list(df["col2"]).index(1)
                                                  df.iloc[id+1:]["col2"].replace(1,0,inplace=True)






                                                  share|improve this answer












                                                  share|improve this answer



                                                  share|improve this answer










                                                  answered Dec 6 at 15:43









                                                  shyamrag cp

                                                  385




                                                  385








                                                  • 3




                                                    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                                                    – Nic3500
                                                    Dec 6 at 16:00










                                                  • Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41














                                                  • 3




                                                    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                                                    – Nic3500
                                                    Dec 6 at 16:00










                                                  • Chained indexing is not recommended.
                                                    – jpp
                                                    Dec 6 at 16:41








                                                  3




                                                  3




                                                  While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                                                  – Nic3500
                                                  Dec 6 at 16:00




                                                  While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.
                                                  – Nic3500
                                                  Dec 6 at 16:00












                                                  Chained indexing is not recommended.
                                                  – jpp
                                                  Dec 6 at 16:41




                                                  Chained indexing is not recommended.
                                                  – jpp
                                                  Dec 6 at 16:41


















                                                  draft saved

                                                  draft discarded




















































                                                  Thanks for contributing an answer to Stack Overflow!


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  To learn more, see our tips on writing great answers.





                                                  Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                                  Please pay close attention to the following guidance:


                                                  • Please be sure to answer the question. Provide details and share your research!

                                                  But avoid



                                                  • Asking for help, clarification, or responding to other answers.

                                                  • Making statements based on opinion; back them up with references or personal experience.


                                                  To learn more, see our tips on writing great answers.




                                                  draft saved


                                                  draft discarded














                                                  StackExchange.ready(
                                                  function () {
                                                  StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53654729%2fpandas-dataframe-remove-secondary-upcoming-same-value%23new-answer', 'question_page');
                                                  }
                                                  );

                                                  Post as a guest















                                                  Required, but never shown





















































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown

































                                                  Required, but never shown














                                                  Required, but never shown












                                                  Required, but never shown







                                                  Required, but never shown







                                                  Popular posts from this blog

                                                  Plaza Victoria

                                                  Puebla de Zaragoza

                                                  Musa