How to get all distinct words within a set of lines?












5















I would like to extract a list of distinct words from a set of lines. Is there a way of doing this ?



Say for example I have lines that look like this:



[
[(isPhysicallySettledFxFwd, NO,"Y"),(isPhysicallySettledFxFwd,isPhysicallySettledFxSwap,"N")],
[(isPhysicallySettledFxSwap,NO,"Y"),(isPhysicallySettledFxSwap, isPhysicallySettledCommodity,"Y")],
[(isPhysicallySettledCommodity,NO,"Y"),(isPhysicallySettledCommodity,YES,"Y")]
]


Then i would get a list of distinct words, looking this:



isPhysicallySettledFxFwd
isPhysicallySettledFxSwap
isPhysicallySettledCommodity
NO
YES
Y
N
(
)
"
[
]
,


I am not sure how to even start, apart from copying the lines to Excel and doing lots of manipulations...









share







New contributor




user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    5















    I would like to extract a list of distinct words from a set of lines. Is there a way of doing this ?



    Say for example I have lines that look like this:



    [
    [(isPhysicallySettledFxFwd, NO,"Y"),(isPhysicallySettledFxFwd,isPhysicallySettledFxSwap,"N")],
    [(isPhysicallySettledFxSwap,NO,"Y"),(isPhysicallySettledFxSwap, isPhysicallySettledCommodity,"Y")],
    [(isPhysicallySettledCommodity,NO,"Y"),(isPhysicallySettledCommodity,YES,"Y")]
    ]


    Then i would get a list of distinct words, looking this:



    isPhysicallySettledFxFwd
    isPhysicallySettledFxSwap
    isPhysicallySettledCommodity
    NO
    YES
    Y
    N
    (
    )
    "
    [
    ]
    ,


    I am not sure how to even start, apart from copying the lines to Excel and doing lots of manipulations...









    share







    New contributor




    user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.























      5












      5








      5








      I would like to extract a list of distinct words from a set of lines. Is there a way of doing this ?



      Say for example I have lines that look like this:



      [
      [(isPhysicallySettledFxFwd, NO,"Y"),(isPhysicallySettledFxFwd,isPhysicallySettledFxSwap,"N")],
      [(isPhysicallySettledFxSwap,NO,"Y"),(isPhysicallySettledFxSwap, isPhysicallySettledCommodity,"Y")],
      [(isPhysicallySettledCommodity,NO,"Y"),(isPhysicallySettledCommodity,YES,"Y")]
      ]


      Then i would get a list of distinct words, looking this:



      isPhysicallySettledFxFwd
      isPhysicallySettledFxSwap
      isPhysicallySettledCommodity
      NO
      YES
      Y
      N
      (
      )
      "
      [
      ]
      ,


      I am not sure how to even start, apart from copying the lines to Excel and doing lots of manipulations...









      share







      New contributor




      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.












      I would like to extract a list of distinct words from a set of lines. Is there a way of doing this ?



      Say for example I have lines that look like this:



      [
      [(isPhysicallySettledFxFwd, NO,"Y"),(isPhysicallySettledFxFwd,isPhysicallySettledFxSwap,"N")],
      [(isPhysicallySettledFxSwap,NO,"Y"),(isPhysicallySettledFxSwap, isPhysicallySettledCommodity,"Y")],
      [(isPhysicallySettledCommodity,NO,"Y"),(isPhysicallySettledCommodity,YES,"Y")]
      ]


      Then i would get a list of distinct words, looking this:



      isPhysicallySettledFxFwd
      isPhysicallySettledFxSwap
      isPhysicallySettledCommodity
      NO
      YES
      Y
      N
      (
      )
      "
      [
      ]
      ,


      I am not sure how to even start, apart from copying the lines to Excel and doing lots of manipulations...







      regular-expression functions vi-words list





      share







      New contributor




      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.










      share







      New contributor




      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share



      share






      New contributor




      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Apr 19 at 5:36









      user3203476user3203476

      1283




      1283




      New contributor




      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      user3203476 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          3 Answers
          3






          active

          oldest

          votes


















          3














          You can do something like this:



          :let a=
          :%s/w+/=add(a, submatch(0))/gn
          :new
          :put =uniq(sort(a))


          This will first declare a list a to work with. Then we run a :%s command, to capture all word-characters (w+) and act on all matches (g flag of the :s command), but won't actually replace (n flag). We use a sub-replace-expression(=) in the replacement part, to store the captured submatch in list a.



          And finally, we create a new window, and put the unique and sorted (uniq) content of list a into it.



          You can get a lot more sophisticated, like only capturing certain words, or counting the numbers, but this shows how flexible the :s command is.





          share
























          • how wonderful ! thank you !!

            – user3203476
            Apr 19 at 6:27



















          2














          Maybe this:



          :%s/W/rr/g
          :sort u
          :g/^s*$/d


          The first puts a line break before and after each non-word character.



          The second command sorts the entire file with the option "unique", so all duplicate lines are removed.



          The third command deletes all lines that are empty or only contain whitespaces.





          share































            1














            You can use grep with the --only-matching/-o flag to accomplish this:



            :%!grep -o 'w+|W' | sort -u




            share































              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              3














              You can do something like this:



              :let a=
              :%s/w+/=add(a, submatch(0))/gn
              :new
              :put =uniq(sort(a))


              This will first declare a list a to work with. Then we run a :%s command, to capture all word-characters (w+) and act on all matches (g flag of the :s command), but won't actually replace (n flag). We use a sub-replace-expression(=) in the replacement part, to store the captured submatch in list a.



              And finally, we create a new window, and put the unique and sorted (uniq) content of list a into it.



              You can get a lot more sophisticated, like only capturing certain words, or counting the numbers, but this shows how flexible the :s command is.





              share
























              • how wonderful ! thank you !!

                – user3203476
                Apr 19 at 6:27
















              3














              You can do something like this:



              :let a=
              :%s/w+/=add(a, submatch(0))/gn
              :new
              :put =uniq(sort(a))


              This will first declare a list a to work with. Then we run a :%s command, to capture all word-characters (w+) and act on all matches (g flag of the :s command), but won't actually replace (n flag). We use a sub-replace-expression(=) in the replacement part, to store the captured submatch in list a.



              And finally, we create a new window, and put the unique and sorted (uniq) content of list a into it.



              You can get a lot more sophisticated, like only capturing certain words, or counting the numbers, but this shows how flexible the :s command is.





              share
























              • how wonderful ! thank you !!

                – user3203476
                Apr 19 at 6:27














              3












              3








              3







              You can do something like this:



              :let a=
              :%s/w+/=add(a, submatch(0))/gn
              :new
              :put =uniq(sort(a))


              This will first declare a list a to work with. Then we run a :%s command, to capture all word-characters (w+) and act on all matches (g flag of the :s command), but won't actually replace (n flag). We use a sub-replace-expression(=) in the replacement part, to store the captured submatch in list a.



              And finally, we create a new window, and put the unique and sorted (uniq) content of list a into it.



              You can get a lot more sophisticated, like only capturing certain words, or counting the numbers, but this shows how flexible the :s command is.





              share













              You can do something like this:



              :let a=
              :%s/w+/=add(a, submatch(0))/gn
              :new
              :put =uniq(sort(a))


              This will first declare a list a to work with. Then we run a :%s command, to capture all word-characters (w+) and act on all matches (g flag of the :s command), but won't actually replace (n flag). We use a sub-replace-expression(=) in the replacement part, to store the captured submatch in list a.



              And finally, we create a new window, and put the unique and sorted (uniq) content of list a into it.



              You can get a lot more sophisticated, like only capturing certain words, or counting the numbers, but this shows how flexible the :s command is.






              share











              share


              share










              answered Apr 19 at 6:07









              Christian BrabandtChristian Brabandt

              16.2k2646




              16.2k2646













              • how wonderful ! thank you !!

                – user3203476
                Apr 19 at 6:27



















              • how wonderful ! thank you !!

                – user3203476
                Apr 19 at 6:27

















              how wonderful ! thank you !!

              – user3203476
              Apr 19 at 6:27





              how wonderful ! thank you !!

              – user3203476
              Apr 19 at 6:27











              2














              Maybe this:



              :%s/W/rr/g
              :sort u
              :g/^s*$/d


              The first puts a line break before and after each non-word character.



              The second command sorts the entire file with the option "unique", so all duplicate lines are removed.



              The third command deletes all lines that are empty or only contain whitespaces.





              share




























                2














                Maybe this:



                :%s/W/rr/g
                :sort u
                :g/^s*$/d


                The first puts a line break before and after each non-word character.



                The second command sorts the entire file with the option "unique", so all duplicate lines are removed.



                The third command deletes all lines that are empty or only contain whitespaces.





                share


























                  2












                  2








                  2







                  Maybe this:



                  :%s/W/rr/g
                  :sort u
                  :g/^s*$/d


                  The first puts a line break before and after each non-word character.



                  The second command sorts the entire file with the option "unique", so all duplicate lines are removed.



                  The third command deletes all lines that are empty or only contain whitespaces.





                  share













                  Maybe this:



                  :%s/W/rr/g
                  :sort u
                  :g/^s*$/d


                  The first puts a line break before and after each non-word character.



                  The second command sorts the entire file with the option "unique", so all duplicate lines are removed.



                  The third command deletes all lines that are empty or only contain whitespaces.






                  share











                  share


                  share










                  answered Apr 19 at 6:28









                  RalfRalf

                  3,7451318




                  3,7451318























                      1














                      You can use grep with the --only-matching/-o flag to accomplish this:



                      :%!grep -o 'w+|W' | sort -u




                      share




























                        1














                        You can use grep with the --only-matching/-o flag to accomplish this:



                        :%!grep -o 'w+|W' | sort -u




                        share


























                          1












                          1








                          1







                          You can use grep with the --only-matching/-o flag to accomplish this:



                          :%!grep -o 'w+|W' | sort -u




                          share













                          You can use grep with the --only-matching/-o flag to accomplish this:



                          :%!grep -o 'w+|W' | sort -u





                          share











                          share


                          share










                          answered Apr 19 at 15:03









                          Peter RinckerPeter Rincker

                          10.6k11828




                          10.6k11828















                              Popular posts from this blog

                              Plaza Victoria

                              Puebla de Zaragoza

                              Musa