Operating Systems > Linux and UNIX

Searching for duplicate names in a file..

<< < (2/3) > >>

Agent007:
Thanks a million Void Main!! That really worked...Ur right, I only wanted the ID's to be listed. Btw, how does it actually work? I mean what's the f1-d'@' for? also, why the need of pipes?

thanks & rgds,
007

flap:

quote:Originally posted by void main:
I would have to think a little more about doing it without changing the @ sign.
--- End quote ---


How about this?

tr '@' ' ' < test.txt | sort -k4,4 -u | gawk '{print $1 " " $2 " " $3 " " $4 "@" $5 " " $6 " " $7 " " $8 " " $9 " " $10}'

Or is there a way of making that gawk statement smaller?

voidmain:

quote:Originally posted by flap:


How about this?

tr '@' ' ' < test.txt | sort -k4,4 -u | gawk '{print $1 " " $2 " " $3 " " $4 "@" $5 " " $6 " " $7 " " $8 " " $9 " " $10}'

Or is there a way of making that gawk statement smaller?
--- End quote ---


I'm sure there is and gawk/awk is very powerful. Unfortunately my brain was already full 10 years ago before reaching the awk chapter. And I don't believe your command will actually prevent lines with duplicates ids (before the @).

[ January 20, 2003: Message edited by: void main ]

flap:
Well the duplicate id's have already been removed by your command, output of which is piped through gawk.

voidmain:

quote:Originally posted by Agent007:
Thanks a million Void Main!! That really worked...Ur right, I only wanted the ID's to be listed. Btw, how does it actually work? I mean what's the f1-d'@' for? also, why the need of pipes?

thanks & rgds,
007
--- End quote ---


It's really simple once you play with some of the basic UNIX commands. The first part of the command "cut -f4 -d' ' test.txt" says to break the file into columns separated by spaces "' '" and then only output the fourth column. Now that output will be in the form of "userid@host". So you pipe that output into the "cut -f1 -d'@'" which will split the input data into columns delimeted by '@' caracters which would result in two columns, and the "-f1" says to only output the first column which is the "userid". Take that output and pipe it directly into the "sort -u" command which sorts the input and removes duplicates and then spits the result back at you. It would be wise to invest in a shell programming book. This will become second nature to you...

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version