Find duplicate lines from a CSV

published on March 21, 2016.
Heads-up! You're reading an old post and the information in it is quite probably outdated.

Finding duplicate lines from a CSV file is something I have to do from time to time, yet not on a regular enough basis to remember it all. Plus, I’m trying to blog more often.

cut -d, -f1 file.csv | tr -d '"' | sort | uniq -dc

cut to split the lines at the commas and select the first field, then tr to delete any double quotes that encloses the field, then sort and finally with uniq to show only the duplicated lines and to prefix every line with the count of occurrences.

Tags: linux, shell.
Categories: Development.

Thanks for reading! If you require help on a project of any kind, let's talk!

Robert Basic

Robert Basic

Software engineer, consultant, open source contributor.

Let's work together!

If you require outsourcing or consulting help on your projects, I'm available!

Robert Basic © 2008 — 2019
Get the feed