I'm trying to order a .csv file with just over 300 entries and output it all back out ordered by the numerical values in one specific column under a dialect.
Here's the code I've written so far but it just seems to output the data as it went in
import csv
import itertools
from itertools import groupby as gb
reader = csv.DictReader(open('Full_List.csv', 'r'))
groups = gb(reader, lambda d: d['red label'])
result = [max(g, key=lambda d: d['red label']) for k, g in groups]
writer = csv.DictWriter(open('output.csv', 'w'), reader.fieldnames)
writer.writeheader()
writer.writerows(result)
There's only 50 rows in the whole file that contain a value under the dialect "red label" and all the others are left blank.
It's in the Z column on the .csv(but not that last one) so I'd assume the index of the column is 25(0 being the first).
Any help would be greatly appreciated.
–
df = pd.read_csv('Full_List.csv')
df = df.sort('red label')
df.to_csv('Full_List_sorted.csv', index=False)
You may need to adjust the options to read_csv
and to_csv
to match the format of your CSV file.
https://stackoverflow.com/questions/15559812/sorting-by-specific-column-data-using-csv-in-python/15561404#15561404
share
improve this answer
–
–
–
groupby
isn't for sorting, it's for chunking an iterable. For sorting use sorted
.
import csv
reader = csv.DictReader(open('Full_List.csv', 'r'))
result = sorted(reader, key=lambda d: float(d['red label']))
writer = csv.DictWriter(open('output.csv', 'w'), reader.fieldnames)
writer.writeheader()
writer.writerows(result)
Note: I changed your lambda to cast your character data to float for correct numerical sorting.
https://stackoverflow.com/questions/15559812/sorting-by-specific-column-data-using-csv-in-python/15559985#15559985
share
improve this answer
–
–
–
–
I found with testing that the following works on csv files that I have. Note that all rows of the column have valid entries.
from optparse import OptionParser
# Create options.statistic using -s
# Open and set up input file
ifile = open(options.filein, 'rb')
reader = cvs.DictReader(ifile)
# Create the sorted list
print 'Try the float version'
sortedlist = sorted(reader, key = lambda d: float(d[options.statistic]), reverse=options.high)
except ValueError:
print 'Need to use the text version'
ifile.seek(0)
ifile.next()
sortedlist = sorted(reader, key=lambda d: d[options.statistic], reverse=options.high)
# Close the input file. This allows the input file to be the same as the output file
ifile.close()
# Open the output file
ofile = open(options.fileout, 'wb')
writer = csv.DictWriter(ofile, fieldnames=outfields, extrasactions='ignore', restval = '')
# Output the header
writer.writerow(dict((fn, fn) for fn in outfields))
# Output the sorted list
writer.writerows(sortedlist)
ofile.close()
https://stackoverflow.com/questions/15559812/sorting-by-specific-column-data-using-csv-in-python/21169395#21169395
share
improve this answer