All blog posts
Legacy
How to convert a fixed-width file into CSV
Learn how to convert a fixed-width file to CSV with the standard GNU unix tool gawk
(The more valuable and massive a data set is, the less likely it's in a format you can just parse. Has anybody else noticed that?)
Here's how to convert a fixed-width file to CSV with the standard GNU unix tool gawk
:
Theoretical (i.e., see "real life" below)
Thanks to stackoverflow: (reproducing verbatim)
gawk '$1=$1' OFS=, FIELDWIDTHS='4 2 5 1 1' infile > outfile.csv
Where FIELDWIDTHS
is a list of field widths and OFS
is the output file separator.
Real life
In real life, fixed width files contain commas and double quotes.
# put this in a file called fixed2csv.awk
{
for (i=1;i<=NF;i++) {
sub(/\s+$/,"",$i)
sub("\"","\"\"",$i)
printf "\"%s\"%s", $i, (i<NF?OFS:ORS)
}
}
Then run it on your data:
gawk -f fixed2csv.awk OFS=, FIELDWIDTHS='4 2 5 1 1' infile > outfile.csv
Thanks to Ed Morton on Stackoverflow for inspiration!
Ready for easy AI?
Skip the ML struggle and focus on your downstream application. We have built-in sample data so you can get started without sharing yours.