rvp
 
rvp Go ahead. As far as I'm concerned, all the stuff I write for the board is public-domain.
Yay. A wiki page is coming soon. Meanwhile, I have created a Git repo for my script, with tests and CI. I have also fixed a bug and implemented two new options: units (e.g., B,KiB,MiB) and zero (to give size zero a special representation if you want to, for example, avoid "0.0 bytes").
rvp BTW, assigning back to a field like this: $f = sprintf(format, size, u[i]) causes field-splitting--which'll mess up filenames if they contain multiple adjacent blanks.
Thanks! I didn't consider this. I've added a comment warning the user the script can mess up whitespace.
I'd call it more "field-joining" than "field-splitting" 🙂, since asigning to a field causes $0 to be reconstructed from $1–$NF joined together with OFS.
I could avoid this problem if I only humanized the first field, but I like the feature that lets you select which field to humanize. To preserve the whitespace with a user-selectable field, I'd have to do my own AWK-style parsing manually—in AWK, which would be a hassle. I could use split with seps if I were targeting Gawk, but I am not. I probably won't fix it. The user shouldn't try to parse the output of du anyway, since it doesn't handle newlines in filenames.
(If someone wants one, I have implemented an AWK-style field parser that preserves the separators between the fields for future use, but in Tcl, where it is more convenient. It is part of a "what if awk(1) but SQL?" thing I wrote.)