File Size Statistics Script (Lua)

I used ChatGPT to write a script for generating a list of file statistics based on everything within the directory it is placed in. It uses LuaFilesystem, and generates a final output like the following after it’s done processing through the files:

2359    files found.
Average (mean) file size:       44842.524374735 bytes
Standard deviation:     320478.50592438
Multiple modes:
Mode 1: 126     bytes
Mode 2: 204     bytes
Frequency:      7
[####################] 0.00 - 199271.16: 2245 files
[##########          ] 199271.16 - 398542.33: 59 files
[#######             ] 398542.33 - 597813.49: 16 files
[#######             ] 597813.49 - 797084.65: 14 files
[#####               ] 797084.65 - 996355.82: 6 files
[#####               ] 996355.82 - 1195626.98: 8 files
[##                  ] 1195626.98 - 1394898.14: 2 files
[#                   ] 1394898.14 - 1594169.31: 1 files
[#                   ] 1594169.31 - 1793440.47: 1 files
[                    ] 1793440.47 - 1992711.63: 0 files
[                    ] 1992711.63 - 2191982.80: 0 files
[#                   ] 2191982.80 - 2391253.96: 1 files
[                    ] 2391253.96 - 2590525.12: 0 files
[                    ] 2590525.12 - 2789796.29: 0 files
[                    ] 2789796.29 - 2989067.45: 0 files
[##                  ] 2989067.45 - 3188338.61: 2 files
[                    ] 3188338.61 - 3387609.78: 0 files
[                    ] 3387609.78 - 3586880.94: 0 files
[                    ] 3586880.94 - 3786152.10: 0 files
[                    ] 3786152.10 - 3985423.27: 0 files
[                    ] 3985423.27 - 4184694.43: 0 files
[#                   ] 4184694.43 - 4383965.59: 1 files
[                    ] 4383965.59 - 4583236.76: 0 files
[                    ] 4583236.76 - 4782507.92: 0 files
[                    ] 4782507.92 - 4981779.08: 0 files
[                    ] 4981779.08 - 5181050.24: 0 files
[#                   ] 5181050.24 - 5380321.41: 1 files
[                    ] 5380321.41 - 5579592.57: 0 files
[                    ] 5579592.57 - 5778863.73: 0 files
[                    ] 5778863.73 - 5978134.90: 0 files
[                    ] 5978134.90 - 6177406.06: 0 files
[                    ] 6177406.06 - 6376677.22: 0 files
[#                   ] 6376677.22 - 6575948.39: 1 files
[                    ] 6575948.39 - 6775219.55: 0 files
[                    ] 6775219.55 - 6974490.71: 0 files
[                    ] 6974490.71 - 7173761.88: 0 files
[                    ] 7173761.88 - 7373033.04: 0 files
[                    ] 7373033.04 - 7572304.20: 0 files
[                    ] 7572304.20 - 7771575.37: 0 files
[                    ] 7771575.37 - 7970846.53: 0 files
[                    ] 7970846.53 - 8170117.69: 0 files
[                    ] 8170117.69 - 8369388.86: 0 files
[                    ] 8369388.86 - 8568660.02: 0 files
[                    ] 8568660.02 - 8767931.18: 0 files
[                    ] 8767931.18 - 8967202.35: 0 files
[                    ] 8967202.35 - 9166473.51: 0 files
[                    ] 9166473.51 - 9365744.67: 0 files
[                    ] 9365744.67 - 9565015.84: 0 files
[#                   ] 9565015.84 - 9764287.00: 1 files
0th percentile: 0       bytes
10th percentile:        167     bytes
20th percentile:        317     bytes
30th percentile:        476     bytes
40th percentile:        692     bytes
50th percentile (median):       986     bytes
60th percentile:        1428    bytes
70th percentile:        2101    bytes
80th percentile:        3650    bytes
90th percentile:        38917   bytes
100th percentile:       9764287 bytes

With minimal effort, you could change it quite a bit, because it’s written as pure functions. I wouldn’t have achieved this myself, nor produced it so quickly, if I didn’t have ChatGPT do the easy stuff for me. I found the experience quite helpful. While ChatGPT did once forget that Lua indexes tables starting with 1, and made a few weird decisions and downright inefficient code in some places, it allowed me to focus on making it work exactly how I wanted it to, instead of just mostly correct or “good enough for now”.

(Btw, the example output above is from my Obsidian vault. You can read a bit more about how I use Obsidian to organize my notes here.)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.