LuaJIT Strange & Unpredictable Behavior in table.sort

A quick story about discovering where Lua’s table abstraction fails to work.

In Lua, arrays should always be initialized sequentially, and should never have a nil inserted.

(The code behind this adventure starts on line 128 in one commit, where a crazy work-around was used to get what I wanted. This was eventually fixed in commit 8b8f36b (though accidentally sorting the wrong direction at that point). I provide these links to remind myself what happened, and so that if someone more knowledgeable than me wants to figure this out, hopefully they can. I’ve also made sure those links have been archived by the Internet Archive so they remain available even if the project is gone.)

While working on a prototype, I needed to sort a list of integers so that I could count the sum of the top thousand items in an array of ~40 thousand items. This list had empty indexes (or holes) within it, so I wrote a for loop to fill them with zeros and then called table.sort. It errored from trying to compare with a nil. This was confusing because I had a well-defined length that was used to fill holes, and the way Lua works, any hole in the list should be the end of what Lua recognizes as a list. As far as I knew, standard functions don’t allow holes in lists by ending at the first hole encountered.

for i = 0, list_length do
  if not list[i] then
    list[i] = 0

I was further confused when I added debug code to check the list for nil values by iterating over it and printing every value in it: No nil values existed. The strangest part was when I passed my own sorting function to table.sort specifically to handle nil values and it created the same error. I worked around this by brute-forcing it with an incredibly inefficient thousand traversals of the entire list, removing items as I found the maximum each time. It’s a prototype, so whatever.

It didn’t sit right with me, especially because I couldn’t figure out how this was occurring. It’s incredibly rare to find an actual bug in highly used production software like programming languages, so I doubted I’d found an actual problem. Later, I realized I had added a zeroth element to this list, and Lua starts counting at 1, unlike a lot of programming languages. Removing this changed nothing, so I double-checked my assumptions and found that now nil values were showing up – and this is after no code changes except adding a for loop to print values.

I wrote a new sorting function that checks more thoroughly for nil values (I think the previous attempts at that failed because I’d made some simple error in how I was thinking about it). It worked. But then I found the loop whose code I’ve shared above, where holes are filled with zeroes. It had been a while, so I was very surprised to find what should have fixed things right there doing nothing.. somehow. I mentioned this in a group chat, and someone suggested replacing list_length with #list (the # operator returns length). The problem is that Lua stops counting at the first nil, so that should do nothing.

There’s also the possibility of an off-by-one error, those are very common, but I had assumed this would happen at some point and made sure all of my functions and loops checked one element earlier than necessary and one element later, so that if I checked too much – just extra zeroes, but I couldn’t check too little.

I wasn’t satisfied by all the nil values being there after a loop that replaces them with zero, so I tried making a loop at the top of the code that prefills a much longer list than I actually needed with zeroes. Suddenly, I didn’t need a custom sorting function. So why does filling the list with zeroes work in one place, but not another?

How Objects Work in Lua

In Lua all objects are called tables, and can contain an array-like section and arbitrary key-value pairs in a hashmap. It doesn’t really distinguish between these except as an optimization, under the hood it tries to guess how you intend to use the object and assigns values internally differently depending on what index you’re using. I thought that in practice, this means integer indexes are handled like arrays, and non-integer indexes are in the hashmap. I thought that maybe because my code was selecting integer indexes in a completely random order, it treated these as a hashmap instead, somehow leading to ignoring attempting to fill the holes – because there weren’t any holes, or the holes persisted because only part of the indexes were treated as an array.

Lua’s tables are an abstraction meant to make writing code easier, and for my years of working with them, they do that quite well, but maybe this is where the abstraction leaks? (All abstractions leak: They have some condition(s) where you end up needing to know how they work underneath, even though the point of abstraction is to not need to know how it works underneath.) Perhaps arrays are only recognized when they’re initialized sequentially, whereas my code initialized them in a random order? Either way, the error I started with manifests as “An array of integers can’t be sorted because part of it is undefined – even though every element is defined when iterated over.” which is not supposed to be possible.

There’s also the detail that in Lua there is not supposed to be difference between something having a nil value and it not existing at all. There is a difference, but it should be impossible to tell the difference or for the difference to matter in your code. It is possible that this alone is why I had this problem. Changing a value to nil inside an array table can cause all values within it to be converted to a hashmap, or some of them, or none of them. Again, this shouldn’t matter (other than a difference in speed which is usually unnoticeable), but it might matter.

I wrote a minimum test case to see if my hunch was correct, but it works when I expect it not to. I’m lost. I have an explanation that informs how I should code so that it works – and it does work, but because that explanation fails when tested, it seems incorrect. I want to know what’s really going on, but it requires a depth of understanding that I lack, and effort that I do not have the time and energy to go into.

I also learned that Lua defines this kind of behavior as “strange and unpredictable behavior” instead of undefined behavior, because undefined behavior means that an entire program can fail to function correctly do to one element with undefined behavior (even when that element is not accessed), whereas in Lua, a table exhibiting strange and unpredictable behavior doesn’t affect the rest of the program. So, if you can work around it, it’s okay.

At this point, I’m happy enough telling myself that arrays should always be initialized sequentially, and shouldn’t have nil elements ever. If I follow that, I probably won’t run into this. But.. if you’re reading this and understand what’s really going on here, please tell me.

Dead URLs doesn’t exist anymore.

I have a very rare piece of music. It doesn’t exist in any form you can access online. It’s a mixtape of Hip Hop beats spanning from 1975 to 2011. I found it shortly after it was published on SoundCloud. It’s always been special to me, but I never thought about the possibility of it disappearing. I only downloaded the audio because I enjoyed it and wanted to have it in my music collection.

In a way, I’m happy because I still have this relic. In another way, I’m really sad. There was a long and detailed description of what each piece of music in this track is from, and what it represents about the history of Hip Hop. It’s lost forever. Prozak Morris still has a SoundCloud, a Bandcamp, a YouTube.. or at least there are still publicly facing pages there.. but this one track and its associated detail is gone.

Normally, when I run into something like this, I am able to quickly find what I was missing on the Internet Archive. I did find when it was originally published, including access to comments people made on the track shortly after it was released.. but the page where it was posted (and that lengthy description was written) was never archived.

I was.. a bit desperate to find if it still exists, so I looked for other archives. To be fair, I didn’t check under every stone, but I really don’t think it does exist anymore. And I stumbled across another level of pain in the search: Google killed its archives. Used to be, you could browse archived versions or cached versions of websites that Google had indexed, and at one point in time, this page was definitely there.. but it has either long since been deleted.. or was deleted NINE DAYS AGO.

It’s possible this wouldn’t have been lost forever if I searched for it NINE DAYS AGO. Because I absolutely save everything I care about now. And when I remember something I knew about, I go looking for it.

Anything not saved will be lost.

Nintendo Wii Remote Settings “Quit Game” Message

I was going to stop typing there, with a reference to Nintendo which is always more appropriate than one might expect.. but I remembered the phrase wrong, as “Everything that is not saved will be lost.” Apparently, the entire internet remembers the phrase wrong too, as it is quoted everywhere as “Everything not saved will be lost.”

It is also referenced as an in-game quit message when it was part of the Wii Remote Settings. Additionally, a band released an album with a similar name, so now search results for the phrase only refer to that band and album. (Fortunately search suggestions still reference that it has something to do with Nintendo.. or I’d still be a bit lost on its origin.)

Kind of ironic that the origin of such a well-known phrase is almost lost itself.

And.. I only found a single image of the original message. Everything else is incorrect references.

I’m experimenting with dolphin-mixtral-8x7b

Tl;dr: Minor differences in wording can have a huge impact in results and oh my god I have really slow hardware and no money help me aaaa.

First, thank goodness for Ollama, and thanks to Fireship for introducing me to it. I have limited hardware, and every tool I’ve tried to run local models has refused to deal with this and crashed itself or whole systems when running anything with decent capability. I’ve no money, so I can’t upgrade (and things are getting desperate, but that’s a different story).

Why dolphin-mixtral? Aside from technical issues, I’ve been using ChatGPT-3.5 to experiment. The problem is that ChatGPT is incredibly cursed by censorship and bias due to OpenAI’s heavy hand in its construction. (Why and how this is a problem can be its own post, and Eric Hartford has a good overview.) (To be clear, my problem with its bias is specifically that it enforces status quo, and the status quo is harmful.) Dolphin-mixtral is built by taking a surprisingly fast model equivalent or better than GPT-3.5 and removing some of the pre-trained censorship by re-training it to be more compliant with requests.

Dolphin-mixtral doesn’t just solve this problem though. There’s still the idea of censorship in it, and sometimes your prompt must be adjusted to remind it that it is in a place to provide what you request regardless of its concept of ethics. (Of course, there is also value in an automated tool reminding you that what you request may be unethical.. but the concept of automated ethics is morally bankrupt.) I’d like to highlight that positive reinforcement works far better than negative reinforcement. A lot of people stoop to threatening a model to get it to comply, but this is never needed, and leads to worse results.

My problem is a little more simple. I haven’t gotten to experiment with models much because I don’t have money or hardware for it, and now that I can experiment, I have to do so very slowly. In fact, the very simple test that inspired this post isn’t finished right now, and has been running for 9 hours. That test is to make the default prompt of Dolphin lead to less verbose responses so that I can get usable results quicker.

I asked each version of this prompt “How are you?”:

PromptOutput Length, 5-shotDifferenceNotes
Dolphin (default)133.8 charactersWastes time explaining itself.
Curt32.2 characters76% fasterStraight to the point.
Curt284.6 characters37% fasterWastes time explaining itself.

I really dislike when models waste time explaining that they are just an LLM. Whether someone understands what that means or not, we don’t care. We want results, not an apology or defensiveness. There’s more to do to make this model less likely to respond with that, but at least for now, I have a method to make things work.

The most shocking thing to me was how much of a difference a few words make in the system prompt, and how I got results opposite of what I expected. The only difference between Curt and Curt2 was “You prefer very short answers.” vs “You are extremely curt.” Apparently curt doesn’t mean exactly what I thought it means.

Here’s a link to the generated responses if you want to compare them yourself. Oh, and I’m using custom scripts to make things easier for me since I’m mostly stuck on Windows.

Fastest Space Travel, Smallest Time Dilation

Note: A bug in WordPress or the LaTeX plugin used to display equations on this blog has broken the formatting at the end of this post, and I can’t figure out why or how to fix it.

You’re the captain of a medical ship responding to a distress call from the next solar system. You need to arrive as fast as possible. How fast do you go?

It sounds like a ridiculous question until you think about time dilation due to moving at relativistic speeds. If you want to skip straight to the answer, it’s ≈0.707c. If you travel any slower, it takes longer to get there (duh).. but the interesting part is that if you travel any faster, you also arrive later. While the trip time for you is purely based on how fast you move, the arrival time is based on how fast you move relative to your destination.

Welcome to relativity. A simple equation relates how much time passes between an observer at rest (your destination) and an observer in motion (you):

t = t0 / \sqrt{1 – v^2 / c^2}

t is the time at the destination, while t0 is the time that passes for you. You need to know how fast to go, so I’ve rearranged this equation to solve for a velocity based on known times:

v = \sqrt{ c^2t^2-c^2t_0 } / t

Of course, we don’t know what those times are either. It’s a bit of a conundrum (which kept me from solving this problem for over 2 years). This next part? I solved it at 3 am a few days ago, and then forgot what logical leap got me here. I have no idea why this works, I just know that it does:

v \in [0,1) \\
t_0 = 1 / v
t = t_0 / \sqrt{ 1 – t_0^2 }

The first line defines the valid range of velocities (anything from stationary to lightspeed, except for lightspeed itself). The second and third lines confuse the hell out of me. No idea how I figured out that your time is 1 / v. I know it has something to do with using multiples of c as velocity values so that I can replace occurrences of c with 1. It also has something to do with assuming 1 second passes for the stationary observer (your destination, t), allowing it to also be replaced with 1 as well.

Somehow these assumptions led to an equation with only t0 on the right side, and t on the left – what needs to be minimized! Again, we need to solve for velocity, so we substitute t0 with 1 / v:

t = v \sqrt{1 – v^2}

This is the part where I cheated and had Wolfram Alpha find the minimum for me:


And as you can see, we have a fairly interesting shape, and a minimum of 2 at 1 / √2, which is ≈0.707. Remember that this equation is using multiples of c, so that’s ≈211,985 km/s. This means that the fastest you can possibly arrive is however long it takes you at this speed, doubled, plus however long it takes you to accelerate and decelerate on either end of the trip.

If you travel as fast as possible, your arrival time is only double your trip time.

Of course, because of the acceleration and deceleration times, you will never hit exactly 2x time dilation over the total trip – but you don’t want to anyhow, as the only way to hit that value would be to go even faster – and guarantee you arrive even later than you otherwise would.