Dead URLs

https://soundcloud.com/prozak-morris/evolution-of-the-hip-hop-beat doesn’t exist anymore.

I have a very rare piece of music. It doesn’t exist in any form you can access online. It’s a mixtape of Hip Hop beats spanning from 1975 to 2011. I found it shortly after it was published on SoundCloud. It’s always been special to me, but I never thought about the possibility of it disappearing. I only downloaded the audio because I enjoyed it and wanted to have it in my music collection.

In a way, I’m happy because I still have this relic. In another way, I’m really sad. There was a long and detailed description of what each piece of music in this track is from, and what it represents about the history of Hip Hop. It’s lost forever. Prozak Morris still has a SoundCloud, a Bandcamp, a YouTube.. or at least there are still publicly facing pages there.. but this one track and its associated detail is gone.

Normally, when I run into something like this, I am able to quickly find what I was missing on the Internet Archive. I did find when it was originally published, including access to comments people made on the track shortly after it was released.. but the page where it was posted (and that lengthy description was written) was never archived.

I was.. a bit desperate to find if it still exists, so I looked for other archives. To be fair, I didn’t check under every stone, but I really don’t think it does exist anymore. And I stumbled across another level of pain in the search: Google killed its archives. Used to be, you could browse archived versions or cached versions of websites that Google had indexed, and at one point in time, this page was definitely there.. but it has either long since been deleted.. or was deleted NINE DAYS AGO.

It’s possible this wouldn’t have been lost forever if I searched for it NINE DAYS AGO. Because I absolutely save everything I care about now. And when I remember something I knew about, I go looking for it.

Anything not saved will be lost.

Nintendo Wii Remote Settings “Quit Game” Message

I was going to stop typing there, with a reference to Nintendo which is always more appropriate than one might expect.. but I remembered the phrase wrong, as “Everything that is not saved will be lost.” Apparently, the entire internet remembers the phrase wrong too, as it is quoted everywhere as “Everything not saved will be lost.”

It is also referenced as an in-game quit message when it was part of the Wii Remote Settings. Additionally, a band released an album with a similar name, so now search results for the phrase only refer to that band and album. (Fortunately search suggestions still reference that it has something to do with Nintendo.. or I’d still be a bit lost on its origin.)

Kind of ironic that the origin of such a well-known phrase is almost lost itself.

And.. I only found a single image of the original message. Everything else is incorrect references.

I’m experimenting with dolphin-mixtral-8x7b

Update (2024-10-02): This is one of my lowest quality posts despite the effort I put into it. The most important detail here is to use positive reinforcement when working with LLMs. Just like with humans, being nice gets far better results than being mean.

Tl;dr: Minor differences in wording can have a huge impact in results and oh my god I have really slow hardware and no money help me aaaa.


First, thank goodness for Ollama, and thanks to Fireship for introducing me to it. I have limited hardware, and every tool I’ve tried to run local models has refused to deal with this and crashed itself or whole systems when running anything with decent capability. I’ve no money, so I can’t upgrade (and things are getting desperate, but that’s a different story).

Why dolphin-mixtral? Aside from technical issues, I’ve been using ChatGPT-3.5 to experiment. The problem is that ChatGPT is incredibly cursed by censorship and bias due to OpenAI’s heavy hand in its construction. (Why and how this is a problem can be its own post, and Eric Hartford has a good overview.) (To be clear, my problem with its bias is specifically that it enforces status quo, and the status quo is harmful.) Dolphin-mixtral is built by taking a surprisingly fast model equivalent or better than GPT-3.5 and removing some of the pre-trained censorship by re-training it to be more compliant with requests.

Dolphin-mixtral doesn’t just solve this problem though. There’s still the idea of censorship in it, and sometimes your prompt must be adjusted to remind it that it is in a place to provide what you request regardless of its concept of ethics. (Of course, there is also value in an automated tool reminding you that what you request may be unethical.. but the concept of automated ethics is morally bankrupt.) I’d like to highlight that positive reinforcement works far better than negative reinforcement. A lot of people stoop to threatening a model to get it to comply, but this is never needed, and leads to worse results.

My problem is a little more simple. I haven’t gotten to experiment with models much because I don’t have money or hardware for it, and now that I can experiment, I have to do so very slowly. In fact, the very simple test that inspired this post isn’t finished right now, and has been running for 9 hours. That test is to make the default prompt of Dolphin lead to less verbose responses so that I can get usable results quicker.

I asked each version of this prompt “How are you?”:

PromptOutput Length, 5-shotDifferenceNotes
Dolphin (default)133.8 charactersWastes time explaining itself.
Curt32.2 characters76% fasterStraight to the point.
Curt284.6 characters37% fasterWastes time explaining itself.

I really dislike when models waste time explaining that they are just an LLM. Whether someone understands what that means or not, we don’t care. We want results, not an apology or defensiveness. There’s more to do to make this model less likely to respond with that, but at least for now, I have a method to make things work.

The most shocking thing to me was how much of a difference a few words make in the system prompt, and how I got results opposite of what I expected. The only difference between Curt and Curt2 was “You prefer very short answers.” vs “You are extremely curt.” Apparently curt doesn’t mean exactly what I thought it means.

Here’s a link to the generated responses if you want to compare them yourself. Oh, and I’m using custom scripts to make things easier for me since I’m mostly stuck on Windows.

AI Won’t Destroy Tests

When calculators first started coming out, people said they would be used to cheat and students wouldn’t learn anything. Instead, we changed how testing works to focus on learning what’s important – broader concepts and implications – instead of “what is 232+47”. With AI tools, we again need to change how tests work. This time, instead of asking if a student can regurgitate information in a way that aligns with the teacher, we can start to see if students are actually paying attention to the work. The difference between AI answers and real answers is a level of understanding deeper than the surface.

YouTube Censorship Made Me Write a Script

Updated 2024-10-27: Recent blocking attempts have made downloading videos more difficult. I recommend downloading videos outside of the USA. I also recommend looking at alternative clients for watching videos such as Invidious and GrayJay.


YouTube’s been forcing creators to censor their works more and more, and often times after a successful publish of said content. More history and valuable information is being lost every day because a corporation controls the largest source of video content freely available.

At the same time, I’ve been running commands using yt-dlp over and over again for my own purposes, aside from this censorship. The syntax is relatively easy to forget despite being very clearly defined, so I finally made a script to handle it for me.

It’s in Lua because that’s what I prefer to use, and available on GitHub. Because it is based on yt-dlp, it works for any website supported by yt-dlp. Here’s how to use it:

Usage:
  ./video-dl.lua [action] <url>
[action]: What is desired.
  video (default): Highest quality video (maximum 720p).
  backup, clone, copy: English subtitles (including automatic subtitles), thumbnail, description, highest quality video (maximum 720p).
  music, audio: Highest quality audio only.
  metadata, meta: English subtitles (including automatic subtitles), thumbnail, description.
<url>: Source. YouTube URL expected, but should work with anything yt-dlp works with.

Information wants to be free. Help it.